Actually Open AI

Maybe we do have “ChatGPT at home,” now.

OpenAI recently released two actually-open LLMs, suitable for running on local, consumer-grade PC hardware. gpt-oss:20b and gpt-oss:120b appear to be among the most capable, efficient, and reliable local LLMs I’ve tested, so far. They’re no GPT-5-Thinking, of course, but even the 20b model has handled all the logic puzzles I’ve thrown at it, so far, including competently writing a C function to find the midpoint of a great-circle path anywhere on Earth.

Performance, at least for the 20b model, is quite good, considering the high quality of the responses. Inference runs at about 12-13 tokens per second, on a Core i9 system with an RTX4070 GPU. (128GB system RAM; ~16GB total used, so it basically fits in the 12GB VRAM.) The 120b model runs at 4-5 tokens per second, which is fair, considering it’s 6x larger. (I believe the 120b model uses a mixture-of-experts scheme, to limit the amount of the model that’s active at any one time.)

The ability to have local intelligent agents handling various tasks will open up a whole range of new, interesting projects. The next step is to try to get an idea of what kind of tasks various LLM model sizes can handle. qwen3:0.6b is really fast, but usually loses the plot when asked anything but a basic question. gpt-oss:120b is very capable, but communication is so slow that it might as well happen via Morse code.

This entry was posted in Uncategorized. Bookmark the permalink.

Actually Open AI

Leave a Reply Cancel reply

Donations are welcome!

Follow this site on Facebook…

Pages

Datasheet search

Meta

Recent Posts

Links

Categories

Archives