Login

OpenAI's newest chatbot is less hallucinatory and may even count to three

OpenAI has released yet another chatbot onto us poor, innocent humans. We present o1, a new chatbot that is designed to perform more advanced reasoning. It's said to be better than other chatbots at coding, math, and solving multi-step problems.

The biggest change compared to previous OpenAI LLMs may be the shift from mimicking text training data patterns to a more direct problem-solving approach, courtesy of reinforcement. The net result, according to the researchers, is a more accurate and consistent chatbot.

Jerry Tworek from OpenAI, the research leader, told The Verge that "we have noticed that this models hallucinates a lot less." "Hallucinates less" does not mean that there are no hallucinations. Tworek says, "We can't claim to have solved hallucinations." Ah.

o1 is still said to use a "chain-of-thought" similar to the way humans solve problems step-by-step. This is said to contribute to a much higher performance when it comes to tasks like coding or math.

According to reports, o1 scored an impressive 83% on the qualifying exam for International Mathematics Olympiad. This is a far cry from the meager 13% that GPT-4o managed. It has also done well in coding contests, and OpenAI claims that a forthcoming update will allow it to match PhD students "in challenging benchmarks tasks in physics chemistry and biological."

This new bot, however, is actually worse in some ways. It can't browse or process images and has less information about the world. It is also slower to respond, and give answers, than GPT-4o.

One immediate question that arises from all of this is whether the new chatbot suffers from any of the surprising limits of previous bots. Can o1, say, count to three?

Yes, it does. GPT-4o is apparently unable to count the "r's" within the word "strawberry", only being able to reach two. But o1 can count to three.

The cost of this step-change in counting is not cheap. Developer access costs $15 for 1 million input tokens, and $60 for 1 million output tokens. This is three and four times more expensive than GPT-4o.

ChatGPT Plus users and Team members have reportedly had access to a preview version of the bot called o1-preview. OpenAI has not yet announced a date for the release of a free version called o1 mini.

It sounds like a bot that can provide more reliable responses, along with more practical reasoning. This is a step toward something more useful in the world as well as closer to general intelligence or human-like intelligence.

OpenAI has a plan. Bob McGrew, OpenAI's Chief Research Officer, says: "We spent many months working on reason because we believe this is the critical breakthrough." "Fundamentally this is a different way of modeling to be able solve the really difficult problems required to progress to human-like intelligence levels."

If it can really count to three, I'm impressed. As a routine precaution, it goes without saying that, well, you'll know the rest.

Interesting news

Comments

Выбрано: []
No comments have been posted yet