close
close
migores1

The new OpenAI models impress, but they remain at a distance from AGI

Every time OpenAI presents new research, it’s worth considering how close the company is to its mission statement: achieving general artificial intelligence.

The emergence of AGI – a type of AI that can emulate the ingenuity, judgment and reasoning of humans – has been an industry obsession since the arrival of the Turing Test in the 1950s. Three months after the launch of ChatGPT, OpenAI reaffirmed its ambition to deliver AGI.

So how does his latest release fare?

On Thursday, after much anticipation, the San Francisco-based company led by Sam Altman finally unveiled OpenAI o1, a new series of AI models that are “designed to spend more time thinking before responding.”

OpenAI’s big claims about the models suggest they are entering a new paradigm in the generative AI boom. Some experts agree. But do they put the industry on the cusp of AGI? Not yet.

AGI is at a distance


A screen showing OpenAI's new o1 model.

OpenAI’s new o1 models bring improvements to AI models’ ability to reason.

VCG/Getty Images



OpenAI has tried to strike a careful balance between managing expectations and generating hype in unveiling its new models.

OpenAI said the current GPT-4o model behind the chatbot is better for “browsing the web for information”. But despite lacking “many of the features that make ChatGPT useful”, the company claimed the o1 models represent a “significant improvement” for complex reasoning tasks.

The company is so confident in this claim that it said it is “resetting the counter back to one” with the release of these new models – limited to a preview version for now – and calling them the “o1” as a symbol of the new paradigm. they present.

In some ways, o1 models enter OpenAI in a new paradigm.

The company said the models emulate the capabilities of Ph.D. students on “challenging benchmark assignments in physics, chemistry, and biology.” They can also excel in tough competitions, such as the International Mathematical Olympiad and the Codeforces programming competition, OpenAI added.

There appear to be several reasons for this performance increase. OpenAI said it “trained these models to spend more time thinking about problems before responding, just like a person.”

“Through training, they learn to refine their thought process, try different strategies and admit their mistakes,” the company noted in a blog post.

Noam Brown, a researcher at OpenAI, provided a useful way to think about this. The models, he wrote on X, were trained to have a “private chain of thought” before responding, which essentially means they spend more time “thinking” before they speak.

Where previous AI models were bogged down by the data fed to them during the “pre-training” phase, Brown wrote, the o1 models showed that “we can now scale inference.” This is the ability of a model to process data it has not been previously presented with.

Jim Fan, a senior researcher at Nvidia, noted that the technical aspects behind this made this fundamental discovery of o1 OpenAI models possible.

As Fan wrote, that’s because a huge amount of computing power once reserved for the training portion of building an AI model has been “turned instead to serve inference.”

However, it is not clear that this brings OpenAI much closer to AGI.

After the release of o1 models, OpenAI head Altman responded to an X post by Will Depue—an OpenAI employee who pointed out how far large language models have come in the past four years—writing, “Stochastic parrots can fly so high. .. “

It was a subtle reference from Altman to a research paper published in 2021 that positioned the types of AI models OpenAI works on as technologies that appear to understand the language it generates, but don’t. Is Altman suggesting that o1 models are stochastic parrots?

Meanwhile, others have pointed out that the models seem to be stuck with some of the same issues associated with previous models. Uncertainty hangs over how the o1 models will perform more broadly.

Ethan Mollick, a Wharton management professor who spent some time experimenting with the o1 models before their unveiling on Thursday, noted that despite the clear leap in reasoning capabilities, “errors and hallucinations still happen.”

Nvidia’s Fan also noted that applying o1 to products is “much more difficult than reaching the academic standards” that OpenAI was using to showcase the reasoning capabilities of its new models.

It remains to be seen how OpenAI and the wider AI industry work to address these issues.

While the reasoning capabilities of the o1 models transform OpenAI into a new era of artificial intelligence development, the company put its technology at stage two on a five-step intelligence scale this summer.

If he is serious about reaching his ultimate goal of AGI, he has a long way to go.

Related Articles

Back to top button