OpenAI Brings GPT-5 Closer to Reality

Monday - November 27 2023, 01:46 UTC - 1 year ago

tldr #

OpenAI has recently been reported to be in the process of building its most advanced AI system yet, GPT-5. The team has spoken out on the difficulties they are facing, especially in terms of data accessibility, but have made recent advancements in speed and computing power. They have also been in conversations to access proprietary data that is not available to the public.

content #

At an MIT event in March, OpenAI cofounder and CEO Sam Altman said his team wasn’t yet training its next AI, GPT-5. "We are not and won’t for some time," he told the audience.

This week, however, new details about GPT-5’s status emerged.

In an interview, Altman told the Financial Times the company is now working to develop GPT-5. Though the article did not specify whether the model is in training—it likely isn’t—Altman did say it would need more data. The data would come from public online sources—which is how such algorithms, called large language models, have previously been trained—and proprietary private datasets.

The underlying research behind GPT-5 suggests it could produce higher quality output than its predecessor, GPT-4

This lines up with OpenAI’s call last week for organizations to collaborate on private datasets as well as prior work to acquire valuable content from major publishers like the Associated Press and News Corp. In a blog post, the team said they want to partner on text, images, audio, or video but are especially interested in “long-form writing or conversations rather than disconnected snippets” that express “human intention.” .

OpenAI has pushed against many bottlenecks in order to develop their new AI system, including the need for more data, computing power, and GPU technology

It’s no surprise OpenAI is looking to tap higher quality sources not available publicly. AI’s extreme data needs are a sticking point in its development. The rise of the large language models behind chatbots like ChatGPT was driven by ever-bigger algorithms consuming more data. Of the two, it’s possible even more data that’s higher quality can yield greater near-term results. Recent research suggests smaller models fed larger amounts of data perform as well as or better than larger models fed less.

NVIDIA recently released its newest H100 GPUs which can train language models three times faster than the chips the team was previously using

“The trouble is that, like other high-end human cultural products, good prose ranks among the most difficult things to produce in the known universe,” Ross Andersen wrote in The Atlantic this year. “It is not in infinite supply, and for AI, not any old text will do: Large language models trained on books are much better writers than those trained on huge batches of social-media posts.” .

After scraping much of the internet to train GPT-4, it seems the low-hanging fruit has largely been picked. A team of researchers estimated last year the supply of publicly accessible, high-quality online data would run out by 2026. One way around this, at least in the near term, is to make deals with the owners of private information hoards.

OpenAI has also been in conversations to access proprietary data that is not available to the public

Computing is another roadblock Altman addressed in the interview.

Foundation models like OpenAI’s GPT-4 require vast supplies of graphics processing units (GPUs), a type of specialized computer chip widely used to train and run AI. Chipmaker Nvidia is the leading supplier of GPUs, and after the launch of ChatGPT, its chips have been the hottest commodity in tech. Altman said they recently took delivery of a batch of the company’s latest H100 chips, and he expects supply to loosen up even more in 2024.

Data accessibility is a critical issue in the development of AI because much of the data necessary for training is owned by large corporations

In addition to greater availability, the new chips appear to be speedier too.

In tests released this week by AI benchmarking organization MLPerf, the chips trained large language models nearly three times faster than the mark set just five months ago.

hashtags #

openai gpt5 ai dataaccessibility gpu mlperf

worddensity #

data (7, 1.39%)
models (7, 1.39%)
altman (4, 0.8%)
ai (4, 0.8%)
large (4, 0.8%)