Gemini: The AI That Could Supercharge Google?
Category Artificial Intelligence Monday - July 3 2023, 15:33 UTC - 1 year ago Recent progress in AI has been impressive and OpenAI's flagship GPT-4 may soon be one-upped by Google Deepmind's new algorithm 'Gemini'. The new algorithm will combine the strengths of AlphaGo with the capabilities of GPT-4 and will be capable of fusion between multiple types of data and reinforcement learning. It will also be designed for API integrations and using Google's data sets to enable the algorithm to achieve wild results. It remains to be seen what Google's end-goal with Gemini is.
Recent progress in Artificial Intelligence (AI) has been startling. Barely a week goes by without a new algorithm, application, or implication making headlines. But OpenAI, the source of much of the hype, only recently completed their flagship algorithm, GPT-4, and according to OpenAI CEO Sam Altman, its successor, GPT-5, hasn't begun training yet.
It's possible the tempo will slow down in coming months, but don't bet on it. A new AI model as capable as GPT-4, or more so, may drop sooner than later.
This week, in an interview with Will Knight, Google DeepMind CEO Demis Hassabis said their next big model, Gemini, is currently in development, “a process that will take a number of months.” Hassabis said Gemini will be a mashup drawing on AI's greatest hits, most notably DeepMind's AlphaGo, which employed reinforcement learning to topple a champion at Go in 2016, years before experts expected the feat.
“At a high level you can think of Gemini as combining some of the strengths of AlphaGo-type systems with the amazing language capabilities of the large models," Hassabis told Wired. “We also have some new innovations that are going to be pretty interesting.” All told, the new algorithm should be better at planning and problem-solving, he said.
The Era of AI Fusion .
Many recent gains in AI have been thanks to ever-bigger algorithms consuming more and more data. As engineers increased the number of internal connections—or parameters—and began to train them on internet-scale data sets, model quality and capability increased like clockwork. As long as a team had enough cash to buy chips and access to data, progress was nearly automatic because the structure of the algorithms, called transformers, didn't have to change much.
Then in April, Altman said the age of big AI models was over. Training costs and computing power had skyrocketed, while gains from scaling had leveled off. “We'll make them better in other ways,” he said, but didn't elaborate on what those other ways would be.
GPT-4, and now Gemini, offer clues.
Last month, at Google's I/O developer conference, CEO Sundar Pichai announced that work on Gemini was underway. He said the company was building it “from the ground up” to be multimodal—that is, trained on and able to fuse multiple types of data, like images and text—and designed for API integrations (think plugins). Now add in reinforcement learning and perhaps, as Knight speculates, other DeepMind specialties in robotics and neuroscience, and the next step in AI is beginning to look a bit like a high-tech quilt.
But Gemini won't be the first multimodal algorithm. Nor will it be the first to use reinforcement learning or support plugins. OpenAI has integrated all of these into GPT-4 with impressive effect.
If Gemini goes that far, and no further, it may match GPT-4. What's interesting is who's working on the algorithm. Earlier this year, DeepMind joined forces with Google Brain. The latter invented the first transformers in 2017; the former designed AlphaGo and its successors. Mixing DeepMind's toolkit with Google's transformers, and training the combination on Google's data sets, could potentially yield wild results. It's an all-in-one AI lab.
It remains unclear what Google is building with Gemini. Is it just a powerful language machine, like GPT-4, capable of out-thinking human Jeopardy contestants and writing articles? Or will it be something deeper, with a modicum of common sense and robot hands adapted for the real world? We'll have to wait and see. Question is, when will we know? .
Share