The Pitfalls of OpenAI ChatGPT and Google Gemini Large Language Models
Category Artificial Intelligence Monday - March 4 2024, 07:34 UTC - 8 months ago The lack of an internal model of the physical world in large language models such as OpenAI ChatGPT and Google Gemini has caused glitches and inaccuracies in generated images and videos. This poses a major concern for using these systems in tasks such as self-driving cars and humanoid robots, as the generated data may not adhere to real world physics. This flaw stems from the system's inability to accurately reconstruct reality and cannot be solved by adding more data. Constraints such as truthfulness and adherence to the laws of physics cannot be encoded in generative AI systems, highlighting the dangers of relying on them for important tasks.
Generative AI has been a hot topic in recent years, with major advancements made by companies such as OpenAI and Google. Their large language models, ChatGPT and Gemini respectively, have astounded the world with their ability to generate information and media. However, as with any new technology, there are bound to be flaws and ethical concerns that arise. In the case of these large language models, one major flaw has been discovered - the lack of an internal model of the physical world.
It may seem like a small oversight, but the consequences are significant. Without a model of the physical world, these systems struggle with generating accurate images and videos. Basic concepts such as gravity and object permanence are not taken into account, resulting in glitches where objects appear and disappear or defy the laws of physics. This may not seem like a big issue, but when it comes to using these systems for tasks like self-driving cars or humanoid robots, accuracy and safety are of utmost importance.
OpenAI has claimed that their video generation system can provide real world training data for such tasks, but if the videos being generated do not adhere to real world physics, this would be counterproductive. Ultimately, human testers would need to manually verify the accuracy of the data, defeating the purpose of using artificial intelligence in the first place. This has caused concern among experts and critics, who point out that the flaws in these systems are not a result of the training data, but rather a flaw in the system's ability to construct reality.
One expert, Gary Marcus, has been vocal about this issue. As the founder of Geometric Intelligence, a company that specialized in world models and was later acquired by Uber, Marcus is well-versed in the importance of having an accurate physical world representation in AI systems. He points out that the glitches in generated images and videos are similar to those found in the Unreal game engine, and that trying to use generated data for real world tasks would be akin to morphing and splicing.
The root of the issue lies in the system's inability to accurately reconstruct reality. These glitches are not present in the training data, but rather arise from a flaw in how the system processes and compresses information. This is a fundamental flaw that cannot be solved by simply adding more data. Additionally, there is no way to encode constraints such as truthfulness or adherence to the laws of physics in generative AI systems. This further highlights the dangers of relying on these systems for important tasks.
It is clear that space, time, and causality are crucial components of any world model, and without them, there will continue to be issues with generative AI systems. While OpenAI's ChatGPT and Google's Gemini have made significant strides in the field of AI, it is important to recognize and address these flaws in order to ensure the ethical and safe use of this technology.
Share