I have many conversations with people about Large Language Models like ChatGPT and Copilot. The idea that “it makes convincing sentences, but it doesn’t know what it’s talking about” is a difficult concept to convey or wrap your head around. Because the sentences are so convincing.
Any good examples on how to explain this in simple terms?
Interesting thoughts! Now that I think about this, we as humans have a huge advantage by having not only language, but also sight, smell, hearing and taste. An LLM basically only has “language.” We might not realize how much meaning we create through those other senses.
To add to this insight, there are many recent publications showing the dramatic improvements of adding another modality like vision to language models.
While this is my conjecture that is loosely supported by existing research, I personally believe that multimodality is the secret to understanding human intelligence.