I have many conversations with people about Large Language Models like ChatGPT and Copilot. The idea that “it makes convincing sentences, but it doesn’t know what it’s talking about” is a difficult concept to convey or wrap your head around. Because the sentences are so convincing.
Any good examples on how to explain this in simple terms?
I think a good example would be finding similar prompts that reliably give contradictory information.
It’s sort of like auto pilot. It just believes everything and follows everything as if they’re instructions. Prompt injection and jail breaking are examples of this. It’s almost exactly like the trope where you trick an AI into realizing it’s had a contradiction and it explodes.