Research Hit: Humans are Better than AI at Reading the Room
AI is terrible at predicting what is happening - in contrast to human beings.
I would imagine that humans are better than AI at reading the room.
Yes, and no. AI is reaching amazing levels of competence in multiple areas and in many outperform human beings such as in medical screening. This is because AI can process so much more information and compare and learn with massive datasets.
But AI lacks the predictive ability of human beings.
What do you mean by predictive ability?
Let me give you an example in cars and driving. I live close to the centre of a small town (because I love it) in beautiful Switzerland, and on one side of the building is large road. I often step out for a walk and while walking down the road I may head towards a crossing, I will slowly veer towards the crossing, and look over my shoulder in preparation for the crossing but while still a few meters away.
On doing this cars normally already slow down to stop to allow me cross. But I wasn’t in an obvious crossing position, such as waiting at the crossing but merely changed direction a few degrees and glanced over my shoulder. But the drivers notice these subtle cues and predict I may want to cross and slow down, or at least keep an eye on me. This is our predictive brain at play in real time. We human beings have excellent predictive brains (actually I and others argue that a key feature of the brain is its predictive abilities).
We can often see what people are about to do or say in advance with very subtle and nuanced clues. This is important precisely, for example, for autonomous driving, and as the title here says, in reading the room.
But do we know that AI can’t do this?
Well, this is precisely what Kathy Garcia et al. of John Hopkins University studied. To do this the team of researchers got human participants to caption over 300 pictures and short videos of human interactions. This could be something like “two people arguing” or “three people socialising and laughing together over a shared joke”.
Then over 350 AI language, video, and image models were asked to predict how humans would respond and also to predict brain activity (that bit is interesting also).
So how well did AI do?
In short - abysmally. In contrast to human beings who consistently responded in similar ways, AI tools failed to predict this. For example even image models failed to be able to predict if the people in the pictures were communicating with each other.
Large language models such as ChatGPT were better at predicting human behaviour and video models had a reasonable stab at predicting activity in the human brain (but this is less truly predictive because current content predicts neural activity).
But AI does seem to be very impressive in many ways?
Yes indeed. But things like matching static scenes, which AI is now incredibly competent at (with the occasional glaring mistake) is very different to prediction. We are built to predict the next move, next stops and follow on actions - this is the only way that we can respond so effectively in the real time in the real world.
In fact as I write about in my Handbook of The Brain in Business having predictive brains is a core principle - and some people argue THE core principle of the human brain.
But will AI catch up here?
Maybe, but maybe this is a limitation of the current technology - this AI I reported on below may be able to do better, but this is at very early stages at the moment.
Well, I’m happy that, at least for now, us human beings are clearly superior to AI
Me too!
Reference
Kathy Garcia, Emalie McMahon, Colin Conwell, et al.
Modeling dynamic social vision highlights gaps between deep learning and humans
International Conference on Learning Representations
https://iclr.cc/virtual/2025/poster/27867