39 episodes of 'CSI' used to build AI's natural language model
The show's predictability makes it the ideal robo-cop training tool
A group of University of Edinburgh boffins have turned CSI:Crime Scene Investigation scripts into a natural language training dataset.
Their aim is to improve how bots understand what's said to them – natural language understanding.
Drawing on 39 episodes from the first five seasons of the series, Lea Frermann, Shay Cohen and Mirella Lapata have broken the scripts up as inputs to a LSTM (long short-term memory) model.
The boffins used the show because of its worst flaw: a rigid adherence to formulaic scripts that make it utterly predictable. Hence the name of their paper: “Whodunnit? Crime Drama as a Case for Natural Language Understanding”.
“Each episode poses the same basic question (i.e., who committed the crime) and naturally provides the answer when the perpetrator is revealed”, the boffins write. In other words, identifying the perpetrator is a straightforward sequence labelling problem.
What the researchers wanted was for their model to follow the kind of reasoning a viewer goes through in an episode: learn about the crime and the cast of characters, start to guess who the perp is (and see whether the model can outperform the humans).
The human sample was small – just three individuals – but those who worry robots will replace humans can at least take heart that we can still outperform the AI in answering “whodunnit?”
While humans can outperform the LSTM model in precision, we're mostly cautious: the researchers' model would put in its first guess (right or wrong) at the 190th sentence in an episode, whereas humans typically waited for 300 sentences.
“Once humans guess the perpetrator, however, they are very precise and consistent” the researchers write; when models got the identification right, they guessed sooner. The models also picked out “first mentions” of perpetrators quickly.
The best way to confuse the AI, it turned out, was to have no perpetrator at all: in the one episode of the 39 involving a suicide, human viewers worked out the twist about two-thirds of the way through, while the model kept guessing right to the end.
The researchers' annotated screenplays are at GitHub. ?