(Image credit: Pixabay)
Stanford University’s SQuAD (Stanford Question Answering Dataset) is a reading and comprehension test based on 500+ Wikipedia articles and tests the comprehension of words where the answer to every question is a segment of text that corresponds to the particular reading passage. It offers 100,000+ question/answer pairs, in other words- it’s a tough test, even for the best of us but not so much for artificial intelligence.
A recent Bloomberg article reports that Microsoft and Alibaba’s AI platforms have scored better than humans on Stanford’s tough test. Alibaba’s Institute of Data Science of Technologies’ deep neural network was the first to surpass human reading and comprehension, scoring 82.44 over the highest human score of 82.304. The next day, Microsoft’s platform managed to edge-out Alibaba with a score of 82.650.
Stanford’s set of test questions were designed to analyze whether machine-learning platforms are able to process large amounts of data and process it accurately before producing precise answers. In a statement to Bloomberg, Alibaba’s chief scientist for natural language processing Lou Si was quoted as saying, “That means objective questions such as ‘what causes rain’ can now be answered with high accuracy by machines. The technology underneath can be gradually applied to numerous applications such as customer service, museum tutorials and online responses to medical inquiries from patients, decreasing the need for human input in an unprecedented way.”
While those respective AI platforms performed remarkably well, especially over humans, it doesn’t mean they have the same comprehension skills humans do. It doesn’t understand what it’s reading, for example- it can’t tell you who Luke Skywalker really is beyond answering a question about Mark Hamill. They also can’t make sense of sentences or passages that are ambiguous in context- “The bouncer refused them entry because he feared violence,” something natural for humans to grasp but AI doesn’t know who ‘they’ are.
With that said, it also doesn’t mean AI isn’t on the fast-track to understanding natural language use and how it applies to comprehension as work is being done in this area with researchers figuring out ways for AI to gain a grasp of human understanding.
Have a story tip? Message me at: cabe(at)element14(dot)com