Artificial Intelligence And The Tokyo Test

80% As Good As A Human?

There can be no doubt of Artificial Intelligence’s high performance. The issue is that AI’s ability is judged against human standards, which is inaccurate if not somewhat narcissistic. The most well known method of testing AI is the Turing Test, first discussed in a 1953 paper by Alan Turing. If the questioner can’t work out which of two test participants are human, then the AI has passed. For an idea of what this might look like when applied to artificially intelligent humanoids, watch the anxiety inducing film Ex Machina. Pop culture aside, Turing’s brainchild is beginning to look more and more outdated. How do we test AI that doesn’t just perform as well as humans, but better than them, too?

Jumping through virtual hoops

It can appear that some AI systems operate closely to something that may pass the Turing Test, for example in customer service bots and some social robots. It’s clear that AI can outperform humans in a number of tasks – could you instantly digest and analyse a giant dataset, for example? With this in mind, testing AI on how human it is has become less relevant. Researchers in Japan have taken this into consideration and suggested an alternative testing method called the Tokyo Test. Unlike the Turing Test, it doesn’t just ask AI to pass as a human. Instead, AI must convince examiners that it is high functioning enough to pass the notoriously difficult entrance exam for Tokyo University.

Japan’s National Institute of Informatics set about creating an AI called the Todai Robot to test the effectiveness of the test itself. The machine failed to pass the exam, but it’s score was still higher than 80 per cent of human participants. It’s not outlandish to assume that one day, AI will make the grade. Other researchers have looked to tests that explore how intelligent systems handle real world information. As well as needing AI to be able to crunch numbers and deliver insights, humanity also needs to trust it to make difficult moral choices. The way AI has been tested up until now is no longer applicable – but how will this impact the relationship between humans and machines?

You vs. AI

Testing AI in the same way as humans leads to some unsettling home truths. If humans and AI can be tested in the same way, then what sets them apart? Automation and AI are increasingly part of our everyday lives and their influence will only grow. Changing the way we test machines could also disrupt human education and examination. Speaking at the April 2017 TED Conference, AI expert Noriko Arai expressed concern over the format of human education – should we test humans differently now that AI can achieve extensive data retention? While AI may be built to store and sort information, arguably it’s still a skill that humans should have. That being said, perhaps educators should test students not on their ability to regurgitate information, but on skills which are inherently human like conversation, compassion, creativity, collaboration and adaptability. This would require a complete overhaul of the existing, fact focused infrastructure. Given that creative skills are becoming more valued by businesses that can employ AI to deal with data, maybe that isn’t so difficult to imagine. In terms of the technology itself, developers are likely to build more complex tests to push AI to the absolute limit. One day, AI may well be able to pass as a high performing human – but factual intelligence simply isn’t enough. Researchers need to trust that AI can function in everyday scenarios, making decisions that humans often struggle with. . . and neither the Turing or the Tokyo Test will reveal that.

Artificial Intelligence may not have definitively passed the Turing Test or the Tokyo Test, but at this rate, it seems it won’t be long before it does. However, while showing that AI could potentially pass a challenging exam, even the Tokyo Test seems to miss the point. AI’s wider application in businesses and society suggests that tests would be more effective if they focused on how software copes with real world environments. Examining how quickly it can make a moral decision, will probably be more useful than how well it writes an essay. If human nature rings true, researchers will keep pushing AI to do more, be more, and deliver more insights, and take on more tasks. But when does it end? At what point do AI developers step back and admit that AI has surpassed human ability? Here’s a scary suggestion – never.

What other considerations should be made when testing AI? Will researchers continue to push the boundaries of AI until we reach the singularity? Share your thoughts.