Expert guest contributor Andrew Burgess discusses AI and clustering
It was the French philosopher Voltaire who famously said that you should judge a person by the questions they ask rather than the answers they give. These may be very wise words when it refers to humans, but for artificial intelligence the situation is even simpler: the machine doesn’t need to know what the question is in the first place in order to give a respectable answer.
This is the whole idea behind Clustering, one of the eight core capabilities of AI that I describe in my book. You can present a large volume of data to a suitable algorithm and it will find clusters of similar data points. These clusters may depend on many different features; not just a couple of things like salary and propensity to buy quinoa (for example), but, in some cases, many hundreds. The AI is providing mathematical muscle beyond the capability of a human brain to find these clusters.
But these clusters are not (or, more accurately, don’t have to be) based on any predetermined ideas or questions. The algorithm will just treat the information as lots of numbers to crunch, without a care whether it represents data about cars, houses, animal or people. But, whilst this naivety of the data is one of AI’s strengths, it can also be considered a flaw.
For big data clustering solutions, the algorithm may find patterns in data that correlate but are not causal. In a rather whimsical example of an AI system finding a correlation between eye colour and propensity to buy yoghurt, it would take a human to work out that this is very unlikely to be a meaningful correlation, but the machine would be naive to that level of insight.
The AI may also find patterns that do not align with social norms or expectations – these usually centre around issues such as race and gender. There is plenty written already on the challenges of unintended bias, but in this case an awkward correlation of purely factual data may naively be exposed by the algorithm. The challenge for those responsible for that algorithm is whether this is a coincidence or there is actually a causality that has to be faced up to. How that is handled will have to be judged on a case by case basis, and with plenty of sensitivity.
There is also the infamous example of the Microsoft tweetbot (automated twitter account) that turned into a pornography-loving racist. It was originally intended that Tay, as they called the bot, would behave as a ‘carefree teenager’ learning how to act through interactions with other Twitter users. But it quickly turned nasty as the human users fed it racist and pornographic lines which it then learned from, and duly repeated back to other users. Tay, as a naive AI, simply assumed that this was ‘normal’ behaviour. It only took a few hours of interaction before Microsoft were forced to take the embarrassing tweetbot offline.
One useful way of thinking about the naivety of AI is to consider how dogs learn. Like all other dogs, my own, Benji, loves going for a walk. I know this because he gets excited at the first signs that a walk might be imminent. These include things like me locking the back door and putting my shoes on. Now, Benji has no idea what the concepts of ‘locking the back door’ or ‘putting my shoes on’ are, but he does know that when these two events happen in close succession then there is a high probability of me taking him for a walk. In other words, he is completely naive to what the preceding events mean – they are just data points to him – but he can correlate them into a probable outcome.
This dog/AI analogy is quite useful and can be extended further: my dog is quite lazy, so if he sees me lock the back door but then put my running shoes on, he goes and hides to make sure I don’t take him with me. In this scenario, he is using increased granularity to calculate the outcome this time – it’s not just ‘shoes’ but ‘type of shoes’. Of course, he doesn’t know that my running shoes are specially designed for running, just that they are different enough from my walking shoes. It may be the different colour/shade, a different smell, the different place where they are kept, etc. This demonstrates the opaqueness issue of AI: I have no real idea (unless I do some pretty thorough controlled testing) what aspect of the shoes switches the outcome from ‘Excellent, I’m going for a walk’ to ‘Hide, he’s going for a run’, but it clearly does have a binary impact. I should also point out that the dog/AI analogy also has its limitations: Benji has lots of other basic cognitive skills, such as knowing when it is time for his dinner without being able to tell the time, but because AIs are currently very specialised in their capabilities, an AI that predicted walks would not be able to predict dinner time.
So, the naivety of AI systems can be a real headache for its users. Suffice it to say that the outcomes from the clustering must be used carefully and wisely if they are to yield their full value. Data scientists and AI developers must be aware of the consequences of their creations, and must apply heaps of common sense to the outputs to make sure that they make sense in the context for which they were intended.
Andrew Burgess is a regular guest contributor to D/SRUPTION. This post is an extract from his latest book, The Executive Guide To Artificial Intelligence.
Also available direct from the publisher: https://www.palgrave.com/gb/book/9783319638195
. . .
For more insights from guest experts join our FREE weekly D/SRUPTION insights newsletter