AI is now a proven technology in discrete applications. How can it be used in more all-encompassing roles?
Although materially beneficial corporate deployments of AI are beginning to proliferate, the AI activities of the majority still amount to a few isolated pilot projects conceived in an ad-hoc basis. Organisations without a clear AI strategy – and that’s most – run the risk of falling behind as other better organised industry players move forward.
That said, while individual AI solutions can be transformative within the scope of their application, that’s not as clear-cut an argument for front-to-back change as, say, the digital transformation of a high street retailer. Developing an AI strategy requires an exercise of careful discrimination – acknowledging the present limitations of AI as well as its strengths in order to identify where one can, cannot, or even should not exploit it.
This article is about the ‘what’ of an AI strategy rather than the equally important ‘how’. We will look at business areas where AI solutions are already having an impact, we’ll try to characterise the boundary line of its applicability and we’ll also hear from some of the areas of research that may eventually bring more business and operations areas within the scope of AI solutions.
Some terminology first though. We prefer ‘machine learning’ over ‘AI’ for being less loaded with singularity overtones. Better still would be the simple term ‘machine prediction’ since, in truth, what the machine learning community anthropomorphically calls ‘learning’ generally means not much more than fitting a model to data.
Probalistic Pattern Recognition
This is an important point for business leaders to understand: that the diverse, remarkable, almost magical achievements of machine learning (from image captioning and real-time language translation to poker playing and music generation) are all founded on the basic principle of probabilistic pattern recognition. No more, no less. Understand this and you can appreciate where it can and cannot be applied in your own business sector.
Researchers are constantly extending this scope. Natural language processing is now at, or close to, human-level performance for translation across a range of language pairs. However, generating a complex, goal-directed end-to-end conversation with a human is still very much a work-in-progress. While it is important to keep abreast of such specific developments (the Electronic Frontier Foundation tracks progress against a wide range of benchmark tests), abstracting to higher level considerations can be useful.
Figure 1 provides one such abstraction and considers two factors in particular: the cost of making a mistaken decision versus how different future experience is likely to be from past experience. Rules-based systems operate in the darkest shaded region of the diagram – where the future is expected to be so similar to the past that simple rules will hold true for most circumstances. Machine learning takes over where the rigidity of these rules breaks down, although can struggle if the future task environment differs too much from that upon which the algorithm has been trained.
The vertical axis is important too. It’s not a disaster if a voice-controlled grocery app mistakenly delivers salt & vinegar crisps when a customer requests ready salted. An automated system mistakenly telling a sick patient that they don’t have a life-threatening medical condition is something else though. As we move up the vertical axis, machine learning deployments must increasingly rely on the ability to exception manage, typically by defaulting to human judgement when the machine passes a threshold definition of ‘failure’. If exception management is critical but not feasible – in a real-time, customer-facing, high risk decision scenario, for example – then a machine learning solution may not be viable
This scope of applicability is now so broad that corporates are exploiting machine learning in the ways shown in Figure 2.
1. Enhanced ‘core business’ prediction
Predictive analytics has been used for years to support business decision-making – particularly in marketing and risk management. Machine learning can accommodate large numbers of predictive variables and variable relationships, across both structured and unstructured data. It also dynamically improves predictive power as new data is received, meaning both that historical models may need upgrading, and also that predictive analytics is now being applied in new business domains, such as evaluating the level of people risk faced by an organisation as a function of a multitude of internal behavioural practices, or evaluating the level of compliance risk posed by specific client interactions, and so on.
The intelligent enablement of (of even formerly non-IP addressable) physical devices and equipment is bringing active end-to-end flow optimisation, fault prediction and root cause analysis to great swathes of heavy and light industry operations that hitherto have been operations management blind-spots. Business functions that historically may have had little ex-ante predictive analytics deployed for management purposes are now firmly within scope – supply chain management, finance & risk operations, health & safety and so on.
At least three types of automation solutions are being deployed, listed here in loosely decreasing order of maturity:
Extractive – The automated extraction of structured information from unstructured sources.
Orchestrative – Automating processes or activities where simple rules-based approaches (which underpin most RPA solutions) break down. Such as predicting the most appropriate clauses for a legal contract from a range of possibilities, based on the parametrised requirements of each contract.
Generative – These models are being used to automatically ‘blank page’ generate passable ecommerce product descriptions. They offer tremendous potential to excel in architecture, for example, where creativity and stylisation meet structured constraint.
3. New customer propositions
In B2B, John Deere has acquired Blue River to provide precision agriculture solutions that optimise the cost and effectiveness of pesticides, while Bank of America’s Intelligent Receivables solution matches incoming payments with invoices. In B2C, Google’s Pixel ear-buds permit near-real-time face-to-face language translation, while the Amazon Go machine-vision enables a no checkout shopping experience.
Despite the much-extended scope of machine learning applications, organisations can still be blissfully unaware of the value of their own data. Investors in early stage machine learning startups routinely value the privileged access to training datasets much more highly than they do the startup’s machine learning algorithms. Opportunities may even exist to build new businesses founded entirely on machine learning-driven competitive advantage. New techniques such as representation learning are making it more straightforward to integrate a corporate’s data asset with that of relevant third parties to support predictive performance in new non-core (but potentially highly commercialisable) areas. As an absolute minimum, businesses must start to give consideration to IP ownership of third party models trained on their data.
5. Disruptive models
Machine learning has the potential to radically revise pre-existing business models and/or cost to compete. Ocado’s swarm robot warehouse automation approach – now fully implemented – is arguably an example. Not only is order picking fully automated, it is also done in a way that minimises the number of components involved (reducing maintenance and repair) and maximises cost efficiency. It does this by making full use of a warehouse’s 3D space and variabilising the amount of that volume required at any given time.
Such innovations have enabled Ocado to successfully become the ecommerce fulfilment provider to other retailers in other countries.
As these five examples demonstrate, the potential for machine learning is extremely broad. This raises the importance of having a systematic and comprehensive approach to identifying them. Structured reviews of an organisation’s operating model (see Figure 3) financial model and customer propositions – undertaken by mixed teams of business practitioners and machine learning scientists are proving to be a successful way of doing that.
How should any given opportunity be evaluated? Below are some of the criteria that we have found important.
Exception processing – As mentioned, if exception management is critical but not feasible, deploying a machine learning solution may not be possible.
Explicability – There has been much commentary about the difficulty of explaining the decision-making of deep learning models. There are some models ,such as random forests, that better lend themselves to explicability and, in many cases, perform just as well as (and sometimes better than) deep learning models. The practical preferences of the machine learning community itself are also tending towards facilitating better explicability through, for example, making use of newer regularisation approaches that explicitly weight fewer rather than more variables. If deep learning is still preferred, approaches exist that permit a degree of interpretability of a given decision by representing complex models with simpler models in the region of the specific decision criteria. Nonetheless, explicability – and the degree to which it can be mitigated – may be a key criterion where models must operate in highly regulated environments.
Data protection regulation – Opportunities may need to be evaluated, and potentially re-specified, for compliance with relevant data protection regulation, such as the upcoming GDPR.Legal and reputational risk – Any biases in the training data will migrate and manifest themselves in the model. If the training data captures socio-economic trends which, although real, are ethically unacceptable, the machine learning model will reproduce and even reinforce them – potentially exposing the corporate to legal and reputational risk.
Make vs buy – Determining whether there are robust vendor solutions that will work in a given organisational context is key. Properly measuring prediction accuracy, evaluating the robustness of the underlying algorithmic approach and determining the degree of correspondence between the vendor data model and that of the organisation may be important evaluation criteria.
Organisational change – Determining whether the organisation can even accommodate the requisite business change is, for some, a critical criterion that’s often overlooked.
Machine learning today vs tomorrow
Machine learning research is proceeding in a multitude of directions, all of which can be fascinating to the enthusiast. Nonetheless, two key areas of development are increasing the scope of business applicability:
1. Growing adoption of Bayesian learning approaches
Machine learning algorithms don’t always get it right and, in many deployment scenarios, don’t have to – provided they are able to give a probabilistic measure of their confidence in their output. Classical models that represent the vast majority of historical corporate deployments of predictive analytics can’t do this. Furthermore, classical models typically only provide the most likely (maximum likelihood) answer. This is not good enough in many scenarios. Setting optimal levels of supermarket inventory requires an ability to trade off the costs of perishability against the opportunity cost of a missed sale. When the utility of getting a decision right is different from the cost of getting it wrong, classical models can be inadequate.
The growing sophistication and adoption of Bayesian methods – which yield probability distributions over possible outputs to address these classical limitations – is proving key to wider corporate deployment.
2. Reducing the training data required
A variety of methods are reducing the volume of labelled training data required. The ability to train models in simulations of real-world environments has become a critical requirement for models operating at the extreme end of complexity, in training self-driving cars for example, or anomaly detection using the digital twins of IoT-enabled physical assets. As the ability to simulate real-world environments improves, so too will our ability to train models with ever-more modest demands on training data.
An added benefit of the Bayesian approach described above is that it has an in-built ability to incorporate existing domain knowledge into models, which can result in improved performance and shorter training times whenever such knowledge exists.
Transfer learning – simultaneously training models on a range of tasks – can result in faster training on any one of them. Similarly, transferring pre-trained models from one related environment to another can require considerably less fine-tuning than training from scratch.
Corporates in all industries must now craft their own AI strategy. In doing so, a judicious approach will yield the highest return with the fewest adverse surprises. Actually, a perfect illustration of the limitations of AI today is the nature of the exercise itself of identifying and assessing potential AI opportunities. Doing so requires informed and structured imaginative thinking along with the application of experience-based judgement – both of which will be the preserve of humans for some time to come.
Tariq Khatri is Managing Director of machinable, an advisory firm that helps leadership teams design and take control of their machine learning agenda.