A Guide to Big Data

The big surprise is in what you don’t know you don’t know

Firstly lets deal with the hype – big data has become a phenomenon in the past few years – interest being illustrated by a simple Google Trends search as below (the main interest being from India where a lot of the worlds data scientists are coming from). Gartner even predict that big data will account for $232 billion in tech spending through 2016.

Google_Trends_-_Web_Search_interest__big_data_-_Worldwide__2004_-_present

But what on earth is it?

It’s name is hardly a give away and it seems to be the preserve of maths geeks. Rather than try and explain the principles we’ll give some examples instead. This came from a recent article in the Forbes (prepare to be shocked);

Every time you go shopping, you share intimate details about your consumption patterns with retailers. And many of those retailers are studying those details to figure out what you like, what you need, and which coupons are most likely to make you happy. Target, for example, has figured out how to data-mine its way into your womb, to figure out whether you have a baby on the way long before you need to start buying diapers.

Charles Duhigg outlines in the New York Times how Target tries to hook parents-to-be at that crucial moment before they turn into rampant — and loyal — buyers of all things pastel, plastic, and miniature. He talked to Target statistician Andrew Pole — before Target freaked out and cut off all communications — about the clues to a customer’s impending bundle of joy. Target assigns every customer a Guest ID number, tied to their credit card, name, or email address that becomes a bucket that stores a history of everything they’ve bought and any demographic information Target has collected from them or bought from other sources. Using that, Pole looked at historical buying data for all the ladies who had signed up for Target baby registries in the past. From the NYT:

“[Pole] ran test after test, analysing the data, and before long some useful patterns emerged. Lotions, for example. Lots of people buy lotion, but one of Pole’s colleagues noticed that women on the baby registry were buying larger quantities of unscented lotion around the beginning of their second trimester. Another analyst noted that sometime in the first 20 weeks, pregnant women loaded up on supplements like calcium, magnesium and zinc. Many shoppers purchase soap and cotton balls, but when someone suddenly starts buying lots of scent-free soap and extra-big bags of cotton balls, in addition to hand sanitisers and washcloths, it signals they could be getting close to their delivery date.
Or have a rather nasty infection…

“As Pole’s computers crawled through the data, he was able to identify about 25 products that, when analysed together, allowed him to assign each shopper a “pregnancy prediction” score. More important, he could also estimate her due date to within a small window, so Target could send coupons timed to very specific stages of her pregnancy.

One Target employee I spoke to provided a hypothetical example. Take a fictional Target shopper named Jenny Ward, who is 23, lives in Atlanta and in March bought cocoa-butter lotion, a purse large enough to double as a diaper bag, zinc and magnesium supplements and a bright blue rug. There’s, say, an 87 percent chance that she’s pregnant and that her delivery date is sometime in late August.

via How Companies Learn Your Secrets – NYTimes.com.

And perhaps that it’s a boy based on the colour of that rug?

So Target started sending coupons for baby items to customers according to their pregnancy scores. Duhigg shares an anecdote — so good that it sounds made up — that conveys how eerily accurate the targeting is. An angry man went into a Target outside of Minneapolis, demanding to talk to a manager:

Target knows before it shows.

“My daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?”

The manager didn’t have any idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man’s daughter and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologised and then called a few days later to apologise again.
“On the phone, though, the father was somewhat abashed. “I had a talk with my daughter,” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”
There’s another excellent example here from eConsultancy on how Virgin uses big data for targeting and another on how big data will be used in healthcare,  and another in transport
Big data will disrupt marketing for sure and allow for much better targeting of product (disclaimer, the author, John Straw, has an investment in data personalisation startup, Ctrlio). Whether customers will tolerate what may be a perceived intrusion into their privacy is a different matter.