Thursday, January 15, 2015

Why Doesn't Facebook Know My Friends?

I went to Facebook today to add someone, and the list of people who I might know was shockingly terrible. I knew about four of the first ~50 suggested, and they were all friends with my wife.

Why is this analysis so difficult? I suspect it has to do with one of three problems:

  1. Algorithm development: Turns out it is harder than you would suspect to come up with a good "lift" formula based on disparate data. To me, this problem seems closest to market basket analysis, the classic formulation of which is "people who buy [item] are also likely to buy [item]" such as bananas and milk. Facebook may just not have spent as much effort on this area as, say Netflix did.
  2. Computing power: Market basket analysis is really computationally intensive, so it sucks up a ton of processing time. Perhaps Facebook has decided it is not worthwhile to spend the money on this element of its business? I am a bit surprised, as it would seem that Facebook users are only as happy as the involvement and extent of their network, but the cost answer is possible.
  3. Lack of knowledge: This is another way to say "stupidity." Actually, that's a bit unfair. I believe that many data analytics groups in companies are approaching the problem wrong. They are trying to find "people closest in the network to this person" rather than "people with whom this person might be most interested in connecting." For example, I see a lot of people on the suggested list who are friends with my wife. That's reasonable, but an even smarter way to approach the problem would be to see that I have lots of connections to college classmates and suggest more of them who are closer to my network or to notice that I just switched jobs and suggest people at my new workplace.
Some of these algorithm suggestions in #3 would involve "tweaking" the algorithm, which data scientists sometimes object to doing. They want purity so that every special group does not need his or her own algorithm. That's where the marketing folks come in.

The Product Manager's job ought to be thinking deeply about what makes people happy on Facebook and then challenging the data scientists to move towards that goal. If more than one algorithm is required (e.g., one for "currently working" and one for "not employed"), the calculation ought to be about cost versus long-term benefit. Those long-term benefits include customer loyalty, a calculation I have discussed previously and which lately is coming into doubt for me and Facebook. Sorry, Mark Zuckerberg.

Thursday, August 15, 2013

Find Your Co-Data

"Co-data" is my term for data that goes well and augments your core data set. I particularly like that The Weather Channel has found consumer behavior data predicting what you will buy depending on the weather. You don't need The Weather Channel's giant data set to find this data set. It could be as easy as looking at your fellow local businesses' websites.

Let's say you're a cab driver. You want to minimize wait times and maximize distance driven. How about finding out when colleges in your area start up again? Or checking out when a particular bar closes? Or finding out the time a particular show (preferably one with drunken attendees interesting in safe-cabbing it home) gets out?

I tried this simple method when I worked at PPG Industries. Of course, our sales of exterior paint increased when the weather got pleasant. Pulling free data off the NOAA Climate Data Center enabled me to do some rudimentary comparisons between our past sales by region and temperature. I found that people start painting more at about 50 degrees F, and that over about 84 degrees F the amount they paint starts to drop off (too hot out).

Using such basic data and simple correlation, I was able to optimize the load-in for our largest retailer's stores so that we had enough exterior paint early in the season... but not too early. I also found that using last year's sales to predict when we should ship this year was a lousy measure; better to use the average over the past three years and then build back two weeks for safety.

At Vocollect, we're discovering lots of cool ways to use the information we have to make our workers' lives easier. We're helping by giving simple suggestions such as prompting the user to access a feature when we notice the feature could be used to solve a problem we deduce the worker is having. The next phase will be to combine this user data with simple information we have from other sources to help suggest, for example, how two coworkers can avoid each other in a distribution center aisle to ease congestion delays.

All this work goes back to my feeling about big data: you don't need it if you have plain old "data" that you're not using in the first place.

Wednesday, March 13, 2013


No, I'm not talking about Key Performance Indicators. My KPI stands for Keep Pricing Intuitive.

Look at how Southwest Airlines presents its pricing (on the left). It's a thing of beauty. Three categories: "Business Select," "Anytime," and "Wanna Get Away." It's clear what the purpose is, and it's pretty clear that the Wanna Get Away seats will disappear first, then the Anytime, and finally the Business Select.

Why does an intuitive pricing scheme matter? It communicates to your customers something about your brand. In the case of Southwest Airlines, they are saying, "We provide excellent value and make it easy to do business with us." The pricing scheme fits perfectly with the brand image.

I wish more companies, including my own, would understand the value of simplicity and intuitiveness in pricing. That doesn't mean you have to be the lowest price or even the simplest system as long as the pricing is consonant with your brand values.

Simplicity in pricing helps internal people explain what's going on (especially Customer Service). It helps explain to current and potential partners and customers the value of moving to a higher level of engagement, commitment or partnership. And it makes it easy to justify why one customer gets one price and another customer gets an entirely different price, even if those prices are unbelievably different.

For most businesses, salespeople want to get price out of the way in order to talk about the value the product, service or solution can bring. In my opinion, the only way to do so is to make pricing easy to understand. If the salesperson can explain it simply, the conversation is short, allowing the sales representative to focus on more important things such as value-in-use (to justify why your prices are higher than those of your competitors).

Can anyone tell I have been struggling with pricing issues this week?

Friday, March 8, 2013

Hurray for Accurate Depictions of Big Data

I have long been a fan of the incredible, eclectic blog Cory Doctorow today has this excellent review of a book on Big Data. In the review, he describes big data as:
"a computational approach to business, regulation, science and entertainment that uses data-mining applied to massive, Internet-connected data-sets to learn things that previous generations weren't able to see because their data was too thin and diffuse."
Awesome definition. Notice that "big data" means massive, Internet-connected data sets. Analyzing your CRM data is not big data. It's just data. Applying weather corrections to sales (which some companies have been doing for 25 years) is not Big Data. It's just data. Figuring out your customers' various warehouse sizes in order to tailor solutions to them is not Big Data. It's just data.

If you have been following my blog, you know that I believe passionately about the value of small data. Most companies do not use the data they have. Therefore, I would assert that these companies are ill-advised to investigate Big Data. Rather, they would be better off figuring out what customers want and how to aggregate the information they already have to serve those needs.

In a consulting engagement I had when I first moved to Pittsburgh, I met the CEO of a large regional grocery. He said these exact words to me: "Our problem is that we have all this data, but we don't know what to do with it." Unfortunately, the next moment he was pulled away, and I never got to say to him what I wanted to say:

  • Figure out how customers could help themselves and provide the data to them. For example, let customers opt in to a system that links pharmacy information with shopping data and then let customers scan foods to ensure that they don't run afoul of prescription or health restrictions such as salt content. The grocery would consolidate pharmacy sales with them and provide a great service.
  • Figure out what products sell well together. For example, determine how sales of core items such as spices or core canned vegetables such as kidney beans affect the sales of other items that might be in a recipe and then adjust inventory levels to ensure the critical items are always in stock.
  • Attack low-profit brands with house brands. Purposefully stock out of the national brand on occasion and see who switches to the store brand and what type of person doesn't switch. Target incentives to the non-switchers and align pricing and shelf displays to maximize house brand sales.
  • Provide a way to scan products on the grocery cart itself. Use this data to negotiate with suppliers and optimize brand mix by seeing what products customer consider before they decide on a brand.
  • Capture location-based information on the grocery cart. Use this information in conjunction with sales to reorganize higher-value items in locations where the grocery carts pass more frequently.
  • Analyze sales at a particular time of day to see what high-profit items might fall in popularity. Time screen-based in-store advertising to promote those products at the "off times."
There are so many ways to capture the value of "small data" that many companies just do not consider. Why invest millions of dollars in "Big Data" when you aren't using the data you have? And why not combine "Big Data" approaches with existing "small data" to amaze and please your customers? You don't need a genius to get started on either project.

Tuesday, March 5, 2013

1% Inspiration, 99% Perspiration

Popular wisdom holds that new businesses are 1% inspiration, 99% perspiration. At Vocollect, we find that many potential new markets for our core competitive advantage look very attractive from the outside until one gets into the detail. Fortunately, we have the resources of a large company to investigate these new markets before proverbially "leaving our jobs" to enter these markets.

Start-ups don't have the same resources, but they can be more agile and resourceful when it comes to fulfilling customer needs. Usually, individuals starting the company also do not need a $5 million business within one or two years in order to be successful. Consequently, many smart start-ups target a few customers and learn to serve them well before expanding. In these situations, a gigantic market opportunity will serve them particularly well. Check out my favorite resource on start-ups for more insight.

In case I ever need that 1% inspiration, today I started a new blog to collect all my great (and not so great) business ideas. If you have the sweat but need the inspiration, feel free to steal one of my ideas. Just let me know if you're doing it, please, so that I can track your success... and know not to compete head-on.

Tuesday, February 19, 2013

A/B Testing For Everyone

The folks over at the phenomenal Marketing Experiments Blog had yet another post about A/B testing that reminded me of some consulting work I did in the past. Often, I have found that organizations think you have to be a gigantic company to do A/B testing. The reality is that a company of any size can A/B test just about anything, sometimes to dramatic effect. And a small company can apply very sophisticated marketing analysis very inexpensively in this age of free, high-powered statistical languages.

When I worked for Strategic Energy, management believed we couldn't just send our customers a contract and re-sign them for three years of electricity usage. I said, "What's the harm in trying?" We sent a hundred customers a thank-you for letting us serve them along with a new contract for service. About 35 of them sent us back a signed contract. How much did that test cost? About $300 and half a day of work. After that experience, Strategic Energy started sending every customer under a certain size a renewal contract, saving tens of thousands in sales costs per year for those that responded.

We then sent out postcards to the remaining customers plus about 200 more asking them to contact us about their contract renewal. On one postcard, we put an existing customer photo and an inspirational message about saving their business money. On the other postcard, we put a funny beach photo and a message to the effect of, "Wouldn't you rather be spending your time on the beach than renewing an electricity contract?" We assigned customers randomly to one or the other. To our surprise, the beach one got a statistically significantly better response. Simple A/B test done. Learning learned.

I applied this kind analysis to the funding solicitation work of the Jewish Federation of Greater Pittsburgh to equally powerful effect. In this case, some simple linear regression showed that of the greatest factors influencing the size of the gift was whether the gift was given online (even when holding donor age constant). Pushing customers to the website to donate increased the size of the gifts, and some tweaking to the website itself increased gift sizes even further. All that we needed to complete this analysis was a history of donations and some basic information about the donors and when they responded.

The barrier to basic A/B testing usually lies in company culture, not in cost or capabilities. Companies need to get wired for a "learning culture" that emphasizes marketing science over gut feel. This change must emanate from the senior executive team, and they have to understand how powerful data management and analysis can be to improve marketing response rates, revenues and profits.

As analysis professionals, we need to bring these smarts to the executive team so that they can bring culture change to the rest of the company. I try to remind myself of this goal periodically when I find myself tiring of yet another explanatory meeting with the VPs. Although sometimes repetitive and tiresome, the meetings to explain what we are planning to do after we test result in the executive support necessary to internalize the learning from the testing over the long term.

Friday, February 15, 2013

Revenge of the Data

I have been following with relish the story about Elon Musk's war with the New York Times over a negative review of their Tesla S electric vehicle. What I loved about Musk's retort to the New York Times story is how Tesla Motors managed to use device data to refute the story. The war ends up being a debate between the hard data in the device and the reporter's notes.

I take away three conclusions from this episode:
Reporter's vehicle log as annotated by an angry Elon Musk

  1. Data is power. Companies that think about information they could or already do have available and then exploit that data create sustainable competitive advantage through their installed base. I learned this first hand at PPG Industries, where we were able to use tint machine data to examine paint color usage by region. I only wish that PPG had been more open to using the color chip rack to collect data (discretely and privately) about user interactions with the display. At Vocollect, we are exploring a wide variety of ways to aggregate data from our wearable devices to enhance the user experience.
  2. Companies should get data in the hands of users. I see this war in part as a problem stemming from the New York Times reporter's inability to get all of the information he could have had available...information Tesla then gathered from the log files. Perhaps giving this information to the user in the first place in a snazzy interface could have prevented some of the reporter's frustrations. Heck, a number of device manufacturers give the data to users in an API and end up getting cool tools for their other users for free, created essentially by fans of the brand.
  3. Don't get into a pissing match in public. Elon Musk, known for his huge ego, could have been more diplomatic and apologetic to the reporter. Abusing customers or potential customers does not position the brand for success. And essentially accusing a reporter at one of the most prestigious papers in the world of journalistic fraud qualifies as abusing potential customers in my book. Tesla Motors might have gotten a better response from the Times and better publicity by working with them to diagnose what had happened rather than by working against them. Unless you believe that all publicity is good publicity, in which case Musk did the right thing by making this story huge.\
I will anxiously await the innovations from car companies and any other company that has direct interaction with the actual consumer, enabling us to understand and improve our own behavior. As you know if you read this blog regularly, I hope to be at the forefront of that user empowerment given my sincere belief in the power of some Major Data Geekitude to improve our collective future.