Thursday, January 15, 2015

Why Doesn't Facebook Know My Friends?

I went to Facebook today to add someone, and the list of people who I might know was shockingly terrible. I knew about four of the first ~50 suggested, and they were all friends with my wife.

Why is this analysis so difficult? I suspect it has to do with one of three problems:

  1. Algorithm development: Turns out it is harder than you would suspect to come up with a good "lift" formula based on disparate data. To me, this problem seems closest to market basket analysis, the classic formulation of which is "people who buy [item] are also likely to buy [item]" such as bananas and milk. Facebook may just not have spent as much effort on this area as, say Netflix did.
  2. Computing power: Market basket analysis is really computationally intensive, so it sucks up a ton of processing time. Perhaps Facebook has decided it is not worthwhile to spend the money on this element of its business? I am a bit surprised, as it would seem that Facebook users are only as happy as the involvement and extent of their network, but the cost answer is possible.
  3. Lack of knowledge: This is another way to say "stupidity." Actually, that's a bit unfair. I believe that many data analytics groups in companies are approaching the problem wrong. They are trying to find "people closest in the network to this person" rather than "people with whom this person might be most interested in connecting." For example, I see a lot of people on the suggested list who are friends with my wife. That's reasonable, but an even smarter way to approach the problem would be to see that I have lots of connections to college classmates and suggest more of them who are closer to my network or to notice that I just switched jobs and suggest people at my new workplace.
Some of these algorithm suggestions in #3 would involve "tweaking" the algorithm, which data scientists sometimes object to doing. They want purity so that every special group does not need his or her own algorithm. That's where the marketing folks come in.

The Product Manager's job ought to be thinking deeply about what makes people happy on Facebook and then challenging the data scientists to move towards that goal. If more than one algorithm is required (e.g., one for "currently working" and one for "not employed"), the calculation ought to be about cost versus long-term benefit. Those long-term benefits include customer loyalty, a calculation I have discussed previously and which lately is coming into doubt for me and Facebook. Sorry, Mark Zuckerberg.