Building Better Algorithms Requires Human Judgment (and Values)

At the end of last week Facebook announced that it would stop having humans write summaries of trending news stories. This follows previous allegations that the selection of trending stories had been changed to remove conservative ones. This is a good time to revisit some foundational questions about the spread of information in a digital world.

First, it is too easy for information cascades to occur. It takes much less effort to like, share, retweet than it does to read, think and possibly even verify. In the language of Daniel Kahneman’s Thinking Fast and Slow, the internet is geared towards, which is based on the emotional and instinctive systems in our brain. Any naive un-weighted algorithm for selecting trending topics will always amplify this problem. What could be done? An algorithm could over time determine who seems to be more prone to fact checking and thinking and could weight people more heavily. Given that all information has been kept by Facebook (and also Twitter), it would be possible to do this now, going back in time using stories that turned out to be flat out wrong or maybe just not all that important after all and see who recognized that and who didn’t. How do you do any of that? Human judgment of course. The whole point here should not be about eliminating judgment, but of discovering high quality judgment.

Second, we risk the becoming the Digital Balkans. Because of confirmation bias (another example of thinking fast) we are much happier reading stories that fit with our existing beliefs. We also tend to know more people who are like us and spend time with them. Again then a naive algorithm for trending topics will look primarily or only among the people I am connected to. Instead, what we really need is what I have called the “Opposing View Reader.” Again, both Facebook and Twitter have all of the information necessary to construct this. They can analyze which networks I am part of and then find other networks that hold different views (and hence a likely quite separate in a network analysis). And just as above this analysis cannot be done entirely by machine. Humans will have to look at topics and the associated networks and make judgments.

The data exists for us to build better algorithms to deal with both the velocity problem (information cascades) and the network fragmentation problem (Digital Balkans). The question is are we willing to tackle these hard problems and start by admitting that this requires judgment and hence values? Or will we, as seems more likely, retreat to a seemingly safe (from a business point of view) but ultimately self defeating position of naive self-reenforcing algorithms?