Bill Gates is naive, data is not objective

Dan Crawford | February 1, 2013 8:47 pm

by Cathy O’Niel and re-posted with permission of the author who writes her own posts for her blog mathbabe, who works and teaches in New York city as a quant with style. This piece is a comment on data collection, not his intent.

In his recent essay in the Wall Street Journal, Bill Gates proposed to “fix the world’s biggest problems” through “good measurement and a commitment to follow the data.” Sounds great!

Unfortunately it’s not so simple.

Gates describes a positive feedback loop when good data is collected and acted on. It’s hard to argue against this: given perfect data-collection procedures with relevant data, specific models do tend to improve, according to their chosen metrics of success. In fact this is almost tautological.

As I’ll explain, however, rather than focusing on how individual models improve with more data, we need to worry more about which models and which data have been chosen in the first place, why that process is successful when it is, and – most importantly – who gets to decide what data is collected and what models are trained.

Take Gates’s example of Ethiopia’s commitment to health care for its people. Let’s face it, it’s not new information that we should ensure “each home has access to a bed net to protect the family from malaria, a pit toilet, first-aid training and other basic health and safety practices.” What’s new is the political decision to do something about it. In other words, where Gates credits the measurement and data-collection for this, I’d suggest we give credit to the political system that allowed both the data collection and the actual resources to make it happen.

Gates also brings up the campaign to eradicate polio and how measurement has helped so much there as well. Here he sidesteps an enormous amount of politics and debate about how that campaign has been fought and, more importantly, how many scarce resources have been put towards it. But he has framed this fight himself, and has collected the data and defined the success metric, so that’s what he’s focused on.

Then he talks about teacher scoring and how great it would be to do that well. Teachers might not agree, and I’d argue they are correct to be wary about scoring systems, especially if they’ve experienced the random number generator called the Value Added Model. Many of the teacher strikes and failed negotiations are being caused by this system where, again, the people who own the model have the power.
Then he talks about college rankings and suggests we replace the flawed US News & World Reports system with his own idea, namely “measures of which colleges were best preparing their graduates for the job market”. Note I’m not arguing for keeping that US News & World Reports model, which is embarrassingly flawed and is consistently gamed. But the question is, who gets to choose the replacement?

This is where we get the closest to seeing him admit what’s really going on: that the person who defines the model defines success, and by obscuring this power behind a data collection process and incrementally improved model results, it seems somehow sanitized and objective when it’s not.

Let’s see some more example of data collection and model design not being objective:

We see that cars are safer for men than women because the crash-test dummies are men.
We see that cars are safer for thin people because the crash-test dummies are thin.
We see drugs are safer and more effective for white people because blacks are underrepresented in clinical trials (which is a whole other story about power and data collection in itself).
We see that Polaroid film used to only pick up white skin because it was optimized for white people.
We see that poor people are uninformed by definition of how we take opinion polls (read the fine print).

Bill Gates seems genuinely interested in tackling some big problems in the world, and I wish more people thought long and hard about how they could contribute like that. But the process he describes so lovingly is in fact highly fraught and dangerous.
Don’t be fooled by the mathematical imprimatur: behind every model and every data set is a political process that chose that data and built that model and defined success for that model.

5 Comments

blaoism says:

February 1, 2013 at 10:11 pm

There is no theory-n`
PJR says:

February 2, 2013 at 12:28 pm

Great posting. Gates and other philanthropists often do good things but they are wielding their power (money) to do what THEY wish our politicians and governments would do. That’s the exercise of political power, be it through politicians or with their blessings. If we don’t like their projects or metrics, or give them low priority, or think they’re counterproductive, well, we’ve agreed that it’s not our money (or our society). And by opposing and avoiding taxes, philanthropists try to keep it that way as much as possible.
Roy Cameron says:

February 2, 2013 at 2:41 pm

A pleasure to see this criticism of Bill Gates. He is naive and worse.

Quoting a USC biz school prof I have had many conversations with, business execs, thinkers in general, lack a well-differentiated feeling function.

What does that mean?

Ha! It means they are completely blind to the value-judgments that underpin their character and world-view.

Their “objectivity”, an apparent reliance on data upon which they reflect without imposing their subject-hood, is a fraud and it masks some very self-serving POVs that you will never be allowed to question.

This lack is often why business leaders cannot lead. They have no vision, no grasp of how the values get put together.

Their solutions don’t solve.

This has been going on since the inception of the Age of Reason and is not likely to end anytime soon.
coberly says:

February 2, 2013 at 4:42 pm

Roy

it’s been going on a lot longer than that. it is a basic fact of human cognition.

the Bill Gates fallacy is one we reminded ourselves of formally at least once a week in graduate school.

it had little effect on our own field, and of course no effect whatsoever on what the run of the mill “non partisan expert” thinks.

not to mention the mush brained congress and the voters who vote them there.

but, btw, “leading” has nothing to do with thinking clearly. it’s a matter of being half a step ahead of the crowd, persuasive oratory, and having a little money to spend on buying “followers.”
simpleDon says:

February 3, 2013 at 7:33 am

Successful businessmen don’t realize that to a degree their success has come from their ability to narrow their focus to their product, to convince themselves of its superiority above others and to convince his employees and his customers of the same. In other words to built enthusiasm for the product. It is not a model that translates well into the world of objective research.