I Don’t Have the Time. Do You?
We’ve seen the manipulation of data since people began collecting it. It’s a good tool for fruadsters and a great one for propagandists because of the widespread association between numbers and inviolate objectivity. In fact, one needs only to cherry pick some facts or fudge some numbers, snowball them into a cause, and send them rolling through society to influence thinking. Not a hard process: Anyone can do it. There are books on the subject: Cathy O’Neil’s Weapons of Math Destruction and Kit Yates’s The Math of Life & Death: 7 Mathematical Principles That Shape Our Lives both come to mind. * And this morning I stumbled onto an article in the online Wall Street Journal that addresses the trustworthiness of researchers, namely, of “scientists.” **
The Journal’s article relates the work of Joe Simmons, Leif Nelson and Uri Simonsohn, three “debunkers” who have voluntarily taken on the task of cross-checking the data in “scientific” papers and who have as a result debunked published work. Thanks, guys. I don’t think I have the energy to go through tens to hundreds to thousands of annual publications in print or online to discover mistakes, manipulation, or outright lies. That’s a time-consuming effort that I lazily leave to others because I prefer to trust even though I know from experience that such trust is naive in a world of “p-hacking” and AI in service of some special interest agenda or self-aggrandizement.
Fight Numbers with Numbers, Not Innuendo
That the three cross-checkers have done this yeoman work for the rest of us is commendable. That they have received pushback not on the grounds of accurate and detailed data, but on the grounds of “gender” bias from one of their targets is typical of our times. In our litigious times of wokeness, attacking the critic and not the criticism is a common defense tactic.
Apparently, it’s easy to rely on some “special interest” theme for those who cannot argue facts; it’s always a matter of suing anyone on the grounds of some phobia or bias. And that’s what the three “debunkers” now face in a suit filed by Francesca Gino. “In her lawsuit, Gino said Harvard’s investigation was flawed as well as biased against her because of her gender.” Can you imagine? Touchy-feely Harvard going anti-woke. It’s beyond me. Could it be true? Could Simmons, Nelson and Simonsohn spent all those hours sifting through data because they are heartless “gender-phones”?
“Whoa, Francesca!” I shout. “Is this another of those throw-some-mud-and-hope-it-sticks arguments used for distraction and defense? That’s using more non-science to combat the charge of non-science, of unscientific work, a ploy typical in an age of social media that fosters ad hominem attacks as refutations: Kill the messengers, not their message. Wouldn’t a “scientific” Francesca Gino have been better off in restoring her supposedly lost reputation by citing the validity or by proving the validity of her data? If Joe Simmons, Leif Nelson and Uri Simonsohn are wrong, then prove them wrong through logic and the data they say you manipulated.”
But this is the world as it is: Messengers, not messages, are the targets. Messengers must be untrustworthy because they plot against researchers on the bases of all those currently acceptable special interests, such as Gino’s gender—whatever that is, by the way. Whom or what should we trust as we sail the seas of data?
We Really Are Pawns until We Aren’t
It’s easy to trust some statement when it is centered on a number. Yates points this out in a discussion of BMI, a measure of “body mass” that insurance companies, doctors, and even athletic scouts use to determine the “fitness” of a person. But as Yates notes, “The main problem with BMI is that it can’t distinguish between muscle and fat” (48). Because BMI is part of the social psyche, it has become an “indicator of health,” which, by the way, it isn’t. You can find an app for your phone or watch, however, that tells you this supposedly important datum.
Thanks, Pythagoras. From your time to ours we have learned to love numbers because we’ve been taught that they never lie. As much as they were for you, they have become the stuff of our religion of truth.
So, we receive daily doses of them in polls, surveys, and research that “objectively” cover every aspect of personal and social life from that useless BMI to crime data used for enacting ever more anti-gun legislation. As Yates, writes, “Ultimately, the degree to which we believe the stats we come across should depend on how complete a picture the artist paints for us. If it is a richly detailed, realist landscape with context, a trusted source, clear expositions, and chains of reasoning, then we should be content in the veracity of the numbers. If, however, it is a dubiously inferred claim, supported by a minimalist single statistic on an otherwise empty canvas, we should think hard about whether we believe this ‘truth’” (143).
That “with context” encapsulates the data forced into the public record by “climate experts” like John Kerry, Al Gore, and Greta Thunberg. As evidence for this minimalist approach, the current graph on a “climate T-shirt” that Greta’s compatriots are selling with other “climate” paraphernalia shows a series of colorful bars representing temperatures from blue (colder) to red (warmer) that “paints” a cherry-picked data set to support inordinate global warming. Think, also, of the general claim that sea level is rising—which it has been doing for more than ten millennia. Or consider the panicked newscasters who echoed the claims of “the warmest day on record” with that record going back a mere 70 to 150 years on a 4.56 billion-year-old planet. Context? It is what frames the truth of numbers that are on their own meaningless. Take “five,” for example. It’s meaningless unless we apply it in context: Halfway to a first down; 50% of the survey’s participants; not quite a half dozen eggs; a child’s age; years of a contract; cost of a lottery ticket; speed of a car in mph or kph—the contexts are endless, but necessary.
Do the Trustworthy Choose the Algorithms?
Universities are prone to claims of superiority. Who, for example, would dare question the excellence of Harvard, Yale, Princeton, and other Ivy League schools? Who would dare compare Westmoreland Community College to a prestigious research institute and a school whose alumni include presidents, ambassadors, and the movers and shakers of industry? But everyone wants some acknowledgement that a school ranks high on a list of “America’s Best Colleges.” In 1988 journalists at U.S. News decided they could fashion a scale for ranking schools, one that included everything from the SAT scores of incoming freshmen to the financial contributions of outgoing alumni.
In her account of the ranking process, Cathy O’Neil writes, “Three-quarters of the ranking would be produced by an algorithm--an opinion formalized in code—that incorporated these [data] proxies. In the other quarter, they would factor in the subjective views of college officials throughout the country” (53). For me, the key words in this passage lie in the parenthetical “an opinion formalized in code.”
And that’s the heart of so many untrustworthy, but heavily data-laden research documents. It’s tough to take the human element out of the equation. It’s tough to do research in an era of grant financing and social media fame that does not rely on objective algorithms. And it’s especially tough to resist the temptation to fudge in the competitive world of “publish or perish” academia or in a world governed by censors who favor an agenda over doubt.
The war on anyone who bucked the dictates of the COVID restrictions and who questioned the science behind giving little kids the vaccine when they appeared to be rather invulnerable to severe consequences of the disease reveals what an opinion formalized in code can do. But the unintended consequences of censoring competing data and the conclusions they warrant has generated distrust. The ramifications of those algorithms aimed not just at those who questioned the wisdom of kindergarteners wearing masks every day at school and aimed at conservative speech has morphed into a wariness of data used to support any cause emanating from different branches of the government, from the Justice Department and its subagencies to the National Institute of Health to Homeland Security.
It is common now for media sycophants and those in power to proclaim that any data contrary to their policy positions are either purposely false or naively misleading. Think of the number of border crossings as a context for statements by the Secretary of Homeland Security that the “border is secure.” If the data don’t support the outcomes of the policy, the best tactic is to ignore the data or attack the doubters as xenophobes. But there are, in fact, some numbers that appear in real contexts, such as the number of illegal immigrants who have crowded into New York City to the despair of the mayor and the dissatisfaction of the city’s residents.
Inundated by Data, Devoid of Wisdom and Integrity
So, this is our world. We’re swamped by data on every aspect of individual, social, and political life. We have data on health; we have data on crime; we have data…well, we have data on everything we believe we can quantify. But in attempting to quantify all that is, we find we must rely on the data gatherers and interpreters, some of who are manipulators. And we find that those manipulators are highly selective in favor of social and political agendas.
It’s not that manipulation is new. You know it isn’t. What is new is the facility any twenty-first century unscrupulous person has to influence behavior across not just America, but also across the planet. One YouTube video on “findings” in any field of research can reach hundreds of thousands to millions of people in a blink. And without checkers and balancers like Joe Simmons, Leif Nelson and Uri Simonsohn, both true and falsified data set in motion movements and behaviors that might take decades to reverse. Recall that one lone doctor in the 1960s set in motion that notion of low fat, high carb diet as a panacea for heart health. Ancel Keys was celebrated on the cover of Time magazine and in the halls of government and medical centers. America swore off fat, and people became fatter, all on the basis of manipulated data that supported his claim and ignored data that didn’t.
That’s why Cathy O’Neil says we need an analog of the Hyppocratic Oath for people who handle data. She also advocates for “data auditors” like Joe Simmons, Leif Nelson and Uri Simonsohn.
This is, as I said, our world. We began accumulating data in great quantities with the rise of the modern world. As more humans peopled the planet, more found purpose in exploring previously untapped and even unknown data sources. We went from those early encyclopedists like Denis Diderot, chief editor of Encyclopédie, to the many tomes of the Encyclopedia Britannica, to Wikipedia, to online publications of research without peer review. Anyone can publish anything, and that means no one really knows whom to trust. That even “peer-reviewed research” contains falsified, misinterpreted, and misleading data should concern everyone because of our undeniable interconnectedness.
We have more data than any previous generation of humans. But we don’t have more wisdom, and it’s going to be increasingly more difficult to become wiser than our ancestors if the information we use is false. As Heraclitus wrote in the Fragments: One can’t have facts without wisdom nor wisdom without facts. The two are mutually dependent. Because of our interconnectedness, those who foist false data on any segment of society inhibit the growth of wisdom in all of us.
*O’Neil, Cathy. 2016. Weapons of math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York. Crown. And Yates, Kit. 2019. The Math of Life & Death: 7 Mathematical Principles That Shape Our Lives. New York. Scribner.
**Subbaraman, Nidhi. 24 Sept 2023. The Band of Debunkers Busting Bad Scientists. Online at https://www.wsj.com/science/data-colada-debunk-stanford-president-research-14664f3 Accessed September 25, 2023