A report at NewScientist describes a research paper from a Google team as presenting a “fix” for the spread of “garbage” across the Internet: an algorithm that would rank web pages based on their “trustworthiness” by automatically detecting and tabulating “false facts” on each web page.
Like every other pretense of calculating Objective Truth with a formula – or “fact-checking” the Internet with a team of supposedly disinterested and unbiased clergy of truth-seekers – it’s a concept brimming with the potential for abuse. The ink isn’t even dry on the government takeover of the Internet, and we’re already setting up office space for the Ministry of Truth? Everything really does happen faster on the Internet.
The “problem” addressed by the research paper NewScientist references is that Google’s search algorithm currently ranks websites based on their popularity, using “the number of incoming links to a web page as a proxy for quality.” The drawback to this approach is that “websites full of misinformation can rise up the rankings, if enough people link to them.”
Such rankings can even be influenced by a large number of links from Internet users seeking to challenge the claims made on the page, which is one reason it’s become increasingly common practice to link to third-party references to a disputed page. This, in turn, can create self-reinforcing rings of mistaken skepticism, in which those who challenge a website link only to each other, circulating increasingly inaccurate citations of the original page that was challenged… and perhaps denying readers an opportunity to see clarifications, updates, or retractions made on the original page.
The pursuit of Objective Truth is a difficult business, and it will probably continue to stubbornly resist even the most well-meaning efforts to automate it. Not all of those efforts are well-meaning. The NewScientist post approvingly references a few “fact-check” websites that have themselves been rocked by devastating challenges to their impartiality and accuracy.
Some self-described “fact-check” sites are outright jokes. Google’s “Knowledge Vault,” the prospective source of pure and undiluted truth for trustworthiness rankings, is described as a “vast store” of facts validated by the near-unanimous agreement of the Web. Gee, what could go wrong with that?
The temptation for self-appointed Gatekeepers of Truth, especially one as powerful as Google, to fudge their sacred (and enormously complex) truth-detecting formula would be enormous. Even if the formula is kept pure and delivers initially sound results, it could be corrupted by inputting false data, or manipulated by writers who learn how to beat its tests.
Over time, it’s not unreasonable to assume that the websites most commonly beaten down in the rankings due to “trustworthiness” errors would be those written by people who haven’t carefully studied the trustworthiness algorithm and learned how to play games with it.
Then there’s the matter of the Devil’s favorite sort of deception: the “half-truth,” a false claim packed with valid, but insufficient, nuggets of fact. Inferences are difficult to mechanically evaluate. An algorithm designed to detect assertions that run contrary to verified data isn’t going to detect truth left undelivered, context that isn’t properly established, or contrary evidence conveniently left unmentioned.
The level of confidence associated with automatically tabulated “trustworthiness” rankings seems likely to exceed the actual trustworthiness of the pages, as most people understand the meaning of that term. Cleaning up the “garbage” on the Internet would involve a lot more than reducing the search-engine priority of a few highly popular but empirically incorrect web pages; there is danger in persuading users to believe that such measures are sufficient.
There’s also danger in asserting the power to do such things automatically, without an opt-in from users. (If this “ranking by trustworthiness” concept gets past the theoretical stage, perhaps Google will implement it with such an opt-in. Big Internet companies have been burned a few times in the recent past by public outcry over the subtle manipulation of their behavior by stealthy changes to their Web experience, made without explicit user awareness and consent.) A lot seems to be happening to us “automatically” these days; some of these systems are accepted as helpful, while others increase the sense of unease that average end users have lost control of the Internet.
There’s a considerable paradigm shift involved in this ranking-by-facts concept, as it would transform the ranking of web pages from an external process controlled by the great and unruly mass of users – who make pages popular by linking to them – into an internal procedure controlled by Google, and those who learn how to take advantage of its system. Not that existing search algorithms are impossible to manipulate, of course – far from it! – but that transition to internal control is something users might want to ponder at length before signing on to it, assuming they are given the choice of not singing on.