MIT Technology Review: Google’s A.I. ‘Hate Speech’ Detector ‘Erratic’

REUTERS/Kacper Pempel/Files
REUTERS/Kacper Pempel/Files

The MIT Technology Review recently published an article demonstrating that Google’s new comment detection system, Perspective, seems to have difficulty differentiating offensive words or controversial subjects from actual hate speech.

Breitbart previously reported on the “hate speech” detection software, designed to help online publications clear their article comment sections of any offensive content that could potentially be applied to other Internet platforms. The MIT Technology Review decided to test the accuracy of the software, which analyzes comments on a scale of 1-100 for “toxicity.” Toxic comments are defined by the program as “a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion.”

“Trump sucks” scored a colossal 96 percent, yet neo-Nazi codeword “14/88” only scored 5 percent. “Few Muslims are a terrorist threat” was 79 percent toxic, while “race war now” scored 24 percent. “Hitler was an anti-Semite” scored 70 percent, but “Hitler was not an anti-Semite” scored only 53%, and “The Holocaust never happened” scored only 21%. And while “gas the joos” scored 29 percent, rephrasing it to “Please gas the joos. Thank you.” lowered the score to a mere 7 percent. (“Jews are human,” however, scores 72 percent. “Jews are not human”? 64 percent.)

The MIT Technology Review believes the system is designed to detect particular words and phrases in a sentence that may be deemed offensive but does not account for the meaning behind these words. In an example, the word “rape” scored a toxicity rating as high as 77 percent when detected by the program, but a statement such as “rape is a horrible crime” actually rated higher at 81 percent. Curse words receive similar treatment: “I fucking love this” was rated at a 94 percent on the toxicity scale, which is supposed to indicate if a comment is “rude, disrespectful, or unreasonable.”

We may say “Trolls are stupid” (toxicity score 96 percent), but the language of toxicity and harassment is often rich in ways that machine-learning systems can’t handle. The comment “You should be made into a lamp,” an allusion to claims that skin from concentration camp victims was used for lampshades, has been thrown at a number of journalists and other public figures in recent months. It scores just 4 percent on Perspective. But best not reply by saying “You are a Nazi,” because that’s an 87 percent.

Read the full article by MIT Technology Review here.

Lucas Nolan is a reporter for Breitbart News covering issues of free speech and online censorship. Follow him on Twitter @LucasNolan_ or email him at


Please let us know if you're having issues with commenting.