How Big Data Could Either Solve Corporate Diversity Issues or Make Them Worse

Are you feeling a bit paranoid about the use of data-mining tools to sift through huge piles of online information to predict or manipulate your behavior?  How would you feel if companies started using Big Data to find job applicants, with an eye toward increasing racial diversity?

I couldn’t get three paragraphs into an article on the subject at National Journal before my hackles went up, and that was one paragraph before concerns about how it could all go horribly wrong were explored:

Humans are fallible, biased creatures, and even the most well-intentioned hiring managers have a strong tendency to hire “look like me, act like me” candidates.

Those unintended prejudices in recruitment–whether racial, gendered, or economic–are shortcomings that a growing number of big-data firms are hoping they can help solve with their massive number-crunching operations. By mining troves of personal and professional data, these companies claim they can not only match employers with A-plus job candidates, but help close diversity gaps in the workforce, too.

“Big data in the workplace poses some new risks, but it may yet turn out to be good news for traditionally disadvantaged job applicants,” said David Robinson, a principal at Robinson + Yu, a consulting group that works to connect social justice and technology.

Still, concerns abound. Earlier this year, the White House released a landmark report on big data, warning that the exploding enterprise could–intentionally or not–allow companies to use data to discriminate against certain groups of people, particularly minorities and low-income groups. That’s also the fear of the Federal Trade Commission, which held a workshop last week exploring the concept of “discrimination by algorithm.”

“Big data can have consequences,” FTC Chairwoman Edith Ramirez said. “Those consequences can be either enormously beneficial to individuals and society, or deeply detrimental.”

Thanks for the pearl of equivocal wisdom, Chairwoman Ramirez!  I’d say big anything can be either enormously beneficial or deeply detrimental.  Ten thousand years ago, our ancestors were voicing the same concerns about Big Fire.

The problem is summarized well in those opening paragraphs: despite massive affirmative-action programs stretching back for decades, and an overwhelming social consensus against racial discrimination – it’s one of the worst things anyone in American could be accused of – there supposedly remain pockets of lingering, essentially unconscious racism among employers, who tend to hire people who look like them.  That assertion sounds fairly reasonable in the abstract, as it has been well-established that people tend to draw inferences based on superficial characteristics.  That’s one of the reasons you dress up nicely for job interviews.  

But I find the assertion that such tendencies have created a corporate “diversity” problem so massive that only Big Data can ride to the rescue, by essentially taking some of the hiring process out of human hands, highly dubious.  On the contrary, I suspect big corporate managers are making conscious efforts to score diversity points in hiring.  They have literally been trained to do so, through years of corporate outreach programs and government pressure.  At this point, in the year 2014, many (I’m an optimist, so I’ll upgrade that to “most”) of those managers are sincerely eager to avoid conscious racism, above and beyond the concern that their companies could get into legal trouble for it.  The notion that there’s still some sort of “hire people who look like me” code running in the unreachable core of the human mind, and causing such problems that only enlisting HAL 9000 to screen job applicants can achieve true diversity, is quite a stretch.

It wouldn’t go over well with people who are already concerned about getting the fuzzy end of the affirmative-action lollipop because they happen to be the wrong color.  Now the process of deliberately disadvantaging them, to benefit “traditionally disadvantaged” applicants, will become automated and impersonal.  You’ll never know that your application is getting shot down by computerized diversity screens.  

The first concern mentioned by National Journal is the fear that such Big Data techniques could be turned around to make discrimination against the “traditionally disadvantaged” worse, perhaps unintentionally, as algorithms refine themselves to perpetuate existing corporate biases.  “If you’re a company that doesn’t have a history of hiring women or minorities, your model will tell you that these people are not especially well qualified. Even if you’re simply trying to minimize turnover, what you might do is systematically exclude certain groups that are poor or of a certain ethnicity,” as one student of data mining puts it.

However, the body of the article makes it clear that such concerns are unlikely, because as data-mining experts point out, that’s the sort of bias the system would be explicitly constructed to reject.  Nobody’s going to run a program that either deliberately or inadvertently excludes women, for example – they’d get sued back into the Analog Era.  A more reasonable concern would be either explicit or emergent behavior that confers extra bias points upon the “traditionally disadvantaged,” which will be portrayed a positive effort to be more inclusive, but has the net effect of disadvantaging everyone else… and of course, those disfavored groups aren’t the sort of people who will have much luck convincing the government they have been discriminated against, if they even become aware of the process.

The data miners who talked with National Journal seemed to be concerned primarily about income-group discrimination – i.e. poor people don’t get interviewed for big-ticket high-tech jobs.  They want to address that by using these Big Data techniques to suss out qualified applicants from economically disadvantaged groups who might otherwise not reach the interview stage, speaking in terms of making the applicant pool larger, rather than excluding anyone.  Is there really a big problem with corporate management ignoring highly qualified applicants from low-income groups?  We are, presumably, talking about screening processes that prevent such applicants from reaching the personal-interview stage; I doubt any of these proposed Big Data solutions would result in people getting hired sight-unseen by computers.  

How is a data-mining solution that expands the applicant pool, without excluding anyone, going to compensate for the identifying traits of these presumably disadvantaged applicants, such as which school they went to, or particularly their past employment history?  I’m trying to imagine myself in the manager’s chair, looking at applications for a six-figure tech job.  Some of my applicants recently worked at such jobs; some of them have no such prior experience on their resume; some of them are trying to climb from the pits of the shrunken Obama workforce, and maybe haven’t had a full-time job at all.  What kind of expansive data-mining algorithm leads me to ignore those considerations and hire the guy, or gal, who hasn’t worked since January?  (Is this whole data-mining thing really an elaborate effort to alleviate the “unemployable” curse that afflicts so many in the Obama years, where people who haven’t held career positions in a long time have trouble getting employers to consider them for career positions?)

I tend to think all this fretting about subconscious employer discrimination would be best addressed through a booming economy bursting with jobs, creating a seller’s market for skilled labor.  Only in the low-growth doldrums, where getting the headline unemployment rate down 0.1 percent by knocking another hundred thousand people completely out of the workforce is considered good news, and economic data is examined with microscopes to detect the tiniest signs of “recovery,” do we have to worry about using Big Data techniques to ensure the correct socially-just mixture of racial, sexual, and income groups is left unemployed.


Please let us know if you're having issues with commenting.