Where were you when you first learned about implicit bias?
Maybe you were on your couch at home watching one of the 2016 US Presidential Debates when Hilary Clinton discussed implicit bias in her debate with Donald Trump. For many psychologists like me, we’d been hearing about implicit bias—and were sick of hearing about implicit bias—for nearly two decades by then.
I can remember precisely where I was when I learned about the measurement of these supposedly hidden prejudices. I was a PhD student at Brown University in 1998 (back when Hilary’s husband was President) sitting in my advisor’s office when she had me complete an implicit association test (IAT) on the computer. The test consisted of pictures of Black and white faces and positive and negative words, and my job was to sort out the faces by race and the words by emotional valence. What became quickly apparent to me was that I had a much tougher time pairing Black faces with positive words than Black faces with negative words. And I didn’t have to think too hard to grok what my difficulty meant about me: I was a racist.
Let me back up. In the 1980s and 1990s, psychology had a problem. We knew people were biased, but whenever we asked them directly, they acted shocked. "Prejudiced? Who me? Never!" Meanwhile, the world kept providing evidence that bias was alive and well.
The issue was that prejudice became so socially unacceptable that no one would admit to holding negative stereotypes about groups not their own. And make no mistake: bias was real, as anyone could see from persistent disparities in hiring, housing, and criminal justice. So simply asking people about their racial attitudes wouldn’t cut it: people would lie to the psychologists, even to themselves. Even worse: people might not even be aware of their racist attitudes and impulses, so how could we expect them to self-report on them?
To solve this, creative psychologists devised ever more clever ways to assess bias, from elaborate fake polygraphs—the bogus pipeline—to indirect questionnaires. But these approaches had obvious limitations.
Around the same time, the cognitive revolution was in full swing, and cognitive psychologists were making real progress in mapping the structure of mental concepts using speeded reaction time tasks (where people respond as quickly as possible to stimuli while researchers measure how fast they react). In one influential demonstration of semantic priming, participants recognized the word “doctor” faster when it was preceded by the word “nurse,” suggesting these concepts were not only semantically related but stored in memory as linked nodes. More than that, these mental networks didn’t just encode meaning, they encoded evaluation, too. So, when people saw a positive word like “flower,” they were quicker to recognize another positive word like “good,” a finding that launched decades of work on evaluative priming. It also planted the seeds for how to assess implicit bias.
If “flower” and “good” go together in memory, then maybe racial groups and evaluations work the same way. Perhaps our cultural environment wires “Black” person to “bad” and “white” person to “good,” even if we consciously reject those links. That was the clever insight: rather than asking people how they feel about race, psychologists could measure how quickly they paired “Black” with “good” or “bad.” The slower you were to link Black with good, the stronger your supposed bias. From this idea came the IAT and other speeded reaction time tasks. These were tools meant to bypass self-report and catch bias in the act. Researchers called these tools bona fide pipelines—to contrast with the bogus kinds—because they were thought to be pipelines to the soul, unobtrusively measuring racial attitudes
These measures became psychology's darling almost overnight. The promise was irresistible: objective, scientific measurement of subjective, hidden bias.
There was just one tiny problem, man. (You know what’s coming.) The measures don't work as advertised.
Olivier Corneille and Bertram Gawronski just published a paper that systematically dismantles the entire edifice of implicit measures, and to be honest, I don’t see how the field recovers. New shit has come to light. Before you dismiss their critique as a couple of iconoclasts criticizing the establishment, know that these authors, especially Gawronski, are no dilettantes. Along with Galen Bodenhausen, Gawronski developed the influential APE (Associative-Propositional Evaluation) model, which became one of the most important theoretical frameworks for understanding how implicit and explicit attitudes work together. When someone who helped build the house tells you the foundation is cracked, you listen.
Corneille and Gawronski examine six core claims about why implicit measures are supposedly superior to self-reports, and every single one crumbles under scrutiny.
Remember, the whole point of developing these measures was to get around the problems with self-reports—the lying, the lack of awareness, the social-desirability effects. The first and most crucial claim was that implicit measures are immune to social-desirability and context effects while self-reports are hopelessly compromised.
Except it's wrong.
Implicit measures are just as sensitive to social context as self-reports, sometimes more so. Change the race of the experimenter and scores shift. Test people in public versus private settings and results change again. Responses on implicit tests can be controlled and even faked. The supposedly pure pipeline to unconscious bias turns out to be just as contaminated as the self-reports we'd rejected. According to the authors, “there is no compelling evidence that implicit measures are immune or less sensitive than self-reports to social-desirability effects.”
Ouch.
We spent decades trying to bypass the problems with asking people directly about their biased racial attitudes, but our solution doesn’t work.
Equally devastating is the consciousness claim. Implicit measures are thought to tap into unconscious thoughts people genuinely don't know they have. This was supposed to solve the other half of our self-report problem: even if people wanted to tell us about their biases, they might not be aware of them. However, study after study shows people can predict their implicit scores with surprising accuracy. If you can guess what your so-called unconscious bias test will reveal, how unconscious is it really? Even prominent scholars of implicit measures like Anthony Greenwald now admit that these measures cannot assess unconscious thoughts and feelings.
The authors also dispute claims about automaticity (that only implicit measures can tap automatic processes), the ability to capture simple associations (that only implicit measures tap mental links between concepts), robustness (that only implicit measures are stable and resistant to change), and systemic bias (that only implicit measures can detect community-level patterns of prejudice). At every turn, well-designed self-report measures either match or outperform their new-fangled competitors.
Think about all the intellectual energy, research funding, and graduate student careers that went into developing and refining these measures. All the while, the simple approach of asking people about their attitudes—the very thing we were trying to escape—could have been improved and refined to work better than what we built to replace it.
The paper stands on its own, but I wanted to add three comments. First, this entire mess could have been avoided if we’d just spoken to some of our personality colleagues down the hall. If we had, they would have told us that boring measurement basics matter. Psychology spent over a century learning how to build reliable, valid tests, but when these shiny new tools came along, we acted like reliability and validity were bureaucratic hurdles rather than essential foundations. Worse: when folks who knew a thing or two about psychometrics criticized these new measures, prominent proponents suggested these critics were mentally unwell.
Second, I can’t help but think we rushed to theorize about implicit measures and implicit bias before we understood what we were measuring. The moment we saw interesting patterns emerge in reaction times, we started spinning elaborate stories about the differences between implicit and explicit attitudes and unconscious cognition. As I've written before, when researchers rush to theorize, they start seeing confirming evidence everywhere, becoming so invested in their theoretical hunches that they'll bend their methods to fit their beautiful ideas rather than following where the data actually lead. A big hell yes to theories! But not before we've done the basic foundational work first.
Third, Bertram Gawronski raised an important point when we discussed this piece. (Thanks Bertram!). While implicit measures don't work as advertised, that doesn't mean implicit bias itself isn't real. The idea that people can behave in biased ways without being aware of it is probably still valid. The problem is that researchers assumed they could study this phenomenon by simply giving people an IAT rather than doing the harder work of investigating unconscious bias in the real-world. Bertram’s comments suggest a true irony: Our obsession with implicit measures may have made us more ignorant about implicit bias because they gave us the illusion that we were studying unconscious prejudice when all we were doing was studying performance on a computer task.
The implications of Corneille and Gawronski’s paper ripple far beyond academic psychology. Think about all the diversity training programs built around the premise that people can't or won't accurately report their biases. Police departments that spent millions addressing unconscious prejudice based on the assumption that conscious measures were inadequate. Hiring practices restructured around biases that implicit measures supposedly revealed but self-reports supposedly couldn't. We've built an entire industry on false premises.
I want to be clear: bias is absolutely real. Anyone paying attention can see how prejudice can shape things from job interviews to traffic stops. But if our primary scientific tool for measuring bias is fundamentally flawed, and if the problems with self-reports were less insurmountable than we thought, then we might have been solving the wrong problems in the wrong ways for decades.
Corneille and Gawronski deserve mad respect for doing the unglamorous work of checking whether our tools work. When they found that implicit measures fail to deliver on their promises, they had the courage to say so publicly.
To me, the lesson here is that we need to slow down and do the unglamorous work of careful observation, description, and psychometrics before we start spinning grand theories. Maybe if we'd listened to the critics instead of dismissing them, if we'd been humbler about what we actually knew versus what we wanted to believe, this entire detour might not have lasted so long.
I am not a fan of IAT (in fact I distinctly remember writing one of my seminar paper on the problems of IAT, especially on whether it measures actual bias or mere association), but I actually think that Corneille and Gawronski and yourself are being a bit too harsh here. The main question here, and it really is more theoretical than empirical I think, is what is bias to begin with. If bias is indeed a single construct that self-reports and IAT both measure differently, than it makes sense to compare them and which one is better, more robust etc.
However if there really is a difference between implicit and explicit bias (as Gawronsky himself agrees), than comparing IAT, a measure of implicit bias, to self-reports, i.e. a measure of explicit bias, is rather odd to begin with (That's like measuring fluency by comparing reading comprehension to reaction times on a word-nonword task: sure, they both tap into a similar cognitive process, but it is very likely that they are better suited to evaluate different aspects of it, and comparing them would be useless).
So the question becomes - what evidence do we have that IAT badly measures implicit bias. I think some of the points raised by Corneille and Gawronski don't directly answer that, instead just pointing that IAT is not a great measure of bias in general. Even worse, other points just show that IAT is a bad measure of ^insert false assumption regarding implicit bias^, which is definintely not the problem of IAT, that accurately reflects a reality we just didn't hypothesize.
Therefore the problem, I think, is more related to your second point, about the rush to theorize and measure. Fortunatly, that does not mean the field cannot recover - it actually seems to me like a healthy process of regrouping, matching our expectations from IAT to its actual capabilities and properly theorize the differences between implicit and explicit biases.
That being said, I 100% share both the mad respect for Corneille and Gawronski, and the aversion to the rush of the theory-evidence-intervention-policy process. BTW, I think it is a very American (perhaps Anglo-Saxon?) thing - at least from what I see in Europe and here (Israel), people are actually pretty cautious (probably too cautions) about turning social psychology findings into policy.
>bias is absolutely real. Anyone paying attention can see how prejudice can shape things from job interviews to traffic stops.
Prejudice and bias aren't interchangeable. Researchers were biased in favor of the IAT (irrationally seeking only confirmatory evidence), whereas it is unclear to what extent preference for white sounding names is based in bias (someone's race legitimately correlates with desired characteristics in an employee not on the resume, like their tendency to show up on time or not).
Frankly, the field failed here because the discussion fails to include the biases and assumptions of people who have the very left-wing perspective typical of the researchers themselves, that racism is a strong explaining factor in behavior, that biases are specific instead of general, .etc.