On a radio interview that you can listen to here, I describe the controversial paper and then there is a discussion involving the president of the Black Student Alliance and Sandy Darity. Sandy Darity stated that he had two problems with the paper. First, it was being used by others in a Supreme Court case against racial preferences, which is not a problem with the paper itself. Second, we didn't take into account stereotype threat.
Stereotype threat is the concept that certain marginalized groups will underperform on tests relative to their actual abilities due to having to endure negative stereotypes. If someone repeatedly calls you an idiot right before you take a test, this may have a negative impact on your performance.
An outstanding question is then whether stereotype threat affects African American performance on standardized tests as well as performance in the classroom. If so, then another question is how to eliminate the hostile environments that are resulting in the underperformance.
I confess that I have not followed the stereotype threat literature. Economists as a whole haven't paid very much attention to this literature and to the degree that they have it is generally not treated as important. One of the few papers on the racial test score gap that does mention it, and here only in a footnote, is by Fryer and Levitt. Their paper shows substantial racial differences in test scores in early grades. They argue that stereotype threat is unlikely to be important when children are so young. Carneiro, Heckman, and Masterov have a chapter in the Handbook of Employment Discrimination Research: Rights and Realities that does discuss stereotype threat. You can see the working paper version here. They are highly critical, stating that "No serious empirical scholar assigns any quantitative importance to stereotype threat effects." (page 10 of the working paper version)
My limited understanding of this literature makes me skeptical that stereotype threat is large enough to affect the findings of the paper. Steele and Aronson's initial piece on stereotype threat showed that African Americans performed better on GRE verbal questions in less-threatening environments. And, after controlling for SAT verbal scores, performed just as well as white students in less-threatening environments. But the key here, as noted by Sackett, Hardison, and Cullen, is that African Americans and whites in the less-threatening environment performed the same conditional on SAT verbal scores, where there are significant differences in SAT verbal scores across the two groups. The evidence in Steele and Aronson's paper should be interpreted not as removing stereotypes closes the racial test score gap, but that the SAT is taken in a similar environment to the less-threatening environments in the Steele and Aronson study. If the SAT environment was in fact hostile, then SAT verbal scores would underrepresent African American achievement and, conditional on the same SAT score, African American students should outperform their white counterparts.
This criticism of the stereotype threat argument moved this literature towards looking for cases where African Americans and other marginalized groups actually performed better than their majority counterparts conditional on previous performance, where previous performance is assumed to be measured in a more hostile environment. Walton and Spencer provide a meta-analysis of this literature and conclude that stereo-type threat is, in fact, very present in many previous performance measures. But a few things should be noted from the study. First, previous performance is much more similar to performance in non-threat conditions than it is to performance in threat conditions. Second, some of the manipulations done to get a non-threatening environment may affect the quality of the results. As an example, suppose I told you that a particular test didn't matter. How hard would you try? Note that there is no evidence that stereotype threat can come close to examining the large racial differences in test scores, nor is there evidence that it could explain the large racial differences in test scores at Duke.
Our paper found significant racial differences in leaving STEM majors and in the reason for leaving a major being because of course difficulty. But there were no significant racial differences in these measures once we conditioned on academic background, either using SAT scores and Duke's private rankings of the applicants, or using performance in first-year classes. If stereotype threat was important and hence SAT scores or first-year performance were under-estimating African American academic background, then racial differences in switching majors because of course difficulty would reemerge.
As stated above, large test score gaps emerge at an early age and then persist. Stereotype threat allows us to say that these differences in test scores aren't really measuring human capital accumulation, that differences in human capital accumulation across races are not really as bad as they seem. I think stereotype threat serves as a distraction from dealing with the real problem of differences in human capital accumulation across races, particularly at early ages.
But even if I did buy into stereotype threat as a significant source of achievement differences, it would make racial preferences in admissions even less appealing. What better way to reenforce negative stereotypes than having students wondering whether or not they would have been admitted if they were a different race? It would be interesting to look at stereotype threat at universities with more or less aggressive affirmative action policies. For example, did under-represented minorities at public universities in California identify more or less with negative stereotypes after Proposition 209? (Prop 209 banned racial preferences in admissions for public universities in California)