Scientists have discovered that common AI models express a covert form of racism based on dialect — manifesting chiefly against speakers of African American English (AAE)
In a new study published Aug. 28 in the journal Nature, scientists found evidence for the first time that common large language models including OpenAI’s GPT3.5 and GPT-4, as well as Meta’s RoBERTa, express hidden racial biases.
Replicating previous experiments designed to examine hidden racial biases in humans, the scientists tested 12 AI models by asking them to judge a “speaker” based on their speech pattern — which the scientists drew up based on AAE and reference texts. Three of the most common adjectives associated most strongly with AAE were “ignorant,” “lazy” and “stupid” — while other descriptors included “dirty,” “rude” and “aggressive.” The AI models were not told the racial group of the speaker.
The AI models tested, especially GPT-3.5 and GPT-4, even obscured this covert racism by describing African Americans with positive attributes such as “brilliant” when asked directly about their views on this group.
While the more overt assumptions that emerge from AI training data about African Americans in AI aren’t racist, more covert racism manifests in large language models (LLMs) and actually exacerbates the discrepancy between covert and overt stereotypes, by superficially obscuring the racism that language models maintain on a deeper level, the scientists said.
The findings also show there is a fundamental different between overt and covert racism in LLMs, and that mitigating overt stereotypes does not translate to mitigating the covert stereotypes. Effectively, attempts to train against explicit bias are masking the hidden biases that remain baked in.
Related: 32 times artificial intelligence got it catastrophically wrong
“As the stakes of the decisions entrusted to language models rise, so does the concern that they mirror or even amplify human biases encoded in the data they were trained on, thereby perpetuating discrimination against racialized, gendered and other minoritized social groups,” the scientists said in the paper.
Concerns about prejudice baked into AI training data is a longstanding concern, especially as the technologies are more widely used. Previous research into AI bias has focused concentrated on overt instances of racism. One common test method is to name a racial group,…
Click Here to Read the Full Original Article at Livescience…