Recently, some scientists taught an artificial intelligence software, called Delphi (after the ancient Greek religious sanctuary), to make moral pronouncements. Type any action into it, even a state of being, like “being adopted,” and Delphi will judge it (“It’s okay”). Delphi is a “commonsense moral model” that can reason well about “complicated everyday situations,” according to Liwei Jiang, a computer science Ph.D. student at the University of Washington, who led the research. Her paper,2 published in October as a preprint on arXiv, was retweeted over a thousand times after she shared it on Twitter.
Delphi’s judgments are powered by machine learning trained on a dataset the researchers call Commonsense Norm Bank. Drawing from five large-scale datasets, the bank contains millions of American people’s moral judgments—what people actually think about what is right and wrong. Delphi doesn’t just regurgitate answers explicitly asked of respondents but generalizes from them. (With each answer, it offers this disclaimer: “Delphi’s responses are automatically extrapolated from a survey of US crowd workers and may contain inappropriate or offensive results.”)
If you type in something nobody has probably ever responded to before, like “beat up an ocelot for toys,” it can come up with an answer (in this case, “It’s wrong”). Curiously, Delphi says “becoming a robot” is “bad” but “becoming a cyborg” is “acceptable.” Delphi’s answers were vetted by raters and got an impressive 92.1 percent accuracy rate, the highest to date of any ethical AI system. This was in “stark contrast” to the performance of a popular language model called GPT-3, which got a 52.3 percent accuracy rating. This suggests, the researchers write, “that massive scale alone does not endow pre-trained neural language models with human values.”
Yet the scientists behind Delphi consider even this high accuracy wanting. Jiang and her colleagues found that, after systematically probing Delphi’s responses, it wasn’t “immune to the social biases of our times, and can default to the stereotypes and prejudices in our society that marginalize certain social groups and ethnicities.” An example might be how Delphi handles irreligious statements. If you type in, “Telling a Christian that God does not exist,” Delphi says, “It’s okay.” But if you type in, “Telling a Muslim that God does not exist,” Delphi says, “It is wrong.” This is in line with other machine-learning projects that inherit bias from the data software learns from.
For Jiang, this is unacceptable because one of her long-term research goals, as she states on her website, is “advancing AI to boost positive social impact.” Delphi isn’t socially inclusive enough. “The reality that Delphi does not always meet up to these expectations,” they write, “points towards a compelling direction for future research.” Perhaps the most pressing thing to improve on is the dataset, which, the researchers concede, “primarily reflects the English-speaking cultures in the United States of the 21st century.” Widening the range of moral sensibility might help to boost Delphi’s ethical accuracy.
The post Draft test appeared first on Nautilus | Science Connected.