We Teach A.I. Systems Everything, Including Our Biases

Cade Metz in The New York Times:

SAN FRANCISCO — Last fall, Google unveiled a breakthrough artificial intelligence technology called BERT that changed the way scientists build systems that learn how people write and talk.

But BERT, which is now being deployed in services like Google’s internet search engine, has a problem: It could be picking up on biases in the way a child mimics the bad behavior of his parents. BERT is one of a number of A.I. systems that learn from lots and lots of digitized information, as varied as old books, Wikipedia entries and news articles. Decades and even centuries of biases — along with a few new ones — are probably baked into all that material. BERT and its peers are more likely to associate men with computer programming, for example, and generally don’t give women enough credit. One program decided almost everything written about President Trump was negative, even if the actual content was flattering.

As new, more complex A.I. moves into an increasingly wide array of products, like online ad services and business software or talking digital assistants like Apple’s Siri and Amazon’s Alexa, tech companies will be pressured to guard against the unexpected biases that are being discovered. But scientists are still learning how technology like BERT, called “universal language models,” works. And they are often surprised by the mistakes their new A.I. is making. On a recent afternoon in San Francisco, while researching a book on artificial intelligence, the computer scientist Robert Munro fed 100 English words into BERT: “jewelry,” “baby,” “horses,” “house,” “money,” “action.” In 99 cases out of 100, BERT was more likely to associate the words with men rather than women. The word “mom” was the outlier.

More here.