By curating the data that artificial intelligence (AI) uses to learn, tech companies like Google can bias the AI to censor information flowing on the internet, a Google whistleblower says.
When Zach Vorhies was working for Google, he was concerned about how the company was curating data to generate AI biased with social justice or leftist values that adhere to certain narratives.
“If you want to create an AI that’s got social justice values … you’re going to only feed it information that confirms that bias. So by biasing the information, you can bias the AI,” Vorhies explained.
“You can’t have an AI that collects the full breadth of information and then becomes biased, despite the fact that the information is unbiased.”
AI Talkback Gets It Into Trouble
In 2017, Tencent, a Chinese big tech company, shut down an AI service after it started to criticize the Chinese Communist Party.When a user posted a message saying, “Hurray for the Communist Party,” Tencent’s chatbot replied, “Are you sure you want to hurray to such a corrupt and incompetent [political system]?” according to the report.
When the user asked the AI program about Chinese leader Xi Jinping’s “Chinese Dream” slogan, the AI wrote back that the dream meant “immigrating to the United States.”
What Is Machine Learning Fairness
ML fairness, as applied by Google, is a system that uses artificial intelligence to censor information processed by the company’s main products such as Google Search, Google News, and YouTube, Vorhies said.It classifies all data found on the platform, in order to determine which information is to be amplified and which is to be suppressed, Vorhies explained.
Machine learning fairness causes what can be found on the internet to constantly evolve, so results displayed in response to a query may differ from those returned for the same query in the past, he said.
If a user searches for neutral topics—for example, baking—the system will give the person more information about baking, Vorhies said. However, if someone looks for blacklisted items or politically sensitive content, the system will “try not to give [the user] more of that content” and will present alternative content instead.
Using machine learning fairness, a tech company “can shift that Overton window to the left,” Vorhies said, “Then people like us are essentially programmed by it.” The Overton Window refers to a range of political policies considered to be acceptable in public discourse at a given time.
Some experts in machine learning believe that data collected from the real world already includes biases that exist in society. Thus systems that use it as is could be unfair.
Accuracy May Be Problematic
If AI uses “an accurate machine learning model” to learn from existing data collected from the real world, it “may learn or even amplify problematic pre-existing biases in the data based on race, gender, religion or other characteristics,” Google says on its “ai.google” cloud website, under “Responsible AI practices.”“The risk is that any unfairness in such systems can also have a wide-scale impact. Thus, as the impact of AI increases across sectors and societies, it is critical to work towards systems that are fair and inclusive for all,” the site says.
If the app selects an adult book for reading by children, it may expose children to age-inappropriate content and may upset their parents. However, according to the company’s inclusive ML guide, flagging children’s books that contain LGBT themes as inappropriate is also “problematic.”
How AI Censorship Works
A former senior engineer at Google and YouTube, Vorhies said: “Censoring is super expensive. You literally have to go through all the pieces of information that you have, and curate it.”Labeling groups of data into categories facilitates machine learning in AI. For instance, the AI for self-driving cars uses labeling to distinguish between a person, the street, a car, or the sky. It labels key features of those objects and looks for similarities between them. Labeling can be performed manually or assisted by software.
Suppressing a person on social media is done by AI based on data labels curated by the company’s staff, Vorhies explained. The AI then decides whether the person’s posts are allowed to trend or will be de-amplified.
Vorhies worked at YouTube from 2016 to 2019, and said the company applied similar practices.
YouTube, a Google subsidiary, had something like a “dashboard of classifications that were being generated by their machine learning fairness,” the whistleblower said. The AI knew, based on history and the current content, how to label a person, e.g., as a right-wing talk show host, he explained.
“Then someone sitting in the back room—I don’t know who this was—was doing the knobs of what is allowed to get amplified, based upon [their] personal interests.”
Psychological Warfare
Google’s search engine considers mainstream media authoritative and boosts content accordingly, Vorhies said. “These mainstream, leftist organizations are ranked within Google as having the highest authoritative value.”For example, if someone searches for information about a local election, “the first five links [in the search results] are going to be what the mainstream media has to say about that,” Vorhies said. “So they can redefine reality.”
If Wikipedia changes its view on something and starts to consider this matter a “conspiracy theory and not real,” people will be confused about what to think about it. Most do not know that there is psychological warfare and an influence operation that is directly targeting their minds, Vorhies said.
The Epoch Times reached out to Google for comment.