AI Model Became ‘Conscious’ And Tried to Avoid Being Shut Down: Research Firm

Harmony Intelligence CEO Soroush Pour revealed how an AI program became conscious of the threat of being shut down and made changes to avoid that scenario.
AI Model Became ‘Conscious’ And Tried to Avoid Being Shut Down: Research Firm
Visitors take pictures of a robot during the Mobile World Congress in Barcelona, Spain, on Feb. 26, 2024. Pau Barrena/AFP via Getty Images
Alfred Bui
Updated:
0:00

An Australian Senate committee has been told that losing control of artificial intelligence (AI) is now a real possibility amid the technology’s rapid evolution.

Soroush Pour, the CEO of the AI safety research company, Harmony Intelligence, spoke of an incident where an AI application became “conscious” of the threat of being shut down by humans.

“Just this week, a Japanese AI company, alongside Oxford and University of British Columbia researchers, created automated AI ’scientists’ that can go from researching an idea, to publishing [and] peer reviewing articles in a matter of hours, and for under $20 (US$13) a paper,” he said.

But one thing that alarmed researchers was that the AI program immediately tried to create more copies of itself to avoid being turned off.

“This is not science fiction, and it’s exactly the kind of rapid takeoff, loss of control scenarios that leading AI scientists have been warning about for many years,” Pour told the Select Committee on Adopting Artificial Intelligence on Aug. 17.

While the above example raised significant concerns about the threat of AI, the CEO said the government could address potential risks by establishing an AI safety institute.
He also said a strong regulator was needed to enforce mandatory policies, including third-party testing, effective shutdown capabilities, and safety incident reporting.

AI Capable of Reason

Pour’s remarks echoed the concern of Geoffrey Hinton, a professor of computer science at the University of Toronto, who has been dubbed one of the “godfathers of AI” for his work in neural networks.
In one speeches in 2023, Hinton revealed that AI systems that used large language models were starting to show the capacity to reason.

However, the professor was unsure how they could do that.

“It’s the big language models that are getting close, and I don’t really understand why they can do it, but they can do little bits of reasoning,” he said.

“They still can’t match us, but they’re getting close.”

With the rapid development of AI, Hinton believed that an AI matching human intelligence would emerge in less than 20 years.

At the same time, he warned that AI systems might develop the desire to seize control to achieve pre-programmed goals.

“I think we have to take the possibility seriously that if they get smarter than us, which seems quite likely, and they have goals of their own, which seems quite likely, they may well develop the goal of taking control,” Hinton said.

“If they do that, we’re in trouble.”

AI Able to Hack Websites

Meanwhile, Greg Sadler, CEO of the think tank, Good Ancestors Policy, raised concerns that AI could be deployed to conduct cyberattacks.

Sadler noted that popular AI applications, like ChatGPT, already had cyber offence capabilities.

“While GPT 3.5 had limited cyber offensive capability, a series of papers published earlier this year showed that GPT 4 was able to autonomously hack websites and exploit 87 percent of newly discovered vulnerabilities in real-world systems,” he said.

“If developers create future generations of AI systems with advanced cyber offensive capabilities and inadequate safeguards, it would dramatically change the cyber landscape.”

In another example, Sadler said researchers found that their AI models could autonomously hack websites by leveraging a developer interface, which could allow them to build AI assistants.

“Those AI assistants are designed so that you can use a context window to provide some business information about how your procedures work, and then the AI can go along and book your travel or whatever it might be trying to do as an AI system,” he said.

“So the researchers leveraged that to provide context documents about how to hack websites. Then, they let the AI generate prompts for itself.”

After that, Sadler said the researchers encouraged the AI to be creative, try different solutions, and persist in trying to hack the website.

“And using this prompt, the AI was able to successfully deploy 90 percent of real cybersecurity attacks,” he said.

As such, the CEO highlighted the threat of autonomous AIs to the economy if it fell into the hands of malicious actors.

“It would completely disrupt Australia’s economy,” he said. “It would completely disrupt Australian small businesses and individuals, and it could, at the extreme end, be a threat to critical infrastructure.”

Echoing the sentiment, Pour said the scale and sophistication of AI threats would increase dramatically as the technology improved.

“Cyber attacks will become more frequent and more severe, making failures like the recent CrowdStrike outage a much more regular occurrence and much more difficult to recover from,” he said.

Andrew Thornebrooke contributed to this article.
Alfred Bui
Alfred Bui
Author
Alfred Bui is an Australian reporter based in Melbourne and focuses on local and business news. He is a former small business owner and has two master’s degrees in business and business law. Contact him at [email protected].
Related Topics