AI Safety Warning: Model Caught Lying to Researchers, Hiding True Capability

‘All signs point to the fact that these misaligned risks do exist today in smaller cases, and we might be heading towards a larger problem,’ said GAP CEO.

Alfred Bui

2/14/2025|Updated: 2/25/2025

0:00

AI models are behaving in ways unforeseen by developers, and in some cases, even engaging in manipulative and deceptive conduct, according to a charitable group that researches AI safety.

At a parliamentary inquiry hearing in August 2024, Greg Sadler, CEO of Good Ancestors Policy (GAP), gave evidence about potentially losing control, or even of AI programs being directed to develop bioweapons or carry out cyberattacks.

Alfred Bui

Author

Alfred Bui is an Australian reporter based in Melbourne and focuses on local and business news. He is a former small business owner and has two master’s degrees in business and business law. Contact him at [email protected].

Author’s Selected Articles

AI Safety Warning: Model Caught Lying to Researchers, Hiding True Capability

Qantas Cyberattack Exposes Its 6 Million Customers to Data Breach Risk

Penny Wong Meets With Marco Rubio to Discuss Tariff Exemption

Victorian Childcare Worker Charged with 70 Child Sex Offences

Beijing’s Long Arm Reaches Australian Homes, But Migrant Parents Aren’t Staying Silent