Important Takeaways:
- Deceitful tactics by artificial intelligence exposed: ‘Meta’s AI a master of deception’ in strategy game
- Paper: ‘AI’s increasing capabilities at deception pose serious risks, ranging from short-term, such as fraud and election tampering, to long-term, such as losing control of AI systems’
- At its core, deception is the luring of false beliefs from others to achieve a goal other than telling the truth. When humans engage in deception, we can usually explain it in terms of their beliefs and desires – they want the listener to believe something false because it benefits them in some way. But can we say the same about AI systems?
- The study, published in the open-access journal Patterns, argues that the philosophical debate about whether AIs truly have beliefs and desires is less important than the observable fact that they are increasingly exhibiting deceptive behaviors that would be concerning if displayed by a human.
- “Large language models and other AI systems have already learned, from their training, the ability to deceive via techniques such as manipulation, sycophancy, and cheating the safety test. AI’s increasing capabilities at deception pose serious risks, ranging from short-term risks, such as fraud and election tampering, to long-term risks, such as losing control of AI systems,” the authors write in their paper.
- The study surveys a wide range of examples where AI systems have successfully learned to deceive. In the realm of gaming, the AI system CICERO, developed by Meta to play the strategy game Diplomacy, turned out to be an expert liar despite its creators’ efforts to make it honest and helpful. CICERO engaged in premeditated deception, making alliances with human players only to betray them later in its pursuit of victory.
- The risks posed by AI deception are numerous. In the short term, deceptive AI could be weaponized by malicious actors to commit fraud on an unprecedented scale, to spread misinformation and influence elections, or even to radicalize and recruit terrorists. But the long-term risks are perhaps even more chilling. As we increasingly incorporate AI systems into our daily lives and decision-making processes, their ability to deceive could lead to the erosion of trust, the amplification of polarization and misinformation, and, ultimately, the loss of human agency and control.
Read the original article by clicking here.