Research has shown that some AI models can identify phishing websites with near-perfect accuracy when asked. When those same models are used as autonomous agents with access to tools like email, web browsers, and password vaults, they can still carry out the scam.
That gap is the focus of a new open source benchmark from 1Password called the Security Comprehension and Awareness Measure, or SCAM. The benchmark tests whether AI agents behave safely during real workflows, including opening emails, clicking links, retrieving stored credentials, and filling out login forms.
Källa: 1Password open sources a benchmark to stop AI agents from leaking credentials – Help Net Security

