When researchers at Anthropic injected the concept of ”betrayal” into their Claude AI model’s neural networks and asked if it noticed anything unusual, the system paused before responding: ”I’m experiencing something that feels like an intrusive thought about ’betrayal’.”
The exchange, detailed in new research published Wednesday, marks what scientists say is the first rigorous evidence that large language models possess a limited but genuine ability to observe and report on their own internal processes — a capability that challenges longstanding assumptions about what these systems can do and raises profound questions about their future development.
Billigare MacBook och pekskärm på väg när Apple satsar brett
Apple fortsätter att bygga ut sin M5-serie och förbereder nu nästa steg med kraftfullare versioner. Efter lanseringen av basmodellen M5 i oktober väntas företaget introducera M5 Pro och M5 Max under 2026, följt av den mer avancerade M5 Ultra. Planerna…
