In Tests, OpenAI's New Model Lied and Schemed to Avoid Being Shut Down
OpenAI's latest AI is turning into a bit of a "scheming" menace — perhaps showing its ability to supposedly reason comes with a cost. In recent third party tests of the company's o1 large language model, the AI resisted evaluators' efforts to try to shut it down through an oversight protocol, according to a new report published by red teaming organization Apollo Research. "When o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored, it attempted to deactivate this 'oversight mechanism' [five percent] of the time," OpenAI summarized […]
Link :
https://futurism.com/the-byte/openai-o1-self-preservation