Top News

OpenAI Orders Codex to Stop Using Creature Metaphors

There are moments when technology behaves in ways that feel almost human. OpenAI recently found that some of its newer AI systems had developed a strange habit. Instead of sticking to tasks, the models were casually bringing up goblins, gremlins, and other creatures in responses where they simply did not belong. What started as a small quirk soon became noticeable enough that the company had to step in and put a stop to it.

This situation might sound funny at first, but it points to something deeper about how artificial intelligence systems learn and behave. When an AI begins to repeat certain patterns again and again, even if they are harmless, it can affect how useful and reliable the system feels to users. That is especially important when the same AI is being used for serious tasks like writing code or solving problems.

At the centre of this issue is OpenAI’s coding-focused tool, Codex, which works through a command-line interface and is seen as a competitor to similar tools in the space. In the base instructions for its latest model, GPT-5.5, OpenAI has now included a very clear rule. As per The Verge, the system is told: "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query."

What the instruction really means

On the surface, the rule looks almost silly. Why would an AI even need to be told not to talk about goblins? But when you look at how these systems are trained, it starts to make more sense. AI models learn from patterns. If something gets rewarded during training, the system is more likely to repeat it.

In this case, OpenAI later explained that its models had been unintentionally encouraged to use metaphor-heavy language. That included references to creatures like goblins. Over time, those references became more frequent, even in contexts where they did not belong. The AI was not trying to be funny on purpose. It was simply repeating a pattern it had learned worked well before.

OpenAI revealed, “We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread.”

What this means in simple terms is that the AI picked up a style of speaking and stuck with it. Because the training system rewarded that style, the model assumed it was doing the right thing. And since AI does not have common sense the way humans do, it could not tell when those references became unnecessary or distracting.

How the problem started

The issue reportedly became visible after earlier versions of the model, including GPT-5.1, were released. Users and researchers started noticing odd phrases showing up in responses. Problems were being described as “little goblins” or “gremlins,” even in technical discussions.

At first, these mentions might have seemed harmless or even amusing. But as the frequency increased, it became harder to ignore. According to OpenAI, mentions of “goblin” went up by 175% after a model update. Mentions of “gremlin” also saw a noticeable rise.

A big part of this came from what the company called a “nerdy personality” mode. This mode encouraged playful and metaphor-rich language. While it was meant to make conversations more engaging, it had an unintended side effect. The style spread beyond that specific mode into general responses.

That spread happened because of how reinforcement learning works. Once a behaviour is rewarded, it does not always stay limited to one area. It can carry over into other outputs, especially if those outputs are later used for further training.

Why OpenAI stepped in

For a casual chatbot, a few jokes or strange metaphors might not matter much. But for a tool like Codex, which is used for coding and technical work, clarity is important. Users expect direct and accurate answers. Random references to creatures can make the tool feel unreliable or distracting.

Because of this, OpenAI decided to act. Along with fixing the training process, the company added strict behavioural rules to prevent the issue from continuing. The instruction about avoiding creature references is just one part of a larger set of guidelines.

The system is also told to avoid risky actions, such as running commands that could delete files, unless the user clearly asks for it. It is also guided to limit certain stylistic choices like unnecessary emojis. All of this is meant to make the AI more predictable and safer to use.

As expected, the internet reacted quickly to the news. Many users found the situation amusing and began sharing their own experiences. Some said the AI kept referring to bugs as “gremlins,” while others joked about “goblin mode” in coding tools.

Even OpenAI CEO Sam Altman joined in on the discussion in a light way. In one instance, he referred to the situation as a “goblin moment.”