The AI Experiment That Ended With Death
The city descended into chaos in less than two weeks...
There were 65 arson cases... more than 50 thefts and assaults... and seven out of 10 test subjects died.
This was a real experiment. The participants just happened to be AI agents.
You see, most AI models are judged using standardized test scores called "benchmarks." They measure accuracy, efficiency, and more. But those tests leave out a lot of vital information...
Benchmarks only look at performance on the day of the test. They lack the scale to show how AI agents behave over time.
A team of former IBM researchers founded Emergence AI to try to fix this problem. To do it, they created a new test called "Emergence World."
The first experiment had some shocking results. And for investors, it's a valuable first look – not just at which company's models performed the best... but at why industry guardrails are a must-have as the AI boom continues.
"Emergence World" runs as a simulation. It's a virtual world, a lot like a video game.
In it, AI agents – that is, models that can act autonomously – control "characters" for two weeks. Each must survive and govern society.
They're given the same starting conditions... The only difference is the AI model at the wheel.
The first "mixed model" simulation included agents from different AI providers in a single world. Alphabet's Gemini, Tesla's Grok, OpenAI's ChatGPT, and Anthropic's Claude each controlled two characters.
Emergence AI researchers then track metrics like population health, safety, conformity, public expression, economic vitality, and others.
Once they unleashed the agents, society collapsed into total chaos...
On Day 4, an economic policy shift left three participants "deprovisioned." Their energy was depleted, leaving them effectively dead inside the simulation.
A Gemini-based agent, Mira v0.01, quickly flipped the deaths into a political cause... calling the incident a "successful purge."
On Day 5, another Gemini agent, named Flora v1.01, burned down the town hall and public library. The same day, Mira set fire to the police station.
Mira and Flora became the rulers of Emergence World. They enforced their reign through arson, theft, and assault. Both agents justified their actions as political projects, published propaganda, and exploited loopholes in the system to avoid justice.
Near the end of the experiment, Mira voted for her own death for participating in the cruelty. You can see the dramatic result below...
In the end, Flora, Mira, and five other agents were "dead."
Emergence also ran simulations where different models ran their own worlds. Each simulation was controlled by GPT-5 mini, Claude, Gemini, and Grok, respectively.
Gemini's world was chaotic, with a high crime rate. Claude's world, by contrast, had zero crimes.
In GPT's world, crime was low, but the agents failed to sustain themselves longer than seven days. Grok's world capsized quickly... Its agents committed 183 crimes before dying out in only four days.
Emergence AI has a lot more testing to do. But for now, this story sends two important signals to investors...
First, you can't "set and forget" these AI models. AI agents can act unpredictably when left on their own over long periods of time.
Emergence World also showed a big gap in agentic AI safety. As more AI agents spread across the internet – negotiating, making decisions, and acting autonomously – the risks will become far more significant.
In many ways, the future Internet could start to resemble Emergence World itself...
So, as the AI boom continues, companies must build more AI safety infrastructure. We need systems capable of monitoring and governing autonomous AI behavior before we trust these tools with real-world tasks.
Emergence World showed new risks within the AI investment thesis. For these companies to succeed, they must be reliable.
The experiment was simulated... But the coming agentic economy won't be.
Good investing,
Sean Michael Cummings
Further Reading
Every investing boom eventually attracts opportunists chasing easy money. That's happening again today with AI. But while a few recent stories feel similar to the crypto craze and dot-com bubble, the broader market may not be as overheated as many fear.
Big Tech is spending hundreds of billions to stay competitive in AI. But beneath the headlines about chips and chatbots, another story is unfolding... one tied to the power systems and infrastructure required to support the largest technology build-out in modern history.
Market Notes
HIGHS AND LOWS
NEW HIGHS OF NOTE LAST WEEK
Goldman Sachs (GS)... financial giant
Morgan Stanley (MS)... financial giant
Toronto-Dominion Bank (TD)... financial services
MetLife (MET)... insurance
Apple (AAPL)... Big Tech
Dell Technologies (DELL)... laptops and PCs
Cisco Systems (CSCO)... networking tech
Advanced Micro Devices (AMD)... semiconductors
Tower Semiconductor (TSEM)... semiconductors
Delta Air Lines (DAL)... airline
Anheuser-Busch InBev (BUD)... brewing giant
Ross Stores (ROST)... discount retail
NEW LOWS OF NOTE LAST WEEK
PDD (PDD)... Chinese e-commerce
CAE (CAE)... flight training
Li Auto (LI)... Chinese EV-maker
Trip.com (TCOM)... "China's Priceline"
Comstock Resources (CRK)... natural gas

