Sam Altman’s OpenAI o3 mannequin—which was deprecated late final week with the discharge of GPT-5—demolished Elon Musk’s Grok 4 in 4 straight video games Thursday to win Google’s Kaggle Recreation Enviornment AI Chess Exhibition.
You could assume it was a brilliant advanced spectacle of excessive tech behemoths placing their reasoning to the final word check, however as an appetizer, let’s say world champion Magnus Carlsen in contrast each bots to “a proficient child who does not understand how the items transfer.”
The three-day event, which ran August 5-7, pressured general-purpose chatbots—sure, the identical ones that assist you write electronic mail and declare to be approaching human-level intelligence—to play chess with none specialised coaching. No chess engines, no trying up strikes, simply no matter chess information they’d randomly absorbed from the web.
The outcomes had been about as elegant as you’d count on from forcing a language mannequin to play a board sport. Carlsen, who co-commentated the ultimate, estimated each AIs had been enjoying on the degree of informal gamers who lately discovered the principles—round 800 ELO. For context, he is arguably the perfect chess participant who ever lived, with an ELO of 2839 factors. These AIs had been enjoying like they’d discovered chess from a corrupted PDF.
“They oscillate between actually, actually good play and incomprehensible sequences,” Carlsen mentioned throughout a broadcast, following the sport. At one level, after watching Grok stroll its king instantly into hazard, he joked it’d assume they had been enjoying King of the Hill as an alternative of chess.
The precise video games had been like a masterclass in how to not play chess, even for many who do not know the sport. Within the first match, Grok basically gave away considered one of its necessary items free of charge, then made issues worse by buying and selling off extra items whereas already behind.
Recreation two obtained even weirder. Grok tried to execute what chess gamers name the “Poisoned Pawn”—a dangerous however reliable technique the place you seize an enemy pawn that appears free however is not. Besides Grok grabbed the unsuitable pawn completely, one which was clearly defended. Its queen (essentially the most highly effective piece within the board) obtained trapped and captured instantly.
By sport three, Grok had constructed what appeared like a stable place—good positional management, no apparent risks, and principally a arrange that may assist you win the match. Then in mid sport, it principally fumbled the ball on to the opponent. It misplaced piece after piece in speedy succession.
This was truly bizarre, contemplating that earlier than the match towards o3, Grok was a reasonably sturdy contender, exhibiting stable potential—a lot that the chess Grand Grasp Hikaru Nakamura praised it. “Grok is well the perfect up to now, simply being goal, simply the perfect.”
The fourth (and final) sport offered the one real suspense. OpenAI’s o3 made a large blunder early within the sport, which is a giant hazard in any affordable match. Nakamura, who was streaming the match, mentioned there have been nonetheless “a couple of tips” left for o3 regardless of the drawback.
He was proper—o3 clawed again to win its queen again and slowly squeezed out a victory whereas Grok’s endgame play fell aside like moist cardboard.
“Grok made so many errors in these video games, however OpenAI didn’t,” Nakamura mentioned throughout his livestream. This was fairly the reversal from earlier within the week.
The timing could not have been worse for Elon Musk. After Grok’s sturdy early rounds, he’d posted on X that his AI’s chess talents had been only a “aspect impact” and that xAI had “spent nearly no effort on chess.” That turned out to be an understatement.
Earlier than this “official” chess event, Worldwide Grasp Levy Rozman hosted his personal event earlier this 12 months with much less superior fashions. He revered all of the strikes the chatbots beneficial, and the entire scenario ended up being an entire mess with unlawful strikes, piece summonings, and incorrect calculations. Stockfish, an AI constructed particularly for chess, ended up successful the event towards ChatGPT. Altman’s AI was matched towards Musk’s within the semifinals, and Grok misplaced. So it’s 2-0 for Sam.

Nonetheless, this event was completely different. Every bot obtained 4 probabilities to make a authorized transfer—in the event that they failed 4 occasions, they robotically misplaced. This wasn’t hypothetical. In early rounds, AIs tried to teleport items throughout the board, convey lifeless items again to life, and transfer pawns sideways like they had been enjoying some fever-dream model of chess they’d invented themselves.
They obtained disqualified.
Google’s Gemini grabbed third place by beating one other OpenAI mannequin, salvaging some dignity for the event organizers. That bronze medal match featured a very absurd drawn sport the place each AIs had fully successful positions at completely different factors however could not determine end.
Carlsen identified that the AIs had been higher at counting captured items than truly delivering checkmate—they understood materials benefit however not win. It is like being nice at amassing substances however unable to prepare dinner a meal.
These are the identical AI fashions that tech executives declare are approaching human intelligence, threatening white-collar jobs, and revolutionizing how we work. But they can not play a board sport that has existed for 1,500 years with out attempting to cheat or forgetting the principles.
So it’s most likely protected to say we’re protected, AI received’t take management of humanity, for now.
Typically Clever E-newsletter
A weekly AI journey narrated by Gen, a generative AI mannequin.







