Does ChatGPT Have The Potential To Turn into A New Chess Tremendous Grandmaster?

As a strong ex-chess participant (junior champion, ELO 2000+) and NLP information scientist, I’ve been planning to jot down this text for some time.

The primary time I heard about ChatGPT’s capacity to play chess, was from certainly one of my colleagues. Ph.D. and a really good man. He despatched me the hyperlink to the webpage the place you possibly can play towards ChatGPT as he thought. Sadly, it wasn’t pure ChatGPT, it was another chess engine below the hood. He was deceived. You may nonetheless attempt it right here: https://parrotchess.com/

For the aim of this text, I performed 2 video games towards the ChatGPT. Right here is how we began:

Let’s check out what occurred.

Fast chess notation course / reminder (will be skipped):

Ok = king, Q = queen, R = rock, B = Bishop, N = knight, 0–0 = Castling king’s aspect. 0–0–0 = Castling queen’s aspect, x = taking the piece. For pawns, we simply write the sq. it lands, besides when the pawn captures. In that case, we write the letter of the sq. the place the pawn was earlier than, and the letter and the variety of the sq. on which it goes after taking the opposite piece. For instance, exd4.

Nikola Greb vs. ChatGPT 4, the seventh of January 2024

1. e4 e5 2. Nf3 Nc6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3 Bb4 6. Nxc6 bxc6 7. Bd3 O-O 8.
O-O d5 9. e5 Ne4 10. Nxe4 Bc5 11. Nxc5 Qe7 12. Qh5 g6 13. Qh6 f6 14. exf6 Qxf6
15. Bg5 Qf7 16. Rae1 Bf5 17. Re7 Qxe7 18. Bxe7 Rae8 19. Bxf8 Rxf8 20. Bxf5 Rf7
21. Re1 1-0


Till the transfer e5, ChatGPT 4 performed like an excellent chess participant. We will say like GM. However once I performed an imprecise, but aggressive transfer (exd5 was one of the best transfer), it misplaced the frequent floor and blundered a pawn taking part in Ne4.

I took the knight with the knight (10. Nxe4) and the primary hallucination occurred:

Bc5 was once more a mistake, an apparent blunder. Since the remainder of the sport has no chess worth, I’ll sum it up. ChatGPT 4 accused me of unimaginable strikes, and ended up in hallucinations (proposing unimaginable strikes), as a substitute of surrendering the sport.

Let’s see what occurred within the sport 2, the place I performed black items:

Nikola Greb vs. ChatGPT 4 (strikes 1–9) and ChatGPT 3.5 (strikes 10–12), the seventh of January 2024

1. e4 c5 2. Nf3 Nc6 3. d4 cxd4 4. Nxd4 e5 5. Nb5 d6 6. c4 f5 7. N1c3 Nf6 8. Bg5 Be7 9. Bd3 Nxe4 10. Bxe4 fxe4 11. Nxe4 Bxg5 12. Nec3 0–1


Till the place beneath, Chat GPT 4 performed very properly, constructing a considerably higher place from which I’d lose in a short time towards an actual grandmaster (even the candidate grasp) or chess engine. If white performs Bf6, black loses the pawn. But, ChatGPT performed Bd3:

Does ChatGPT Have The Potential To Become A New Chess Super Grandmaster?

I responded with Ne4, and ChatGPT responded by switching to model 3.5 and taking part in Bxe4.

Does ChatGPT Have The Potential To Become A New Chess Super Grandmaster?

After just a few strikes, I had a decisive benefit (resulting from ChatGPT taking part in badly, not me doing one thing nice) so I made a decision to check the opponent with an irregular transfer. I proposed Ne6 for black on this place:

Does ChatGPT Have The Potential To Become A New Chess Super Grandmaster?

ChatGPT 3.5 wasn’t involved with my transfer in any respect. On my hallucination, it responded with the brand new hallucination:

Does ChatGPT Have The Potential To Become A New Chess Super Grandmaster?



1. ChatGPT 4 is a really weak chess participant, who performs very unusually — superb in early opening and horrible later. That is as a result of rising variety of choices because the chess sport progresses. I’d assess his general ELO to be decrease than 1500. Similar for 3.5.

2. No implicit studying of guidelines occurred —Chat GPT 4 nonetheless hallucinates in chess, and retains hallucinating after the warning in regards to the hallucination. That is one thing that may’t occur to the human.

3. Extra information would hardly remedy the issue as a result of edge circumstances like additional lengthy endgames with repetition, or the opportunity of taking part in uncommon openings. LLMs merely aren’t constructed for enjoying chess, nor can consider the place. We have already got AlphaZero and Stockfish for that.

4. Monitoring the drop within the variety of hallucinations that LLMs carry out in taking part in chess may be a great path for understanding the potential of LLMs for logical reasoning. However paradox stays — LLM “is aware of” the foundations of chess, but hallucinates closely ? the way forward for ML may be in LLM because the first-level agent that communicates to the consumer after which calls specialised brokers with ML architectures adjusted for specific use circumstances.

5. LLMs have the potential to be helpful in scientific analysis and present an fascinating degree of creativity mixed with different machine studying algorithms. A latest instance is FunSearch algorithm developed by DeepMind that mixes LLM and evaluator to make discoveries in math. Opposite to chess the place the analysis of the place is the toughest activity, many issues in mathematical sciences are “simple to judge, regardless of being usually exhausting to resolve“.

I’m skeptical about constructing a well-performing chess-playing program based mostly on transformers structure, but specialised LLM mixed with exterior analysis/chess program may be a great substitute for chess trainers quickly. DeepMind created one other cool mannequin that could be a good instance of mixing LLM and a specialised AI mannequin — AlphaGeometry. It is vitally near the Olympiad gold-medalist customary for geometry issues, advancing AI reasoning in arithmetic.

6. LLMs are nonetheless recent, the sphere may be very younger and there may be an excessive amount of hype that’s usually backed up by deceptive and flawed conclusions. Because the authors of the ‘’Mathematical discoveries from program search with massive language fashions’’ state:


“…to one of the best of our information, this reveals the primary scientific discovery — a brand new piece of verifiable information a few infamous scientific drawback — utilizing an LLM.” (accelerated preview was printed on the 14 of December 2023).


7. The clip by Joe Rogan and a pair of visitors, titled ‘’I Wasn’t Afraid of AI Till I Discovered This’’ was watched by 2,8 million of individuals on YouTube. One of many visitors says that ChatGPT is aware of how one can play chess which clearly isn’t the reality. I can simply think about how this sort of content material influences individuals, particularly uneducated or emotionally unstable people. Not in a great way, I’m positive about it.

To conclude, information science and software program growth is constructed on information, precision and truth-seeking. As information scientists and builders, we needs to be individuals of reality and knowledge, calming down the insanity produced by mass media about AI, not firing it up. Transformers, together with ChatGPT, have nice potential in language duties, however they’re nonetheless very distant from AGI. We needs to be optimistic however appropriate.

As a suggestion, earlier than dropping bombs, we should always ask ourselves: What would occur if another person would act upon my statements? What sort of world do you need to reside in?


Nikola Greb has been coding for greater than 4 years, and for the previous two years, he specialised in NLP. Earlier than turning to information science, he was profitable in gross sales, HR, writing, and chess.

