For a second year, he runs a Chess Championship using chatbots.
The openings are quite fine because they have a lot to copy from the training material, but once one of the chatbots make an unusual move it may get chaotic.
Some chatbots play quite well in unusual situations and others make ilegal moves, make piece apear from thin air or forget they still have a piece on the board. If you look carefully, many ilegal moves look sensible for a different game, like they are copying the move from the training material, but sometimes it's difficult to guess.
Anyway, he accept most illegal moves to get funny content, unless the chatbot is cheating too much, sometimes deciding it arbitrarily. I'm not sure how this mess with the internal representation the other chatbot has. Once a ilegal move is played it's hard to guess if the future ilegal moves are error or a different interpretation of the consequences of the ilegal move.
For a second year, he runs a Chess Championship using chatbots.
The openings are quite fine because they have a lot to copy from the training material, but once one of the chatbots make an unusual move it may get chaotic.
Some chatbots play quite well in unusual situations and others make ilegal moves, make piece apear from thin air or forget they still have a piece on the board. If you look carefully, many ilegal moves look sensible for a different game, like they are copying the move from the training material, but sometimes it's difficult to guess.
Anyway, he accept most illegal moves to get funny content, unless the chatbot is cheating too much, sometimes deciding it arbitrarily. I'm not sure how this mess with the internal representation the other chatbot has. Once a ilegal move is played it's hard to guess if the future ilegal moves are error or a different interpretation of the consequences of the ilegal move.
Quarter final 1: ChatGPT vs Gemini
Quarter final 2: Grok vs Copilot
Quarter final 3: Snapchat vs Claude
Quarter final 4: Meta AI vs DeepSeek