A data-driven exploration of the evolution of chess: Moves, captures, and checkmates

For the 4th installment in my series of blog posts exploring a data set of over 650,000 chess tournament games ranging back to the 15th century, I wanted to look at how chess moves have changed over time. Again, I only have reliable data on chess games back to 1850, so 1850 will be my starting point.

One thing I was interested in is whether preferences for specific chess moves have changed over time. Was the all-powerful Queen more popular in the past, then lost favor as new strategies developed? Or has capturing pieces become more common nowadays than in previous years?

Thankfully, each chess game is recorded in PGN format, which means that it stores every move each player made, the outcome of the game, etc. Here’s an example game in PGN format:

[Event “F/S Return Match”] [Site “Belgrade, Serbia Yugoslavia|JUG”] [Date “1992.11.04”] [Round “29”] [White “Fischer, Robert J.”] [Black “Spassky, Boris V.”] [Result “1/2-1/2”]

1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 {This opening is called the Ruy Lopez.} 4. Ba4 Nf6 5. O-O Be7 6. Re1 b5 7. Bb3 d6 8. c3 O-O 9. h3 Nb8 10. d4 Nbd7 11. c4 c6 12. cxb5 axb5 13. Nc3 Bb7 14. Bg5 b4 15. Nb1 h6 16. Bh4 c5 17. dxe5 Nxe4 18. Bxe7 Qxe7 19. exd6 Qf6 20. Nbd2 Nxd6 21. Nc4 Nxc4 22. Bxc4 Nb6 23. Ne5 Rae8 24. Bxf7+ Rxf7 25. Nxf7 Rxe1+ 26. Qxe1 Kxf7 27. Qe3 Qg5 28. Qxg5 hxg5 29. b3 Ke6 30. a3 Kd6 31. axb4 cxb4 32. Ra5 Nd5 33. f3 Bc8 34. Kf2 Bf5 35. Ra7 g6 36. Ra6+ Kc5 37. Ke1 Nf4 38. g3 Nxh3 39. Kd2 Kb5 40. Rd6 Kc5 41. Ra6 Nf2 42. g4 Bd3 43. Re6 1/2-1/2

Notice the extra notation beside the location that the piece moved to: N, B, x, +, etc. Each of these symbols have a particular meaning, e.g., “Qxe7” means that the Queen moved to e7 and took a piece. This notation makes it fairly easy to parse out what pieces are moving where, how many pieces were captured, etc. I charted the evolution of preferences for these movements below.

Piece captures

One of the biggest questions I wanted to answer with this project is whether capturing pieces has become more or less common nowadays. From my own experience, I noticed that as I became more skilled at chess, I became less focused on capturing all of my opponent’s pieces and more focused on controlling the board. Has chess as a sport similarly progressed this way over time?

The chart below shows the rate at which pieces were captured over time. “0.2” means that a piece was captured every 1 / 0.2 = 5 ply.

chess-piece-capture-rate-over-time

Over time, the average chess game has consistently ended with about 16 pieces captured between the two sides. Despite the fact that chess games are getting longer, more pieces aren’t being captured in that extended time period. Whereas a piece was captured every 4 ply in 1850, a piece is captured every 5 ply in 2014. This may indeed be because chess games are increasingly becoming more strategic, focusing on gaining control of the board rather than capturing more pieces.

Checkmates

If chess games are becoming more strategic, then we should expect to see more checkmates over time. Surprisingly, we see the opposite: Less than 2% of expert chess games end in a checkmate in 2014, down from 8% in 1850. I was puzzled by this finding until it occurred to me that most expert chess players are able to predict the next few moves in every game, and therefore resign or call a draw well before the checkmate has occurred. The overall decline in checkmates over time is possibly explained by the fact that draws are becoming the norm in expert chess play, meaning there’s fewer games that even come close to a checkmate.

chess-checkmates-over-time

Piece preferences

We can also look at whether chess players have preferred to use different pieces over time. My favorite piece when I first started to play was always the Queen, but in more recent games I’ve discovered how powerful Knights can be early on. Below, I charted out the move rates of the pieces over time. “0.33” means that the piece is moved every 1 / 0.33 = 3 ply.

chess-rook-moves-over-time

Rooks became considerably more popular to use between 1850 and 1900, then leveled off at being used every 7 ply since then. I’d love to hear a chess historian’s perspective on this. Perhaps a popular chess book was published around 1850 highlighting new strategies for the Rook?

chess-knight-moves-over-time

Meanwhile, Knights and Pawns saw a steady increase in popularity from 1920-1970. My best guess is that this is a direct result of the rise of the Hypermodernism school of chess after WWI, which advocated usage of Pawn chains and Knight outposts, both strategies heavily involving Pawns and Knights.

chess-pawn-moves-over-time

Interestingly, Pawns and Knights started falling out of favor in the 1970s, right when Bobby Fischer shook up the chess world with his dramatic march to become World Champion. Did Fischer’s breathtaking campaign change chess as we knew it?

Finally, King, Queen, and Bishop move rates have remained more-or-less the same over time. I actually find it pretty incredible that we can see much of a change in piece preferences at all, considering that chess strategies are changing so much over time.

Castling

Castling has been a very popular move throughout chess history. Over 1850-2014, only 16% of the games had one player not castle, and less than 4% of the games had both players not castle. (Surprisingly, many of those games lasted more than 50 ply!)

chess-kscastle-over-time

Kingside castling is by far the most popular move (used by 80% of players today) because it requires only 2 pieces to be moved out of the way before it can be done, instead of 3 pieces with the queenside castle. Considering that preferences for queenside castling (below) have remained fairly consistent, I can only guess that players who previously never castled started to realize the value of castling in the 1890s.

chess-qscastle-over-time

In contrast to the kingside castle, players have consistently used the queenside castle only about 8% of the time from 1850-2014. It seems that expert chess players want to get their Rooks into play as quickly as possible, leaving the queenside pieces for play later in the game.

That’s all for today. In the next installment, I’ll be looking at preferences for specific locations on the board.

Dr. Randy Olson is a Senior Data Scientist at the University of Pennsylvania, where he develops state-of-the-art machine learning algorithms with a focus on biomedical applications.

Posted in analysis, data visualization Tagged with: , , , , ,
  • PG Benoit

    Hi Randy,
    The article is very interesting and should be printed in the USCF magazine! One thought about Kingside versus Queenside castling is that for O-O it takes 4 moves for the rook to reach the e1 square to attack. When O-O-O occurs it takes 4 moves for the rook to reach d1 for attacking chances. Parity?!
    Food for thought.
    PG Benoit

    • Stefan

      About that, after 0-0-0, the King position on c1 is less safe than on g1 after 0-0, so it is not uncommon to move Kb1 for safety. That would make the Queenside castling more expensive to reach a symmetric position.

  • It is interesting how the variance of each statistic decreases over time as well. Could this be because chess is getting “figured out” and masters of the game have agreed on a certain style of play which is most effective?

    • It’s more a matter of the number of games I have in the data set each year. This data set has way more games in the 21st century, which means my confidence in the values I’m reporting go up over time (i.e., law of large numbers).

  • Klapaucjusz

    Hi
    Let me say the whole post is great 🙂 as for checkmate graph-resignations skew the results. Even if the position has checkmate in three, for example, the losing side almost alwasys resigns, so it is not reported in db as checkmate. As for rook usage-this is a piece most used in endings, and since games are getting longer, the number of rook moves is rising.

  • Cary Utterberg

    Rooks are generally more active in the endgame, so there is probably a relation between how few pieces are left and how often rooks move, or length of game and how often rooks move. Between 1850 and 1900, the number of published endgames and studies increased significantly, plus games lasted into the endgame more often (due to improved defensive play), so it’s not surprising that the use of rooks increased over that period.

  • Fascinating graphs, though I think the comments also show some of the dangers of trying to draw conclusions from raw data when you don’t have enough knowledge of the domain to provide insight on what’s going on.

    > Despite the fact that chess games are getting longer, more pieces aren’t being captured in that extended time period.

    There are only a finite number of pieces to be captured, so longer games automatically mean fewer captures per move. The causality goes in both directions… a) the faster pieces are exchanged, the shorter the game will probably be, because exchanges move you towards the endgame, and b) the longer the game, the more the necessarily finite number of exchanges must be spread out, reducing the captures per move value.

    Probably the decrease in captures per move is mostly explained by the increase in game length, not something that would be expected to rise with it.

    > chess games are becoming more strategic, then we should expect to see more checkmates over time

    In the normal chess jargon, going for checkmate would mean a more “tactical” style, whereas a more “strategic” style would be most likely be about eventually getting to a winning endgame rather than delivering mate. In either case masters would usually resign when it became clear that the position is hopeless.

    Whether a given game ends in checkmate or resignation is more likely about the prevailing etiquette. In most cases it would be considered bad behavior to force the opponent to play out a game that he’s clearly going to win. But sometimes when he’s about to win brilliantly, it’s a nice gesture to let him show everyone the brilliant thing he was going to do.

    Tactical games are are more likely to be brilliancies, and therefore I’d guess more likely to contain an actual checkmate. Strategic games are less likely to include a checkmate because both sides will see the inevitable coming from a long way off, but there is no clever finish imminent that you’d sportingly allow the opponent to demonstrate on the board.

  • A thing that might be interesting to look at is whether checkmates happen more often in time scrambles. My hypothesis would be that they do because…

    – There is less time for the defender to see all the possibilities and they might overlook some tactic
    – The momentum of rattling moves out quickly tends to carry you forward. You see that you’re about to be mated but reflexes keep you playing those last few moves anyway instead of resigning.
    – If the attacker is also in time trouble there is hope they might miss something, or run out of time themselves. If there’s a realistic chance they’ll stumble because having to make instant decisions it is not bad etiquette to play the game out all the way.

    Finding out the applicable time controls might be tricky though. If I remember correctly most serious tournaments had one at move 40, and then maybe every 20 moves after that.

  • Nick Hildebrant

    Interesting visualizations. I wonder why the trend to narrow in later years. are there fewer recorded games, or is there really less variance in how we play now?

  • It’s because there’s more games in the database in later years.

  • There is a significant skew in the older data: While we nowadays annotate and collect all high-level games, in the 1800s people recorded “interesting” games, either because they were high-profile or – and this is the problem – because they displayed interesting chess – which, in this romantic period, meant a lot of attacks, clever tactics, captures and checkmates.

    There is simply no way to know what the regular games were like back in those days. I am not sure when systematic collection of all games at master level began, but I think post-World War 2 the data quality is probably good enough to make inferences. Before that, you are comparing apples to oranges.

    Your trends, however, do support the general view of chess historians, that in the old days tactics and a swashbuckling, attacking style was favored over solid, strategic lines, and that the theory provided by Steinitz, Tarrasch and Nimzowitsch during the period 1900-1930 marked a clear change towards strategic and positional chess.

    It would be really interesting to see if Nimzowitsch’s modern school (as prescribed in My system, 1925-27) can be seen in the data; if Nimzowitsch’s ideas spread, you should for instance see much fewer central pawn moves, particularily for black, in the opening, say, the first ten moves.

    • Very interesting – thank you for bringing these issues up. I wonder: could we still try to draw inferences from the pre-WW2 games if we treat them as a sample of all chess games (as I have in these articles) and report the statistical uncertainty around the values measured? Or do you think the sampling — well, recording in this case — is far too biased?

      In any case, I’ve been hoping to team up with a knowledgeable chess historian to write some more serious articles (rather than explorations) on this topic. Do you have any recommendations?

  • Gilbert Mathew Hoover

    Very, Very Nice job here: I love explorations like this! Thank You for your good work!