Your network blocks the Lichess assets!

lichess.org
Donate

Analyzing blitz ELO disparities among major chess websites

ChessAnalysisLichessOver the board
A statistical review and in-depth analysis of differences in blitz ratings within the two largest online chess sites

I've been asked a hundred times too many about my rating on chess.com and Lichess by random beginners. When I proudly announce my hard-earned 2400 rapid, they immediately start asking me if I'm a GM. They all seem to share that same disappointed look on their faces when I burst their bubble and tell them that I'm not even close.

All these mini-conversations have raised a curiosity within me about what online ratings actually translate to in real life. In a perfect, boring ELO world, a 2400 online ELO would equal a 2400 FIDE rating, and I would be a happy IM. But this is far from a perfect ELO world—and this is mainly because of rating disparities.

What are rating disparities?

Rating disparities are quite self-explanatory. Basically, they're ELO differences among different federations or player pools.

Let's take my aforementioned online rapid ELO of 2400 as an example. My rapid rating on Lichess is 2400. However, I'd be squashed by a real 2400 in a matter of seconds, and my real level lies closer to 1900 FIDE. Therefore, my Lichess rating has a "disparity" of 500 ELO beyond my FIDE rating.

This process has occurred to a major extent throughout the online chess world, and that extent is exactly what I'm going to analyze in this blog.

Procedure/methodology

I will be using blitz data/ratings for my study on online websites, as it is the most consistently played among masters over both online and face-to-face. I would really love to use Rapid as a comparison for classical, but masters simply don't play the time control as much.

I will also be using FIDE blitz ratings as the base blitz "skill level" for a player, as FIDE ELO is the most objective/standard rating.

There are seven main steps I took to calculate and analyze rating disparities for chess.com and Lichess. If you're a statistician or good with math/data, please tell me if I did anything wrong.

  1. Fetched blitz rating data for a selection of 35 masters with Lichess/chess.com accounts with roughly 1000 or more blitz games on either site. I made sure to leave out accounts not active in the past 3-4 years.
  2. Calculated differences between FIDE and Lichess/chess.com ratings (Website - FIDE = difference).
  3. Averaged these 35 differences to get a mean disparity value (MDV).
  4. Calculated standard deviation (SD) based on all collected data account for outliers and extremes.
  5. Calculated coefficient of volatility (CV) as a percentage using the formula SD ÷ MIV. The higher the CV percentage is, the more volatile the data set is.
  6. Calculated mean absolute deviation (MAD) to determine the average distance of each data point from the MIV.
  7. Analyzed and compared results.

I may do a more extensive study of disparities and/or inflation in the near future with many more data sets, master players, and ratings to come, but as this is my first rodeo with research, I kept the data set relatively concise.

Rating disparities within chess.com and Lichess, calculated and crunched

Tables & charts

Code_Generated_Image (1).png

WebsiteMean Disparity ValueS.D.C.V.M.A.D.
Chess.com+499.688.117.6%67.4
Lichess+231.993.640.6%72.4

What conclusions can we derive from this data?

There's one main thing I'm sure all of us have noticed—chess.com consistently experiences higher positive rating disparities in blitz, with an MDV over 250 points higher than that of Lichess. However, it's important to try to analyze the "why" of these differences, and after thinking and researching about this exact question for a bit too long, I believe I've reached a solid conclusion.

Higher blitz ratings at the master level on chess.com can be attributed to two main factors: the website's much larger player base and its different systemic approach to calculating ELO.

Let's begin with the player base, which I believe is the primary cause of this trend. Chess.com attracts many more members than Lichess, and a vast majority of these members are casuals or beginners. These hordes of beginners feed tons of ELO to intermediates, who then pass it on to a small population of advanced players. All of this rating flows to the master level through a massive ELO food chain at an almost constant rate due to the sheer number of games going on at any given time, resulting in a tiny but extraordinarily highly rated blitz elite.

There's also another secondary factor that helps create this ELO gap: chess.com's different rating calculation engine. Chess.com uses Glicko; meanwhile, Lichess uses Glicko-2. These engines are similar in that they both track rating and rating deviation (RD), where this deviation rises with a player's inactivity and decreases with play. However, Glicko-2 has one key difference from Glicko; it also accounts for volatility. Volatility measures how erratically a player's rating has been changing. An improving or inconsistent player will have high volatility, and this high value translates to extreme rating gains and losses until the player's rating eventually stabilizes. Meanwhile, a consistent player will have low volatility, experiencing less rating change per match while reducing drastic ELO growth based on individual games. In practice, this makes Lichess's player ratings more accurate and controlled than those of chess.com.

Why do systemic rating disparities even happen in the first place?

Behind this comparison lies this more interesting and foundational question. Simply put, ELO disparities can be attributed to differing inflation rates among different websites and federations.

Inflation basically happens when rating points enter a pool without any form of exit. This may be a little difficult to visualize, so let me paint a picture of what this process often looks like.

A new player watches a couple of GothamChess videos and chooses to join chess.com, thinking he will easily pick up the game. He begins as a 1200, loses a couple dozen games, drops down to 800 ELO, and then indefinitely quits chess out of rage. The 400 ELO that player lost is now permanently in the chess ecosystem with no way back out.

This entrance-without-exit can happen once, twice, or even a hundred times with little effect on the overall player pool. Multiply this process by tens of thousands of beginners, however, and you have a continuous influx of rating eventually leading up to the highest level, thereby resulting in inflation for top players. As I mentioned before, this rating injection is accelerated by a larger player base, hence the higher rates of master-level inflation on more popular websites.


It is noteworthy that FIDE ratings have inflated considerably over time at the highest level. As more players became attracted to chess and joined FIDE both due to chess's overall growth and the organization's changing rating policy, this same churn of upwards ELO flow has risen significantly over the last few decades, leading to a similar inflationary result.

This is demonstrated in the amount of 2700s worldwide. Five or six decades ago, long before my existence was even an idea, reaching 2700 made someone a top 10 global player or even a contender for the best in the world. Today, however, 2700-rated players are considerably higher in number, and although they are world class players in their own regard, their relative "power" is nowhere close to that of 2700s in the past.

However, overall FIDE ratings have recently experienced considerable deflation, and that's a topic I will cover in a future blog.


Thanks for reading, and hope you found this post helpful or insightful!