Finding the position of a random player in a leaderboard, in a way that is scalable, presents a common challenge when dealing with databases.
Several factors will influence the solution you choose, including:
- Total number of players
- The speed at which individual players are accumulating scores
- The rate at which new scores are being added (number of concurrent players * previous factor)
- Score range: Limited or Unlimited
- Distribution of scores (uniform, or specific 'hot scores')
Simple Approach
The usual simple method involves counting all players with a higher score, for example
SELECT count(id) FROM players WHERE score > {playerScore}
.
While this approach works for small scales, it becomes slow and resource-intensive as your player base expands, especially in MongoDB and Cloud Firestore.
Cloud Firestore does not support count
natively due to scalability issues. You would need to count the returned documents on the client-side or use Cloud Functions for Firebase for server-side aggregation to avoid extra data transfer.
Regular Update
Instead of updating rankings continuously, consider updating them periodically, like once every hour. Take Stack Overflow's ranking system as an example, where rankings are updated daily.
To implement this, you could schedule a function, or schedule App Engine if the process takes more than 540 seconds. This function would create a list of players in a ladder
collection with a new rank
field. Retrieving the top X players plus the player's rank would then be efficient in O(X) time.
An optimization would be to explicitly store the top X players in a single document, reducing the read operation to just 2 documents, enhancing performance and saving costs.
This method can cater to any number of players and write rates since it operates independently. Adjusting the update frequency may be necessary as your player base grows based on financial considerations. With 30K players per hour, the cost would amount to $0.072 per hour ($1.73 per day), unless optimizations are applied (e.g., excluding players with zero scores).
Inverted Index
In this technique, an inverted index structure is created. It is effective when there is a limited score range much smaller than the total number of players (e.g., scores from 0-999 against 30K players). Alternatively, it could also work for an unbounded score range but with a significantly lower count of unique scores compared to players.
By utilizing a separate collection named 'scores', each unique score has a corresponding document containing a field called player_count
.
When a player achieves a new total score, 1-2 writes occur in the scores
collection. One write increments player_count
for their new score while decrementing the old score if applicable. This method supports both "Current score equals latest score" and "Current score is highest achieved score" ladder styles.
Determining a player's precise rank is as simple as executing something like
SELECT sum(player_count)+1 FROM scores WHERE score > {playerScore}
.
Since Cloud Firestore lacks support for sum()
, this computation is done on the client side. The addition of 1 accounts for the player themselves among those above them, determining their rank.
With this approach, reading up to 999 documents (averaging around 500) is required to obtain a player's rank, although this number decreases if scores with no players are removed.
Understanding the write rate of new scores is crucial, as updating an individual score is limited to once every 2 seconds on average. For a fair distribution of scores ranging from 0-999, this translates to supporting approximately 500 new scores per second. Implementing distributed counters for each score can enhance this capability.
* As each score generates two writes, limiting updates to one per 2 seconds
** Assuming an average game duration of 2 minutes, handling 500 new scores per second could accommodate up to 60,000 simultaneous players without distributed counters. Higher volumes are feasible with a "Highest score is current score" mechanism.
Sharded N-ary Tree
Considered the most challenging option, the Sharded N-ary Tree method offers faster and real-time ranking positions for all players. It serves as a read-optimized counterpart to the Inverted Index methodology discussed earlier, which leans towards write optimization.
Refer to the related article on 'Fast and Reliable Ranking in Datastore' for a suitable general strategy. A bounded score is recommended for this approach (although possible with an unbounded range, modifications would be necessary).
I do not recommend this route due to requiring distributed counters for top-level nodes in frequent-update scenarios, potentially offsetting gains in read-time efficiency.
https://i.sstatic.net/V81hf.png
Final Thoughts
By combining different approaches, the leaderboard optimization can be maximized depending on the frequency of player leaderboard views.
Merging 'Inverted Index' with 'Periodic Update' at shorter intervals can provide O(1) ranking accessibility for all players.
If the leaderboard is viewed over 4 times during the 'Periodic Update' timeframe, money can be saved, and the leaderboard will operate faster.
Within each period, such as 5-15 minutes, all documents from scores
can be read in descending order. By tracking the total players_count
incrementally, scores can be rewritten into a new collection named scores_ranking
with an additional players_above
field. This field sums up player counts excluding the current score, simplifying rank retrieval.
Obtaining a player's rank now only requires fetching the player's score document from score_ranking
-> Their rank corresponds to players_above
+ 1.