Reading Time: 2 minutes
When you need to calculate rank and save it to a field in Postgres. This example uses Rails ActiveRecord but if you can execute the query you are good.
A naive way would be to build an iterative method with an index. An example in Ruby:
Product.order(price: :desc).each_with_index do |product, index|
product.price_rank = index + 1
This might work great if you have a limit(10) or have a small data sample. But what if you have 100,000 records or even worse 1 million records.
For 100 items the above code took 0.181s.
For 1,000 items .49s.
For 100,000 items 155.04s that’s roughly 2 minutes 36 seconds.
Now using Postgres I know it’ll be faster but how much?
query = "UPDATE products SET price_rank = r.rnk FROM (SELECT id, RANK() OVER (ORDER BY price DESC) as rnk FROM products) r WHERE products.id = r.id"
I am getting run times of .0005 because I am dumping the work onto the database.
Running EXPLAIN ANALYZE on the query.
Update on products (cost=16813.40..25200.70 rows=101002 width=112) (actual time=15247.855..15247.855 rows=0 loops=1)
-> Hash Join (cost=16813.40..25200.70 rows=101002 width=112) (actual time=8693.946..12299.080 rows=101002 loops=1)
Hash Cond: (products.id = r.id)
-> Seq Scan on products (cost=0.00..3391.02 rows=101002 width=68) (actual time=0.584..990.371 rows=101002 loops=1)
-> Hash (cost=14563.87..14563.87 rows=101002 width=56) (actual time=8693.010..8693.010 rows=101002 loops=1)
Buckets: 65536 Batches: 4 Memory Usage: 2685kB
-> Subquery Scan on r (cost=11786.32..14563.87 rows=101002 width=56) (actual time=2278.120..7232.235 rows=101002 loops=1)
-> WindowAgg (cost=11786.32..13553.85 rows=101002 width=24) (actual time=2278.083..5283.472 rows=101002 loops=1)
-> Sort (cost=11786.32..12038.82 rows=101002 width=16) (actual time=2278.053..3262.086 rows=101002 loops=1)
Sort Key: products_1.price DESC
Sort Method: external merge Disk: 2576kB
-> Seq Scan on products products_1 (cost=0.00..3391.02 rows=101002 width=16) (actual time=0.639..992.346 rows=101002 loops=1)
Planning time: 3.271 ms
Execution time: 15252.513 ms
Over 100,000 records ranked and updated with values in 15s. Not bad.
You also get the added logic of a true ranking when there is a tie like #1, #2, #2, #2, #5, #6, etc. The calculated version was simple and did not account for it. If you do want to, you can implement it but with Ruby, you’ll add more bloat and complexity.