When you need to calculate rank and save it to a field in Postgres.
A naive way would be to build an iterative method with an index.
Product.order(price: :desc).each_with_index do |p, i| p.price_rank = i + 1 p.save end
This might work great if you have a limit(10) or have a small data sample. But what if you 100,000 records or even worse 1 million records.
For 100 items the above code took 0.181s.
For 1,000 items .49s.
For 100,000 items 155.04s.
Now using Postgres I know it’ll be faster but how much?
query = "UPDATE products SET price_rank = r.rnk FROM (SELECT id, RANK() OVER (ORDER BY price DESC) as rnk FROM products) r WHERE products.id = r.id" ActiveRecord::Base.connection.execute(query)
I am getting run times of .0005 because I am dumping the work onto the database.
Running EXPLAIN ANALYZE on the query.
Update on products (cost=16813.40..25200.70 rows=101002 width=112) (actual time=15247.855..15247.855 rows=0 loops=1) -> Hash Join (cost=16813.40..25200.70 rows=101002 width=112) (actual time=8693.946..12299.080 rows=101002 loops=1) Hash Cond: (products.id = r.id) -> Seq Scan on products (cost=0.00..3391.02 rows=101002 width=68) (actual time=0.584..990.371 rows=101002 loops=1) -> Hash (cost=14563.87..14563.87 rows=101002 width=56) (actual time=8693.010..8693.010 rows=101002 loops=1) Buckets: 65536 Batches: 4 Memory Usage: 2685kB -> Subquery Scan on r (cost=11786.32..14563.87 rows=101002 width=56) (actual time=2278.120..7232.235 rows=101002 loops=1) -> WindowAgg (cost=11786.32..13553.85 rows=101002 width=24) (actual time=2278.083..5283.472 rows=101002 loops=1) -> Sort (cost=11786.32..12038.82 rows=101002 width=16) (actual time=2278.053..3262.086 rows=101002 loops=1) Sort Key: products_1.price DESC Sort Method: external merge Disk: 2576kB -> Seq Scan on products products_1 (cost=0.00..3391.02 rows=101002 width=16) (actual time=0.639..992.346 rows=101002 loops=1) Planning time: 3.271 ms Execution time: 15252.513 ms
Over a 100,000 records ranked and updated with values in 15s. Not bad.
You also get the added logic of a true ranking when there is a tie like #1, #2, #2, #2, #5, #6, etc. The calculated version was simple and did not account for it. If you do want to, you can implement it but with Ruby, you’ll add more bloat and complexity.