Is Rails Scalable?

Standard

Being an optimist I would say, Yes! Having worked on a large scale high visibility site I’ve personally seen it happen, so definitely Yes! When the budget was not a limiting factor, I saw vertical and horizontal scaling at it’s best. Get the biggest badest app server, then multiply that by 20 with multiple load balancers. CDN cache everything possible. Scaling solved. Millions of views, thousands of transactions per second no problem. The databases were actually starting to buckle from all the connections. Enter pgBouncer connection pooling and more load balancers, but that is the point. You can scale to the point where it isn’t a problem with code but infrastructure.

But what if you can’t. Here are my experiences.

Vertical Scaling

You add more RAM, CPUs, faster pipes (fiber optics directly to another server or network), SSDs. This is all possible, as CPUs reach their limits this gets to be a limiting factor. This has a threshold. With more users, the server can still meltdown. This is not for an exponent growth more of a linear one.

This also holds true for the database server, as most issues are database related. Slow queries can run faster by beefing up the hardware it runs on.

I consider this a quick fix for linear or predictable growth.

Horizontal Scaling

Load Balancing! Get more servers to do the job. Two or twenty is better than one. Share the load between servers. This can and will get costly. Be it cloud or metal or a combination of the two. Horizontal scaling if done correctly can be a really easy solution to scaling issues.

With Docker and cloud platforms this option actually gets easier. I think this is what most applications use to solve scaling, as cost per request is linear and predictable. If it cost $1,000 for this one server that can handle 100,000 requests, it will cost $2,000 for 200,000 request. If using AWS you can even autoscale based on usage or schedule.

But similar to database sharding, when you have bugs or problems arise debugging the problem is exponentially difficult. You’ll also find multiple points of failure. Is the load balancer load-balanced or sharded? Why is all the traffic going to 1 server and the 19 others are only getting 3% of the traffic. Did all the servers get the updated code?

* A caveat if you start spooling up more application servers your database server(s) will get swamped with connections. You’ll need to look into pgBouncer and load balancing your database server(s) connections. Look into database sharding/partitioning/clustering.

CDN Caching

Often overlooked but multi-media, page, data, and assets caching. Using CDNs like Akamai, Cloudflare, etc. Not to be confused with content caching on a server. This is to use a distributed group of servers that cache assets regionally or globally. Images, CSS files, fonts, and video can bog down a server when requests are high. This mitigates that problem by serving assets that are closest to the end-user.

Constantly changing files and dynamic data are not good subjects for this type of caching.

Product images, stale or historic content, fonts, and videos are good candidates for CDNs. You are using the provider’s servers so the impact on the application is less. With Rails, it is as simple as setting an asset_host.

The Database

I think the database is going to be a big part of scaling issues.

Connections

Slow queries? How about lots of small queries? I often see oh we have a scaling issue we have a 100k users hitting our app per second. How about that background task that one user runs, which hits the database 1million times per second using those 1-microsecond queries.

It’s like when you copy over one large 2gb file to a thumb drive compared to 20 million 1kb files.

We batch and paginate records from the database when we use it in views. Be mindful when you do it in background task and reports that one user uses.

Also, use lookup tables that are in memory to the app. An example, we have a background job that processes clients websites. We thought ok we’ll save client ids matched to website URLs in Redis and have the jobs lookup the URLs in Redis. Redis is fast, but lots of micro connections to Redis is faster. Redis had a meltdown! We moved one problem from one database server to another Redis server. Redis is an external connection and we were still processing a lot and it was running out of memory. Our first solution was to batch process, each job grabbed one clients list of websites. Done, modest gains but not great as most of the clients had single URLs. The second attempt, we moved the data to a hash that was loaded in memory when each job was initialized. Massive gains, and we just had to make sure the delayed background jobs would be thorn down correctly.

Try to think about any server as an external service. Redis, databases, and background jobs servers, are all external resources. Connections are expensive, and exponentially expensive when you try to scale.

Calculations

As with counter caching you should try to calculate and save into the database values as much as possible. I know aggregate functions in databases are super fast. But if you can pre-calculate values and save them for display.

A caveat, database aggregate methods places a larger load on writes. So constant calculations can cause issues.

Usually, you can solve this by adding indexes your aggregate can use. But if the query is complex and adding indexes won’t help consider pre-calculation and denormalizing the database.

Optimize Queries

Pluck use it. This is using SELECT id, name, location FROM contacts;

Contact.pluck(:id, :name, :location)

Learn EXPLAIN. It’ll help you understand why your queries are slow, sometimes.

Use find_each or find_in_batches, I say aways do this. find_in_batches(batch_size: 100)

Indices & Partial Indices

Adding Indexes! And knowing what to add. If you search or find by name often, add an index. If you search by name and location, add a compound index. An example I search for all Bobs in California in contacts. So I would add a compound index on the name than the location. You try to minimize the list as fast as possible.

add_index :contacts, [:name, :location]

A partial index is adding a where clause to your index.

add_index :orders, :billed, where: “billed = false”

Another one, if you add too many indices you can end up making things worse. Remember indices are in memory, so the more you add the more memory you use. Also, inserts are/can be affected. When you insert into the database it also has to add to the index tree.

Reducing N+1

I think this is a given. Bullet gem. Learn and know about eager loading.

Restrict Database or Use a Different Database or a Service

Using Redshift, having a cloned database for long-running database queries. If your sales team needs to run reports on realtime data think about using a replicated database. Or by using a Publish / Subscribe data model either to a service like DataDog or even another database.

Data accuracy might not be 100%.

Skip ActiveRecord / Sequel / DataMapper / Etc.

An example if you are importing or exporting lots of data from/to a CSV file. You can forego using your ORM and directly inject records by using SQL or even better import the CSV file directly into your database.

Postgres has a COPY:

# to import
COPY states FROM 'path/csv/states.csv' WITH (FORMAT csv);

# to export
COPY states TO 'path/csv/states.csv' DELIMITER ',' CSV HEADER;

# to export plucking
COPY states(name, abbr_name) TO 'path/csv/states.csv' DELIMITER ',' CSV HEADER;

MySQL LOAD DATA INFILE and SELECT INTO OUTFILE:

# to import
LOAD DATA INFILE 'path/csv/states.csv' INTO TABLE states
COLUMNS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES;

# to export
SELECT id, name, abbr_name FROM states WHERE active=true
INTO OUTFILE 'path/csv/states.csv'
FIELDS ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\r\n';

Advance Database Tools & Capabilities

Using pivot tables, materialized views, stored procedures, and all the tools a database can offer. Can greatly impact performance, use it how it’s designed to be used.

Background Processing

Video encoding, pdf generation, image processing, anything you can punt off to another process.

Processing large datafiles? Generating or calculating data sets? Delayed Job, Sidekiq, Resque any of these can help.

Depending on immediacy you can always let something else do the job so your main application doesn’t get bogged down.

Rails Caching

Depending on the version of rails you use you can read up on caching in the docs. But current versions use fragment, Russian Doll, and Shared Partial caching.

Code Optimization

The fun part of scaling. Refactoring and optimizations.

Back in Rails 2, just adding an explicit return at the end of a method actually sped up code.

rubycritic, can help find code smells.

Learning how to optimize Ruby code also translates to Rails. Ruby uses a lot of memory. Since everything is an object and objects take up memory and Rails with all the cool gems your team keeps adding is more objects, variables are objects, nil is an object, an object is an object of BaseObject. It goes on and on.

Little things like string interpolation vs << or +, “#{first_name} #{last_name}” is much faster.

Optimize the algorithm before you try to optimize the code.

Know Your Gems

Sometimes it’s not your code, I know right! Using too many gems can be bad. But some gems aren’t made like others. Keep gems updated!

bundler-audit, a tool to keep your gems up to date.

Using HAML and it feels slow? Look into Hamlit. Wink check out Hamlit-Rails.

Using WillPaginate and its slow? Look at Pagy.

Consider using fast_jsonapi, Panko, Oj, or nativeson, if you aren’t happy with Active Model Serializer. Nativeson if using Postgres generates JSON natively from the database, no serializing needed. This is just an example of how many options you’ll have when considering other libraries to complete the same task.

Importing massive amounts of data into your database. Well, massive inserts are pretty bad so how about looking into using the activerecord-import gem.

If you don’t know your gems google search each one to see if there are better alternatives.

When Optimizing and Refactoring

Know your metrics. Have a test suite and make sure you have adequate code coverage.

Use tools like New Relic, Scout, Hound, pay attention to logs to try to catch hidden bottlenecks.

Use gems that help:

  • bullet, for n+1 queries
  • ruby-prof, profiler for MRI Ruby
  • benchmark, module that has methods to measure and report execution time
  • activerecord-explainer, explains all queries.
  • PgHero, Postgres performance dashboard
  • Etc., find things that will help you track down and document what is slow.

If all else fails or none of this works you can always write code in another language or even better Assembly.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.