How to optimize DB writes in Ruby on Rails

How to optimize DB writes in Ruby on Rails

Foreword

While read queries can heavily impact performance, writes too can do some damage. In this article, part 2 on our Rails Performance Series, we will examine ways to speed up write queries.

Checkout the part 1 about DB reads here.

Transactions

Transactions allow us to execute a series of operations atomically, as they were a single operation. If any operation inside the transaction fails, the whole blocks is canceled, and no operation is persisted (rollback).

A nice thing to know is that a database is much faster at committing a transaction with n writes than n writes each. So generally:

1000.times do
  User.create!(email: 'hey@example.com')
end

is much slower than

User.transaction do
  1000.times do
    User.create!(email: 'hey@example.com')
  end
end

Validations

Rails validations are very useful, although they introduce an overhead on record save. Uniqueness constraints are particularly costly, as they imply a SELECT query before saving to check if the value respects the constraint.
In some cases disabling validations can significantly speed up operations, especially when we are dealing with massive operations like imports and such.
To skip validations we can call the save method as:

record.save(validate: false)

DO NOT do this unless you are sure that performance is a real problem.

Massive insertions / updates

From Rails 6 onwards some handy methods were added to do mass operations, using a single INSERT INTO or UPDATE operation.

Those methods are insert_all, insert_all!, upsert_all and upsert_all!.

The main advantage is that insertion time is orders of magnitude faster.

Warning: with those methods model validations and callbacks will not be invoked. (before_create, after_save, etc.)

If you're using Rails < 6, you can achieve the same functionality using this gem:
github.com/jamis/bulk_insert

Massive deletions

The same thing is valid for deletions.

If you need to prune mass data, consider that this:

User.each(&:destroy)

Is much slower than this:

User.transaction do
  User.each(&:destroy)
end

And it's much much slower than this:

User.delete_all

But, as noted before, delete_all skips all validations and before_destroy hooks, so be aware of it.

Minimize number of queries

It's pretty obvious, but a way to reduce writing performance is, as always, to reduce the number of queries.

So, try to make changes to a model all in the same place or collect changes in the model and invoke a query just once.

For instance:

  • Don't make call an #update in the after_create hook. Use before_create and change the column, instead.

Don't

class User
  after_create do
    update(is_valid_user: true) if email.ends_with?('@gmail.com')
  end
end

Do

class User
  before_create do
    self.is_valid_user = true if email.ends_with?('@gmail.com')
  end
end
  • Don't call #touch if you just made a save to the model. It's been already touch-ed by the save
  • Don't call save multiple times on the same model during the same request.

Don't

user.update!(email: '___', password: '___')
...
...
user.update!(is_valid: true) if user.email.ends_with?('@gmail.com')
user.update!(metadata: metadata) unless metadata.nil?

Do

user.email = '___'
user.password: '___'
...
user.is_valid = true if user.email.ends_with?('@gmail.com')
user.metadata = metadata unless metadata.nil?
user.save! # Just one query
######
# Or, as an alternative:
######
user.assign_attributes(email: '___', password: '___')
...
user.is_valid = true if user.email.ends_with?('@gmail.com')
user.metadata = metadata unless metadata.nil?
user.save! # Just one query

Indexes, triggers and performance

Indexes are great for performance, in general. They speed up read queries a LOT, making a trade-off between speed and occupied memory.

However, you should consider that they also add an extra overhead during writes. In general, this is trivial.

In some scenario, however, if you have too many (unused) indexes, you may see some slowdowns during inserts.

To find it out, just run the query in the console and check the execution time:

User Create (4.3ms)  INSERT INTO "users" ("email", "created_at", "updated_at", "confirmation_token", "confirmation_sent_at") VALUES ($1, $2, $3, $4, $5) RETURNING "id"

In general, it should be a few milliseconds. If it's much higher (like 300ms), it could be caused by too many indexes. Try to prune redundant ones.

A few hints:

  • Indexes on column with a low cardinality are usually less effective and useless (example: booleans, enumerators, values often NULL)
  • Compound indexes always cover single indexes on the first column. So INDEX ON (first_name, last_name) also covers searches on the first_name column on ly (but not on last_name).

The same issue can appear with triggers: don't over use them, because they can slow down a LOT your write performance.

About Us

This article is part of the Mònade Developers Blog, written by Piero Dotti.