Read Replica

Early in my career, I wrote code that read from a MySQL read replica and shipped a bug.

A user posts a comment and immediately reloads the page. The write went to the primary. The read hit the replica. Replication hadn't caught up. Their own comment wasn't there. "Did my post go through?" comes the support ticket. It did. The replica just hadn't caught up yet. I'd completely failed to account for replication lag.

Writes go to the primary. Reads go to replicas. Most web traffic is reads, so lining up replicas gets you a long way in terms of scale.

After that bug, I started paying attention. If data isn't found on the replica right after an INSERT, read from the primary. If you need immediate consistency after an update, don't read from the replica at all. Crude but reliable. I also picked up the habit of monitoring Seconds_Behind_Source — it used to go by a different name.

Incidentally, MySQL 8.0 lets you compare commit timestamps directly through the Performance Schema. Seconds_Behind_Source can be inaccurate with multi-threaded replication, so that's the recommended approach now. The name changed, the measurement changed, but the job is the same. Measuring how far behind the replica is.

The more you distribute, the further consistency drifts. The CAP theorem says so. Reading is easy. Knowing where to read from is hard.

Knowing when to read is harder still.

Read Replica ​

Read Replica