Dry Run

I once watched a friend wrestle with JMeter. A major update was about to ship. He was running load tests on a setup built to mirror production.

Load testing before a major release is miserable. You need the same data volume, the same traffic patterns, the same middleware configuration. All of it. If the dataset runs to several terabytes, just copying it takes half a day. Batch jobs, external API integrations, cache hit rates. Even when you match every variable, it still is not production.

Infrastructure as Code did not exist back then. Building a second production environment in the cloud meant someone did it by hand. Follow the runbook, stand up each server, install the middleware, align the settings. A drift from production hides somewhere in the stack. You never spot it in time.

Bringing production data into a test environment means masking personal information. Names, email addresses, phone numbers, physical addresses. Tedious but non-negotiable. Mask too loosely and you have an incident. Mask too aggressively and the data distribution shifts. The test loses meaning.

Cloud infrastructure carries its own problem. Identical instance specs do not guarantee identical performance. Cloud instances share physical hosts with other tenants. A neighbor consuming CPU or disk I/O drags your instance down with it. The noisy neighbor problem. Yesterday's benchmark looked clean. Today it is slow. Non-reproducible performance degradation is the worst kind. Bare-metal instances exist, technically, but at that cost you might as well stay on-premises. I have never seen anyone use them for this.

In the end, production load can only be reproduced in production. You take a maintenance window, cut over, and pray. Load testing is the work of stacking evidence beneath that prayer.

That friend eventually stopped accepting load-testing requests. I understand. It was fascinating to watch from the side. Ask me to do it myself and I would politely decline.

Dry Run ​

Dry Run