Cost Optimization in Cloud Architecture: Spot Instances and Reserved Capacity Strategies
The “Production-Grade” Deep Dive - Move beyond the basics. Access our System Design curriculum—covering everything from database sharding to microservices orchestration—at a fraction of the price.
“Theory is one thing; building for 100M users is another.”
Claim your 40% discount: https://systemdr.substack.com/7b6b3fb1
The $2M Surprise
A mid-stage startup migrates to AWS, runs everything on On-Demand EC2, ships fast, and then opens the billing dashboard three months later. The number is not what they expected. It never is. The workload was predictable — a steady 200-instance baseline with weekend traffic spikes. Had they structured their purchasing strategy deliberately, that bill would have been 60–70% smaller. The gap between “it works” and “it works cost-efficiently” in cloud infrastructure is almost always a purchasing strategy problem, not an architecture problem.
Three Ways to Buy the Same Compute
Cloud providers sell compute capacity through three purchasing models. Understanding the mechanics of each — not just the discount percentages — is what separates engineers who architect cost-efficient systems from those who over-provision and overpay.
On-Demand is the baseline. You pay per second (or per hour), no commitment, full list price. The flexibility is real: spin up, tear down, no questions asked. But you’re paying a significant premium for that optionality. On-Demand is the right choice for unpredictable spikes, stateful workloads you can’t gracefully interrupt, and new services whose utilization profile you haven’t yet characterized.
Reserved Instances (RIs) / Savings Plans commit you to a usage level for 1 or 3 years in exchange for 30–60% discounts versus On-Demand. There are two flavors worth distinguishing. Standard RIs lock you into a specific instance type and region — maximum discount, minimum flexibility. Convertible RIs let you change instance families or operating systems during the term, at a smaller discount. AWS Savings Plans are an evolution: instead of reserving specific instances, you commit to a dollar-per-hour spend level, and the discount applies automatically to any eligible usage. This is almost always preferable to Standard RIs for teams with evolving workloads, because you’re not penalized for rightsizing or changing instance families.
The non-obvious failure mode with RIs: teams buy them based on current peak utilization rather than stable baseline. If you reserve 500 c5.2xlarge instances because your peak hits 500, you’ll be paying for reserved capacity that sits idle 70% of the time. Reserve for your steady-state floor — the capacity that runs 24/7 regardless of traffic. Use On-Demand or Spot for everything above that.
Spot Instances sell unused EC2 capacity at 60–90% discounts versus On-Demand. The catch: AWS can reclaim them with a two-minute warning when that capacity is needed elsewhere. This isn’t a theoretical risk — Spot interruption rates vary by instance family and region, running from under 1% to over 20% depending on demand conditions. Building on Spot means accepting interruption as a design constraint, not an exception.
The practical approach for most production systems is a mixed fleet: Reserved Instances for the steady-state baseline (your always-on application tier), On-Demand for burst capacity that needs to be reliable, and Spot for batch workloads, CI/CD runners, data processing jobs, and stateless services that can handle interruption gracefully.


