Inek

RDS Right-Sizing. Why It's Harder Than It Looks

¡Tu mensaje fue recibido! Una vez que sea aprobado, estará visible para los demás visitantes.

Cerrar

01 Mar 2026

RDS is often the largest single line item in an AWS bill and the one teams are most afraid to touch.

On paper, right-sizing looks simple. In practice, it rarely is.

The reasoning usually looks like this:

This is where most cost optimization attempts go wrong.

The false signals

CPU utilization lies.

A common strategy:

“Monitor maximum CPU last N weeks/months. Resize if the CPU is below a certain threshold.”

While partially right, it is not sufficient.

CPU can look healthy while the database is already constrained, or look scary while the database is perfectly fine.

CPU tells you how busy the CPU cores are, not whether the database is the bottleneck.

How CPU lies

Lie 1: CPU is low, but the DB is struggling

This happens when the bottleneck is elsewhere:

In these cases:

You pay more.

The problem stays.

CPU is a signal, not a verdict.

Lie #2: CPU spikes don’t mean “we need a bigger instance”

CPU spikes are normal.

Common examples:

If you size for peak CPU, you end up paying 24/7 for minutes of pressure.

This is one of the most common RDS cost traps. How many downsize initiatives are frozen because of a single isolated weekly peak?

Lie #3: CPU hides bad architecture

High CPU is often caused by:

Scaling RDS “fixes” it (temporarily) and locks in higher cost.

This is cost optimization debt.

Memory is not optional

A database does not gracefully degrade when memory is tight. It fails loudly.

Memory in RDS is used for:

When memory is insufficient:

And here’s the trap:

Teams see “FreeableMemory still > 0” and assume they’re safe.

They are not.

Why FreeableMemory lies

FreeableMemory includes:

But databases want to keep memory hot.

A low-but-stable FreeableMemory is often good.

A sawtooth pattern (drops -> sudden reclaim -> drops again) is a warning sign:

Cost optimization mistake #1

“CPU is low, so let’s downsize.”

This downsizes memory, so:

This is why memory is not optional.

It is the foundation, not an optimization lever.

If memory pressure exists, instance downscaling is not a cost optimization. It’s risk creation. Before downsizing an instance, study the memory pattern.

Storage & IOPS: the hidden coupling

The illusion

“We’re not I/O bound. CPU is fine.”

Reality:

You don’t choose to care about IOPS. Memory problems force you to.

Once memory becomes insufficient, storage performance stops being a secondary concern and becomes the primary constraint.

The dangerous coupling (especially on GP2)

With GP2:

Result:

This is classic cost displacement, not optimization.

GP3 helps — but doesn’t save you

GP3 decouples:

Good. But:

Answer: because memory was never sufficient.

Cost optimization mistake #2

“Let’s reduce storage, it’s expensive.”

Reducing storage without understanding I/O:

You save dollars and spend credibility.

Memory pressure silently converts compute cost into I/O cost.

What right-sizing actually requires:

Why teams delay (human factors)

Even when metrics suggest an instance is oversized, teams hesitate.

Databases are stateful, critical, and often poorly understood outside the platform team.

Fear of downtime, lack of representative staging environments, and unclear rollback plans make inaction feel safer than change.

Cost optimization without operational confidence rarely happens.

What “responsible” right-sizing actually looks like

Responsible right-sizing is not a one-off resize action.

It requires:

Without feedback loops, right-sizing becomes guesswork.

The real cost of getting it wrong

RDS right-sizing fails when it is treated as a resizing exercise instead of a systems decision.

CPU, memory, storage, and I/O are not independent knobs. They trade cost and risk across layers.

When teams optimize one signal in isolation, they often move cost rather than remove it. Or worse, create instability that freezes future optimization attempts.

Right-sizing done responsibly is slower, more deliberate, and less dramatic.

But it is also repeatable.

And repeatability is what turns cost optimization from a one-time win into a discipline.

¿Qué te pareció el post?

No hay comentarios.