Loading...


Updated 21 may 2025 • 5 mins read

A to IOPS, the input/output operations per second metric that defines storage performance. It covers what IOPS is, how it is measured, how it differs from throughput and latency, how IOPS works on AWS and Azure, how to size it for your workload, and how to avoid paying for performance you never use.
If your application ever feels slow for no obvious reason, storage is often the quiet culprit. The CPU looks fine. Memory looks fine. Yet requests crawl. Nine times out of ten, the bottleneck turns out to be IOPS.
IOPS is one of those terms that gets thrown around in cloud and infrastructure conversations, usually without a clear definition. People mix it up with speed, with bandwidth, with throughput. Getting it right matters because IOPS affects both how fast your systems run and how much you pay for storage.
This guide explains what IOPS actually is, how it is measured, how it differs from throughput and latency, and how it plays out on cloud platforms like AWS and Azure. By the end, you will know how to size storage for your workload without overpaying for performance you never use.
IOPS stands for Input/Output Operations Per Second. It is a measure of how many read and write operations a storage device or volume can complete in one second.
In plain terms, IOPS tells you how busy your storage can get. Every time an application reads a file, writes a log line, or updates a database row, that counts as an input/output operation. IOPS simply counts how many of those operations a disk or volume can handle each second.
A higher IOPS number means the storage can serve more simultaneous requests. A traditional hard drive might manage a couple of hundred IOPS. A modern NVMe solid-state drive can deliver hundreds of thousands. That huge gap is exactly why IOPS matters so much for databases, virtual machines, and any latency-sensitive workload.
Here is the short answer if you only need one line. IOPS is the speed limit for how many small read and write requests your storage can process per second, and it is one of the three numbers that decide whether your storage feels fast or painfully slow.
IOPS is not a single fixed number stamped on a disk. The same volume can deliver very different IOPS depending on how it is used. Several factors shape the result:
There is also a simple relationship worth memorizing. Throughput equals IOPS multiplied by I/O size. So a workload running 3,000 IOPS at a 4 KB block size moves roughly 12 MB per second. This is why you cannot talk about IOPS sensibly without also knowing the block size behind it.
IOPS rarely travels alone. Storage performance is really a story told by three metrics together, and confusing them is the most common mistake people make.
| Metric | What It Measures | Unit | Simple Analogy |
|---|---|---|---|
| IOPS | Number of read/write operations per second | Operations per second | How many cars pass per minute |
| Throughput | Volume of data moved per second | MB/s or GB/s | How wide the highway is |
| Latency | Delay to complete a single operation | Milliseconds or microseconds | How long each car waits at the toll |
Here is how to think about it. IOPS counts the operations. Throughput measures the data those operations carry. Latency tells you how quickly each one finishes. A database needs high IOPS and low latency. A video streaming or backup workload cares far more about throughput. Match the metric to the job and the storage decision becomes much easier.
Different storage media live in different performance worlds. The numbers below are general ranges, not exact specs, but they show the scale of the differences.
| Storage Type | Typical IOPS Range | Best For |
|---|---|---|
| HDD (spinning disk) | 55 to 180 IOPS | Archives, backups, cold and bulk data |
| SATA SSD | 7,500 to 20,000 IOPS | General-purpose servers and apps |
| Enterprise SAS SSD | Tens of thousands of IOPS | Busy databases and virtualized hosts |
| NVMe SSD | Hundreds of thousands to 1M+ IOPS | High-performance databases and analytics |
In the cloud, you do not buy physical disks. You choose a volume type, and that choice sets your IOPS ceiling. This is where IOPS stops being a hardware spec and becomes a budgeting decision.
Amazon Elastic Block Store, or EBS, is the most common example. According to the official AWS EBS documentation, each volume type offers a different IOPS profile:
| EBS Volume Type | Max IOPS per Volume | Best For |
|---|---|---|
| gp3 (General Purpose SSD) | Up to 80,000 | Most workloads, boot volumes, mid-size databases |
| io2 Block Express (Provisioned IOPS) | Up to 256,000 | Mission-critical, I/O-intensive databases |
| st1 (Throughput Optimized HDD) | Lower IOPS, high throughput | Big data, logs, streaming workloads |
| sc1 (Cold HDD) | Lowest IOPS | Infrequently accessed, cost-sensitive data |
A useful detail: every gp3 volume includes a baseline of 3,000 IOPS and 125 MB/s of throughput at no extra cost, and you only pay more when you provision above that. At the top end, io2 Block Express is built for sub-millisecond latency and 99.999 percent durability, which is why it shows up under demanding databases like SAP HANA and Oracle.
Microsoft Azure follows the same idea with its managed disks. As covered in the Azure managed disk documentation, tiers like Premium SSD v2 and Ultra Disk let you set IOPS independently of disk size, scaling well into the hundreds of thousands of IOPS for the most demanding workloads.
One catch that trips up many teams: your virtual machine or instance has its own IOPS limit, separate from the disk. You can attach a very fast volume and still be capped by the instance. Always check both numbers.
Guessing your IOPS requirement is how budgets get wasted. A quick, structured estimate is far better. Here is a simple approach.
This five-step habit replaces the two failure modes most teams fall into: provisioning for an imagined worst case, or under-provisioning and discovering it during an outage.
Here is the part most performance guides skip. In the cloud, IOPS is not free, and provisioned IOPS is one of the easiest line items to overspend on.
On AWS gp3, IOPS above the free 3,000 baseline carries an additional per-IOPS monthly charge, and extra throughput is billed separately too. Provisioned IOPS volumes like io2 add an even higher per-IOPS cost. None of this is expensive on its own. The problem is scale. A few hundred over-provisioned volumes quietly turn into a serious monthly number.
In our experience, over-provisioned IOPS is one of the most common storage cost leaks, and it usually hides because the volume still works fine. Nothing breaks, so nobody looks. Treating storage performance as part of your wider cloud cost optimization effort, rather than a pure engineering setting, is what surfaces this kind of waste.
EXPERT INSIGHT
A pattern we see often: a team provisions io2 with high IOPS for a database launch, traffic never reaches the forecast, and the volume runs for months at a fraction of its provisioned performance. The fix is rarely dramatic. It is usually a switch to gp3, or simply dialing the provisioned IOPS down to match real demand. The savings are real, and the application does not notice the change at all.
Most IOPS problems are not exotic. They come from the same handful of mistakes, on the performance side and the cost side alike.
Ignoring instance-level limits. Attaching a fast volume to an instance that caps IOPS lower than the volume. This is one of several common cloud cost mistakes that quietly inflate an AWS bill while performance still looks acceptable.
Never monitoring actual usage. If you do not track real read and write operations, you cannot tell whether you are over-provisioned or under-provisioned.
Good IOPS management is a balance. You want enough performance for the busy moments and not a dollar more. A few practical habits get you there.
None of these steps are difficult. They simply require treating storage as something you measure and revisit, not something you set once and forget.
Understanding IOPS is the first step. Keeping storage performance and storage spend in balance, across hundreds of volumes, is the harder ongoing job. That is where Opslyft helps.
Opslyft is a FinOps platform that brings visibility and accountability to cloud spend across AWS, Azure, GCP, and Kubernetes, including the storage layer where IOPS costs live. Instead of finding over-provisioned volumes by accident, teams see them clearly.
In practice, Opslyft supports storage and IOPS cost management in a few concrete ways:
The goal is simple. It turns storage performance from a setting nobody revisits into a cost you actively manage.
IOPS is one of the most important storage metrics, yet one of the most misunderstood. It measures how many operations your storage can handle, and it works hand in hand with throughput and latency to decide whether your systems feel fast.
Size IOPS to your real workload, watch how it differs from throughput, and review it regularly. Get that right and you gain something rare in the cloud: strong performance and a storage bill you can actually predict.
IOPS stands for Input/Output Operations Per Second. It measures how many read and write operations a storage device or cloud volume can complete in one second, and it is a core indicator of storage performance.
IOPS counts the number of read and write operations per second, while throughput measures the volume of data moved per second, usually in MB/s. IOPS is about operation count, throughput is about data size. Throughput equals IOPS multiplied by the I/O block size.
There is no universal good number. It depends on the workload. A small website may be fine with a few hundred IOPS, while a busy transactional database can need tens of thousands. The right target is your measured peak usage plus a buffer of 20 to 30 percent.
Cloud providers charge for provisioned IOPS above a baseline. On AWS gp3, the first 3,000 IOPS are included, and anything above that adds a per-IOPS monthly cost. Provisioned IOPS volumes cost even more, so over-provisioning quietly inflates storage bills.
Not always. If your workload is limited by throughput or latency rather than operation count, adding IOPS will not help. You may also be capped by your instance-level IOPS limit. Always match the metric you increase to the bottleneck you actually have.