Good news for developers, SREs, and cloud engineers β Amazon CloudWatch Agent now supports collecting detailed performance statistics for EBS volumes attached to EC2 and EKS nodes.
This means you can finally monitor and troubleshoot your EBS storage like a pro β with visibility into NVMe-level metrics such as:
- π IOPS (read/write operations)
- π¦ Throughput (bytes read/written)
- β±οΈ I/O wait time
- π― Queue depth
Letβs break it down with a real-world example.
π§ Use Case: App is Slow, But CPU & RAM Look Fine?
Youβre running a production web app on EC2 with a gp3 EBS volume.
The app gets sluggish during peak hours, but CloudWatch shows:
- CPU: fine
- Memory: fine
- Network: fine
Now, thanks to the new update, you can collect EBS disk-level metrics and discover the real problem.
π§ͺ Step-by-Step Example
Step 1: Enable EBS Metrics in CloudWatch Agent
Update your amazon-cloudwatch-agent.json config:
{
"metrics": {
"metrics_collected": {
"diskio": {
"resources": ["*"],
"measurement": [
"reads", "writes", "read_bytes", "write_bytes",
"io_time", "await", "util", "queue"
],
"metrics_collection_interval": 60
}
}
}
}
Then restart the agent:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config -m ec2 -c file:/path/to/config.json -s
Step 2: View in CloudWatch
You'll now see custom metrics like:
- await β time the app waits for I/O
- queue β how many I/O ops are waiting
- io_time β total time EBS spends on operations
- read_bytes, write_bytes β data throughput
Step 3: Analyze & Act
During peak load:
- queue = 22 (too high)
- await = 120ms (delays noticeable)
- write_bytes drops sharply
π§ Root cause: EBS is bottlenecked. Time to provision more IOPS or switch from gp3 to io2.
β Why This Matters
Benefit | Impact |
---|---|
Granular storage insights | Understand app latency at disk level |
Real-time metrics | Catch slowdowns before users do |
Automation ready | Build alarms & dashboards |
Works with EC2 + EKS | Great for both VMs & containers |
π§Ύ TL;DR
π CloudWatch Agent now supports:
- NVMe-based EBS performance metrics
- Queue depth, IOPS, throughput, and more
- Alarms, dashboards, and smarter diagnostics
No more guessing β now you can see and solve storage bottlenecks confidently.
Have you started using EBS metrics in your monitoring stack?
Drop your setup or questions in the comments
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.