DEV Community

Cover image for πŸ“Š New: Amazon CloudWatch Agent Now Supports Detailed EBS Performance Metrics (June 2025)
Latchu@DevOps
Latchu@DevOps

Posted on

πŸ“Š New: Amazon CloudWatch Agent Now Supports Detailed EBS Performance Metrics (June 2025)

Good news for developers, SREs, and cloud engineers β€” Amazon CloudWatch Agent now supports collecting detailed performance statistics for EBS volumes attached to EC2 and EKS nodes.

This means you can finally monitor and troubleshoot your EBS storage like a pro β€” with visibility into NVMe-level metrics such as:

  • πŸ” IOPS (read/write operations)
  • πŸ“¦ Throughput (bytes read/written)
  • ⏱️ I/O wait time
  • 🎯 Queue depth

Let’s break it down with a real-world example.


πŸ”§ Use Case: App is Slow, But CPU & RAM Look Fine?

You’re running a production web app on EC2 with a gp3 EBS volume.
The app gets sluggish during peak hours, but CloudWatch shows:

  • CPU: fine
  • Memory: fine
  • Network: fine

Now, thanks to the new update, you can collect EBS disk-level metrics and discover the real problem.


πŸ§ͺ Step-by-Step Example

Step 1: Enable EBS Metrics in CloudWatch Agent

Update your amazon-cloudwatch-agent.json config:

{
  "metrics": {
    "metrics_collected": {
      "diskio": {
        "resources": ["*"],
        "measurement": [
          "reads", "writes", "read_bytes", "write_bytes",
          "io_time", "await", "util", "queue"
        ],
        "metrics_collection_interval": 60
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Then restart the agent:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config -m ec2 -c file:/path/to/config.json -s
Enter fullscreen mode Exit fullscreen mode

Step 2: View in CloudWatch

You'll now see custom metrics like:

  • await β†’ time the app waits for I/O
  • queue β†’ how many I/O ops are waiting
  • io_time β†’ total time EBS spends on operations
  • read_bytes, write_bytes β†’ data throughput

Step 3: Analyze & Act

During peak load:

  • queue = 22 (too high)
  • await = 120ms (delays noticeable)
  • write_bytes drops sharply

🧠 Root cause: EBS is bottlenecked. Time to provision more IOPS or switch from gp3 to io2.


βœ… Why This Matters

Benefit Impact
Granular storage insights Understand app latency at disk level
Real-time metrics Catch slowdowns before users do
Automation ready Build alarms & dashboards
Works with EC2 + EKS Great for both VMs & containers

🧾 TL;DR

πŸš€ CloudWatch Agent now supports:

  • NVMe-based EBS performance metrics
  • Queue depth, IOPS, throughput, and more
  • Alarms, dashboards, and smarter diagnostics

No more guessing β€” now you can see and solve storage bottlenecks confidently.


Have you started using EBS metrics in your monitoring stack?
Drop your setup or questions in the comments

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.