Metrics

Detailed guide to understanding and using metrics for monitoring your DanubeData resources.

Overview

Metrics provide quantitative measurements of resource performance and health over time.

VPS Metrics

CPU Metrics

CPU Usage %

Current CPU utilization
Range: 0-100%
Alert: > 80% sustained

CPU Load Average

System load over 1, 5, 15 minutes
Varies by CPU count
Alert: > CPU count

CPU Steal

CPU time stolen by hypervisor (should be near 0)
Alert: > 5%

Memory Metrics

Memory Usage %

RAM utilization
Range: 0-100%
Alert: > 85%

Memory Available

Free memory plus cache/buffers
Alert: < 10% of total

Swap Usage

Swap space used
Alert: > 0 (indicates memory pressure)

Disk Metrics

Disk I/O

Read/write MB/s
IOPS (operations per second)

Disk Usage %

Storage consumption
Alert: > 85%

Network Metrics

Network In/Out

Bandwidth usage (MB/s)
Track against allocation

Network Packets

Packets per second
Useful for diagnosing issues

Database Metrics

Performance Metrics

Query Time

Average query execution time
Alert: > 100ms average

Slow Queries

Queries exceeding threshold
Alert: > 10/minute

Throughput

Queries per second
Monitor for capacity planning

Connection Metrics

Active Connections

Current client connections
Alert: > 80% of max_connections

Connection Rate

New connections per second
Alert: Sudden spikes

Cache Metrics

Buffer Cache Hit Rate

Percentage of queries served from cache
Target: > 99%
Alert: < 95%

Cache Size

Memory used for query cache
Monitor for sizing

Replication Metrics

Replication Lag

Delay between primary and replica
Alert: > 5 seconds

Replication Status

Connected/Disconnected status
Alert: Disconnected

Cache (Redis) Metrics

Memory Metrics

Memory Usage

Current RAM consumption
Alert: > 90% of allocated

Memory Fragmentation

Ratio of RSS to used memory
Alert: > 1.5 (consider restart)

Evicted Keys

Keys removed due to memory pressure
Alert: > 100/second

Performance Metrics

Hit Rate

Cache hit percentage
Target: > 90%
Alert: < 80%

Operations/Sec

Commands processed per second
Monitor for capacity

Latency

Average command execution time
Alert: > 10ms

Connection Metrics

Connected Clients

Active connections
Alert: > 80% of max

Blocked Clients

Clients waiting on blocking operations
Alert: Sustained blocked clients

Metric Collection

Data Retention

1-minute granularity: 1 hour
5-minute granularity: 1 day
15-minute granularity: 1 week
1-hour granularity: 30 days

API Access

Fetch metrics programmatically:

curl -X GET \
  https://api.danubedata.com/v1/resources/{id}/metrics \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -d 'metric=cpu_usage&start=2024-10-01T00:00:00Z&end=2024-10-02T00:00:00Z'

Best Practices

Regular Monitoring: Check metrics daily
Set Baselines: Know normal values
Correlate Metrics: Look at multiple metrics together
Trend Analysis: Watch for gradual changes
Alert Configuration: Set meaningful thresholds

Metrics

Overview

VPS Metrics

CPU Metrics

Memory Metrics

Disk Metrics

Network Metrics

Database Metrics

Performance Metrics

Connection Metrics

Cache Metrics

Replication Metrics

Cache (Redis) Metrics

Memory Metrics

Performance Metrics

Connection Metrics

Metric Collection

Data Retention

API Access

Best Practices

Related Documentation