Last9 Last9

Nov 21st, ‘24 / 4 min read

Extracting Account-Level CDN Metrics from Akamai Logs with Last9

Learn how to extract and analyze account-level CDN metrics from Akamai logs using Last9 for real-time insights and better customer tracking.

Extracting Account-Level CDN Metrics from Akamai Logs with Last9

As our multi-tenant SaaS platform grew to serve thousands of customers, we faced a critical challenge: understanding CDN usage patterns per customer account.

While Akamai provides excellent CDN services, getting granular, account-level metrics for bytes transferred isn't straightforward.

Here's how we solved this using log parsing and Last9.

The Challenge: Why Metrics Weren't Enough

〽️
"Can you tell me how much bandwidth customer X used last month?"

This seemingly simple question from our billing team sent us down a rabbit hole. While Akamai's built-in metrics are great for overall CDN monitoring, they don't provide the account-level granularity we needed.

Our application serves content through URLs like:

<https://cdn.example.com/accounts/{account-id}/assets/image.jpg>

Each request contains vital information about which account is consuming bandwidth, but this information isn't automatically parsed into metrics. Instead, it lives buried in our CDN logs.

Requirements: What We Needed to Solve

  • Extract account IDs from request URLs in real-time
  • Calculate bytes transferred per account
  • Create time-series data for trending and analysis
  • Set up alerting for unusual patterns
  • Correlate this data with other parts of our stack

The Solution: From Logs to Actionable Metrics

Step 1: Log Ingestion and Parsing

First, we needed to extract account information at ingestion time rather than query time.

This approach has several advantages:

Log Ingestion and Parsing
Log Ingestion and Parsing
  • Reduced query processing overhead
  • Faster dashboard rendering
  • More efficient storage use

Using Last9's log processing pipeline, we set up custom parsing rules to extract account IDs from our URLs.

Here's what this looks like in practice:

When a log line arrives containing:

GET /accounts/12345/assets/logo.png 200 1048576 bytes

We extract and attach labels:

bytes_transferred{account_id="12345"} 1048576

Step 2: Aggregation and Storage

Aggregation and Storage
Aggregation and Storage

Once we had the labeled data, we needed to aggregate it effectively. Last9's metric aggregation allows us to:

  • Sum bytes transferred per account in 1-minute intervals
  • Maintain historical data for trend analysis
  • Create roll-ups for different time windows (hourly, daily, monthly)

This gives us queries like:

sum(bytes_transferred[1h]) by (account_id)

Step 3: Setting Up Alerts

Setting Up Alerts
Setting Up Alerts

With our data properly labeled and aggregated, we set up several types of alerts:

  1. Usage Spikes: Alert when an account's bandwidth usage increases by 3x their normal pattern
  2. Quota Monitoring: Notify when accounts approach their bandwidth limits
  3. Anomaly Detection: Flag unusual patterns that might indicate security issues
⚠️
Last9's alerting system lets us define these conditions using our labeled metrics and send notifications through multiple channels (Slack, PagerDuty, email).

Step 4: Cross-Stack Correlation

Cross-Stack Correlation
Cross-Stack Correlation

The real power came when we started correlating this data across our stack. By using consistent account ID labels, we could now:

  • Compare CDN usage with API calls
  • Track end-to-end request flows
  • Identify performance bottlenecks per customer
For example, we could now answer questions like: "Is high CDN usage for account X correlating with increased API latency?"

Implementation Tips and Tricks

After implementing this solution across several environments, here are key lessons learned:

  1. Label Cardinality: Be careful with high-cardinality labels. While account IDs are important, we also maintain a separate lookup for account name/details to keep cardinality manageable.
  2. Aggregation Timing: Process and aggregate logs as close to ingestion as possible. This reduces storage costs and query latency.
  3. Historical Data: Keep raw logs for a shorter period (7-14 days) but maintain aggregated metrics for longer (months/years) for trending.
  4. Alert Tuning: Start with conservative thresholds and adjust based on actual patterns. We began with:
    • 3x usage increase over 24-hour average
    • 80% of quota warnings
    • Minimum baseline thresholds to avoid noise
📈
Last9 is designed to handle massive volumes of data and efficiently run queries with high cardinality. Most importantly, it provides real-time insights into unused metrics.

Looking Forward: Future Improvements

We're currently working on several enhancements:

  1. Machine Learning: Using historical patterns to predict future usage and detect anomalies more accurately.
  2. Cost Attribution: Directly linking CDN costs to customer accounts for better business metrics.
  3. Real-time Dashboard: Building customer-facing dashboards showing CDN usage and trends.
💡
For the latest product updates, check out our changelog!

Conclusion

While getting account-level CDN metrics from Akamai required some creative log parsing, the end result has been invaluable for our operations.

Using Last9's capabilities for log processing, metric aggregation, and alerting, we've built a robust system that gives us the granular visibility we need.

The ability to track and alert on per-account CDN usage has helped us:

  • Optimize costs through a better understanding of usage patterns
  • Improve customer experience by catching issues early
  • Make data-driven decisions about infrastructure scaling

Remember, logs contain valuable information that often isn't captured in standard metrics. You can transform this data into actionable insights with the right tools and approach.

🤝
Have questions about implementing similar monitoring for your Akamai setup? Feel free to reach out to us on Twitter @last9io

Contents


Newsletter

Stay updated on the latest from Last9.

Authors
Prathamesh Sonpatki

Prathamesh Sonpatki

Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

X
Aditya Godbole

Aditya Godbole

CTO at Last9