Vibe monitoring with Last9 MCP: Ask your agent to fix production issues! Setup →
Last9 Last9

Apr 30th, ‘25 / 10 min read

AWS Centralized Logging: A Complete Implementation Guide

Learn how to set up centralized logging in AWS, from basic setup to advanced implementations, with troubleshooting tips for smooth operations.

AWS Centralized Logging: A Complete Implementation Guide

In cloud environments, logs are often spread across numerous services, making it difficult to track down issues or gather meaningful insights. For AWS users, this challenge can become especially time-consuming. Centralized logging in AWS helps by bringing all your logs into a single platform, making management and analysis easier.

This guide covers everything DevOps engineers need to know about setting up centralized logging in AWS, from the basics to advanced setups, along with troubleshooting tips for resolving common issues.

What is AWS Centralized Logging?

AWS centralized logging is the practice of collecting, storing, and analyzing logs from multiple AWS services in a single location. Instead of jumping between CloudWatch, S3, and various application logs, you get one unified view of your entire infrastructure.

The core benefit? You can spot patterns, troubleshoot faster, and get better insights across your entire AWS environment without the headache of context-switching between different services.

💡
If you’re looking to improve your AWS workflow automation, check out our article on AWS Step Functions.

Benefits of Centralized Logging for DevOps Teams

The typical AWS infrastructure generates tons of logs from EC2 instances, Lambda functions, API Gateway, RDS, and dozens of other services. Without centralization, you're essentially:

  • Wasting time hunting down logs across different services
  • Missing connections between related events
  • Struggling to get a complete picture during incidents
  • Finding it impossible to set up meaningful alerts

According to AWS, teams using centralized logging typically cut their incident response times by 30-50%. That's not just faster fixes—it's better uptime and happier customers.

Choosing the Right AWS Logging Architecture for Your Needs

Before diving into architectures, let's review the key AWS services that generate logs you'll want to centralize:

AWS Service Log Type Description
EC2 Application logs, system logs Logs from your applications and the OS
CloudTrail API activity logs Records all AWS API calls made in your account
VPC Flow Logs Network traffic logs Captures IP traffic going to and from network interfaces
AWS Config Configuration change logs Records configuration changes to your AWS resources
AWS WAF Web application firewall logs Logs traffic patterns and blocked attacks
Load Balancers Access logs Records client connections and requests
RDS Database logs Includes error logs, audit logs, and slow query logs
Lambda Function execution logs Records function invocation and execution details

Now, let's look at the three most popular approaches for AWS centralized logging:

CloudWatch Logs + Insights: The AWS Native Solution

The simplest approach uses AWS's built-in services:

  1. All AWS services send logs to CloudWatch Logs
  2. CloudWatch Log Insights provides the search and analysis layer
  3. CloudWatch Dashboards visualize the important metrics

Pros: Native AWS integration, minimal setup, works with most AWS services out of the box

.Cons: Limited retention options, can get expensive at scale, less flexible for custom analysis

💡
To learn how to reduce your AWS CloudWatch costs, check out our article on cutting down AWS CloudWatch costs.

ELK Stack on AWS: The Open Source Powerhouse

For more power and flexibility:

  1. AWS services send logs to a collection pipeline (Logstash or Fluentd)
  2. Logs are processed and stored in Elasticsearch
  3. Kibana provides visualization and search capabilities

Pros: Highly customizable, powerful search capabilities, great visualizations. Cons: More complex to set up and maintain, requires separate infrastructure

S3 + Athena + QuickSight: The Cost-Effective Data Lake Approach

The data lake approach:

  1. Logs are stored in S3 buckets (can be automated with Log Forwarding)
  2. AWS Athena runs SQL queries against the log data
  3. QuickSight creates dashboards and visualizations

Pros: Cost-effective for long-term storage, works well for compliance, scales infinitely

Cons: Not real-time, requires SQL knowledge, more setup work

Step-by-Step AWS Centralized Logging Setup

Let's walk through implementing a CloudWatch-based centralized logging solution, which offers the best balance of simplicity and power for most teams.

Step 1: Configuring Comprehensive Log Collection Across Services

First, ensure all your AWS services are sending logs to CloudWatch:

# Example: Configure CloudWatch agent on EC2 instance
sudo yum install -y amazon-cloudwatch-agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json

For Lambda functions, update your function configuration:

Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: function/
      Handler: app.handler
      Runtime: nodejs14.x
      Tracing: Active
      Policies:
        - CloudWatchLambdaInsightsExecutionRolePolicy
      Layers:
        - !Sub "arn:aws:lambda:${AWS::Region}:580247275435:layer:LambdaInsightsExtension:14"

Step 2: Optimizing Log Retention for Compliance and Cost Balance

Manage your log retention periods to balance cost and compliance:

# Set 30-day retention for production logs
aws logs put-retention-policy --log-group-name "/aws/lambda/production-api" --retention-in-days 30

# Set 7-day retention for development logs
aws logs put-retention-policy --log-group-name "/aws/lambda/development-api" --retention-in-days 7

Step 3: Implementing Real-time Log Processing with Subscriptions

To forward logs to other services:

# Create subscription filter to send logs to a Lambda processor
aws logs put-subscription-filter \
    --log-group-name "/aws/lambda/api-gateway-logs" \
    --filter-name "ErrorProcessor" \
    --filter-pattern "ERROR" \
    --destination-arn "arn:aws:lambda:us-east-1:123456789012:function:LogProcessor"

Step 4: Building Actionable Dashboards for Operational Visibility

Build dashboards for your most important metrics:

aws cloudwatch put-dashboard \
  --dashboard-name "ServiceHealthOverview" \
  --dashboard-body '{
    "widgets": [
      {
        "type": "log",
        "x": 0,
        "y": 0,
        "width": 24,
        "height": 6,
        "properties": {
          "query": "SOURCE '/aws/lambda/production-api' | fields @timestamp, @message\n| filter @message like /ERROR/\n| sort @timestamp desc\n| limit 20",
          "region": "us-east-1",
          "title": "Recent API Errors",
          "view": "table"
        }
      }
    ]
  }'

Step 5: Scaling to Enterprise: Multi-Account Logging Strategies

For multi-account setups:

  1. In the destination account, create a CloudWatch Logs destination
  2. Set up IAM permissions to allow the source account to write to it
  3. Create subscription filters in the source accounts pointing to the destination

Here's a simplified version of the IAM policy required:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::SOURCE_ACCOUNT_ID:root"
      },
      "Action": "logs:PutSubscriptionFilter",
      "Resource": "arn:aws:logs:REGION:DESTINATION_ACCOUNT_ID:destination:DESTINATION_NAME"
    }
  ]
}
💡
If you’re looking to track and monitor AWS activity more effectively, check out our article on AWS CloudTrail.

Advanced Techniques to Take Your AWS Logging to the Next Level

Once you have the basics running, consider these advanced techniques:

Leveraging Kinesis Firehose for Flexible Log Delivery and Storage

For more flexibility in where your logs end up, Amazon Kinesis Data Firehose is a game-changer:

# Create a Firehose delivery stream to S3
aws firehose create-delivery-stream \
    --delivery-stream-name "centralized-logs-to-s3" \
    --delivery-stream-type DirectPut \
    --s3-destination-configuration \
        "RoleARN=arn:aws:iam::123456789012:role/FirehoseS3DeliveryRole,\
        BucketARN=arn:aws:s3:::centralized-logs-bucket,\
        Prefix=logs/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/,\
        ErrorOutputPrefix=errors/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/!{firehose:error-output-type}/,\
        BufferingHints={SizeInMBs=128,IntervalInSeconds=300}"

# Set up CloudWatch to send logs to Firehose
aws logs put-subscription-filter \
    --log-group-name "/aws/lambda/production-api" \
    --filter-name "SendToFirehose" \
    --filter-pattern "" \
    --destination-arn "arn:aws:firehose:us-east-1:123456789012:deliverystream/centralized-logs-to-s3" \
    --role-arn "arn:aws:iam::123456789012:role/CWLtoKinesisFirehoseRole"

Protecting Your Log Data: Essential Security Measures and Encryption

Protect your log data with these critical security measures:

  1. Set up CloudTrail for logging access to your log data

Implement strict IAM policies for log access:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:StartQuery",
        "logs:GetQueryResults"
      ],
      "Resource": "arn:aws:logs:us-east-1:123456789012:log-group:/aws/lambda/*",
      "Condition": {
        "StringEquals": {
          "aws:PrincipalTag/Department": "SecurityTeam"
        }
      }
    }
  ]
}

Encrypt logs at rest:

# Enable encryption on CloudWatch Log Group
aws logs create-log-group \
    --log-group-name "/aws/lambda/secure-service" \
    --kms-key-id "arn:aws:kms:us-east-1:123456789012:key/abcd1234-ab12-cd34-ef56-abcdef123456"

Extracting Insights with CloudWatch Logs Insights

CloudWatch Logs Insights lets you run powerful queries across your logs:

fields @timestamp, @message
| filter @message like /Exception/
| parse @message "user: *, action: *" as user, action
| stats count(*) as exceptionCount by user, action
| sort exceptionCount desc
| limit 10

This query finds the top 10 user/action combinations causing exceptions.

Automated Responses with EventBridge

Set up automated responses to specific log patterns:

  1. Create a CloudWatch Logs Metric Filter to detect patterns
  2. Set a CloudWatch Alarm on that metric
  3. Configure an EventBridge rule to trigger automated actions

For example, automatically restart a service when it logs specific error messages:

For example, automatically restart a service when it logs specific error messages:

# Create metric filter
aws logs put-metric-filter \
  --log-group-name "API-Gateway-Execution-Logs" \
  --filter-name "5xxErrorFilter" \
  --filter-pattern '{ $.status = 5* }' \
  --metric-transformations \
      metricName=5xxErrors,metricNamespace=APIGateway,metricValue=1

# Create alarm
aws cloudwatch put-metric-alarm \
  --alarm-name "APIGateway5xxAlarm" \
  --metric-name "5xxErrors" \
  --namespace "APIGateway" \
  --threshold 5 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --evaluation-periods 1 \
  --period 60 \
  --statistic Sum \
  --alarm-actions "arn:aws:sns:us-east-1:123456789012:AlertTopic"
💡
If you're looking to improve security in your AWS environment, check out our article on getting started with AWS WAF.

How to Enhance Visibility with Third-Party Platforms

While AWS's native tools are useful, specialized observability platforms provide more advanced capabilities.

Our platform, Last9, is designed specifically for high-cardinality observability at scale—essential as your AWS infrastructure expands. Unlike more expensive alternatives, we offer a managed observability solution that fits within your budget.

What sets our platform apart in AWS centralized logging is its integration with OpenTelemetry and Prometheus. This allows you to unify your metrics, logs, and traces for complete visibility.

Many teams rely on Last9 for real-time insights with correlated monitoring and alerting—something proven valuable for companies like Probo, CleverTap, and Replit that require reliable monitoring for high-scale operations.

Other options that integrate well with AWS include:

  • Grafana
  • Dynatrace
  • Sumo Logic

Optimization Strategies for AWS Centralized Logging

Logging costs can add up quickly. Here's how to keep them in check:

Strategic Log Retention: Balancing Compliance Requirements with Costs

Not all logs are equally valuable:

Log Type Suggested Retention Reasoning
Security Audit Logs 1+ years Compliance requirements, breach investigations
Production Error Logs 30-90 days Troubleshooting, pattern analysis
Debug/Verbose Logs 3-7 days Short-term debugging only
Access Logs 30 days Traffic analysis, security investigations

Reducing Volume While Maintaining Visibility

For busy services, consider sampling logs instead of recording everything:

import random

def lambda_handler(event, context):
    # Only log 10% of successful events
    if event['success'] == True and random.random() > 0.1:
        return
    
    # Always log errors
    print(f"Event ID: {event['id']}, Status: {event['status']}")

Log Compression Techniques for Long-term Archiving

When storing logs in S3, use compression:

# Enable log compression in CloudWatch
aws logs put-retention-policy \
    --log-group-name "/aws/lambda/high-volume-service" \
    --retention-in-days 1

# Set up Lambda to process and compress logs to S3
# (Include Lambda function code that compresses logs before writing to S3)
💡
Now, fix production log issues instantly—right from your IDE, with AI and Last9 MCP. Bring real-time production context—logs, metrics, and traces—into your local environment to auto-fix code faster. Setup here!

Troubleshooting Your AWS Centralized Logging Setup

Even well-designed logging systems run into issues. Here's how to tackle the most common problems:

Solving Disappearing Data: Fixing Missing Log Issues

If logs aren't showing up where expected:

  1. Check IAM permissions for the logging services
  2. Verify the CloudWatch agent is running (for EC2 instances)
  3. Look for throttling in the AWS CloudTrail logs
  4. Confirm log group names match exactly what you're querying

Example fix for IAM permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:*:*"
    }
  ]
}

Reducing Log Delivery Delays: Addressing High Latency Problems

If logs are delayed:

  1. Look for network bottlenecks between services
  2. Check if you're hitting CloudWatch API limits
  3. Consider batching log writes more efficiently
  4. Monitor CloudWatch service health in the AWS status page

Standardizing Formats and Fixing Parsing Errors

When logs aren't parsing correctly:

  1. Standardize log formats across services
  2. Use structured logging (JSON format) where possible
  3. Create test cases for your parsing logic
  4. Update filter patterns to match the actual format

Example of good structured logging in Node.js:

const logger = require('pino')();

function processRequest(req) {
  logger.info({
    requestId: req.id,
    user: req.user,
    action: req.action,
    duration: performance.now() - req.startTime,
    status: "success"
  });
}

Controlling Unexpected Logging Costs

If your logging costs spike:

  1. Look for runaway logging in specific services
  2. Check for recursive logging patterns (logs about logs)
  3. Review and adjust retention periods
  4. Implement log sampling for high-volume, low-value logs
💡
Last9's Alert Studio is an end-to-end alerting tool built to handle high cardinality use cases. It's designed to reduce alert fatigue and improve MTTD. Check it out here!

How to Set Up Effective Logging Alerts

Proactive monitoring is crucial for effective operations:

Creating Smart Alert Thresholds

Set up graduated alerting based on severity:

Log Pattern Threshold Action Recovery Time
5xx Errors >10 in 5 min Slack notification Low: Investigate during work hours
5xx Errors >50 in 5 min PagerDuty alert Medium: Investigate within 1 hour
5xx Errors >200 in 5 min Incident declared High: Immediate response
Auth Failures >20 in 10 min Security team alert Medium: Investigate within 1 hour
Latency >2s >5% of requests Performance team notification Low: Investigate during work hours

Setting Up CloudWatch Alarms for Log Metrics

# Create metric filter for critical errors
aws logs put-metric-filter \
  --log-group-name "/aws/lambda/payment-processor" \
  --filter-name "CriticalErrorFilter" \
  --filter-pattern "ERROR" \
  --metric-transformations \
      metricName=CriticalErrors,metricNamespace=PaymentService,metricValue=1

# Create alarm with SNS notification
aws cloudwatch put-metric-alarm \
  --alarm-name "PaymentProcessorCriticalErrors" \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 1 \
  --metric-name "CriticalErrors" \
  --namespace "PaymentService" \
  --period 300 \
  --statistic Sum \
  --threshold 5 \
  --alarm-description "Alarm when 5 or more critical errors occur within 5 minutes" \
  --alarm-actions "arn:aws:sns:us-east-1:123456789012:AlertsTopic"

Conclusion

To wrap things up, setting up centralized logging in AWS helps bring clarity to your log management, making it easier to troubleshoot and monitor your cloud environment.

By combining the right AWS services and configurations, you’ll simplify the process and gain valuable insights. As your system grows, centralized logging will save you time and keep things running smoothly.

💡
And if you’d like to discuss anything further, our Discord community is here for you. We have a dedicated channel where you can connect with other developers and share your specific use case.

FAQs

Q: How much does AWS centralized logging cost? A: The cost varies based on volume and retention. For a medium-sized application, expect $200-500/month using native AWS services. Costs can be higher with third-party tools, but often come with more capabilities.

Q: Can I use AWS centralized logging for compliance requirements? A: Yes, with proper configuration. For regulations like HIPAA, PCI-DSS, or SOC2, you'll need to set appropriate retention periods, encryption settings, and access controls.

Q: How do I handle log data across multiple AWS regions? A: You can either set up regional log aggregation points or forward all logs to a central region. The best approach depends on your latency requirements and data residency needs.

Q: What's the best way to handle PII in logs? A: Implement log scrubbing before centralization. Use regex patterns to identify and mask sensitive data like credit cards, emails, and personal information.

Q: How do I monitor the health of my logging system itself? A: Set up metrics on log volume, ingestion rates, and query performance. Alert on unexpected drops in log volume, which often indicate logging failures.

Q: Can serverless applications use the same centralized logging approach? A: Yes, though the implementation differs slightly. Lambda functions automatically integrate with CloudWatch Logs, but you may need custom code for forwarding to other destinations.

Contents


Newsletter

Stay updated on the latest from Last9.

Authors
Anjali Udasi

Anjali Udasi

Helping to make the tech a little less intimidating. I love breaking down complex concepts into easy-to-understand terms.