Skip to content
Last9 named a Gartner Cool Vendor in AI for SRE Observability for 2025! Read more →
Last9

Cluster Monitoring

Monitor your Kubernetes cluster with Last9 using Prometheus stack for comprehensive metrics collection

Monitor your Kubernetes cluster with Last9 using the Prometheus Kubernetes monitoring stack. This integration provides comprehensive cluster-level metrics, node metrics, and application insights with remote write capabilities to Last9.

Prerequisites

Before setting up Kubernetes cluster monitoring, ensure you have:

  • Kubernetes Cluster: A running Kubernetes cluster (v1.19+)
  • kubectl: Configured and connected to your cluster
  • Helm: Installed (v3.9 or higher)
  • Cluster Admin Access: Required for creating cluster-wide resources
  • Last9 Account: With Prometheus remote write credentials
  1. Create Monitoring Namespace

    Create a dedicated namespace for Last9 monitoring components:

    kubectl create namespace last9
  2. Add Prometheus Community Helm Repository

    Add and update the Prometheus community Helm repository:

    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    helm repo update
  3. Set Up Remote Write Credentials

    Create a Kubernetes secret containing your Last9 Prometheus remote write credentials:

    kubectl create secret generic last9-remote-write-secret \
    -n last9 \
    --from-literal=username="{{ .Metrics.Username }}" \
    --from-literal=password="{{ .Metrics.WriteToken }}"

    Replace the placeholder values with your actual Last9 credentials from the Last9 Integrations page.

  4. Create Monitoring Configuration

    Create a file named k8s-monitoring-values.yaml with the following Helm chart values configuration:

    # Disable default deployments
    alertmanager:
    enabled: false
    grafana:
    enabled: false
    prometheus:
    enabled: true
    agentMode: true
    prometheusSpec:
    # Enable only necessary scrape configs
    serviceMonitorSelectorNilUsesHelmValues: false
    podMonitorSelectorNilUsesHelmValues: false
    # Configure remote write
    # Global external labels to add to all metrics
    externalLabels:
    cluster: my-cluster-name # Replace with your cluster name
    remoteWrite:
    - url: "{{ .Metrics.WriteURL }}"
    remoteTimeout: 60s
    queueConfig:
    capacity: 10000
    maxSamplesPerSend: 3000
    batchSendDeadline: 20s
    minShards: 4
    maxShards: 200
    minBackoff: 100ms
    maxBackoff: 10s
    basicAuth:
    username:
    name: last9-remote-write-secret
    key: username
    password:
    name: last9-remote-write-secret
    key: password
    writeRelabelConfigs:
    - sourceLabels: [__name__]
    regex: "up|kube_.*|container_.*|node_.*" # Keep relevant metrics
    action: keep
    # Keep kube-state-metrics
    kubeStateMetrics:
    enabled: true
    # Keep node-exporter
    nodeExporter:
    enabled: true
    # Enable cadvisor via kubelet service monitor
    kubelet:
    enabled: true
    serviceMonitor:
    resource: true # Enables scraping of cadvisor metrics
    cAdvisor: true
    # Disable unnecessary components
    prometheusOperator:
    admissionWebhooks:
    enabled: false
    tls:
    enabled: false
    # Disable other exporters
    kubeApiServer:
    enabled: true
    kubeControllerManager:
    enabled: false
    kubeDns:
    enabled: false
    kubeEtcd:
    enabled: false
    kubeProxy:
    enabled: false
    kubeScheduler:
    enabled: false

    Configuration Explanation:

    • Agent Mode: Prometheus runs in agent mode, optimized for remote write without local storage
    • Metric Filtering: Only collects essential Kubernetes metrics to reduce data volume
    • External Labels: Adds cluster identification to all metrics
    • Remote Write: Configured with optimal queue settings for reliable data transmission
  5. Install Monitoring Stack

    Deploy the Kubernetes monitoring stack using Helm:

    helm upgrade --install last9-k8s-monitoring prometheus-community/kube-prometheus-stack \
    -n last9 \
    -f k8s-monitoring-values.yaml \
    --version 75.15.1 \
    --create-namespace

    This command installs:

    • Prometheus Operator: Manages Prometheus instances and configurations
    • Prometheus: In agent mode for metric collection and remote write
    • kube-state-metrics: Exposes Kubernetes object state as metrics
    • node-exporter: Collects hardware and OS metrics from cluster nodes
  6. Verify Installation

    Check that all monitoring components are running correctly:

    kubectl get pods -n last9

    You should see pods similar to:

    NAME READY STATUS RESTARTS AGE
    last9-k8s-monitoring-kube-state-metrics-xxx-xxx 1/1 Running 0 2m
    last9-k8s-monitoring-operator-xxx-xxx 1/1 Running 0 2m
    last9-k8s-monitoring-prometheus-node-exporter-xxx 1/1 Running 0 2m
    prometheus-last9-k8s-monitoring-prometheus-0 2/2 Running 0 2m

Understanding the Setup

Prometheus Agent Mode

The setup uses Prometheus in agent mode, which:

  • Optimized for Remote Write: No local storage, designed specifically for forwarding metrics
  • Reduced Resource Usage: Lower memory and storage requirements
  • Reliable Data Transfer: Built-in queue management and retry logic
  • Automatic Discovery: Discovers services and pods automatically via service monitors

Metrics Collected

The monitoring stack automatically collects:

Cluster-Level Metrics

  • kube-state-metrics: Kubernetes object states (deployments, pods, services, etc.)
  • API Server Metrics: Kubernetes API server performance and availability
  • Cluster Resource Usage: CPU, memory, and storage across the cluster

Node-Level Metrics

  • node-exporter: Hardware and OS metrics from each node
  • kubelet: Container runtime metrics via cAdvisor
  • Node Resources: CPU, memory, disk, and network utilization

Container Metrics

  • Container Resources: CPU and memory usage per container
  • Pod Metrics: Lifecycle, restart counts, and resource requests/limits
  • Network Metrics: Network I/O per pod and container

Verification and Monitoring

  1. Check Prometheus Remote Write

    Verify that Prometheus is successfully sending data to Last9:

    kubectl logs -n last9 prometheus-last9-k8s-monitoring-prometheus-0 -c prometheus | grep "remote_write"
  2. Validate Secret Access

    Ensure Prometheus can access the remote write credentials:

    kubectl get secret last9-remote-write-secret -n last9 -o yaml
  3. Monitor Resource Usage

    Check resource consumption of monitoring components:

    kubectl top pods -n last9
  4. Verify Metrics in Last9

    Log into your Last9 account and check that Kubernetes metrics are being received in Grafana.

    Look for metrics like:

    • up{job="kube-state-metrics"}
    • kube_pod_info
    • node_cpu_seconds_total
    • container_memory_usage_bytes

Configuration Customization

Cluster Identification

Update the external labels to identify your cluster:

prometheus:
prometheusSpec:
externalLabels:
cluster: production-us-east-1
environment: production
team: platform

Metric Filtering

Customize the write relabel configs to include/exclude specific metrics:

writeRelabelConfigs:
- sourceLabels: [__name__]
regex: "up|kube_.*|container_.*|node_.*|prometheus_.*"
action: keep
- sourceLabels: [__name__]
regex: "kube_pod_container_status_.*"
action: drop # Remove noisy metrics

Resource Limits

Configure resource limits for monitoring components:

prometheus:
prometheusSpec:
resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 1000m
memory: 2Gi
nodeExporter:
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi

Uninstallation

To remove the monitoring stack:

helm uninstall last9-k8s-monitoring -n last9
kubectl delete namespace last9

Troubleshooting

Prometheus Not Starting

Check Prometheus logs for configuration issues:

kubectl logs -n last9 prometheus-last9-k8s-monitoring-prometheus-0 -c prometheus

Remote Write Failures

Verify credentials and network connectivity:

kubectl describe secret last9-remote-write-secret -n last9
kubectl logs -n last9 prometheus-last9-k8s-monitoring-prometheus-0 -c prometheus | grep -i error

High Resource Usage

Monitor resource consumption and adjust limits:

kubectl top pods -n last9
kubectl describe pod -n last9 prometheus-last9-k8s-monitoring-prometheus-0

Missing Metrics

Check service monitor selection and pod discovery:

kubectl get servicemonitors -n last9
kubectl get podmonitors -n last9

Best Practices

  • Cluster Naming: Use consistent cluster naming across environments
  • Resource Limits: Set appropriate CPU and memory limits for your cluster size
  • Metric Filtering: Filter metrics to reduce costs and improve query performance
  • Monitoring: Set up alerts for monitoring stack health and remote write failures
  • Updates: Regularly update the Helm chart to get latest features and security fixes

Need Help?

If you encounter any issues or have questions: