Cluster Monitoring
Monitor your Kubernetes cluster with Last9 using Prometheus stack for comprehensive metrics collection
Monitor your Kubernetes cluster with Last9 using the Prometheus Kubernetes monitoring stack. This integration provides comprehensive cluster-level metrics, node metrics, and application insights with remote write capabilities to Last9.
Prerequisites
Before setting up Kubernetes cluster monitoring, ensure you have:
- Kubernetes Cluster: A running Kubernetes cluster (v1.19+)
- kubectl: Configured and connected to your cluster
- Helm: Installed (v3.9 or higher)
- Cluster Admin Access: Required for creating cluster-wide resources
- Last9 Account: With Prometheus remote write credentials
-
Create Monitoring Namespace
Create a dedicated namespace for Last9 monitoring components:
kubectl create namespace last9 -
Add Prometheus Community Helm Repository
Add and update the Prometheus community Helm repository:
helm repo add prometheus-community https://prometheus-community.github.io/helm-chartshelm repo update -
Set Up Remote Write Credentials
Create a Kubernetes secret containing your Last9 Prometheus remote write credentials:
kubectl create secret generic last9-remote-write-secret \-n last9 \--from-literal=username="{{ .Metrics.Username }}" \--from-literal=password="{{ .Metrics.WriteToken }}"Replace the placeholder values with your actual Last9 credentials from the Last9 Integrations page.
-
Create Monitoring Configuration
Create a file named
k8s-monitoring-values.yamlwith the following Helm chart values configuration:# Disable default deploymentsalertmanager:enabled: falsegrafana:enabled: falseprometheus:enabled: trueagentMode: trueprometheusSpec:# Enable only necessary scrape configsserviceMonitorSelectorNilUsesHelmValues: falsepodMonitorSelectorNilUsesHelmValues: false# Configure remote write# Global external labels to add to all metricsexternalLabels:cluster: my-cluster-name # Replace with your cluster nameremoteWrite:- url: "{{ .Metrics.WriteURL }}"remoteTimeout: 60squeueConfig:capacity: 10000maxSamplesPerSend: 3000batchSendDeadline: 20sminShards: 4maxShards: 200minBackoff: 100msmaxBackoff: 10sbasicAuth:username:name: last9-remote-write-secretkey: usernamepassword:name: last9-remote-write-secretkey: passwordwriteRelabelConfigs:- sourceLabels: [__name__]regex: "up|kube_.*|container_.*|node_.*" # Keep relevant metricsaction: keep# Keep kube-state-metricskubeStateMetrics:enabled: true# Keep node-exporternodeExporter:enabled: true# Enable cadvisor via kubelet service monitorkubelet:enabled: trueserviceMonitor:resource: true # Enables scraping of cadvisor metricscAdvisor: true# Disable unnecessary componentsprometheusOperator:admissionWebhooks:enabled: falsetls:enabled: false# Disable other exporterskubeApiServer:enabled: truekubeControllerManager:enabled: falsekubeDns:enabled: falsekubeEtcd:enabled: falsekubeProxy:enabled: falsekubeScheduler:enabled: falseConfiguration Explanation:
- Agent Mode: Prometheus runs in agent mode, optimized for remote write without local storage
- Metric Filtering: Only collects essential Kubernetes metrics to reduce data volume
- External Labels: Adds cluster identification to all metrics
- Remote Write: Configured with optimal queue settings for reliable data transmission
-
Install Monitoring Stack
Deploy the Kubernetes monitoring stack using Helm:
helm upgrade --install last9-k8s-monitoring prometheus-community/kube-prometheus-stack \-n last9 \-f k8s-monitoring-values.yaml \--version 75.15.1 \--create-namespaceThis command installs:
- Prometheus Operator: Manages Prometheus instances and configurations
- Prometheus: In agent mode for metric collection and remote write
- kube-state-metrics: Exposes Kubernetes object state as metrics
- node-exporter: Collects hardware and OS metrics from cluster nodes
-
Verify Installation
Check that all monitoring components are running correctly:
kubectl get pods -n last9You should see pods similar to:
NAME READY STATUS RESTARTS AGElast9-k8s-monitoring-kube-state-metrics-xxx-xxx 1/1 Running 0 2mlast9-k8s-monitoring-operator-xxx-xxx 1/1 Running 0 2mlast9-k8s-monitoring-prometheus-node-exporter-xxx 1/1 Running 0 2mprometheus-last9-k8s-monitoring-prometheus-0 2/2 Running 0 2m
Understanding the Setup
Prometheus Agent Mode
The setup uses Prometheus in agent mode, which:
- Optimized for Remote Write: No local storage, designed specifically for forwarding metrics
- Reduced Resource Usage: Lower memory and storage requirements
- Reliable Data Transfer: Built-in queue management and retry logic
- Automatic Discovery: Discovers services and pods automatically via service monitors
Metrics Collected
The monitoring stack automatically collects:
Cluster-Level Metrics
- kube-state-metrics: Kubernetes object states (deployments, pods, services, etc.)
- API Server Metrics: Kubernetes API server performance and availability
- Cluster Resource Usage: CPU, memory, and storage across the cluster
Node-Level Metrics
- node-exporter: Hardware and OS metrics from each node
- kubelet: Container runtime metrics via cAdvisor
- Node Resources: CPU, memory, disk, and network utilization
Container Metrics
- Container Resources: CPU and memory usage per container
- Pod Metrics: Lifecycle, restart counts, and resource requests/limits
- Network Metrics: Network I/O per pod and container
Verification and Monitoring
-
Check Prometheus Remote Write
Verify that Prometheus is successfully sending data to Last9:
kubectl logs -n last9 prometheus-last9-k8s-monitoring-prometheus-0 -c prometheus | grep "remote_write" -
Validate Secret Access
Ensure Prometheus can access the remote write credentials:
kubectl get secret last9-remote-write-secret -n last9 -o yaml -
Monitor Resource Usage
Check resource consumption of monitoring components:
kubectl top pods -n last9 -
Verify Metrics in Last9
Log into your Last9 account and check that Kubernetes metrics are being received in Grafana.
Look for metrics like:
up{job="kube-state-metrics"}kube_pod_infonode_cpu_seconds_totalcontainer_memory_usage_bytes
Configuration Customization
Cluster Identification
Update the external labels to identify your cluster:
prometheus: prometheusSpec: externalLabels: cluster: production-us-east-1 environment: production team: platformMetric Filtering
Customize the write relabel configs to include/exclude specific metrics:
writeRelabelConfigs: - sourceLabels: [__name__] regex: "up|kube_.*|container_.*|node_.*|prometheus_.*" action: keep - sourceLabels: [__name__] regex: "kube_pod_container_status_.*" action: drop # Remove noisy metricsResource Limits
Configure resource limits for monitoring components:
prometheus: prometheusSpec: resources: limits: cpu: 2000m memory: 4Gi requests: cpu: 1000m memory: 2Gi
nodeExporter: resources: limits: cpu: 200m memory: 200Mi requests: cpu: 100m memory: 100MiUninstallation
To remove the monitoring stack:
helm uninstall last9-k8s-monitoring -n last9kubectl delete namespace last9Troubleshooting
Prometheus Not Starting
Check Prometheus logs for configuration issues:
kubectl logs -n last9 prometheus-last9-k8s-monitoring-prometheus-0 -c prometheusRemote Write Failures
Verify credentials and network connectivity:
kubectl describe secret last9-remote-write-secret -n last9kubectl logs -n last9 prometheus-last9-k8s-monitoring-prometheus-0 -c prometheus | grep -i errorHigh Resource Usage
Monitor resource consumption and adjust limits:
kubectl top pods -n last9kubectl describe pod -n last9 prometheus-last9-k8s-monitoring-prometheus-0Missing Metrics
Check service monitor selection and pod discovery:
kubectl get servicemonitors -n last9kubectl get podmonitors -n last9Best Practices
- Cluster Naming: Use consistent cluster naming across environments
- Resource Limits: Set appropriate CPU and memory limits for your cluster size
- Metric Filtering: Filter metrics to reduce costs and improve query performance
- Monitoring: Set up alerts for monitoring stack health and remote write failures
- Updates: Regularly update the Helm chart to get latest features and security fixes
Need Help?
If you encounter any issues or have questions:
- Join our Discord community for real-time support
- Contact our support team at support@last9.io