What is Real User Monitoring

Your dashboards show 99.9% uptime — everything looks healthy. Synthetic tests pass, and backend latency sits comfortably at 50 ms.

Yet, mobile users in Southeast Asia report slow checkouts, and your support team sees a rise in “page not loading” tickets. The product team tests it, and everything seems fine.

That’s where Real User Monitoring helps — it shows how users actually experience your app, across different devices, networks, and regions.

TL;DR

What it is: Real User Monitoring (RUM) collects performance data from real user sessions—not from synthetic or scripted tests. It reveals how your app behaves across actual devices, networks, and usage conditions.

When to use it: Use RUM to understand the real impact on users, debug region-specific or device-specific issues, and link performance insights with outcomes like conversion or engagement rates.

The Problem RUM Solves

Traditional monitoring tells you what’s happening on your servers. Synthetic monitoring shows what could happen in ideal test conditions. But neither tells you what your users are actually going through.

Here’s where the gap shows up:

Scenario 1: The Regional Blind Spot
Your traces show an API response time of 100 ms. Yet, users in India face 4-second page loads—caused by CDN misconfigurations, slow third-party scripts, or poor network routes your synthetic tests never hit.

Scenario 2: The Device Diversity Gap
Your checkout works flawlessly on a high-end MacBook Pro but breaks on older Android phones with unstable 3G connections—the same devices used by nearly 40% of your customers.

Scenario 3: The Business Impact Mystery
Backend metrics look fine, but conversions drop 15%. Without RUM, you’re left guessing. With it, you notice that LCP spiked by 800 ms after a recent deployment—enough delay for users to leave before the CTA appears.

RUM bridges the gap between technical metrics, user experience, and business outcomes. It’s the missing layer of visibility.

💡

For a full breakdown of RUM metrics and how they align with real user experience, read our guide “RUM Metrics Explained”

What Real User Monitoring Is

Real User Monitoring (RUM) is observability from the user’s point of view. Instead of running predefined tests from fixed locations (like synthetic monitoring does), RUM collects telemetry from actual user sessions as they happen — every click, page load, and network hop.

It tracks performance across:

Devices: From high-end desktops to mid-range Android phones and older tablets.
Networks: 5G, LTE, 3G, corporate VPNs, and even flaky café Wi-Fi.
Geographies: Different regions, local ISPs, and how your CDN behaves across them.
Browsers: Chrome, Safari, Firefox, Edge — all with their quirks and versions.
Usage patterns: Peak traffic surges, rare edge cases, and unexpected user flows that tests rarely hit.

RUM data comes from lightweight JavaScript agents embedded in your frontend or from server-side instrumentation. These agents capture timing data, resource loads, user interactions, and errors in real time — giving you a ground-level view of how your app performs where it matters most: in your users’ hands.

RUM in Your Observability Stack

If you already think in terms of the three pillars — metrics, logs, and traces — RUM adds a missing fourth one: user experience.

Your existing observability stack tells you how your systems are behaving. RUM tells you how users are experiencing that behavior.

For example, your traces might show a lightning-fast 50 ms API response. But that doesn’t include:

DNS resolution
SSL handshake
Network latency
Browser parsing and rendering
Third-party script delays
Client-side JavaScript execution

RUM captures all of it.

During incidents, instead of assuming how bad things are for users, you can see it: rising error rates, abandoned sessions, or region-specific slowdowns — all tied to real user impact.

💡

To understand more on how RUM complements synthetic checks and where each fits best, read our comparison of RUM vs Synthetic Monitoring.

RUM vs. Synthetic Monitoring

You don’t have to pick one over the other. Synthetic monitoring and Real User Monitoring (RUM) play different but complementary roles. You can consider synthetic as “what should happen,” and RUM as “what does happen.”

Where Synthetic Monitoring Shines

Synthetic monitoring runs scripted, predictable tests from fixed points. It’s great when you want stability, control, and early warnings.

Use it to:

Validate deployments before they hit production
Establish performance baselines that you can compare over time
Monitor uptime continuously (is your API up right now?)
Catch regressions when you add new features
Benchmark performance vs. competitors from neutral locations

In short, synthetic is reliable, repeatable, and proactive.

When RUM Becomes Essential

RUM captures the real experience of real users: across devices, networks, geographies — all the messy, unpredictable bits that synthetic tests often miss.

Use it to:

Understand real user impact, not just theoretical performance
Spot regional or ISP-specific issues that tests don’t trigger
Tie performance to business outcomes like conversions and churn
Segment by device, app version, or user tier to see who’s affected
Catch issues exposed only under real traffic patterns

RUM gives you the context and clarity that a synthetic alone can't.

How They Work Together

Most teams start with synthetic monitoring—since it’s easier to set up and gives quick guardrails. As the product grows, RUM is layered in to close the visibility gap.

For critical flows (login, checkout, onboarding), run synthetic tests and track RUM data in parallel.
Use synthetic alerts to know when something breaks. Use RUM to see how much it bothers users and where.
Over time, the two reinforce each other: synthetic helps prevent downtime, RUM tells you whether users suffer when things go wrong.

What RUM Measures

Real User Monitoring (RUM) collects telemetry from actual user sessions—data from the browsers, devices, and networks your users rely on. It captures what happens during live interactions, giving you visibility into application performance from the user’s side.

Frontend Performance Metrics

Modern browsers expose detailed timing APIs that RUM agents use to record how a page loads and responds. These numbers define how responsive and stable the experience feels.

Core Web Vitals (Google’s UX metrics):

Largest Contentful Paint (LCP): Marks when the main content becomes visible. Target: under 2.5 seconds. A slower LCP usually points to large images, blocking scripts, or render delays.
First Input Delay (FID) / Interaction to Next Paint (INP): Measures how quickly a page reacts to user actions. Target: under 100 ms for FID or 200 ms for INP. High values indicate JavaScript or main-thread blocking.
Cumulative Layout Shift (CLS): Tracks how much layout elements move while loading. Target: under 0.1. High CLS means the layout reflows too often, affecting usability.

Additional Loading Metrics:

Time to First Byte (TTFB): Time from request to the first byte received. Includes DNS lookup, connection, and backend response.
First Contentful Paint (FCP): When the first visible element appears. Useful for identifying render-blocking resources such as CSS or fonts.
DOM Content Loaded: Indicates when HTML parsing completes and the DOM is ready. Longer times often mean synchronous scripts or large bundles are blocking progress.

These metrics reflect how efficiently your application loads and renders for users—not just how fast your backend responds.

Error Tracking with Context

RUM provides full context for JavaScript errors in production, capturing more than the stack trace. It includes browser version, device details, user actions, active feature flags, and session attributes.

For example:

“Error on checkout page — iPhone 12, iOS 16.3, new-payment-flow flag enabled, TypeError at checkout.js:847.”

With this context, debugging becomes faster and more precise.

Backend Response Patterns

RUM can extend beyond frontend metrics through server-side instrumentation. It helps link client-side delays to backend behavior.

You can track:

Request traces from browser to database
Database query latency under live workloads
Third-party API performance and dependency delays
Resource contention during traffic spikes, such as connection pool saturation or cache misses

This correlation shows how backend performance affects user experience.

User Interaction Data

RUM also captures how users engage with your application, connecting system performance to real behavior.

It can measure:

Click activity: Which elements users interact with most
Form usage: Time to complete, abandonment rate, or field-level friction
Navigation patterns: Common paths and drop-off points
Feature adoption: Which capabilities drive engagement

This data helps teams understand how technical performance influences usability and business outcomes.

💡

If you want a step-by-step walkthrough on integrating RUM in your app, here’s our setup guide.

How RUM Works

Real User Monitoring (RUM) works by observing what happens during real user sessions — from the first click in the browser to the last database query on the backend.

It combines client-side instrumentation with backend traces to give you a complete picture of how your application performs in production.

Client-Side Implementation

RUM usually starts with a lightweight JavaScript snippet added to your frontend. It loads asynchronously, so it doesn’t block the main thread or delay rendering.

// Async RUM initialization - doesn't block page rendering
(function() {
  const rumScript = document.createElement('script');
  rumScript.async = true;
  rumScript.src = 'https://rum-provider.com/collector.js';
  
  rumScript.onload = function() {
    RUM.init({
      apiKey: 'your-api-key',
      service: 'your-service-name',
      environment: 'production',
      sampleRate: 0.1,              // Capture 10% of sessions
      trackInteractions: true,      // Clicks, scrolls, inputs
      trackResources: true,         // CSS, JS, images, fonts
      allowedDomains: ['api.yourdomain.com'],
      metadata: {
        userTier: window.userTier,
        featureFlags: window.activeFlags,
        appVersion: '2.4.1'
      }
    });
  };
  
  const firstScript = document.getElementsByTagName('script')[0];
  firstScript.parentNode.insertBefore(rumScript, firstScript);
})();

Asynchronous loading ensures the monitoring doesn’t interfere with what it’s measuring. Once initialized, the agent begins collecting timing data, resource metrics, and interaction events directly from the user’s browser.

Track Core Web Vitals

Modern browsers expose rich APIs that RUM tools use to measure frontend performance. These metrics — LCP, FID, and CLS — represent how fast your content appears, how quickly it responds, and how stable it feels.

// Track Largest Contentful Paint (LCP)
new PerformanceObserver((entryList) => {
  const entries = entryList.getEntries();
  const lastEntry = entries[entries.length - 1];
  
  sendMetric('lcp', lastEntry.startTime, {
    element: lastEntry.element?.tagName,
    url: lastEntry.url,
    size: lastEntry.size
  });
}).observe({ entryTypes: ['largest-contentful-paint'] });

// Track First Input Delay (FID)
new PerformanceObserver((entryList) => {
  const firstInput = entryList.getEntries()[0];
  const fid = firstInput.processingStart - firstInput.startTime;
  
  sendMetric('fid', fid, {
    eventType: firstInput.name,
    target: firstInput.target?.tagName
  });
}).observe({ entryTypes: ['first-input'], buffered: true });

// Track Cumulative Layout Shift (CLS)
let clsScore = 0;
new PerformanceObserver((entryList) => {
  for (const entry of entryList.getEntries()) {
    if (!entry.hadRecentInput) clsScore += entry.value;
  }
  sendMetric('cls', clsScore);
}).observe({ entryTypes: ['layout-shift'] });

These values help you go from “the page is slow” to “LCP is 4 seconds because the hero image is too large.” It’s precise, actionable insight instead of guesswork.

Capture Errors with Context

Error tracking in production often lacks the context developers need. RUM improves this by attaching details like user environment, active feature flags, and recent actions.

// Standard errors
window.addEventListener('error', (event) => {
  const errorData = {
    message: event.error?.message || event.message,
    filename: event.filename,
    line: event.lineno,
    column: event.colno,
    stack: event.error?.stack,
    userAgent: navigator.userAgent,
    url: window.location.href,
    timestamp: Date.now(),
    userId: getCurrentUserId(),
    sessionId: getSessionId(),
    featureFlags: getActiveFeatureFlags(),
    userTier: getUserTier(),
    recentActions: getRecentUserActions()
  };
  sendErrorMetric(errorData);
});

It also covers unhandled promise rejections and network failures — two common blind spots in modern web apps.

// Unhandled promise rejections
window.addEventListener('unhandledrejection', (event) => {
  sendErrorMetric({
    type: 'unhandled_promise_rejection',
    reason: event.reason?.toString(),
    stack: event.reason?.stack,
    url: window.location.href,
    timestamp: Date.now(),
    asyncContext: getCurrentAsyncOperation()
  });
});

// Track API calls and network errors
const originalFetch = window.fetch;
window.fetch = async function(...args) {
  const startTime = performance.now();
  try {
    const response = await originalFetch.apply(this, args);
    const duration = performance.now() - startTime;
    
    sendMetric('api_call', duration, {
      url: args[0],
      method: args[1]?.method || 'GET',
      status: response.status,
      success: response.ok
    });
    
    return response;
  } catch (error) {
    sendErrorMetric({
      type: 'network_error',
      url: args[0],
      error: error.message
    });
    throw error;
  }
};

With this setup, you not only know that a request failed, but also what the user was doing, which feature flags were active, and which API call caused the issue.

💡

For tasks that run outside the main request flow, see how we approach asynchronous job monitoring.

Server-Side Instrumentation

To connect frontend performance with backend activity, RUM integrates with tracing frameworks like OpenTelemetry. This adds the missing link between a user session and the corresponding backend requests.

from opentelemetry import trace
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.instrumentation.requests import RequestsInstrumentor
from opentelemetry.instrumentation.sqlalchemy import SQLAlchemyInstrumentor

FlaskInstrumentor().instrument()
RequestsInstrumentor().instrument()
SQLAlchemyInstrumentor().instrument()

tracer = trace.get_tracer(__name__)

@app.route('/api/users/<user_id>')
def get_user(user_id):
    with tracer.start_as_current_span("get_user") as span:
        span.set_attribute("user.id", user_id)
        span.set_attribute("endpoint", "/api/users")
        
        user = database.get_user(user_id)
        
        span.set_attribute("user.tier", user.tier)
        span.set_attribute("cache.hit", user.from_cache)
        
        return jsonify(user.to_dict())

When a user reports a slow checkout, this trace helps you follow the path from browser event → API gateway → microservice → database. You see where the latency originates — not just that it exists.

Sampling Strategies

Capturing every session can get expensive fast. Sampling helps you balance insight with cost.

const samplingConfig = {
  errorSampleRate: 1.0, // Always keep errors
  successSampleRate: (session) => {
    if (session.userTier === 'enterprise') return 1.0;
    if (session.userTier === 'paid') return 0.5;
    return 0.1;
  },
  geographicSampleRate: {
    US: 0.1,
    EU: 0.1,
    APAC: 0.3,
    LATAM: 0.3,
    default: 0.2
  },
  featureFlagSampleRate: (flags) => {
    if (flags.includes('beta-checkout-flow')) return 1.0;
    return 0.1;
  }
};

Capture everything critical — errors, enterprise users, problem regions, or new features — and downsample the rest.

Business Metrics Beyond Technical Telemetry

Performance metrics tell you what happened. Business metrics tell you why it matters.
By combining both, you can see how performance affects conversions, retention, or engagement.

// Track feature engagement
document.addEventListener('click', (event) => {
  if (event.target.matches('[data-track-feature]')) {
    const feature = event.target.dataset.trackFeature;
    sendMetric('feature_interaction', 1, {
      feature_name: feature,
      user_tier: getUserTier(),
      page_context: window.location.pathname,
      current_lcp: getCurrentLCP(),
      page_load_time: getPageLoadTime()
    });
  }
});

An increase in LCP by 500ms is interesting.
An increase in LCP by 500ms that coincides with an 8% drop in checkout conversion is something you can act on.

Data Privacy and Compliance

Real User Monitoring captures data from real users, which makes privacy and compliance a core part of any implementation. Handling this data responsibly builds user trust and helps you stay within legal boundaries such as GDPR and CCPA.

PII Scrubbing

Before any telemetry leaves the browser, ensure personally identifiable information (PII) is removed or masked. RUM data should include performance context, not sensitive user data.

// Automatic PII detection and scrubbing
function scrubPII(data) {
  const piiPatterns = {
    email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
    phone: /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g,
    ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
    creditCard: /\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b/g,
    ipAddress: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g
  };
  
  let scrubbedData = JSON.stringify(data);
  
  for (const [type, pattern] of Object.entries(piiPatterns)) {
    scrubbedData = scrubbedData.replace(pattern, `[REDACTED_${type.toUpperCase()}]`);
  }
  
  return JSON.parse(scrubbedData);
}

// Apply to all outgoing RUM data
function sendMetric(name, value, attributes) {
  const scrubbedAttributes = scrubPII(attributes);
  rum.send(name, value, scrubbedAttributes);
}

Automated scrubbing like this ensures your observability pipeline only stores technical signals, not personal information.

Consent Management

Respecting user consent is essential, both for compliance and user trust. Initialize RUM only after confirming that analytics tracking is permitted.

// Initialize RUM only with consent
if (hasUserConsent('analytics')) {
  RUM.init({
    // ... configuration
    
    respectDNT: true,               // Honor browser "Do Not Track"
    dataRetentionDays: 90,          // Limit storage duration
    dataRegion: getUserRegion()     // Keep EU data within the EU
  });
} else {
  // Minimal mode: capture errors only, without identifying data
  RUM.init({
    mode: 'minimal',
    trackErrors: true,
    trackPerformance: false,
    trackSessions: false,
    disableSessionReplay: true
  });
}

This approach ensures that observability remains functional while still honoring privacy choices.

💡

Now, troubleshoot real user performance issues directly from your observability stack with Last9 MCP. Bring together frontend metrics, backend traces, and error context to identify issues faster and improve user experience in real time.

What RUM Won’t Do

Real User Monitoring brings valuable visibility, but like any observability signal, it has clear boundaries:

1. Data has variability

Every user session is different — devices, browsers, networks, and extensions all influence results. Some outliers will always appear, and that’s expected in real-world telemetry.

How to approach it:

Focus on percentiles (p75, p95) instead of averages.
Use segmentation by device, region, or network type for better comparisons.
Track trends over time to understand meaningful changes, not single-session anomalies.

2. Generates large data volumes

RUM captures continuous streams of frontend metrics, events, and user interactions. Without boundaries, this data can grow quickly and become expensive to store or analyze.

How to approach it:

Start with low sampling rates and increase only where needed.
Prioritize important segments — errors, enterprise users, active experiments, or regions under review.
Combine RUM with backend metrics for focused investigations rather than raw data exploration.

3. Some sessions won’t be visible

Not every user interaction will be recorded. Privacy extensions, corporate firewalls, and certain ad blockers may prevent your RUM script from running.

How to approach it:

Use RUM alongside synthetic and server-side monitoring to fill those gaps.
Measure coverage (sessions seen vs. total traffic) to know where visibility ends.
Treat RUM data as representative, not exhaustive.

4. Client-side instrumentation adds cost

RUM runs in the browser, which means some CPU, memory, and bandwidth overhead. If session replay is enabled, DOM capture adds more.

How to approach it:

Load scripts asynchronously to avoid blocking rendering.
Keep session replay sampling low (around 5–10%) and focus on sessions with errors.
Regularly measure the RUM agent’s footprint—its size, load time, and CPU impact.

Use hints like:

<link rel="preconnect" href="https://rum-collector.com">

to reduce connection overhead.

5. Privacy and compliance remain ongoing work

RUM collects data from real people. That makes privacy, consent, and compliance non-negotiable. Different regions enforce different regulations, and those rules evolve.

How to approach it:

Initialize RUM only after user consent is granted.
Scrub PII before export and anonymize all session data.
Enforce data residency and retention based on the user region.
Keep documentation and vendor DPAs up to date.

6. Correlation needs validation

RUM often highlights relationships between performance and user behavior — for instance, slower pages correlating with lower conversions. But these patterns still need testing.

How to approach it:

Use RUM to form hypotheses.
Validate them through controlled experiments or A/B testing.
Track both technical and business metrics to confirm actual impact.

RUM gives you the real-world view of performance. When paired with backend telemetry, synthetic checks, and responsible data practices, it becomes one of the most reliable tools in your observability stack.

How Last9 Helps You with RUM

Real User Monitoring (RUM) in Last9 connects frontend performance to the same telemetry pipeline as your metrics, logs, and traces. Instead of checking separate tools, you see how users actually experience your system — and what’s causing slowdowns underneath.

When a user session shows a 4-second LCP, you can immediately see:

which API call took longer than expected,
what the service latency looked like, and
whether infrastructure limits were hit.

That’s the difference between observing a symptom and understanding its cause.

What You Get

Last9’s RUM setup automatically tracks:

Core Web Vitals: TTFB, FCP, LCP, CLS, INP
Error insights: JavaScript exceptions with full stack traces
Custom events: Checkout, search, and other key interactions
Segmentation: Filter by region, device, app version, or network
OpenTelemetry support: Correlate frontend data with backend traces

The Outcome

One place to view user experience and system performance
Faster debugging across frontend, backend, and infra layers
Performance budgets you can enforce in CI/CD

With RUM built into Last9, user experience isn’t a separate metric — it becomes part of how you observe, debug, and improve your systems.

Getting started takes just a few minutes — follow our RUM setup guide for detailed steps, and if you need any help at any step or would like to understand more about how Last9 fits within your stack, connect with our experts!

FAQs

What's the difference between RUM and APM?

APM (Application Performance Monitoring) focuses on server-side application health: API response times, database queries, backend errors, service dependencies. RUM focuses on client-side user experience: page load times, browser rendering, user interactions, and frontend errors. You need both—APM shows what's happening on your servers, RUM shows what users experience.

Do I need RUM if I have synthetic monitoring?

Yes. Synthetic monitoring runs scripted tests from controlled locations with consistent conditions. It's good for uptime checks and catching regressions. But it can't capture the diversity of real users: different devices, network conditions, geographies, and actual usage patterns. RUM shows what synthetic tests miss.

How much data will RUM generate?

A lot. A medium-traffic site (1M page views/month) with 10% sampling generates roughly 100K sessions worth of data. With session replay at 5% sampling, expect 5K video-like recordings. Storage and processing costs scale with traffic. Start with conservative sampling and increase only where needed.

Will RUM slow down my application?

If implemented poorly, yes. If done right, the overhead is minimal (<50ms). Keys: load RUM script asynchronously, use sampling to reduce data volume, avoid synchronous processing, and monitor the RUM agent's own performance impact. The monitoring shouldn't slow down what you're monitoring.

How do I handle GDPR compliance with RUM?

Obtain user consent before tracking, automatically scrub PII from all events, mask sensitive data in session replays, implement data retention policies (typically 90 days), ensure data residency requirements (EU data in EU regions), allow users to request data deletion, and update your privacy policy. Consult legal counsel for your specific situation.

Can RUM track users across domains?

Yes, with proper implementation. Use shared session identifiers stored in cookies (with consent), configure CORS properly for cross-domain data collection, and ensure your RUM provider supports cross-domain tracking. Common for flows that span multiple domains (e.g., checkout on a separate subdomain).

What's the ROI of implementing RUM?

It varies, but typically: identifying and fixing performance issues that improve conversion rates (a 100ms LCP improvement can increase conversion by 1-5%), reducing debugging time (seeing exact user sessions vs. guessing), preventing revenue loss from undetected errors, and validating that performance investments actually improve user experience. Most teams see ROI within 2-3 months.

How do I connect RUM data with business metrics?

Add business context to RUM events: user tier, subscription plan, cart value, feature flags. Track custom events for business interactions: form completions, feature usage, conversion funnels. Query RUM data alongside business metrics to find patterns. Example: "Users with LCP > 3s have 15% lower checkout completion." Use A/B tests to validate causation.

What's the difference between session replay and screen recording?

Session replay reconstructs user interactions from DOM events—it's not a video recording. It captures clicks, scrolls, form inputs, and page transitions, then rebuilds them as a video-like playback. This makes it much more efficient than screen recording (smaller data size) and enables better privacy controls (you can mask sensitive elements). But it can't capture everything a true screen recording would (like browser extensions or visual bugs outside the DOM).

Should I sample successful sessions differently from error sessions?

Yes. Errors are rare and high-impact, so capture 100% of sessions with errors. Successful sessions can be sampled more aggressively (5-20% depending on traffic). You can also increase sampling for high-value users (enterprise accounts), problem regions (where you see issues), or new features (during testing). This optimizes for signal while managing data volume.

What is Real User Monitoring

Contents

TL;DR

The Problem RUM Solves

What Real User Monitoring Is

RUM in Your Observability Stack

RUM vs. Synthetic Monitoring

What RUM Measures

Frontend Performance Metrics

Backend Response Patterns

How RUM Works

Client-Side Implementation

Server-Side Instrumentation

Data Privacy and Compliance

What RUM Won’t Do

How Last9 Helps You with RUM

FAQs

Contents

Do More with Less

Handcrafted Related Posts

Recap of SRECon Americas 2023

How We Cut Monitoring Costs and Deprecated Thanos at Replit

The Ultimate Guide to Ubuntu Performance Monitoring