Skip to content
Last9
Book demo

GraphQL

Instrument your GraphQL server (Apollo Server, GraphQL Yoga, or any Node.js GraphQL framework) with OpenTelemetry to send traces and metrics to Last9

Use OpenTelemetry to instrument your Node.js GraphQL server and send telemetry data to Last9. This integration provides automatic instrumentation for GraphQL operations, DataLoader batching, and outbound HTTP calls — without instrumenting individual field resolvers, which keeps trace volume manageable at production scale.

You can either run an OpenTelemetry Collector as a sidecar or send telemetry directly from your application to Last9.

Prerequisites

Before setting up GraphQL monitoring, ensure you have:

  • Node.js 16.0 or higher
  • A GraphQL server using Apollo Server, GraphQL Yoga, or similar
  • Last9 account with integration credentials
  • npm or yarn package manager
  1. Install OpenTelemetry Packages

    Install the core OTel packages plus the GraphQL instrumentation:

    npm install \
    @opentelemetry/api \
    @opentelemetry/auto-instrumentations-node \
    @opentelemetry/exporter-metrics-otlp-proto \
    @opentelemetry/exporter-trace-otlp-proto \
    @opentelemetry/sdk-metrics \
    @opentelemetry/sdk-node \
    @opentelemetry/sdk-trace-node

    @opentelemetry/auto-instrumentations-node bundles @opentelemetry/instrumentation-graphql, so no separate GraphQL package is needed.

  2. Set Environment Variables

    Configure the following environment variables before starting your application:

    export OTEL_SERVICE_NAME="your-graphql-service"
    export OTEL_EXPORTER_OTLP_ENDPOINT="{{ .Logs.WriteURL }}"
    export OTEL_EXPORTER_OTLP_HEADERS="Authorization={{ .Logs.AuthValue }}"
    export OTEL_RESOURCE_ATTRIBUTES="deployment.environment=production"

    If you are routing through a local OpenTelemetry Collector instead of sending directly to Last9, set:

    export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
  3. Create Instrumentation File

    Create instrumentation.ts (or instrumentation.js) in your project root. This file must be imported before any other module in your entry point.

    // instrumentation.ts
    import { SpanStatusCode } from "@opentelemetry/api";
    import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node";
    import { OTLPMetricExporter } from "@opentelemetry/exporter-metrics-otlp-proto";
    import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
    import { PeriodicExportingMetricReader } from "@opentelemetry/sdk-metrics";
    import { NodeSDK } from "@opentelemetry/sdk-node";
    import {
    AlwaysOnSampler,
    BatchSpanProcessor,
    ParentBasedSampler,
    } from "@opentelemetry/sdk-trace-node";
    const OTLP_ENDPOINT =
    process.env.OTEL_EXPORTER_OTLP_ENDPOINT || "http://localhost:4318";
    const SERVICE_NAME = process.env.OTEL_SERVICE_NAME || "graphql-service";
    function parseHeaders(raw: string): Record<string, string> {
    if (!raw) return {};
    return raw.split(",").reduce(
    (acc, pair) => {
    const idx = pair.indexOf("=");
    if (idx > 0)
    acc[pair.slice(0, idx).trim()] = pair.slice(idx + 1).trim();
    return acc;
    },
    {} as Record<string, string>,
    );
    }
    const sdk = new NodeSDK({
    serviceName: SERVICE_NAME,
    spanProcessor: new BatchSpanProcessor(
    new OTLPTraceExporter({
    url: `${OTLP_ENDPOINT}/v1/traces`,
    headers: parseHeaders(process.env.OTEL_EXPORTER_OTLP_HEADERS || ""),
    }),
    {
    maxQueueSize: 4096,
    maxExportBatchSize: 512,
    scheduledDelayMillis: 5000,
    exportTimeoutMillis: 30_000,
    },
    ),
    metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
    url: `${OTLP_ENDPOINT}/v1/metrics`,
    headers: parseHeaders(process.env.OTEL_EXPORTER_OTLP_HEADERS || ""),
    }),
    exportIntervalMillis: 30_000,
    }),
    sampler: new ParentBasedSampler({ root: new AlwaysOnSampler() }),
    instrumentations: [
    getNodeAutoInstrumentations({
    "@opentelemetry/instrumentation-graphql": {
    // Disable per-resolver spans — they generate O(fields) spans per request
    // and add no value in Last9's APM views, which are built on SERVER/CLIENT spans.
    ignoreResolveSpans: true,
    // Track GraphQL-level errors (distinct from HTTP 500s) as span attributes.
    // This is set in responseHook (after execution) rather than the sampler,
    // where attributes are not yet available.
    responseHook: (span: any, info: any) => {
    const raw =
    info?.errors ??
    info?.result?.errors ??
    info?.response?.errors ??
    null;
    const errs = Array.isArray(raw) ? raw : raw ? [raw] : [];
    const hasErr = errs.length > 0;
    span.setAttribute(
    "graphql.execute.error",
    hasErr ? "true" : "false",
    );
    if (hasErr) {
    span.setStatus({
    code: SpanStatusCode.ERROR,
    message: "GraphQL operation failed",
    });
    span.setAttribute(
    "graphql.error.message",
    errs[0]?.message || errs[0]?.toString() || "unknown error",
    );
    }
    },
    allowValues: false,
    },
    "@opentelemetry/instrumentation-http": {
    headersToSpanAttributes: {
    server: {
    requestHeaders: ["content-type", "user-agent"],
    responseHeaders: ["content-type"],
    },
    },
    },
    // Disable high-noise, low-signal instrumentations
    "@opentelemetry/instrumentation-fs": { enabled: false },
    "@opentelemetry/instrumentation-dns": { enabled: false },
    "@opentelemetry/instrumentation-net": { enabled: false },
    }),
    ],
    });
    sdk.start();
    const shutdown = async () => {
    try {
    await sdk.shutdown();
    } catch (_e) {
    /* best-effort */
    }
    };
    process.on("SIGTERM", shutdown);
    process.on("SIGINT", shutdown);
    console.log(
    `[otel] Initialised — exporting to ${OTLP_ENDPOINT} as "${SERVICE_NAME}"`,
    );
  4. Import Instrumentation First

    Import the instrumentation file at the very top of your server entry point, before Apollo Server or any other imports:

    // server.ts
    import "./instrumentation"; // ← must be first
    import { ApolloServer } from "@apollo/server";
    import { startStandaloneServer } from "@apollo/server/standalone";
    // ... rest of your imports

    Important: Loading order matters. Importing any GraphQL or HTTP library before the instrumentation file means those modules are already loaded and patching them has no effect.

  5. Start Your Server

    Start your application as normal:

    # Development
    npx ts-node server.ts
    # Production (PM2 example)
    pm2 start server.js --name graphql-service
    # Production (Node.js)
    node server.js

What Gets Instrumented

When using automatic GraphQL instrumentation, OpenTelemetry captures:

GraphQL Operations

  • Operation name, type (query, mutation, subscription), and execution time
  • GraphQL-level errors (distinct from HTTP errors) via graphql.execute.error attribute
  • Operation field values (when allowValues: true — disabled by default to prevent PII leakage into span attributes)

HTTP Layer

  • Incoming POST /graphql requests with method, route, and status code
  • Response timing and content-length
  • Configured request/response headers

DataLoader Batching

  • dataloader.load spans tracking individual load calls and their batching behaviour
  • Useful for identifying N+1 query patterns

Outbound HTTP Calls

  • All downstream service calls via http, https, axios, node-fetch, and similar libraries
  • Propagates W3C traceparent header automatically, enabling end-to-end trace continuity across services

Controlling Resolver Span Depth

Resolver spans are the biggest source of span volume in GraphQL services. A single operation with 20 fields nested 3 levels deep generates ~60 INTERNAL resolver spans on top of the regular SERVER and CLIENT spans. The depth option controls how many levels of resolvers are instrumented:

depth valueSpans generatedWhen to use
-1 (default)All resolver spans at every nesting levelLocal debugging only
1Top-level resolver spans only (e.g. Query.homeMatches)When you need to identify which root field is slow
0No resolver spansProduction default — use with ignoreResolveSpans: true

The instrumentation above uses ignoreResolveSpans: true (equivalent to depth: 0), which is more efficient as it skips resolver hook registration entirely. Use depth: 1 as a middle ground when top-level resolver timing matters:

'@opentelemetry/instrumentation-graphql': {
depth: 1, // top-level resolvers only — eliminates nested explosion
allowValues: false,
responseHook: ...
}

The spans that remain at depth: 0POST /graphql, query OperationName, dataloader.load, outbound calls — are sufficient to diagnose latency, errors, and downstream dependency issues in Last9 APM.

Advanced Configuration

Capturing Query Variables Safely

By default no variable values are recorded. To capture specific variables as span attributes — useful for debugging without logging full payloads — use responseHook to selectively promote them:

'@opentelemetry/instrumentation-graphql': {
depth: 0,
responseHook: (span: any, info: any) => {
// Only capture the operation name variable, never user-supplied data
const vars = info?.variableValues ?? {}
if (vars.operationId) span.setAttribute('graphql.variable.operationId', vars.operationId)
// ... error handling as above
},
allowValues: false, // keep field values out of spans
}

Note: Never capture variables that may contain PII (user IDs, tokens, personal data) without explicit review. Use an allowlist, not a blocklist.

Tracking GraphQL Errors vs HTTP Errors

HTTP status codes alone do not capture GraphQL errors — a GraphQL error response has HTTP status 200 with an errors array in the response body. The responseHook in the instrumentation above sets graphql.execute.error: true on the execution span when this occurs, making GraphQL errors queryable and alertable in Last9.

Custom Spans for Business Operations

Add custom spans to track business-specific logic within a resolver:

import { trace } from "@opentelemetry/api";
const tracer = trace.getTracer("graphql-service");
async function resolveUserTeams(userId: string) {
const span = tracer.startSpan("resolve_user_teams");
try {
span.setAttribute("user.id", userId);
const teams = await fetchTeams(userId);
span.setAttribute("teams.count", teams.length);
return teams;
} catch (err: any) {
span.recordException(err);
span.setStatus({ code: trace.SpanStatusCode.ERROR, message: err.message });
throw err;
} finally {
span.end();
}
}

Apollo Federation — Disable Inline Tracing

If your service is a federated subgraph, Apollo Server enables ApolloServerPluginInlineTrace by default, which embeds a serialized trace in every response to the gateway. This adds response body overhead and duplicates what OTel already captures.

Disable it explicitly:

import { ApolloServer } from "@apollo/server";
import { ApolloServerPluginInlineTraceDisabled } from "@apollo/server/plugin/inlineTrace";
const server = new ApolloServer({
typeDefs,
resolvers,
plugins: [
ApolloServerPluginInlineTraceDisabled(), // OTel handles tracing
],
});

Verification

  1. Check Startup Log

    On startup you should see:

    [otel] Initialised — exporting to http://localhost:4318 as "graphql-service"
  2. Send a Test Query

    curl -X POST http://localhost:4000/graphql \
    -H "Content-Type: application/json" \
    -d '{"query": "{ __typename }"}'
  3. Confirm Traces in Last9

    Open Last9 APM and search for your service name. You should see:

    • POST /graphql server spans
    • query __typename (or your operation name) GraphQL execution spans
    • Outbound call spans for any downstream services

Troubleshooting

No traces appearing

Verify the instrumentation file is the first import and environment variables are set:

env | grep OTEL_

Enable debug logging to see what the SDK is doing:

import { diag, DiagConsoleLogger, DiagLogLevel } from "@opentelemetry/api";
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG);
// Add this before sdk.start()

GraphQL errors not reflected as span errors

Ensure the responseHook is configured. HTTP 200 responses with a GraphQL errors array in the body do not automatically set span status to ERROR — the responseHook is required for this.

High memory usage on the collector

If routing through a local OTel Collector, ensure memory_limiter is configured with an absolute limit_mib (not limit_percentage) when the collector shares a host with your application. See the OpenTelemetry Collector documentation for recommended settings.

Instrumentation import order

// ✅ Correct
import "./instrumentation";
import { ApolloServer } from "@apollo/server";
// ❌ Incorrect — ApolloServer already loaded before patching
import { ApolloServer } from "@apollo/server";
import "./instrumentation";

Please get in touch with us on Discord or Email if you have any questions.