Processors and Transforms

Processors run on telemetry data between the receiver and the exporter. They are the primary tool for reducing volume, dropping noise, and enriching data before it reaches Last9.

This guide covers the four processors most commonly used in production deployments:

Processor	What it does
`filter`	Drop spans, logs, or metrics that match a condition
`transform`	Rename, enrich, or modify telemetry using OTTL expressions
`memory_limiter`	Prevent the collector from OOMing under traffic spikes
`batch`	Buffer and flush data in efficient batches

All processors must be listed in the service.pipelines section to take effect.

filter processor

The filter processor drops telemetry that matches one or more OTTL conditions. Use it to eliminate spans, logs, or metrics that add volume without adding value.

Drop internal spans

Internal spans from frameworks like GraphQL routers, ORM layers, and service meshes can account for the majority of your trace volume. Drop them to reduce ingestion by up to 60–70%:

processors:
  filter/drop_internal_spans:
    error_mode: ignore
    traces:
      span:
        - 'kind == SPAN_KIND_INTERNAL and status.code != STATUS_CODE_ERROR'

Drop internal spans for specific services

When you need finer control — for example, dropping internal spans only from known high-volume services:

processors:
  filter/drop_internal_spans:
    error_mode: ignore
    traces:
      span:
        - 'resource.attributes["service.name"] == "gql-router" and kind == SPAN_KIND_INTERNAL'
        - 'resource.attributes["service.name"] == "kong" and kind == SPAN_KIND_INTERNAL'
        - 'resource.attributes["service.name"] == "api-gateway" and kind == SPAN_KIND_INTERNAL and status.code != STATUS_CODE_ERROR'

Drop database noise spans

Transaction bookkeeping spans (BEGIN, COMMIT, ROLLBACK) inflate trace counts without providing actionable information:

processors:
  filter/drop_db_noise:
    error_mode: ignore
    traces:
      span:
        - 'attributes["db.system"] != "" and IsMatch(name, "^(BEGIN|COMMIT|ROLLBACK)$")'

Drop logs by severity

Drop DEBUG and TRACE logs to reduce log volume in production. Logs at INFO and above pass through:

processors:
  filter/drop_debug_logs:
    error_mode: ignore
    logs:
      log_record:
        - 'severity_number < SEVERITY_NUMBER_INFO'

Drop redundant metric buckets

For Prometheus histograms, you often only need one of _count, _sum, or _bucket depending on your use case. Drop the ones you don’t query:

processors:
  filter/drop_histogram_sum:
    error_mode: ignore
    metrics:
      datapoint:
        - 'IsMatch(metric.name, ".*_sum$")'

transform processor

The transform processor modifies telemetry in place using OTTL statements. Use it to rename spans, normalize operation names, add missing attributes, or fix instrumentation gaps.

Remove unique IDs from span names

Auto-instrumented frameworks sometimes embed request-specific values (UUIDs, user IDs, numeric IDs) in span names, creating unbounded cardinality in APM:

processors:
  transform/normalize_span_names:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # Remove query params: "GET /users?id=abc123" → "GET /users"
          - replace_pattern(name, "\\?.*$", "")
          # Remove numeric path segments: "/api/users/12345/orders" → "/api/users/{id}/orders"
          - replace_pattern(name, "/[0-9]+", "/{id}")
          # Remove UUIDs: "/sessions/550e8400-e29b-41d4-a716-446655440000" → "/sessions/{uuid}"
          - replace_pattern(name, "/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}", "/{uuid}")

Fix GraphQL span visibility in APM

Apollo Server and other GraphQL frameworks emit the HTTP transport span and the named operation span both as SERVER kind. This inflates throughput 2× in APM and breaks the Operations tab. The fix demotes the HTTP layer to INTERNAL so the named operation is the single source of truth:

processors:
  transform/fix_graphql_spans:
    error_mode: ignore
    trace_statements:
      # Step 1: Give the HTTP span a meaningful name before demotion
      - context: span
        conditions:
          - instrumentation_scope.name == "@opentelemetry/instrumentation-http"
            and kind == SPAN_KIND_SERVER
        statements:
          - set(name, Concat([attributes["http.method"], " ", attributes["http.target"]], ""))
            where IsString(attributes["http.target"])

      # Step 2: Demote HTTP SERVER → INTERNAL when a named GraphQL operation span exists
      - context: span
        conditions:
          - instrumentation_scope.name == "@opentelemetry/instrumentation-http"
            and kind == SPAN_KIND_SERVER
            and IsMatch(name, ".*/graphql.*")
        statements:
          - set(kind, SPAN_KIND_INTERNAL)

      # Step 3: Add http.method to GraphQL operation spans so APM Operations tab shows them
      - context: span
        conditions:
          - IsString(attributes["graphql.operation.type"])
            and kind == SPAN_KIND_SERVER
            and attributes["http.method"] == nil
        statements:
          - set(attributes["http.method"], "POST")
          - set(attributes["http.status_code"], "500")
            where attributes["http.status_code"] == nil and status.code == STATUS_CODE_ERROR
          - set(attributes["http.status_code"], "200")
            where attributes["http.status_code"] == nil

Add static labels to CloudWatch metrics

CloudWatch metrics arrive without service_name or environment labels, making service-level alerting impossible. Use a transform processor to add them:

processors:
  transform/enrich_cloudwatch:
    error_mode: ignore
    metric_statements:
      - context: datapoint
        statements:
          - set(attributes["service_name"], "payments-service")
            where resource.attributes["aws.cloudwatch.namespace"] == "AWS/ApplicationELB"
          - set(attributes["deployment_environment"], "production")

Propagate resource attributes to span attributes

Some backends and dashboards expect certain attributes on the span rather than the resource. Copy them across:

processors:
  transform/promote_resource_attrs:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          - set(attributes["k8s.namespace"], resource.attributes["k8s.namespace.name"])
            where attributes["k8s.namespace"] == nil
          - set(attributes["host"], resource.attributes["host.name"])
            where attributes["host"] == nil

memory_limiter processor

The memory_limiter processor protects the collector from OOMing under sudden traffic spikes. It checks memory usage on an interval and begins dropping data when usage crosses a threshold.

Always place memory_limiter first in every pipeline.

processors:
  memory_limiter:
    check_interval: 1s
    limit_percentage: 80
    spike_limit_percentage: 25

Field	Value	Meaning
`check_interval`	`1s`	How often to check memory usage
`limit_percentage`	`80`	Start refusing new data at 80% of available memory
`spike_limit_percentage`	`25`	Soft limit — begin backpressure 25% below `limit_percentage`

Setting the memory limit for the collector process

Set an explicit memory limit for the collector via the --mem-ballast-size-mib flag or GOMEMLIMIT env var. If the collector runs in Kubernetes, set a memory limit on the pod and configure memory_limiter to 80% of that limit:

# Kubernetes pod spec
resources:
  limits:
    memory: 4Gi
  requests:
    memory: 2Gi

# Corresponding memory_limiter config
processors:
  memory_limiter:
    check_interval: 1s
    limit_percentage: 80    # 80% of 4Gi = ~3.2Gi
    spike_limit_percentage: 25

batch processor

The batch processor buffers spans, logs, and metrics before sending them to the exporter. Batching reduces the number of outbound connections and improves compression ratios.

processors:
  batch:
    timeout: 5s
    send_batch_size: 1000
    send_batch_max_size: 2000

Field	Value	Meaning
`timeout`	`5s`	Send the current batch after this interval even if `send_batch_size` is not reached
`send_batch_size`	`1000`	Target number of items per batch
`send_batch_max_size`	`2000`	Maximum batch size; 0 means no limit

Putting it all together

A complete pipeline configuration using all four processors. Order matters: memory_limiter first, then filtering, then transforms, then batching.

processors:
  memory_limiter:
    check_interval: 1s
    limit_percentage: 80
    spike_limit_percentage: 25

  filter/drop_internal_spans:
    error_mode: ignore
    traces:
      span:
        - 'kind == SPAN_KIND_INTERNAL and status.code != STATUS_CODE_ERROR'

  filter/drop_debug_logs:
    error_mode: ignore
    logs:
      log_record:
        - 'severity_number < SEVERITY_NUMBER_INFO'

  transform/normalize_span_names:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          - replace_pattern(name, "\\?.*$", "")
          - replace_pattern(name, "/[0-9]+", "/{id}")

  batch:
    timeout: 5s
    send_batch_size: 1000
    send_batch_max_size: 2000

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors:
        - memory_limiter
        - filter/drop_internal_spans
        - transform/normalize_span_names
        - batch
      exporters: [otlp/last9]

    logs:
      receivers: [otlp]
      processors:
        - memory_limiter
        - filter/drop_debug_logs
        - batch
      exporters: [otlp/last9]

    metrics:
      receivers: [otlp, prometheus]
      processors:
        - memory_limiter
        - batch
      exporters: [otlp/last9]

Troubleshooting

Spans still appearing after filter

Check that the processor name exactly matches what’s listed in service.pipelines. A processor defined but not listed in processors: in the pipeline is silently ignored.
error_mode: ignore vs error_mode: propagate

ignore skips items that cause evaluation errors (e.g. missing attributes) and continues processing. Use propagate during development to surface OTTL syntax errors — switch to ignore in production.
High CPU from transform processor

OTTL expressions with regex (IsMatch, replace_pattern) are evaluated per span. For very high-throughput services, move static attribute assignments to resource_detection or resourcedetection processors, which run once per batch rather than per item.

Please get in touch with us on Discord or Email if you have any questions.