Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

github.com/open-telemetry/opentelemetry-collector-contrib/processor/groupbyattrsprocessor

Package Overview
Dependencies
Alerts
File Explorer
Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/open-telemetry/opentelemetry-collector-contrib/processor/groupbyattrsprocessor

  • v0.114.0
  • Source
  • Go
  • Socket score

Version published
Created
Source

Group by Attributes processor

Status
Stabilitybeta: traces, metrics, logs
Distributionscontrib, k8s
IssuesOpen issues Closed issues
Code Owners@rnishtala-sumo

Description

This processor re-associates spans, log records and metric datapoints to a Resource that matches with the specified attributes. As a result, all spans, log records or metric datapoints with the same values for the specified attributes are "grouped" under the same Resource.

Typical use cases:

  • extract resources from "flat" data formats, such as Fluentbit logs or Prometheus metrics
  • associate Prometheus metrics to a Resource that describes the relevant host, based on label present on all metrics
  • optimize data packaging by extracting common attributes
  • compacting multiple records that share the same Resource and InstrumentationLibrary attributes but are under multiple ResourceSpans/ResourceMetrics/ResourceLogs, into a single ResourceSpans/ResourceMetrics/ResourceLogs (when empty list of keys is being provided). This might happen e.g. when groupbytrace processor is being used or data comes in multiple requests. By compacting data, it takes less memory, is more efficiently processed, serialized and the number of export requests is reduced.

It is recommended to use the groupbyattrs processor together with batch processor, as a consecutive step, as this will reduce the fragmentation of data (by grouping records together under matching Resource/Instrumentation Library)

Examples

Grouping metrics

Consider the below metrics, all originally associated to the same Resource:

Resource {host.name="localhost",source="prom"}
  Metric "gauge-1" (GAUGE)
    DataPoint {host.name="host-A",id="eth0"}
    DataPoint {host.name="host-A",id="eth0"}
    DataPoint {host.name="host-B",id="eth0"}
  Metric "gauge-1" (GAUGE) // Identical to previous Metric
    DataPoint {host.name="host-A",id="eth0"}
    DataPoint {host.name="host-A",id="eth0"}
    DataPoint {host.name="host-B",id="eth0"}
  Metric "mixed-type" (GAUGE)
    DataPoint {host.name="host-A",id="eth0"}
    DataPoint {host.name="host-A",id="eth0"}
    DataPoint {host.name="host-B",id="eth0"}
  Metric "mixed-type" (SUM)
    DataPoint {host.name="host-A",id="eth0"}
    DataPoint {host.name="host-A",id="eth0"}
  Metric "dont-move" (Gauge)
    DataPoint {id="eth0"}

With the below configuration, the groupbyattrs will re-associate the metrics with either host-A or host-B, based on the value of the host.name attribute.

processors:
  groupbyattrs:
    keys:
      - host.name

The output of the processor will therefore be:

Resource {host.name="localhost",source="prom"}
  Metric "dont-move" (Gauge)
    DataPoint {id="eth0"}

Resource {host.name="host-A",source="prom"}
  Metric "gauge-1"
    DataPoint {id="eth0"}
    DataPoint {id="eth0"}
    DataPoint {id="eth0"}
    DataPoint {id="eth0"}
  Metric "mixed-type" (GAUGE)
    DataPoint {id="eth0"}
    DataPoint {id="eth0"}
  Metric "mixed-type" (SUM)
    DataPoint {id="eth0"}
    DataPoint {id="eth0"}

Resource {host.name="host-B",source="prom"}
  Metric "gauge-1"
    DataPoint {id="eth0"}
    DataPoint {id="eth0"}
  Metric "mixed-type" (GAUGE)
    DataPoint {id="eth0"}

Notes:

  • The DataPoints for the gauge-1 (GAUGE) metric were originally split under 2 Metric instances and have been merged in the output
  • The DataPoints of the mixed-type (GAUGE) and mixed-type (SUM) metrics have not been merged under the same Metric, because their DataType is different
  • The dont-move metric DataPoints don't have a host.name attribute and therefore remained under the original Resource
  • The new Resources inherited the attributes from the original Resource (source="prom"), plus the specified attributes from the processed metrics (host.name="host-A" or host.name="host-B")
  • The specified "grouping" attributes that are set on the new Resources are also removed from the metric DataPoints
  • While not shown in the above example, the processor also merges collections of records under matching InstrumentationLibrary

Compaction

In some cases, the data might come in single requests to the collector or become fragmented due to use of groupbytrace processor. Even after batching there might be multiple duplicated ResourceSpans/ResourceLogs/ResourceMetrics objects, which leads to additional memory consumption, increased processing costs, inefficient serialization and increase of the export requests. As a remedy, groupbyattrs processor might be used to compact the data with matching Resource and InstrumentationLibrary properties.

For example, consider the following input:

Resource {host.name="localhost"}
  InstrumentationLibrary {name="MyLibrary"}
  Spans
    Span {span_id=1, ...}
  InstrumentationLibrary {name="OtherLibrary"}
  Spans
    Span {span_id=2, ...}
    
Resource {host.name="localhost"}
  InstrumentationLibrary {name="MyLibrary"}
  Spans
    Span {span_id=3, ...}
    
Resource {host.name="localhost"}
  InstrumentationLibrary {name="MyLibrary"}
  Spans
    Span {span_id=4, ...}
    
Resource {host.name="otherhost"}
  InstrumentationLibrary {name="MyLibrary"}
  Spans
    Span {span_id=5, ...}

With the below configuration, the groupbyattrs will re-associate the spans with matching Resource and InstrumentationLibrary.

processors:
  batch:
  groupbyattrs:

pipelines:
  traces:
    processors: [batch, groupbyattrs/grouping]
    ...

The output of the processor will therefore be:

Resource {host.name="localhost"}
  InstrumentationLibrary {name="MyLibrary"}
  Spans
    Span {span_id=1, ...}
    Span {span_id=3, ...}
    Span {span_id=4, ...}
  InstrumentationLibrary {name="OtherLibrary"}
  Spans
    Span {span_id=2, ...}

Resource {host.name="otherhost"}
  InstrumentationLibrary {name="MyLibrary"}
  Spans
    Span {span_id=5, ...}

Configuration

The configuration is very simple, as you only need to specify an array of attribute keys that will be used to "group" spans, log records or metric data points together, as in the below example:

processors:
  groupbyattrs:
    keys:
      - foo
      - bar

The keys property describes which attribute keys will be considered for grouping:

  • If the processed span, log record and metric data point has at least one of the specified attributes key, it will be moved to a Resource with the same value for these attributes. The Resource will be created if none exists with the same attributes.
  • If none of the specified attributes key is present in the processed span, log record or metric data point, it remains associated to the same Resource (no change), with multiple instances of the same Resource still compacted.

Please refer to:

Internal Metrics

The following internal metrics are recorded by this processor:

MetricDescription
num_grouped_spansthe number of spans that had attributes grouped
num_non_grouped_spansthe number of spans that did not have attributes grouped
span_groupsdistribution of groups extracted for spans
num_grouped_logsnumber of logs that had attributes grouped
num_non_grouped_logsnumber of logs that did not have attributes grouped
log_groupsdistribution of groups extracted for logs
num_grouped_metricsnumber of metrics that had attributes grouped
num_non_grouped_metricsnumber of metrics that did not have attributes grouped
metric_groupsdistribution of groups extracted for metrics

FAQs

Package last updated on 18 Nov 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc