Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

github.com/open-telemetry/opentelemetry-collector-contrib/connector/servicegraphconnector

Package Overview
Dependencies
Alerts
File Explorer
Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/open-telemetry/opentelemetry-collector-contrib/connector/servicegraphconnector

  • v0.114.0
  • Source
  • Go
  • Socket score

Version published
Created
Source

Service Graph Connector

Status
Distributionscontrib, k8s
IssuesOpen issues Closed issues
Code Owners@jpkrohling, @mapno, @JaredTan95

Supported Pipeline Types

Exporter Pipeline TypeReceiver Pipeline TypeStability Level
tracesmetricsalpha

Overview

The service graphs connector builds a map representing the interrelationships between various services in a system. The connector will analyse trace data and generate metrics describing the relationship between the services. These metrics can be used by data visualization apps (e.g. Grafana) to draw a service graph.

Service graphs are useful for a number of use-cases:

  • Infer the topology of a distributed system. As distributed systems grow, they become more complex. Service graphs can help you understand the structure of the system.
  • Provide a high level overview of the health of your system. Service graphs show error rates, latencies, among other relevant data.
  • Provide an historic view of a system’s topology. Distributed systems change very frequently, and service graphs offer a way of seeing how these systems have evolved over time.

This component is based on Grafana Tempo's service graph processor.

How it works

Service graphs work by inspecting traces and looking for spans with parent-children relationship that represent a request. The connector uses the OpenTelemetry semantic conventions to detect a myriad of requests. It currently supports the following requests:

  • A direct request between two services where the outgoing and the incoming span must have span.kind client and server respectively.
  • A request across a messaging system where the outgoing and the incoming span must have span.kind producer and consumer respectively.
  • A database request; in this case the connector looks for spans containing attributes span.kind=client as well as db.name.

Every span that can be paired up to form a request is kept in an in-memory store, until its corresponding pair span is received or the maximum waiting time has passed. When either of these conditions are reached, the request is recorded and removed from the local store.

Each emitted metrics series have the client and server label corresponding with the service doing the request and the service receiving the request.

traces_service_graph_request_total{client="app", server="db", connection_type="database"} 20

TLDR: The connector will try to find spans belonging to requests as seen from the client and the server and will create a metric representing an edge in the graph.

Metrics

The following metrics are emitted by the connector:

MetricTypeLabelsDescription
traces_service_graph_request_totalCounterclient, server, connection_typeTotal count of requests between two nodes
traces_service_graph_request_failed_totalCounterclient, server, connection_typeTotal count of failed requests between two nodes
traces_service_graph_request_server_secondsHistogramclient, server, connection_typeTime for a request between two nodes as seen from the server
traces_service_graph_request_client_secondsHistogramclient, server, connection_typeTime for a request between two nodes as seen from the client
traces_service_graph_unpaired_spans_totalCounterclient, server, connection_typeTotal count of unpaired spans
traces_service_graph_dropped_spans_totalCounterclient, server, connection_typeTotal count of dropped spans

Duration is measured both from the client and the server sides.

Possible values for connection_type: unset, messaging_system, or database.

Additional labels can be included using the dimensions configuration option. Those labels will have a prefix to mark where they originate (client or server span kinds). The client_ prefix relates to the dimensions coming from spans with SPAN_KIND_CLIENT, and the server_ prefix relates to the dimensions coming from spans with SPAN_KIND_SERVER.

Since the service graph connector has to process both sides of an edge, it needs to process all spans of a trace to function properly. If spans of a trace are spread out over multiple instances, spans are not paired up reliably. A possible solution to this problem is using the load balancing exporter in a layer on front of collector instances running this connector.

Visualization

Service graph metrics are natively supported by Grafana since v9.0.4. To run it, configure a Tempo data source's 'Service Graphs' by linking to the Prometheus backend where metrics are being sent:

apiVersion: 1
datasources:
  # Prometheus backend where metrics are sent
  - name: Prometheus
    type: prometheus
    uid: prometheus
    url: <prometheus-url>
    jsonData:
        httpMethod: GET
    version: 1
  - name: Tempo
    type: tempo
    uid: tempo
    url: <tempo-url>
    jsonData:
      httpMethod: GET
      serviceMap:
        datasourceUid: 'prometheus'
    version: 1

Configuration

The following settings are required:

  • latency_histogram_buckets: the list of durations defining the latency histogram buckets.
    • Default: [2ms, 4ms, 6ms, 8ms, 10ms, 50ms, 100ms, 200ms, 400ms, 800ms, 1s, 1400ms, 2s, 5s, 10s, 15s]
  • dimensions: the list of dimensions to add together with the default dimensions defined above.

The following settings can be optionally configured:

  • store: defines the config for the in-memory store used to find requests between services by pairing spans.
    • ttl: TTL is the time to live for items in the store.
      • Default: 2s
    • max_items: MaxItems is the maximum number of items to keep in the store.
      • Default: 1000
  • cache_loop: the interval at which to clean the cache.
    • Default: 1m
  • store_expiration_loop: the time to expire old entries from the store periodically.
    • Default: 2s
  • virtual_node_peer_attributes: the list of attributes, ordered by priority, whose presence in a client span will result in the creation of a virtual server node. An empty list disables virtual node creation.
    • Default: [peer.service, db.name, db.system]
  • virtual_node_extra_label: adds an extra label virtual_node with an optional value of client or server, indicating which node is the uninstrumented one.
    • Default: false
  • metrics_flush_interval: the interval at which metrics are flushed to the exporter.
    • Default: Metrics are flushed on every received batch of traces.
  • database_name_attribute: the attribute name used to identify the database name from span attributes.
    • Default: db.name

Example configurations

Sample with custom buckets and dimensions

receivers:
  otlp:
    protocols:
      grpc:

connectors:
  servicegraph:
    latency_histogram_buckets: [100ms, 250ms, 1s, 5s, 10s]
    dimensions:
      - dimension-1
      - dimension-2
    store:
      ttl: 1s
      max_items: 10

exporters:
  prometheus/servicegraph:
    endpoint: localhost:9090
    namespace: servicegraph

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [servicegraph]
    metrics/servicegraph:
      receivers: [servicegraph]
      exporters: [prometheus/servicegraph]

Sample with options for uninstrumented services identification

receivers:
  otlp:
    protocols:
      grpc:

connectors:
  servicegraph:
    dimensions:
      - db.system
      - messaging.system
    virtual_node_peer_attributes:
      - db.name
      - db.system
      - messaging.system
      - peer.service
    virtual_node_extra_label: true

exporters:
  prometheus/servicegraph:
    endpoint: localhost:9090
    namespace: servicegraph

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [servicegraph]
    metrics/servicegraph:
      receivers: [servicegraph]
      exporters: [prometheus/servicegraph]

FAQs

Package last updated on 18 Nov 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc