🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
DemoInstallSign in
Socket

fluent-plugin-xml-parser

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

fluent-plugin-xml-parser

1.0.0
Rubygems
Version published
Maintainers
2
Created
Source

Fluent::Plugin::XmlParser

Fluent plugin for Parsing XML Input

Requirements

fluent-plugin-xml-parserFluentdRuby
>= 1.0.0>= v0.14.0>= 2.1
< 1.0.0>= v0.12.0>= 1.9

Installation

Add this line to your application's Gemfile:

gem 'fluent-plugin-xml-parser'

And then execute:

$ bundle

Or install it yourself as:

$ gem install fluent-plugin-xml-parser

Usage

Fluent::Plugin::XmlParser provides input data conversion from simple XML data like sensor data into Ruby hash structure for emitting next procedure in fluentd. In order to enable Fluent::Plugin::XmlParser, 'format xml' option needs to be specified at 'source' directive.

The followings are an example configuration for reformatting Libelium SmartCity sensor data to fit ElasticSearch received via MQTT protocol(fluent-plugin-mqtt-io).

For v1.0

Put configuration options in <parse> tag


<source>
  type mqtt
  bind 127.0.0.1
  port 11883
  topic 'Libelium/+/#'
  @label @MQTT_OUT
  <parse>
    type @xml
    time_xpath '["cap:alert/cap:info/cap:onset", "text"]'
    time_key '@timestamp'
    attr_xpaths '[["cap:alert/cap:info/cap:parameter/cap:valueName", "text"]]'
    value_xpaths '[["cap:alert/cap:info/cap:parameter/cap:value", "text"]]'
  </parse>
</source>

For v0.12

Use format instead of <parse></parse and put configuration options in <source> tag


<source>
  type mqtt
  bind 127.0.0.1
  port 11883
  topic 'Libelium/+/#'
  format xml
  time_xpath '["cap:alert/cap:info/cap:onset", "text"]'
  time_key '@timestamp'
  attr_xpaths '[["cap:alert/cap:info/cap:parameter/cap:valueName", "text"]]'
  value_xpaths '[["cap:alert/cap:info/cap:parameter/cap:value", "text"]]'
  @label @MQTT_OUT
</source>

The target XML file of this example is as follows (an old style of Libelium sensor data):

<?xml version="1.0" encoding="UTF-8"?>
<cap:alert xmlns:cap="urn:oasis:names:tc:emergency:cap:1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:oasis:names:tc:emergency:cap:1.2 CAP-v1.2-os.xsd">
  <cap:identifier>Libelium_2015-09-21T18:54:24+:00:00495</cap:identifier>
  <cap:sender>Libelium_SmartCity_387243170</cap:sender>
  <cap:sent>2015-09-21T18:54:24+:00:00</cap:sent>
  <cap:status>Actual</cap:status>
  <cap:msgType>Alert</cap:msgType>
  <cap:scope>Public</cap:scope>
  <cap:code>KPI</cap:code>
  <cap:info>
    <cap:category>Other</cap:category>
    <cap:event>Libelium</cap:event>
    <cap:urgency>Inmediate</cap:urgency>
    <cap:severity>Unknown</cap:severity>
    <cap:certainty>Observed</cap:certainty>
    <cap:onset>2015-09-21T18:54:24+:00:00</cap:onset>
    <cap:senderName>Libelium</cap:senderName>
    <cap:headline>Waspmote sensors</cap:headline>
    <cap:description>Sensor data from Waspmote devices: MCP</cap:description>
    <cap:parameter>
      <cap:valueName>MCP</cap:valueName>
      <cap:value>50</cap:value>
    </cap:parameter>
  </cap:info>
</cap:alert>

Configurable options are the following:

  • time_xpath: A value for fluentd time field. An array with two strings means xpath of the value and the attribute of the XML element (name, text etc). If this option is omitted, current time is used.
  • time_key: An attribute name of extra timestamp field appended to the record. If Output Plugin does not provide timestamp configuration, you can specify field name by this option. If this option is omitted, extra timestamp field is not appended.
  • time_format: You can specify time format. If this is omitted, ISO8601 format is used.
  • attr_xpaths: indicates attribute name of the target value. Each array with two strings means xpath of the attribute name and the attribute of the XML element (name, text etc). XPath can be omitted as 'null' and specify your own attribute name as the second parameter. A third element in the array can be used to provide a default value if the XML element is missing.
  • value_xpaths: indicates the target value to be extracted. Each array with two strings means xpath of the target value and the attribute of the XML element (name, text etc) and each value is stored into the Hash with the key specified at an array instance in the attr_xpaths with the same index. XPath can be omitted as 'null' and specify your own value as the second parameter. A third element in the array can be used to provide a default value if the XML element is missing.

The extracted fields are packed into Hash structure (record field) to emit the next procedure in fluentd.

You can check your own XML data structure by using irb or pry


require 'rexml/document'
doc = REXML::Document.new(open("test.xml"))
doc.elements['cap:alert/cap:info'].children

Contributing

FAQs

Package last updated on 16 Apr 2019

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts