Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

logstash-output-datahub

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

logstash-output-datahub

  • 1.0.1
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

Aliyun DataHub Plugin for LogStash

Getting Started


介绍

  • 该插件是基于logstash开发的输出插件,它主要完成数据采集到DataHub(阿里云产品)存储服务上

安装

  • 环境要求linux, jdk1.7+, logstash(可选,如果没安装也没关系)
  • 从流计算官网下载tar包,使用以下命令安装

如果之前没安装过logstash,请使用以下步骤安装

$ tar -xzvf logstash-with-datahub-2.3.0.tar.gz
$ cd logstash-with-datahub-2.3.0

如果之前安装过logstash,拿到logstash-output-datahub-1.0.0.gem,再使用以下命令安装

$ ${LOGSTASH_HOME}/bin/logstash-plugin install --local logstash-output-datahub-1.0.0.gem

样例

  • 采集日志

以下是一个应用打出的日志,格式如下:

20:04:30.359 [qtp1453606810-20] INFO  AuditInterceptor - [13pn9kdr5tl84stzkmaa8vmg] end /web/v1/project/fhp4clxfbu0w3ym2n7ee6ynh/statistics?executionName=bayes_poc_test GET, 187 ms

在datahub的topic的schema如下:

request_time, STRING
thread_id, STRING
log_level, STRING
class_name, STRING
request_id, STRING
detail, STRING

logstash的配置如下:

input {
    file {
		path => "${APP_HOME}/log/app.log"
		start_position => "beginning"
    }
}

filter{
    grok {
		match => {
		   "message" => "(?<request_time>\d\d:\d\d:\d\d\.\d+)\s+\[(?<thread_id>[\w\-]+)\]\s+(?<log_level>\w+)\s+(?<class_name>\w+)\s+\-(?<detail>.+)"
		}
    }
}

output {
    datahub {
		access_id => ""
		access_key => ""
		endpoint => ""
		project_name => ""
		topic_name => ""
		#shard_id => "0"
		#shard_keys => ["thread_id"]
		dirty_data_continue => true
		dirty_data_file => "/Users/ph0ly/trash/dirty.data"
		dirty_data_file_max_size => 1000
    }
}

参数介绍

access_id(Required): 阿里云access id
access_key(Required): 阿里云access key
endpoint(Required): 阿里云datahub的服务地址
project_name(Required): datahub项目名称
topic_name(Required): datahub topic名称
retry_times(Optional): 重试次数,-1为无限重试、0为不重试、>0表示需要有限次数
retry_interval(Optional): 下一次重试的间隔,单位为秒
shard_keys(Optional):数组类型,数据落shard的字段名称,插件会根据这些字段的值计算hash将每条数据落某个shard, 注意shard_keys和shard_id都未指定,默认轮询落shard
shard_id(Optional): 所有数据落指定的shard,注意shard_keys和shard_id都未指定,默认轮询落shard
dirty_data_continue(Optional): 脏数据是否继续运行,默认为false,如果指定true,则遇到脏数据直接无视,继续处理数据。当开启该开关,必须指定@dirty_data_file文件
dirty_data_file(Optional): 脏数据文件名称,当数据文件名称,在@dirty_data_continue开启的情况下,需要指定该值。特别注意:脏数据文件将被分割成两个部分.part1和.part2,part1作为更早的脏数据,part2作为更新的数据
dirty_data_file_max_size(Optional): 脏数据文件的最大大小,该值保证脏数据文件最大大小不超过这个值,目前该值仅是一个参考值

相关参考


  • LogStash主页
  • LogStash插件开发

Authors && Contributors


License


licensed under the Apache License 2.0

FAQs

Package last updated on 14 Jun 2017

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc