Socket
Socket
Sign inDemoInstall

com.ecfront:keyword-extract

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

com.ecfront:keyword-extract

URL关键词提取


Version published
Maintainers
1
Source

== URL关键词提取

image::https://img.shields.io/travis/gudaoxuri/keyword-extract.svg[link="https://travis-ci.org/gudaoxuri/keyword-extract"] image:https://api.codacy.com/project/badge/Grade/f2fc8d2aa9594a0bae6e2a445caa56db["Codacy code quality", link="https://www.codacy.com/app/gudaoxuri/keyword-extract?utm_source=github.com&utm_medium=referral&utm_content=gudaoxuri/keyword-extract&utm_campaign=Badge_Grade"] image:https://img.shields.io/badge/license-ASF2-blue.svg["Apache License 2",link="https://www.apache.org/licenses/LICENSE-2.0.txt"] image:https://maven-badges.herokuapp.com/maven-central/com.ecfront/keyword-extract/badge.svg["Maven Central",link="https://maven-badges.herokuapp.com/maven-central/com.ecfront/keyword-extract/"]

单文件、无三方依赖、支持在线规则升级、非标准协议的URL关键词提取工具。

=== 使用

[source,xml]

com.ecfront keyword-extract 1.6 ----

[source,java]

// 关键词提取 KeyWordExtract.Result result = KeyWordExtract.extract(url);

// 使用在线规则 KeyWordExtract.loadOnlineRules("https://raw.githubusercontent.com/gudaoxuri/keyword-extract/master/src/main/resources/kwe-rules.txt");

=== 规则配置说明

本地规则文件默认已打到jar中,如要修改可在classpath根目录中创建kwe-rules.txt文件,此文件会覆盖默认规则。

使用在线规则会覆盖自定义规则。


一行一条规则,配置项以|分隔

规则分一般规则和自定义规则,后者使用js代码处理

一般规则

<名称>||<关键字所在位置,query:查询条件中,path:url路径中>|<对于query位置指定关键字的key,对于path位置指定以/分隔的偏移量>|<解码方式,目前只支持decodeURI,空>|<编码>

e.g. :

百度|www.baidu.com|query|wd|decodeURI|UTF-8 搜狗微信|weixin.sogou.com|query|query|encodeURI|UTF-8 苏宁|search.suning.com|path|0|decodeURI|UTF-8

自定义规则

<名称>||<js代码,入参为uri,返回值为result>

e.g. :

微博|s.weibo.com|var uri = decodeURI(decodeURI(uri)); var kv = uri.split("/")[2]; result = kv.split("&Refer=")[0];

自定义协议支持

app://app1/somepath?q=URL关键词提取 custom://custom1/somepath?q=URL关键词提取


FAQs

Package last updated on 13 Nov 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc