
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
light-search
Advanced tools
https://github.com/hecomi/node-mecab-async
mecab例子 http://www.edrdg.org/~jwb/mecabdemo.html
NBest 东京大学 NBest=1 时 词语被分成1个 NBest=2 时 词语被分成3个 东京,大学,东京大学
相似度算法 2.1 cosø
2.2
2.3
2.4
词典路径 /usr/local/Cellar/mecab/0.996/lib/mecab/dic
tf 单词频率 词出现的次数/句子整个词数 反复出现的词更能代表改文章
idf 逆文档频率 log(所有文档个数/包含该单词的文档个数) 在某个文档当中词只出现一次的话,该单词代表文档的意思 使用idf在不同的文档中出现次数特别多的词可以被忽略。
tf * idf 考虑上两个要素,值越大越重要
除了cos相似度以外
=> Jaccard相似度也比较常用 http://blog.csdn.net/xceman1997/article/details/8600277
=> simhash算法原理和代码实现 http://blog.sina.cn/dpool/blog/s/blog_81e6c30b0101cpvu.html
k 汉字 r 读音 v { w 权重 n 单词在给定句子中出现的次数
tf tf值 idf idf值 tfidf tf乘idf的值
count 单词在多少个文档中出现过
weight 权重 sum 给定句子的单词数 total 总文档数 }
首页: http://www.coreseek.cn/opensource/mmseg/
yum install make gcc gcc-c++ libtool autoconf automake
wget http://www.coreseek.cn/uploads/csft/3.2/mmseg-3.2.14.tar.gz
tar zxvf mmseg-3.2.14.tar.gz
cd mmseg-3.2.14
./bootstrap
./configure --prefix=/usr/local/mmseg3
make && make install
ln -s /usr/local/mmseg3/bin/mmseg /bin/mmseg3
brew install m4
brew install libtool
brew install automake
brew install autoconf
brew install autoconf-archive
cd mmseg-3.2.14
./bootstrap
./configure --prefix=/usr/local/mmseg3
make && make install
/bootstrap: line 24: libtoolize: command not found,错误 libtoolize应该写成glibtoolize。#include <string> 后面加入一句 #include <ext/hash_map>http://www.qinbin.me/mac%E4%B8%AD%E5%AE%89%E8%A3%85coreseeksphinx%E4%B8%AD%E6%96%87%E5%88%86%E8%AF%8D%E5%85%A8%E6%96%87%E6%A3%80%E7%B4%A2/ http://blog.shiniv.com/2013/08/mac-install-coreseek-full-text-search/
步骤1: 安装依赖包(通常默认的stdc包得版本会比较高)
yum install compat-libstdc++-33.x86_64
步骤2: 拷贝词典文件到系统目录(在工程目录下有编译好的)
copy /LightSearch/ictclas到 /usr/lib/ictclas
步骤3: 设定文件连接
ln -s /usr/lib/ictclas ictclas
要在应用程序的运行目录放置两个文件
注意:Configure.xml里定义的data路径,需要指定相对路径,而不能是绝对路径 所以,还要在APP根目录下创建一个到 /usr/lib/ictclas 的链接
自定义词汇/gFAQs
LightSearch
We found that light-search demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.