
Company News
Socket Partners with Replit to Block Malicious Packages in AI-Powered Development
Replit is integrating Socket Firewall into its AI-powered development experience to help protect builders from malicious open source packages.
| Language | version |
|---|---|
| Python | |
| Rust |
LTP(Language Technology Platform) 提供了一系列中文自然语言处理工具,用户可以使用这些工具对于中文文本进行分词、词性标注、句法分析等等工作。
如果您在工作中使用了 LTP,您可以引用这篇论文
@article{che2020n,
title={N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models},
author={Che, Wanxiang and Feng, Yunlong and Qin, Libo and Liu, Ting},
journal={arXiv preprint arXiv:2009.11616},
year={2020}
}
参考书: 由哈工大社会计算与信息检索研究中心(HIT-SCIR)的多位学者共同编著的《自然语言处理:基于预训练模型的方法 》(作者:车万翔、郭江、崔一鸣;主审:刘挺)一书现已正式出版,该书重点介绍了新的基于预训练模型的自然语言处理技术,包括基础知识、预训练词向量和预训练模型三大部分,可供广大LTP用户学习参考。
pip install -U ltp ltp-core ltp-extension -i https://pypi.org/simple # 安装 ltp
注: 如果遇到任何错误,请尝试使用上述命令重新安装 ltp,如果依然报错,请在 Github issues 中反馈。
import torch
from ltp import LTP
ltp = LTP("LTP/small") # 默认加载 Small 模型
# 将模型移动到 GPU 上
if torch.cuda.is_available():
# ltp.cuda()
ltp.to("cuda")
output = ltp.pipeline(["他叫汤姆去拿外衣。"], tasks=["cws", "pos", "ner", "srl", "dep", "sdp"])
# 使用字典格式作为返回结果
print(output.cws) # print(output[0]) / print(output['cws']) # 也可以使用下标访问
print(output.pos)
print(output.sdp)
# 使用感知机算法实现的分词、词性和命名实体识别,速度比较快,但是精度略低
ltp = LTP("LTP/legacy")
# cws, pos, ner = ltp.pipeline(["他叫汤姆去拿外衣。"], tasks=["cws", "ner"]).to_tuple() # error: NER 需要 词性标注任务的结果
cws, pos, ner = ltp.pipeline(["他叫汤姆去拿外衣。"], tasks=["cws", "pos", "ner"]).to_tuple() # to tuple 可以自动转换为元组格式
# 使用元组格式作为返回结果
print(cws, pos, ner)
use std::fs::File;
use itertools::multizip;
use ltp::{CWSModel, POSModel, NERModel, ModelSerde, Format, Codec};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let file = File::open("data/legacy-models/cws_model.bin")?;
let cws: CWSModel = ModelSerde::load(file, Format::AVRO(Codec::Deflate))?;
let file = File::open("data/legacy-models/pos_model.bin")?;
let pos: POSModel = ModelSerde::load(file, Format::AVRO(Codec::Deflate))?;
let file = File::open("data/legacy-models/ner_model.bin")?;
let ner: NERModel = ModelSerde::load(file, Format::AVRO(Codec::Deflate))?;
let words = cws.predict("他叫汤姆去拿外衣。")?;
let pos = pos.predict(&words)?;
let ner = ner.predict((&words, &pos))?;
for (w, p, n) in multizip((words, pos, ner)) {
println!("{}/{}/{}", w, p, n);
}
Ok(())
}
| 深度学习模型 | 分词 | 词性 | 命名实体 | 语义角色 | 依存句法 | 语义依存 | 速度(句/S) |
|---|---|---|---|---|---|---|---|
| Base | 98.7 | 98.5 | 95.4 | 80.6 | 89.5 | 75.2 | 39.12 |
| Base1 | 99.22 | 98.73 | 96.39 | 79.28 | 89.57 | 76.57 | --.-- |
| Base2 | 99.18 | 98.69 | 95.97 | 79.49 | 90.19 | 76.62 | --.-- |
| Small | 98.4 | 98.2 | 94.3 | 78.4 | 88.3 | 74.7 | 43.13 |
| Tiny | 96.8 | 97.1 | 91.6 | 70.9 | 83.8 | 70.1 | 53.22 |
| 感知机算法 | 分词 | 词性 | 命名实体 | 速度(句/s) | 备注 |
|---|---|---|---|---|---|
| Legacy | 97.93 | 98.41 | 94.28 | 21581.48 | 性能详情 |
注:感知机算法速度为开启16线程速度
make bdist
感知机算法
深度学习算法
FAQs
Unknown package
We found that ltp/base2 demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Company News
Replit is integrating Socket Firewall into its AI-powered development experience to help protect builders from malicious open source packages.

Security News
npm confirmed a tooling bug incorrectly marked several one-character packages as security holders and said it was working on a rollback.

Research
/Security News
Newer packages in this compromise use native extensions and .pth loaders to execute JavaScript stealers in developer environments.