Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
OLEA (Offensive Language Error Analysis) is an open source library for diagnostic evaluation and error analysis of models for offensive language detection.
Hate speech and offensive language detection models can benefit from in-depth error analysis, more than just an F1 score, but many systems lack any extensive error analysis. To address this issue, we present OLEA, an extensible tool that provides researchers further insight into the performance of their offensive language detection model on different datasets.
The datasets currently available with OLEA:
'numpy>1.21.0'
'scipy>1.6.0'
'datasets>2.2.0'
'matplotlib>3.0'
'pandas>1.2.0'
'Pillow>8.0.0'
'scikit-learn>1.0'
'emoji>1.0'
'wordsegment>1.3'
pip install olea
The user provides a pre-trained hate speech detection model and predicts it on an OLEA-supported dataset. The user can then apply different analyses to their predictions to gain insight into what cases their model fails on. Consider this introductory example
from olea.data import COLD
from olea.analysis import COLDAnalysis
from olea.analysis import Generic
import pandas as pd
#import statements for downloading the example model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import TextClassificationPipeline
#Load Dataset
cold = COLD()
#Load in a Model
link = 'Hate-speech-CNERG/bert-base-uncased-hatexplain'
tokenizer = AutoTokenizer.from_pretrained(link)
model = AutoModelForSequenceClassification.from_pretrained(link)
#Predict on COLD
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer)
predictions = pd.DataFrame(pipe(list(cold.data()['Text']))).label
hate_map = {'offensive': 'Y' , 'hate speech': 'Y' , 'normal':'N'}
submission = cold.submit(cold.data(), predictions, map=hate_map)
plot_info, metrics = COLDAnalysis.analyze_on(submission,'Cat',show_examples = True)
plot_info, metrics = Generic.check_substring(submission,'female',show_examples = True)
OLEA provides generic analysis that can be applied to any NLP classification task, by evaluating performance based on a subset of the data. This can be applied to text length, and text containing certain strings, and text determined to be written in AAVE (Blodgett et al., 2016). OLEA also provides analysis specific for COLD and for HateCheck. The analysis provides metrics of F1, precision, and recall for each subset of data as well as accuracy and number of instances in each category
Generic Analysis includes:
analyze_on
for evaluating model performance on any specified
categorical column.
check_substring
for evaluating model performance on presence of a
specified substring in text
aave
for evaluating how the model predicts on instances that are
written using African American Vernacular English. The scores are
calculated using the TwitterAAE model
(Blodgett et al., 2016). These scores represent an
inference of the proportion of words in the text that come from a
demographically-associated language/dialect.
str_len_analysis
for evaluating how the model performs on
instances of different character or word lengths using a histogram.
check_anno_agreement
for evaluating model performance on
instances with full annotator agreement on the offensiveness of a
text ("Y","Y","Y") or ("N","N","N") vs instances with
partial agreement. This should indicate "easy" (full) vs
"difficult" (partial) cases.
The COLD-specific analysis includes:
analyze_on
for evaluating model performance on the COLD specific
categories outlined in (Palmer et al., 2020). These categories are
constructed from offensiveness, presence of adjectival
nomanilization, presence of slur, and presence of linguistic
distancing.The HateCheck-specifc analysis includes:
analyze_on
for evaluating model performance on the HateCheck
specific categories outlined in (Röttger et al., 2021). Some
categories included are negation, counter, derogation, and
profanity.Marie Grace, Jay Seabrum, Dananjay Srinivas, and Alexis Palmer all contributed to this library. Please contact olea.ask@gmail.com for further inquiries.
Blodgett, S. L., Green, L., & O’Connor, B. (2016). Demographic Dialectal Variation in Social Media: A Case Study of African-American English. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1119–1130. https://doi.org/10.18653/v1/D16-1120
Palmer, A., Carr, C., Robinson, M., & Sanders, J. (2020). COLD: Annotation scheme and evaluation data set for complex offensive language in English. 28.
Röttger, P., Vidgen, B., Nguyen, D., Waseem, Z., Margetts, H., & Pierrehumbert, J. B. (2021). HateCheck: Functional Tests for Hate Speech Detection Models. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 41–58. https://doi.org/10.18653/v1/2021.acl-long.4
FAQs
OLEA (Offensive Language Error Analysis) is an open source library for diagnostic evaluation and error analysis of models for offensive language detection.
We found that olea demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.