
Security News
ECMAScript 2025 Finalized with Iterator Helpers, Set Methods, RegExp.escape, and More
ECMAScript 2025 introduces Iterator Helpers, Set methods, JSON modules, and more in its latest spec update approved by Ecma in June 2025.
job-offer-classifier
Advanced tools
Classify job candidate emails
Sentiment classifier of emails from job candidates based on whether an email response expresses an interesting candidate for the job position.
The sentiment classifier can be found on PyPI so you can just run:
pip install job-offer-classifier
For an editable install, clone the GitHub repository and cd
to the cloned repo directory, then run:
pip install -e job_offer_classifier
First load and run the data science pipeline by importing the module:
from job_offer_classifier.pipeline_classifier import Pipeline
Instantiate the class Pipeline
and call the pipeline
method. This method loads the dataset, and trains and evaluates the model. The source file is the dataset of payloads annotated with 'positive' and 'negative' labels
pl = Pipeline(src_file = '../data/interim/payloads.csv',random_state=931696214)
pl.pipeline()
The parameter random_state
is the pandas seed used in the dataframe split. This parameter is necessary to present deterministic results and has been chosen from the results of the k fold validation.
To make a prediction, use the sentiment
method
pl.sentiment(''' Thank you for offering me the position of Merchandiser with Thomas Ltd.
I am thankful to accept this job offer and look ahead to starting my career with your company
on June 27, 2000.''')
'positive'
One can take an example from the test set, contained in the dfs
attribute. This attribute is a dictionary of pandas dataframes.
example = pl.dfs['test'].sample(random_state=1213702178).payload.iloc[0]
print(example.strip())
thank you for offering me the position of financial analyst at Lozano-Carlson.
i was delighted to meet
you and learn more about the company.
although i verbally agreed to accept the position, i have given it a lot of thought and decided to turn
down the post.
i believe it is in my, and your company’s, best interests.
ultimately, i elected to take on a
position at a firm where i believe my skills and experience are a better fit. i truly apologise for any
inconvenience i have caused.
i was impressed with Lozano-Carlson during the interview, and continue to be at this time.
wishing you
all the best in the future and hope to still see you in attendance at the snow terrace financial conference
in june.
pl.sentiment(example)
'negative'
We use two tools to assesss the performance of the model:
To plot the confusion matrix, the Pipeline
has the method plot_confusion_matrix
.
pl.plot_confusion_matrix('train')
pl.plot_confusion_matrix('test')
To assess the performance of the model via the k fold validation method, import the class KFoldPipe
from job_offer_classifier.validations import KFoldPipe
Run the k_fold_validation
method
kfp = KFoldPipe(src_file='../data/interim/payloads.csv',n_splits=4)
kfp.k_fold_validation()
The averaged scores are stored in averages
kfp.averages['train']
{'accuracy': 0.9954212456941605,
'accuracy_baseline': 0.7985348105430603,
'auc': 0.9987489432096481,
'auc_precision_recall': 0.9996496587991714,
'average_loss': 0.02481173211708665,
'label/mean': 0.7985348105430603,
'loss': 0.03453406784683466,
'precision': 0.9954595416784286,
'prediction/mean': 0.7989358454942703,
'recall': 0.9988532066345215,
'global_step': 12500.0,
'f1_score': 0.9971447710408015}
kfp.averages['test']
{'accuracy': 0.980555534362793,
'accuracy_baseline': 0.800000011920929,
'auc': 0.995563268661499,
'auc_precision_recall': 0.9989252239465714,
'average_loss': 0.060208675917238,
'label/mean': 0.800000011920929,
'loss': 0.060208675917238,
'precision': 0.986666664481163,
'prediction/mean': 0.8020820915699005,
'recall': 0.9895833283662796,
'global_step': 12500.0,
'f1_score': 0.9880000766313914}
The seed of the best F1 score is stored in best_seed
kfp.best_seed
427851256
The library supports multiple classes in labels. The following instruction uploads the multiclass classifier
from job_offer_classifier.multiclass import Multiclass
The sibatel_web_intekglobal_payloads.csv file contains three type of sentiments: 'positive', 'negative' and 'neutral'. Instantiate the Multiclass
by specifying the number of classes
mc = Multiclass(
src_file='../data/raw/sibatel_web_intekglobal_payloads.csv',
random_state=931696214,
n_classes=3
)
mc.pipeline()
mc.plot_confusion_matrix('train')
mc.plot_confusion_matrix('test')
To further inquire on the training parameters and how to store and load the trained models, please refer to the pipeline docs and multiclass docs. The validation method can be found in the validations docs
https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub
FAQs
Classification of Job Offer Responses
We found that job-offer-classifier demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
ECMAScript 2025 introduces Iterator Helpers, Set methods, JSON modules, and more in its latest spec update approved by Ecma in June 2025.
Security News
A new Node.js homepage button linking to paid support for EOL versions has sparked a heated discussion among contributors and the wider community.
Research
North Korean threat actors linked to the Contagious Interview campaign return with 35 new malicious npm packages using a stealthy multi-stage malware loader.