
Security News
MCP Community Begins Work on Official MCP Metaregistry
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Analyze resumes and job descriptions by extracting key sections and scoring them based on keyword relevance
dsresumatch
is a Python library designed to do the analysis, evaluation, and scoring of resumes in PDF format. With tools to extract, clean, and analyze text, it allows users to identify missing, count word frequencies, evaluate keyword matches, and generate scores for resumes based on predefined criteria. This package is especially useful for recruiters, hiring managers, and job seekers aiming to optimize resumes for keyword-based Applicant Tracking Systems (ATS).
Text Processing
read_pdf(file_path)
: Extracts raw text from a PDF file.clean_text(raw_text)
: Cleans and preprocesses extracted text by removing punctuation, stop words, and converting to lowercase.count_words_in_pdf(file_path)
: Counts word frequencies in a PDF file.Section Evaluation
missing_section(clean_text, add_benchmark_sections=None)
: Identifies sections missing from the resume compared to benchmark sections.Keyword Analysis
evaluate_keywords(cleaned_text, keywords=None, use_only_supplied_keywords=False)
: Matches keywords against the resume text and evaluates the coverage.Resume Scoring
resume_score(cleaned_text, keywords=None, use_only_supplied_keywords=False, add_benchmark_sections=[], feedback=True)
: Scores resumes based on keyword matching, benchmark sections, and provides detailed feedback on missing or extra keywords and sections.dsresumatch
Fit into the Python Ecosystem?dsresumatch
addresses a unique niche in the Python ecosystem by focusing on resume analysis and scoring, particularly for optimizing resumes for Applicant Tracking Systems (ATS) for Data Scientists. While there are general-purpose text analysis libraries such as:
There are no Python packages that consistently support resume matching (e.g., resume-matcher, which was last updated in February 2024). However, there are some Python programs available, such as resume-job-matcher and Resume Compatibility.
If you are looking for general PDF text extraction, libraries like PyPDF2 and pdfplumber might suit your needs. However, dsresumatch
builds on this functionality to provide domain-specific tools tailored to resume evaluation.
$ pip install dsresumatch
The full documentation can be found here.
Before using dsresumatch
, please ensure that your PDF file is placed in the root directory of the project repository.
Here is an example of using dsresumatch
to extract text from pdf, count words from pdf:
# Import required functions
from dsresumatch.pdf_cv_processing import read_pdf, count_words_in_pdf, clean_text
from dsresumatch.sections_check import missing_section
from dsresumatch.evaluate_keywords import evaluate_keywords
from dsresumatch.resume_scoring import resume_score
#additonal imports would be determined later
file_path = "~/Desktop/my_example_cv.pdf" # Specify the file path
raw_text = read_pdf(file_path) # Read text from the PDF
cleaned_text = clean_text(raw_text) # Clean and preprocess the text
word_counts = count_words_in_pdf(file_path) # Count words in the PDF
# (Optional) give keywords
add_benchmark_sections = ["Work Experience", "Education", "Skills", "Projects", "Certifications"]
# Identify missing sections
missing = missing_section(cleaned_text, benchmark_sections)
keywords = ["Python", "Data Analysis", "Machine Learning"] # Evaluate keywords
keyword_evaluation = evaluate_keywords(cleaned_text, keywords)
resume_summary = resume_score(
cleaned_text,
keywords=keywords,
add_benchmark_sections=benchmark_sections,
feedback=True,
) # Score the resume
print("Word Counts:", word_counts)
print("Missing Sections:", missing)
print("Keyword Evaluation:", keyword_evaluation)
print("Resume Summary:", resume_summary)
A vignette, outlining the full functionality of the package can be found here.
Nelli Hovhannisyan, Ashita Diwan, Timothy Singh, Jia Quan Lim
As a developer, are you innterested in contributing? Check out the contributing guidelines.
You may find the guided instructions in CONTRIBUTING.md Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
dsresumatch
was created by Nelli Hovhannisyan, Ashita Diwan, Timothy Singh, Jia Quan Lim. It is licensed under the terms of the MIT license.
dsresumatch
was created with cookiecutter
and the py-pkgs-cookiecutter
template.
FAQs
Analyze resumes and job descriptions by extracting key sections and scoring them based on keyword relevance
We found that dsresumatch demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.
Research
Security News
Malicious npm packages posing as developer tools target macOS Cursor IDE users, stealing credentials and modifying files to gain persistent backdoor access.