Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
ScholarVista is a tool that analyzes research papers and extracts and plots information from them. It utilizes Grobid, a library for extracting content from research papers, to extract all the relevant data. The extracted data is then plotted and displayed using Python.
ScholarVista is a tool that extracts and plots information from a set of Academic Research Papers in PDF / TEI XML format. To process PDFs, it utilizes Grobid to generate the TEI XML files, then ScholarVista extracts the relevant information from the TEI XML files and generates the following data:
If you want to generate the results from a set of PDF academic papers, you must ensure that the Grobid Service to be installed and running in your machine. See Grobid Installation Instrucions here.
If you already have the TEI XML files generated, you can directly generate the information from them.
$ pip install scholarvista
When using pip it is a good practice to use virtual environments. Check out the official documentation on virtual envornments here.
The most convenient way of using ScholarVista is by using its CLI.
The CLI Tool will generate and save to a directory a keyword cloud and a list of URLs for each PDF analyzed, together with a histogram comparing the numer of figures of each PDF.
Usage: scholarvista [OPTIONS] COMMAND [ARGS]...
ScholarVista's CLI main entry point.
Options:
--input-dir PATH Directory containing PDF files. [required]
--output-dir PATH Directory to save results. Defaults to current directory.
--help Show this message and exit.
Commands:
process-pdfs Process all PDFs in the given directory.
process-xmls Process all TEI XMLs in the given directory.
See example.py
You can execute ScholarVista CLI from your shell like this:
# Process PDF files and save the results to a specified directory
$ scholarvista --input-dir ./pdfs --output-dir ./output process-pdfs
Note: The process-pdfs
command requires the Grobid Service to be up and running as described in requirements.
# Process TEI XML files and save the results to the current directory
$ scholarvista --input-dir ./xmls process-xmls
Please refer to the LICENSE
file.
For further assistance or to contribute to the project, please refer to the CONTRIBUTING.md
file.
FAQs
ScholarVista is a tool that analyzes research papers and extracts and plots information from them. It utilizes Grobid, a library for extracting content from research papers, to extract all the relevant data. The extracted data is then plotted and displayed using Python.
We found that scholarvista demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.