
Security News
Meet Socket at Black Hat and DEF CON 2025 in Las Vegas
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Python3 client for the arXiv API.
Install package arxiv_client
from PyPI.
This differs from the pre-existing arxiv.py project in that it further abstracts away the arXiv API so you do not need to learn to construct query strings. The overall goal is to enable users to skip reading the API docs entirely.
import arxiv_client as arx
client = arx.Client()
articles = client.rss_by_subject(arx.Subject.COMPUTER_SCIENCE)
import arxiv_client as arx
categories = [arx.Category.CS_AI, arx.Category.CS_CL, arx.Category.CS_IR]
client = arx.Client()
articles = client.search(arx.Query(keywords=["llm"], categories=categories, max_results=10))
for article in articles:
print(article)
When using the structured Query
fields, multiple values within a single field are combined using OR
,
and multiple fields are combined using AND
.
The Query
object accepts the following field filters:
keywords
: terms across all fieldstitle_keywords
: terms in the article titleauthor_names
: names in the author listcategories
: arXiv subject categoriesabstract_keywords
: terms in the article abstractcomment_keywords
: terms in the author provided commentarticle_ids
: arXiv article IDscustom_params
: custom query stringQuery(keywords=["llm"], categories=[Category.CS_AI, Category.CS_IR], max_results=5)
# Query(
# keywords=['llm'],
# title_keywords=[],
# author_names=[],
# categories=[<Category.CS_AI: 'cs.AI'>, <Category.CS_IR: 'cs.IR'>],
# abstract_keywords=[],
# comment_keywords=[],
# article_ids=[],
# custom_params=None,
# sort_criterion=SortCriterion(sort_by=<SortBy.LAST_UPDATED_DATE: 'lastUpdatedDate'>, sort_order=<SortOrder.DESC: 'descending'>),
# start=0,
# max_results=5
# )
Results in the following query logic:
("llm") in any field AND (cs.AI OR cs.IR) in the categories
See the Query class for more information.
If the provided simple query logic is insufficient, the Query
object takes a self-built query string through the custom_params
attribute. You do not need to URL encode this value.
See arXiv Query Construction for more information on building your own queries.
custom = f"cat:{Category.CS_AI.value} ANDNOT cat:{Category.CS_RO.value}"
Query(keywords=["paged attention", "attention window"], custom_params=custom)
# Query(
# keywords=['paged attention', 'attention window'],
# title_keywords=[],
# author_names=[],
# categories=[],
# abstract_keywords=[],
# comment_keywords=[],
# article_ids=[],
# custom_params='cat:cs.AI ANDNOT cat:cs.RO',
# sort_criterion=SortCriterion(sort_by=<SortBy.LAST_UPDATED_DATE: 'lastUpdatedDate'>, sort_order=<SortOrder.DESC: 'descending'>),
# start=0,
# max_results=10
# )
Results in the following query logic:
("paged attention" OR "attention window") in any field AND (cs.AI AND NOT cs.RO) in the categories
Equivalent query string:
(all:"paged attention" OR all:"attention window") AND (cat:cs.AI ANDNOT cat:cs.RO)
The arXiv search API is unreliable, especially for large queries.
The API will sometimes return incomplete results or return no entries, although the response is valid. See this GitHub issue for discussion on the topic.
If you are encountering this problem, some things that may help include:
100
seems to have a relatively high success rateRetries often help with the issue, but are sometimes insufficient. If you need more reliable access to large query results, consider looking into the arXiv Bulk Data Access options.
This uses hatch for project management.
FAQs
Python3 client for the arXiv API
We found that arxiv-client demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Security News
CAI is a new open source AI framework that automates penetration testing tasks like scanning and exploitation up to 3,600× faster than humans.
Security News
Deno 2.4 brings back bundling, improves dependency updates and telemetry, and makes the runtime more practical for real-world JavaScript projects.