===============================
Signposting link parser library
Finding signposting in FAIR resources
.. image:: https://img.shields.io/pypi/v/signposting
:target: https://pypi.org/project/signposting/
:alt: pypi install signposting
.. image:: https://img.shields.io/pypi/pyversions/signposting
:target: https://pypi.org/project/signposting/
:alt: Python
.. image:: https://img.shields.io/github/license/stain/signposting
:target: https://www.apache.org/licenses/LICENSE-2.0
:alt: Apache License v.2.0
.. image:: https://github.com/stain/signposting/workflows/Tests/badge.svg?branch=main
:target: https://github.com/stain/signposting/actions?workflow=Tests
:alt: Test Status
.. image:: https://github.com/stain/signposting/workflows/Package%20Build/badge.svg?branch=main
:target: https://github.com/stain/signposting/actions?workflow=Package%20Build
:alt: Package Build
.. image:: https://codecov.io/gh/stain/signposting/branch/main/graph/badge.svg
:target: https://codecov.io/gh/stain/signposting
:alt: Codecov
.. image:: https://img.shields.io/readthedocs/signposting/latest?label=Read%20the%20Docs
:target: https://signposting.readthedocs.io/en/latest/index.html
:alt: Read the Docs
.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.6815412.svg
:target: https://doi.org/10.5281/zenodo.6815412
:alt: DOI 10.5281/zenodo.6815412
Summary
This library helps client to discover links that follow the
signposting
_ conventions, most notably FAIR Signposting
_.
This can then be used to navigate between:
- Persistent identifiers
- HTML landing pages
- File downloads/items
- Structured metadata
Method
The library works by inspecting the HTTP messages for
Link
headers from a given URI with find_signposting_http
, which
which categorize them by their rel
Link relation
_ into a
Signposting
object with absolute URIs.
It is up to the clients of this library to decide how to further
navigate or retrieve the associated resources, e.g. using a
RDF library like rdflib
_ or retrieving resources using urllib
_.
Future versions of this library may also provide ways to discover
FAIR signposting in HTML <link>
annotations and in
linkset
_ documents.
Motivation
FAIR Signposting
_ has been proposed as a mechanism for automated clients to find
metadata and persistent identifiers for FAIR data residing in repositories that follow
the traditional PID-to-landing-page metaphor.
This avoids the need for client guesswork with content-negotiation, and allows structured
metadata to be provided by the repository rather than just PID providers like DataCite.
The main idea of FAIR Signposting is to re-use the existing HTTP mechanism for links, using
existing relations like describedby
, cite-as
and item
.
The aim of this library is to assist such clients to find and consume FAIR resources
for further processing. It is out of scope for this code to handle parsing of the
structured metadata files.
Copyright and license
© Copyright 2022 The University of Manchester, UK.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
SPDX-License-Identifier: Apache-2.0
See the authors
_ page for a full list of contributors.
How to use this repository
The documentation
_ pages explain briefly how to use this library including a listing of modules and methods.
Issues and Discussions
As usual in any GitHub based project, raise an issue
_ if you find any bug or have other suggestions; or open a discussion
_ if you want to discuss or talk :-)
Version
v0.9.9
.. _GitHub Actions: https://github.com/features/actions
.. _PyPI: https://pypi.org
.. _bump2version: https://github.com/c4urself/bump2version
.. _discussion: https://github.com/stain/signposting/discussions
.. _documentation: https://signposting.readthedocs.io/
.. _issue: https://github.com/stain/signposting/issues
.. _main branch: https://github.com/stain/signposting/tree/main
.. _pdb-tools: https://github.com/haddocking/pdb-tools
.. _project's documentation: https://signposting.readthedocs.io/en/latest/index.html
.. _pytest: https://docs.pytest.org/en/stable/git
.. _test.pypi.org: https://test.pypi.org
.. _ReadTheDocs: https://readthedocs.org/
.. _signposting: https://signposting.org/conventions/
.. _FAIR Signposting: https://signposting.org/FAIR/
.. _Link Relation: https://www.iana.org/assignments/link-relations/
.. _rdflib: https://rdflib.readthedocs.io/en/stable/
.. _urllib: https://docs.python.org/3/library/urllib.html
.. _linkset: https://signposting.org/FAIR/#linksetrec
.. _authors: https://signposting.readthedocs.io/en/latest/authors.html
Changelog
v0.9.9 (2024-03-23)
- CLI outputs Describes if present
v0.9.8 (2024-03-23)
- CLI by default now looks up all HTTP and HTML links combined
- Added CLI options --html --html --linkset --distinct to control above
- Added CLI options --any-context to also report links from other contexts
- Added experimental parsing of URI-based link relations (--extensions)
- Added link relation rel="describes"
v0.9.7 (2024-02-25)
- Experimental CLI support for HTTP and HTML links added
v0.9.6 (2024-01-08)
v0.9.5 (2024-01-08)
- Updated changelog and codemeta contributors
v0.9.4 (2023-11-16)
v0.9.3 (2023-09-21)
- Fix error in text (contributed by Vincent Emonet)
v0.9.2 (2023-07-03)
- Added
CITATION.cff
- Allow
#fragments
in profile URIs (e.g. for JSON-LD)
v0.9.1 (2022-10-03)
- Added
codemeta.json
and contributors
v0.9.0 (2022-10-03)
- Deprecated
Signposting.context_url
, use Signposting.context
instead - Removed deprecated
find_signposting
method, use find_signposting_http_link
v0.8.3 (2022-10-02)
- Fix typos, unused imports, missing docstring, and other small issues (contributed by Bruno P. Kinoshita)
v0.8.2 (2022-09-29)
- Improved code coverage
Signposting.linksets
now included in iteration
v0.8.1 (2022-09-29)
- Documentation markup fixes.
- Indicated Development Status raised to Alpha
v0.8.0 (2022-09-29)
- Added
warn_duplicate
option to Signposting
constructor - str() on
Signposting
now includes Link
from other contexts Signposting
added support for +
(add) and |
(merge) operations- Added
Signpost
and Signposting
support for ==
and hash()
- str() on
Signpost
correctly shows context as anchor=
- Added
Signpost.with_context
to change a signpost's for_context
v0.7.3 (2022-09-29)
- Prototyped operators for
Signposting
and Signpost
- Revised API documentation and cross-links
- Further code coverage by tests
v0.7.2 (2022-09-26)
- Prototyped Signpost/Signposting support for
==
and hash()
- Prototyped
Signpost.with_context
to change a signpost's for_context
v0.7.1 (2022-08-22)
v0.7.0 (2022-08-20)
- Support multiple context in
Signposting
-- users of find_signposting_linkset
should take particularly care to look up using for_context
- RFC7231 update: Don't resolve context according to
Content-Location
header
v0.6.1 (2022-08-19)
find_signposting_linkset
listed in module
v0.6.0 (2022-08-14)
- Linkset parsing exposed as
find_signposting_linkset
- Optional explicit content-negotiate for linksets
- Integration tests for linksets using a2a-fair-metrics benchmarks
v0.5.2 (2022-08-14)
- Handle missing Content-Type header
v0.5.1 (2022-08-14)
- Unit tests compatible with Python 3.7
v0.5.0 (2022-08-13)
- Add experimental RFC9264 linkset parsing (text and json)
v0.4.0 (2022-08-13)
- Renamed deprecated
find_signposting
, renamed to find_signposting_http_link
- More unit tests for
signposting.htmllinks
v0.3.3 (2022-08-12)
v0.3.2 (2022-08-12)
- Unit tests for
signposting.htmllinks
v0.3.1 (2022-08-11)
- Refactor
signposting.htmllinks
module
v0.3.0 (2022-08-09)
- Expose
find_signposting_html
in public API
v0.2.6 (2022-08-09)
- Improved type safety in
htmllinks
v0.2.5 (2022-08-08)
- Further documentation improvements
- Initial HTML parsing of elements (import
signposting.htmllinks
for now) - Added str/repr for
Signposting
and Signpost
classes. str(s)
return HTTP link headers. - Added
Signposting.signposts
property Signposting
is now iterable
v0.2.4 (2022-07-08)
- Documentation improvements
v0.2.3 (2022-07-08)
v0.2.2 (2022-06-07)
- Tidy up
__init__.py
public API
v0.2.1 (2022-06-05)
- API Change: Refactored to new
Signposting
classes
to avoid exposing the ParsedLink
implementation. - Note:
Signposting
attributes like .authors
are now
sets to indicate order is not (very) important. - Removed rdflib dependency
v0.1.3 (2022-05-17)
- Hide for now draft implementation
v0.1.2 (2022-05-17)
- Draft implementation of
Signposting
classes
v0.1.1 (2022-04-13)
v0.1.0 (2022-04-13)
v0.0.15 (2022-04-13)
- Documentation improvements
v0.0.14 (2022-04-13)
- Documentation improvements
v0.0.13 (2022-04-13)
- Documentation improvements
v0.0.12 (2022-04-13)
v0.0.11 (2022-04-13)
v0.0.10 (2022-04-12)
v0.0.9 (2022-04-11)
- Documented changelog for old versions
v0.0.8 (2022-04-11)
v0.0.7 (2022-04-11)
- Command line tool functional
v0.0.6 (2022-04-11)
- Initial draft of command line tool
v0.0.5 (2022-04-10)
v0.0.4 (2022-04-06)
- API Documentation drafted
find_landing_page
renamed find_signposting_http
v0.0.3 (2022-04-06)
- README updates
- More tests until a2a-fair-metrics test #17
v0.0.2 (2022-04-06)
- Initial HTTP Link header parsing
v0.0.1 (2022-04-01)
- Generated from joaomcteixeira/python-project-skeleton