Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

acdh-transkribus-utils

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

acdh-transkribus-utils

some utility function to interact with the Transkribus-API

  • 2.11
  • PyPI
  • Socket score

Maintainers
2

acdh-transkribus-utils

PyPI version flake8 Lint Test codecov

A python package providing some utility functions for interacting with the Transkribus-API

Installation

pip install acdh-transkribus-utils

Usage

Authentication

Set Transkribus-Credentials as environment variables:

export TRANSKRIBUS_USER=some@mail.com
export TRANSKRIBUS_PASSWORD=verysecret

(or create a file called env.secret similar to env.dummy and run source export_env_variables.sh) you can pass in your credentials also as params e.g.

import os

from transkribus_utils.transkribus_utils import ACDHTranskribusUtils


tr_user = os.environ.get("TRANSKRIBUS_USER")
tr_pw = os.environ.get("TRANSKRIBUS_PASSWORD")

client = ACDHTranskribusUtils(user=tr_user, password=tr_pw)

List all collections

collections = client.list_collections()
for x in collections[-7:]:
    print(x["colId"], x["colName"])

# 188933 bv-play
# 188991 Kasten_blau_45_11
# 190357 acdh-transkribus-utils
# 193145 palm
# 195363 Österreichische Bundesverfassung: Datenset A
# 196428 Österreichische Bundesverfassung: Datenset B
# 196429 Österreichische Bundesverfassung: Datenset C

List all documents from a given collection

col_id = 142911
documents = client.list_docs(col_id)
n = -3
for x in documents[n:]:
    print(x["docId"], x["title"], x["author"], x["nrOfPages"])

# 950920 Kasten_blau_44_9_0050 Pfalz-Neuburg, Eleonore Magdalena Theresia von 1
# 950921 Kasten_blau_44_9_0037 Pfalz, Johann Wilhelm Joseph Janaz von der 4
# 950922 Kasten_blau_44_9_0239 Pfalz, Johann Wilhelm Joseph Janaz von der 1


Download METS files from Collection

from transkribus_utils.transkribus_utils import ACDHTranskribusUtils

COL_ID = 51052
client = ACDHTranskribusUtils()
client.collection_to_mets(COL_ID)
# downloads a METS for each document in the given collection into a folder `./{COL_ID}

client.collection_to_mets(COL_ID, file_path='./foo')
# downloads a METS for each document in the given collection into a folder `./foo/{COL_ID}

client.collection_to_mets(COL_ID, filter_by_doc_ids=[230161, 230155])
# downloads only METS for document with ID 230161 and 230155 into a folder `./{COL_ID}

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc