Latest Threat Research:SANDWORM_MODE: Shai-Hulud-Style npm Worm Hijacks CI Workflows and Poisons AI Toolchains.Details
Socket
Book a DemoSign in
Socket

apd

Package Overview
Dependencies
Maintainers
1
Versions
17
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

apd - pypi Package Compare versions

Comparing version
0.6.0
to
0.7.0
+3
-3
.gitlab-ci.yml
###############################################################################
# (c) Copyright 2021 CERN for the benefit of the LHCb Collaboration #
# (c) Copyright 2021-2023 CERN for the benefit of the LHCb Collaboration #
# #

@@ -15,3 +15,3 @@ # This software is distributed under the terms of the GNU General Public #

image: registry.cern.ch/docker.io/library/python:3.9
image: registry.cern.ch/docker.io/library/python:3.10

@@ -62,3 +62,3 @@ .setup_environment:

- tags@lhcb-dpa/analysis-productions/apd
image: registry.cern.ch/docker.io/library/python:3.9
image: registry.cern.ch/docker.io/library/python:3.10
before_script:

@@ -65,0 +65,0 @@ - pip install build twine

@@ -34,2 +34,10 @@ # See https://pre-commit.com for more information

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.3.0
hooks:
- id: mypy
files: src
additional_dependencies: [types-requests]
args: [--show-error-codes]
- repo: "https://gitlab.cern.ch/lhcb-core/LbDevTools.git"

@@ -36,0 +44,0 @@ rev: 2.0.38

Metadata-Version: 2.1
Name: apd
Version: 0.6.0
Version: 0.7.0
Summary: Tool to access the Analysis production Data

@@ -10,5 +10,17 @@ License: BSD-3-Clause

Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.9
License-File: LICENSE
Requires-Dist: requests
Requires-Dist: click
Requires-Dist: click-log
Requires-Dist: rich
Provides-Extra: testing
License-File: LICENSE
Requires-Dist: black; extra == "testing"
Requires-Dist: flake8; extra == "testing"
Requires-Dist: flake8-bugbear; extra == "testing"
Requires-Dist: pylint; extra == "testing"
Requires-Dist: pytest; extra == "testing"
Requires-Dist: pytest-cov; extra == "testing"
Requires-Dist: responses; extra == "testing"

@@ -18,3 +30,5 @@ Analysis Production Data

*EXPERIMENTAL* prototype of Python module to fulfill DPA's grand task https://gitlab.cern.ch/lhcb-dpa/project/-/issues/134.
Programmatic interface to the LHCb experiment Analysis Productions database,
which allows retrieving information about the samples produced.
It queries a REST endpoint provided by the web application, and caches the data locally.

@@ -24,3 +38,3 @@ Usage

The `apd` package is available in the ``lb-conda default`` environment.
The ``apd`` Python package is available in the ``lb-conda default`` environment.

@@ -27,0 +41,0 @@ From Python

Analysis Production Data
========================
*EXPERIMENTAL* prototype of Python module to fulfill DPA's grand task https://gitlab.cern.ch/lhcb-dpa/project/-/issues/134.
Programmatic interface to the LHCb experiment Analysis Productions database,
which allows retrieving information about the samples produced.
It queries a REST endpoint provided by the web application, and caches the data locally.

@@ -9,3 +11,3 @@ Usage

The `apd` package is available in the ``lb-conda default`` environment.
The ``apd`` Python package is available in the ``lb-conda default`` environment.

@@ -12,0 +14,0 @@ From Python

@@ -11,2 +11,3 @@ [metadata]

Programming Language :: Python :: 3.10
Programming Language :: Python :: 3.11

@@ -43,2 +44,3 @@ [options]

apd-list-pfns = apd.command:cmd_list_pfns
apd-list-lfns = apd.command:cmd_list_lfns
apd-list-samples = apd.command:cmd_list_samples

@@ -45,0 +47,0 @@ apd-summary = apd.command:cmd_summary

@@ -5,2 +5,3 @@ [console_scripts]

apd-dump-info = apd.command:cmd_dump_info
apd-list-lfns = apd.command:cmd_list_lfns
apd-list-pfns = apd.command:cmd_list_pfns

@@ -7,0 +8,0 @@ apd-list-samples = apd.command:cmd_list_samples

Metadata-Version: 2.1
Name: apd
Version: 0.6.0
Version: 0.7.0
Summary: Tool to access the Analysis production Data

@@ -10,5 +10,17 @@ License: BSD-3-Clause

Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.9
License-File: LICENSE
Requires-Dist: requests
Requires-Dist: click
Requires-Dist: click-log
Requires-Dist: rich
Provides-Extra: testing
License-File: LICENSE
Requires-Dist: black; extra == "testing"
Requires-Dist: flake8; extra == "testing"
Requires-Dist: flake8-bugbear; extra == "testing"
Requires-Dist: pylint; extra == "testing"
Requires-Dist: pytest; extra == "testing"
Requires-Dist: pytest-cov; extra == "testing"
Requires-Dist: responses; extra == "testing"

@@ -18,3 +30,5 @@ Analysis Production Data

*EXPERIMENTAL* prototype of Python module to fulfill DPA's grand task https://gitlab.cern.ch/lhcb-dpa/project/-/issues/134.
Programmatic interface to the LHCb experiment Analysis Productions database,
which allows retrieving information about the samples produced.
It queries a REST endpoint provided by the web application, and caches the data locally.

@@ -24,3 +38,3 @@ Usage

The `apd` package is available in the ``lb-conda default`` environment.
The ``apd`` Python package is available in the ``lb-conda default`` environment.

@@ -27,0 +41,0 @@ From Python

###############################################################################
# (c) Copyright 2021 CERN for the benefit of the LHCb Collaboration #
# (c) Copyright 2021-2023 CERN for the benefit of the LHCb Collaboration #
# #

@@ -27,5 +27,6 @@ # This software is distributed under the terms of the GNU General Public #

"authw",
"ApdReturnType",
]
from .analysis_data import AnalysisData, get_analysis_data
from .analysis_data import AnalysisData, ApdReturnType, get_analysis_data
from .ap_info import (

@@ -32,0 +33,0 @@ SampleCollection,

@@ -23,3 +23,5 @@ ###############################################################################

import os
from enum import Enum
from pathlib import Path
from typing import Any, Optional

@@ -45,2 +47,8 @@ from apd.ap_info import (

class ApdReturnType(Enum):
PFN = 0
LFN = 1
SAMPLE = 2
def _load_and_setup_cache(

@@ -229,3 +237,3 @@ cache_dir, working_group, analysis, ap_date=None, api_url="https://lbap.app.cern.ch"

# Map contains AnalysisData objects already loaded
__analysis_map = {}
__analysis_map: dict[str, Any] = {}

@@ -341,3 +349,3 @@

*,
return_pfns=True,
return_type=ApdReturnType.PFN,
check_data=True,

@@ -350,4 +358,4 @@ use_local_cache=True,

"""Main method that returns the dataset info.
The normal behaviour is to return the PFNs for the samples but setting
return_pfns to false returns the SampleCollection"""
The normal behaviour is to return the PFNs for the samples, but setting
return_type to ApdReturnType.SAMPLE returns the SampleCollection"""

@@ -398,9 +406,14 @@ # Establishing the list of samples to run on

if return_pfns:
if use_local_cache:
return self._transform_pfns(samples.PFNs())
return samples.PFNs()
if return_type == ApdReturnType.SAMPLE:
return samples
return samples
if return_type == ApdReturnType.LFN:
print("Returning lfns")
return samples.LFNs()
# by default we return the PFns
if use_local_cache:
return self._transform_pfns(samples.PFNs())
return samples.PFNs()
def _transform_pfns(self, pfns):

@@ -422,3 +435,3 @@ """Method to return PFNs, useful as it can be overriden in inheriting classes"""

def summary(self, tags: list = None) -> dict:
def summary(self, tags: Optional[list] = None) -> dict:
"""Prepares a summary of the Analysis Production info."""

@@ -425,0 +438,0 @@

@@ -374,2 +374,9 @@ ###############################################################################

def LFNs(self):
"""Collects the LFNs"""
lfns = []
for sample in self.info:
lfns += sample["lfns"].keys()
return lfns
def byte_count(self):

@@ -376,0 +383,0 @@ """Collects the number of files from all the samples"""

###############################################################################
# (c) Copyright 2022 CERN for the benefit of the LHCb Collaboration #
# (c) Copyright 2022-2023 CERN for the benefit of the LHCb Collaboration #
# #

@@ -24,3 +24,2 @@ # This software is distributed under the terms of the GNU General Public #

from pathlib import Path
from typing import NoReturn

@@ -43,3 +42,3 @@ import requests

def _login_gitlab_jwt(tokens_file: Path) -> NoReturn:
def _login_gitlab_jwt(tokens_file: Path):
r = requests.get(

@@ -55,3 +54,3 @@ f"{LBAPI_BASE_URL}/gitlab/credentials/",

def _login_sso(token_file: Path) -> NoReturn:
def _login_sso(token_file: Path):
"""Login to the CERN SSO and obtain a user token."""

@@ -73,3 +72,3 @@ token_response = device_authorization_login(SSO_CLIENT_ID)

def _write_request(r: requests.Response, path: Path) -> NoReturn:
def _write_request(r: requests.Response, path: Path):
"""Write the response (i.e. the token) as a read-only file on disk"""

@@ -89,3 +88,3 @@ # Ensure the data is valid JSON

def _auth_ok(token) -> NoReturn:
def _auth_ok(token):
r = requests.get(

@@ -99,3 +98,3 @@ f"{LBAPI_BASE_URL}/user/",

def get_auth_headers() -> dict[str, str]:
def get_auth_headers() -> dict[str, dict[str, str]]:
if "LBAP_TOKENS_FILE" in os.environ:

@@ -102,0 +101,0 @@ tokens_file = Path(os.environ["LBAP_TOKENS_FILE"])

###############################################################################
# (c) Copyright 2021 CERN for the benefit of the LHCb Collaboration #
# (c) Copyright 2021-203 CERN for the benefit of the LHCb Collaboration #
# #

@@ -20,7 +20,12 @@ # This software is distributed under the terms of the GNU General Public #

import click
import click_log
import click # type: ignore[import]
import click_log # type: ignore[import]
import requests
from .analysis_data import APD_DATA_CACHE_DIR, APD_METADATA_CACHE_DIR, get_analysis_data
from .analysis_data import (
APD_DATA_CACHE_DIR,
APD_METADATA_CACHE_DIR,
ApdReturnType,
get_analysis_data,
)
from .ap_info import cache_ap_info

@@ -228,2 +233,63 @@ from .authentication import get_auth_headers, logout

@common_docstr()
def cmd_list_lfns(
working_group,
analysis,
cache_dir,
tag,
value,
eventtype,
datatype,
polarity,
config,
name,
version,
date,
):
"""List the LFNs for the analysis, matching the tags specified.
This command checks that the arguments are not ambiguous."""
# Loading the data and filtering/displaying
datasets = get_analysis_data(
working_group, analysis, metadata_cache=cache_dir, ap_date=date
)
filter_tags = _process_common_tags(
eventtype, datatype, polarity, config, name, version
)
filter_tags |= dict(zip(tag, value))
for f in datasets(**filter_tags, return_type=ApdReturnType.LFN):
click.echo(f)
@click.command()
@click.argument("working_group")
@click.argument("analysis")
@click.option(
"--cache_dir",
default=os.environ.get(APD_METADATA_CACHE_DIR, None),
help="Specify location of the cache for the analysis metadata",
)
@click.option("--tag", default=None, help="Tag to filter datasets", multiple=True)
@click.option(
"--value",
default=None,
help="Tag value used if the name is specified",
multiple=True,
)
@click.option(
"--eventtype", default=None, help="eventtype to filter the datasets", multiple=True
)
@click.option(
"--datatype", default=None, help="datatype to filter the datasets", multiple=True
)
@click.option(
"--polarity", default=None, help="polarity to filter the datasets", multiple=True
)
@click.option(
"--config", default=None, help="Config to use (e.g. lhcb or mc)", multiple=True
)
@click.option("--name", default=None, help="dataset name")
@click.option("--version", default=None, help="dataset version")
@click.option("--date", default=None, help="analysis date in ISO 8601 format")
@click_log.simple_verbosity_option(logger)
@common_docstr()
def cmd_list_samples(

@@ -254,3 +320,5 @@ working_group,

filter_tags |= dict(zip(tag, value))
matching = datasets(check_data=False, return_pfns=False, **filter_tags)
matching = datasets(
check_data=False, return_type=ApdReturnType.SAMPLE, **filter_tags
)
click.echo(matching)

@@ -257,0 +325,0 @@

###############################################################################
# (c) Copyright 2022 CERN for the benefit of the LHCb Collaboration #
# (c) Copyright 2022-2023 CERN for the benefit of the LHCb Collaboration #
# #

@@ -22,3 +22,3 @@ # This software is distributed under the terms of the GNU General Public #

if not hasattr(_find_suitable_token, "tokens"):
_find_suitable_token.tokens = json.loads(
_find_suitable_token.tokens = json.loads( # type: ignore[attr-defined]
Path(os.environ["LBAP_TOKENS_FILE"]).read_text()

@@ -28,3 +28,3 @@ )

path = PurePosixPath(str(path)[1:])
for eos_token in _find_suitable_token.tokens["eos_tokens"]:
for eos_token in _find_suitable_token.tokens["eos_tokens"]: # type: ignore[attr-defined]
if allow_write and not eos_token["allow_write"]:

@@ -40,3 +40,3 @@ continue

msg += "Available tokens:\n"
for eos_token in _find_suitable_token.tokens["eos_tokens"]:
for eos_token in _find_suitable_token.tokens["eos_tokens"]: # type: ignore[attr-defined]
msg += f" * {eos_token['path']}"

@@ -51,10 +51,10 @@ if not eos_token["allow_write"]:

original_url = url
url = urlparse.urlparse(url)
url = urlparse.urlparse(url) # type: ignore[assignment]
# skip the files not on root if requested
if ignore_nonroot:
if not url.scheme or url.scheme == "file":
if not url.scheme or url.scheme == "file": # type: ignore[attr-defined]
return original_url
token = _find_suitable_token(PurePosixPath(url.path), allow_write)
token = _find_suitable_token(PurePosixPath(url.path), allow_write) # type: ignore[attr-defined]
token = urlparse.unquote(token)

@@ -61,0 +61,0 @@ url_parts = list(url)

###############################################################################
# (c) Copyright 2021 CERN for the benefit of the LHCb Collaboration #
# (c) Copyright 2021-2023 CERN for the benefit of the LHCb Collaboration #
# #

@@ -12,5 +12,5 @@ # This software is distributed under the terms of the GNU General Public #

from rich.console import Console
from rich.console import Console # type: ignore[import]
console = Console(stderr=True)
error_console = Console(stderr=True, style="bold red")

@@ -14,9 +14,9 @@ ###############################################################################

# isort: off
try:
from snakemake.remote.XRootD import (
RemoteProvider as XRootDRemoteProvider, # pylint: disable=import-error
)
from snakemake.remote.XRootD import RemoteProvider as XRootDRemoteProvider # type: ignore[import]
except Exception as exc:
raise Exception("apd.snakemake requires snakemake to be available") from exc
from .analysis_data import ApdReturnType
from .analysis_data import get_analysis_data as std_get_analysis_data

@@ -73,4 +73,4 @@ from .eos import auth as std_auth

results = self.analysisData(*args, **kwargs)
return_pfns = kwargs.get("return_pfns", True)
if not return_pfns:
return_type = kwargs.get("return_type", ApdReturnType.PFN)
if return_type != ApdReturnType.PFN:
return results

@@ -77,0 +77,0 @@ return [remote(f) for f in results]

@@ -83,10 +83,10 @@ ###############################################################################

def test_sample_check_load_dataset_error(apinfo_multipleversions):
datasets = AnalysisData(
"SL",
"RDs",
metadata_cache=apinfo_multipleversions,
datatype=["2012", "2018"],
polarity="magup",
)
with pytest.raises(ValueError):
datasets = AnalysisData(
"SL",
"RDs",
metadata_cache=apinfo_multipleversions,
datatype=["2012", "2018"],
polarity="magup",
)
datasets()

@@ -93,0 +93,0 @@

###############################################################################
# (c) Copyright 2021 CERN for the benefit of the LHCb Collaboration #
# (c) Copyright 2021-2023 CERN for the benefit of the LHCb Collaboration #
# #

@@ -14,3 +14,3 @@ # This software is distributed under the terms of the GNU General Public #

from apd import AnalysisData
from apd import AnalysisData, ApdReturnType

@@ -193,2 +193,3 @@

def test_unknown_tag_value(apd_cache):
"""CHeck that we throw a ValueError when a value does not exist for a given tag."""
datasets = AnalysisData("b2oc", "b02dkpi")

@@ -203,1 +204,17 @@

)
def test_lfn(apd_cache):
"""Check that the method to return the LFNs is functional."""
datasets = AnalysisData("b2oc", "b02dkpi")
lfns = datasets(
datatype="2011",
eventtype="11164047",
polarity="magdown",
return_type=ApdReturnType.LFN,
)
assert (
len(lfns) == 5
and all("00128098_0000" in x for x in lfns)
and all(x.startswith("/lhcb") for x in lfns)
)