Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
circuit-maintenance-parser
Advanced tools
Python library to parse Circuit Maintenance notifications and return a structured data back
circuit-maintenance-parser
is a Python library that parses circuit maintenance notifications from Network Service Providers (NSPs), converting heterogeneous formats to a well-defined structured format.
Every network depends on external circuits provided by NSPs who interconnect them to the Internet, to office branches or to external service providers such as Public Clouds.
Obviously, these services occasionally require operation windows to upgrade or to fix related issues, and usually, they happen in the form of circuit maintenance periods. NSPs generally notify customers of these upcoming events so that customers can take actions to minimize the impact on the regular usage of the related circuits.
The challenge faced by many customers is that almost every NSP defines its own maintenance notification format, even though in the end the relevant information is mostly the same across NSPs. This library is built to parse notification formats from several providers and to return always the same object struct which will make it easier to process them afterwards.
The format of this output follows the BCOP defined during a NANOG meeting that aimed to promote the usage of the iCalendar format. Indeed, if the NSP is using the proposed iCalendar format, the parser is straightforward and there is no need to define custom logic, but this library enables supporting other providers that are not using this proposed practice, getting the same outcome.
You can leverage this library in your automation framework to process circuit maintenance notifications and use the standardized Maintenance
model to handle your received circuit maintenance notifications in a simple way. Every Maintenance
object contains the following attributes:
1
for most non-iCalendar parsed notifications.²0
for most non-iCalendar parsed notifications.Please, refer to the BCOP to more details about the standardized meaning of these attributes.
¹ Per the BCOP, the status (X-MAINTNOTE_STATUS
) is an optional field in iCalendar notifications. However, a Maintenance
object will always contain a status
value; in the case where an iCalendar notification omits this field, the status
will be set to "NO-CHANGE"
, and it's up to the consumer of this library to determine how to appropriately handle this case. Parsers of other notification formats are responsible for setting an appropriate value for this field based on the notification contents, and may or may not include "NO-CHANGE"
as one of the possible reported values.
² Per the BCOP, the sequence is a mandatory field in iCalendar notifications. However, some NSPs have been seen to send notifications which, while otherwise consistent with the BCOP, omit the SEQUENCE
field; in such cases, this library will report a sequence number of -1
.
Provider
, directly or via the init_provider
method, that depending on the selected type will return the corresponding instance.NotificationData
class. This instance groups together DataParts
which each contain some content and a specific type (that will match a specific Parser
). For example, a NotificationData
might describe a received email message, with DataParts
corresponding to the subject line and body of the email. There are factory methods to initialize a NotificationData
describing a single chunk of binary data, as well as others to initialize one directly from a raw email message or email.message.EmailMessage
instance.Provider
uses one or more Processors
that will be used to build Maintenances
when the Provider.get_maintenances(data)
method is called.Processor
class uses one or more Parsers
to process each type of data that it handles. It can have custom logic to combine the parsed data from multiple Parsers
to create the final Maintenance
object.Parser
class supports one or a set of related data types, and implements the Parser.parse()
method used to retrieve a Dict
with the relevant keys/values.
By default, there is a GenericProvider
that supports a SimpleProcessor
using the standard ICal
Parser
, being the easiest path to start using the library in case the provider uses the reference iCalendar standard.
(*) Providers in both lists, with BCOP standard and nonstandard parsers.
Note: Because these providers do not support the BCOP standard natively, maybe there are some gaps on the implemented parser that will be refined with new test cases. We encourage you to report related issues!
The library supports an optional parser option leveraging Large Language Models (LLM) to provide best-effort parsing when the specific parsers have not been successful.
Warning: Some of these integrations, such as OpenAI, require of extras installations parameters. Check the extras section
When the appropriate environment variable(s) are set (see below), these LLM parsers are automatically appended after all existing processors for each defined Provider.
These integrations may involve some costs for API usage. Use it carefully! As an order of magnitude, a parsing of an email with OpenAI GPT gpt-3.5-turbo model costs $0.004.
These are the currently supported LLM integrations:
PARSER_LLM_QUESTION_STR
(Optional), question to overwrite the default one. Change it carefully. It has precedence over PARSER_LLM_QUESTION_FILEPATH
PARSER_LLM_QUESTION_FILEPATH
(Optional), a path to a file that contains a question to overwrite the default one.
OpenAI, these are the supported ENVs:
PARSER_OPENAI_API_KEY
(Required): OpenAI API Key.PARSER_OPENAI_MODEL
(Optional): The LLM model to use, defaults to "gpt-3.5-turbo".Each Maintenance
comes with a metadata
attribute to provide information about the provider used and the process and parsers used in the successful parsing of the maintenance.
This information is relevant to validate the actual content of the Maintenance
because it may be generated using an LLM-powered parser which means that the confidence level is lower than using a pre-defined parser. You can check the Metadata.generate_by_llm
boolean to check it.
The library is available as a Python package in pypi and can be installed with pip:
pip install circuit-maintenance-parser
pip install circuit-maintenance-parser[openai]
The library requires two things:
notificationdata
: this is the data that the library will check to extract the maintenance notifications. It can be simple (only one data type and content, such as an iCalendar notification) or more complex (with multiple data parts of different types, such as from an email).provider
identifier: used to select the proper Provider
which contains the processor
logic to take the proper Parsers
and use the data that they extract. By default, the GenericProvider
(used when no other provider type is defined) will support parsing of iCalendar
notifications using the recommended format.The first step is to define the Provider
that we will use to parse the notifications. As commented, there is a GenericProvider
that implements the gold standard format and can be reused for any notification matching the expectations.
from circuit_maintenance_parser import init_provider
generic_provider = init_provider()
type(generic_provider)
<class 'circuit_maintenance_parser.provider.GenericProvider'>
However, usually some Providers
don't fully implement the standard and maybe some information is missing, for example the organizer
email or maybe a custom logic to combine information is required, so we allow custom Providers
:
ntt_provider = init_provider("ntt")
type(ntt_provider)
<class 'circuit_maintenance_parser.provider.NTT'>
Once we have the Provider
ready, we need to initialize the data to process, we call it NotificationData
and can be initialized from a simple content and type or from more complex structures, such as an email.
from circuit_maintenance_parser import NotificationData
raw_data = b"""BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Maint Note//https://github.com/maint-notification//
BEGIN:VEVENT
SUMMARY:Maint Note Example
DTSTART;VALUE=DATE-TIME:20151010T080000Z
DTEND;VALUE=DATE-TIME:20151010T100000Z
DTSTAMP;VALUE=DATE-TIME:20151010T001000Z
UID:42
SEQUENCE:1
X-MAINTNOTE-PROVIDER:example.com
X-MAINTNOTE-ACCOUNT:137.035999173
X-MAINTNOTE-MAINTENANCE-ID:WorkOrder-31415
X-MAINTNOTE-IMPACT:OUTAGE
X-MAINTNOTE-OBJECT-ID;X-MAINTNOTE-OBJECT-IMPACT=NO-IMPACT:acme-widgets-as-a-service
X-MAINTNOTE-OBJECT-ID;X-MAINTNOTE-OBJECT-IMPACT=OUTAGE:acme-widgets-as-a-service-2
X-MAINTNOTE-STATUS:TENTATIVE
ORGANIZER;CN="Example NOC":mailto:noone@example.com
END:VEVENT
END:VCALENDAR
"""
data_to_process = NotificationData.init_from_raw("ical", raw_data)
type(data_to_process)
<class 'circuit_maintenance_parser.data.NotificationData'>
Finally, with we retrieve the maintenances (it is a List
because a notification can contain multiple maintenances) from the data calling the get_maintenances
method from the Provider
instance:
maintenances = generic_provider.get_maintenances(data_to_process)
print(maintenances[0].to_json())
{
"account": "137.035999173",
"circuits": [
{
"circuit_id": "acme-widgets-as-a-service",
"impact": "NO-IMPACT"
},
{
"circuit_id": "acme-widgets-as-a-service-2",
"impact": "OUTAGE"
}
],
"end": 1444471200,
"maintenance_id": "WorkOrder-31415",
"organizer": "mailto:noone@example.com",
"provider": "example.com",
"sequence": 1,
"stamp": 1444435800,
"start": 1444464000,
"status": "TENTATIVE",
"summary": "Maint Note Example",
"uid": "42"
}
Notice that, either with the GenericProvider
or NTT
provider, we get the same result from the same data, because they are using exactly the same Processor
and Parser
. The only difference is that NTT
notifications come without organizer
and provider
in the notification, and this info is fulfilled with some default values for the Provider
, but in this case, the original notification contains all the necessary information, so the defaults are not used.
ntt_maintenances = ntt_provider.get_maintenances(data_to_process)
assert maintenances_ntt == maintenances
Every maintenance contains the metadata
attribute to understand how has been parsed:
print(maintenances[0].metadata)
provider='genericprovider' processor="SimpleProcessor" parsers=["ICal"], generated_by_llm=False
There is also a cli
entry point circuit-maintenance-parser
which offers easy access to the library using a few arguments:
data-file
: file storing the notification.data-type
: ical
, html
or email
, depending on the data type.provider-type
: to choose the right Provider
. If empty, the GenericProvider
is used.circuit-maintenance-parser --data-file "/tmp/___ZAYO TTN-00000000 Planned MAINTENANCE NOTIFICATION___.eml" --data-type email --provider-type zayo
Circuit Maintenance Notification #0
{
"account": "some account",
"circuits": [
{
"circuit_id": "/OGYX/000000/ /ZYO /",
"impact": "OUTAGE"
}
],
"end": 1601035200,
"maintenance_id": "TTN-00000000",
"organizer": "mr@zayo.com",
"provider": "zayo",
"sequence": 1,
"stamp": 1599436800,
"start": 1601017200,
"status": "CONFIRMED",
"summary": "Zayo will implement planned maintenance to troubleshoot and restore degraded span",
"uid": "0"
}
Even though the library aims to include support for as many providers as possible, it's likely that not all the thousands of NSP are supported and you may need to add support for some new one. Adding a new Provider
is quite straightforward, and in the following example we are adding support for an imaginary provider, ABCDE, that uses HTML notifications.
The first step is creating a new file: circuit_maintenance_parser/parsers/abcde.py
. This file will contain all the custom parsers needed for the provider and it will import the base classes for each parser type from circuit_maintenance_parser.parser
. In the example, we only need to import Html
and in the child class implement the methods required by the class, in this case parse_html()
which will return a dict
with all the data that this Parser
can extract. In this case, we have to helper methods, _parse_bs
and _parse_tables
that implement the logic to navigate the notification data.
from typing import Dict
import bs4 # type: ignore
from bs4.element import ResultSet # type: ignore
from circuit_maintenance_parser.parser import Html
class HtmlParserABCDE1(Html):
def parse_html(self, soup: ResultSet) -> Dict:
data = {}
self._parse_bs(soup.find_all("b"), data)
self._parse_tables(soup.find_all("table"), data)
return [data]
def _parse_bs(self, btags: ResultSet, data: Dict):
...
def _parse_tables(self, tables: ResultSet, data: Dict):
...
The next step is to create the new Provider
by defining a new class in circuit_maintenance_parser/provider.py
. This class that inherits from GenericProvider
only needs to define two attributes:
_processors
: is a list
of Processor
instances that uses several data Parsers
. In this example, we don't need to create a new custom Processor
because the combined logic serves well (the most likely case), and we only need to use the newly defined HtmlParserABCDE1
and also the generic EmailDateParser
that extracts the email date. Also notice that you could have multiple Processors
with different Parsers
in this list, supporting several formats._default_organizer
: This is a default helper to fill the organizer
attribute in the Maintenance
if the information is not part of the original notification.class ABCDE(GenericProvider):
_processors: List[GenericProcessor] = [
CombinedProcessor(data_parsers=[EmailDateParser, HtmlParserABCDE1]),
]
_default_organizer = "noc@abcde.com"
And expose the new Provider
in circuit_maintenance_parser/__init__.py
:
from .provider import (
GenericProvider,
ABCDE,
...
)
SUPPORTED_PROVIDERS = (
GenericProvider,
ABCDE,
...
)
Last, but not least, you should update the tests!
Parser
in tests/unit/test_parsers.py
Provider
logic in tests/unit/test_e2e.py
... adding the necessary data samples in tests/unit/data/abcde/
.
You can anonymize your IPv4 and IPv6 addresses using the
invoke anonymize-ips --local
. Keep in mind that only IPv4 addresses for documentation purposes (RFC5737: "192.0.2.0/24", "198.51.100.0/24", "203.0.113.0/24") are preserved, in case you need to check these IPs in your test output (unlikely)
Pull requests are welcomed and automatically built and tested against multiple versions of Python through Travis CI.
The project is following Network to Code software development guidelines and is leveraging:
poetry
poetry install
invoke tests --local
Parsers
(inheriting from some of the generic Parsers
or a new one) that will extract the data from the notification, which could contain multiple DataParts
. The data_type
of the Parser
and the DataPart
have to match. The custom Parsers
will be placed in the parsers
folder.unit/test_parsers.py
with the new parsers, providing some data to test and validate the extracted data.Provider
inheriting from the GenericProvider
, defining the Processors
and the respective Parsers
to be used. Maybe you can reuse some of the generic Processors
or maybe you will need to create a custom one. If this is the case, place it in the processors
folder.
Provider
also supports the definition of a _include_filter
and a _exclude_filter
to limit the notifications that are actually processed, avoiding false positive errors for notification that are not relevant.unit/test_e2e.py
with the new provider, providing some data to test and validate the final Maintenances
created.Provider
class updating the map SUPPORTED_PROVIDERS
in circuit_maintenance_parser/__init__.py
to officially expose the Provider
.pytest --log-cli-level=DEBUG --capture=tee-sys
. You can narrow down the tests that you want to execute with the -k
flag. If successful, your results should look similar to the following:-> % pytest --log-cli-level=DEBUG --capture=tee-sys -k test_parsers
...omitted debug logs...
====================================================== 99 passed, 174 deselected, 17 warnings in 10.35s ======================================================
invoke tests --local
-> % invoke tests --local
LOCAL - Running command black --check --diff .
All done! ✨ 🍰 ✨
41 files would be left unchanged.
LOCAL - Running command flake8 .
LOCAL - Running command find . -name "*.py" | xargs pylint
************* Module tasks
tasks.py:4:0: W0402: Uses of a deprecated module 'distutils.util' (deprecated-module)
--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
poetry install
updates the library and its dependencies locally.circuit-maintenance-parser
is now built with your recent local changes.If you were to add loggers or debuggers to one of the classes:
class HtmlParserZayo1(Html):
def parse_bs(self, btags: ResultSet, data: dict):
"""Parse B tag."""
raise Exception('Debugging exception')
After running poetry install
:
-> % circuit-maintenance-parser --data-file ~/Downloads/zayo.eml --data-type email --provider-type zayo
Provider processing failed: Failed creating Maintenance notification for Zayo.
Details:
- Processor CombinedProcessor from Zayo failed due to: Debugging exception
Note:
invoke build
will result in an error due to no Dockerfile. This is expected as the library runs simple pytest testing without a container.
-> % invoke build
Building image circuit-maintenance-parser:2.2.2-py3.8
#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 2B done
#1 DONE 0.0s
WARNING: failed to get git remote url: fatal: No remote configured to list refs from.
ERROR: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to read dockerfile: open /var/lib/docker/tmp/buildkit-mount1243547759/Dockerfile: no such file or directory
For any questions or comments, please check the FAQ first and feel free to swing by the Network to Code slack channel (channel #networktocode). Sign up here
This library uses a Basic World Cities Database by Pareto Software, LLC, the owner of Simplemaps.com: The Provider offers a Basic World Cities Database free of charge. This database is licensed under the Creative Commons Attribution 4.0 license as described at: https://creativecommons.org/licenses/by/4.0/.
FAQs
Python library to parse Circuit Maintenance notifications and return a structured data back
We found that circuit-maintenance-parser demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.