Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
There are several packages available providing handlers for the standard library logging module that can send application logs to Graylog by TCP/UDP/HTTP (py-gelf is a good example). Although these can be useful, it's not ideal to make an application performance dependent on network requests just for the purpose of delivering logs.
Alternatively, one can simply log to a file or stdout
and have a collector (like Fluentd) processing and sending those logs asynchronously to a remote server (and not just to Graylog, as GELF can be used as a generic log format), which is a common pattern for containerized applications. In a scenario like this all we need is a GELF logging formatter.
logging.LogRecord
attributes as additional fields;$ pip install gelf-formatter
$ python setup.py install
Simply create a gelfformatter.GelfFormatter
instance and pass it as argument to logging.Handler.setFormatter
:
import sys
import logging
from gelfformatter import GelfFormatter
formatter = GelfFormatter()
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(formatter)
Apply it globally with logging.basicConfig
to automatically format log records from third-party packages as well:
logging.basicConfig(level=logging.DEBUG, handlers=[handler])
Alternatively, you can configure a local logging.Logger
instance through logging.Logger.addHandler
:
logger = logging.getLogger('my-app')
logger.addHandler(handler)
That's it. You can now use the logging module as usual, all records will be formatted as GELF messages.
The formatter will output all (non-deprecated) fields described in the GELF Payload Specification (version 1.1):
version
: String, always set to 1.1
;
host
: String, the output of socket.gethostname
at initialization;
short_message
: String, log record message;
full_message
(optional): String, formatted exception traceback (if any);
timestamp
: Number, time in seconds since the epoch as a floating point;
level
: Integer, syslog severity level.
None of these fields can be ignored, renamed or overridden.
logging.info("Some message")
{"version":"1.1","host":"my-server","short_message":"Some message","timestamp":1557342545.1067393,"level":6}
The full_message
field is used to store the traceback of exceptions. You just need to log them with logging.exception
.
import urllib.request
req = urllib.request.Request('http://www.pythonnn.org')
try:
urllib.request.urlopen(req)
except urllib.error.URLError as e:
logging.exception(e.reason)
{"version": "1.1", "short_message": "[Errno -2] Name or service not known", "timestamp": 1557342714.0695107, "level": 3, "host": "my-server", "full_message": "Traceback (most recent call last):\n ...(truncated)... raise URLError(err)\nurllib.error.URLError: <urlopen error [Errno -2] Name or service not known>"}
The GELF specification allows arbitrary additional fields, with keys prefixed with an underscore.
To include additional fields use the standard logging extra
keyword. Keys will be automatically prefixed with an underscore (if not already).
logging.info("request received", extra={"path": "/orders/1", "method": "GET"})
{"version": "1.1", "short_message": "request received", "timestamp": 1557343604.5892842, "level": 6, "host": "my-server", "_path": "/orders/1", "_method": "GET"}
By default the formatter ignores all logging.LogRecord
attributes. You can however opt to include them as additional fields. This can be used to display useful information like the current module, filename, line number, etc.
To do so, simply pass a list of LogRecord
attribute names as value of the allowed_reserved_attrs
keyword when initializing a GelfFormatter
. You can also modify the allowed_reserved_attrs
instance variable of an already initialized formatter.
attrs = ["lineno", "module", "filename"]
formatter = GelfFormatter(allowed_reserved_attrs=attrs)
# or
formatter.allowed_reserved_attrs = attrs
logging.debug("starting application...")
{"version": "1.1", "short_message": "starting application...", "timestamp": 1557346554.989846, "level": 6, "host": "my-server", "_lineno": 175, "_module": "myapp", "_filename": "app.py"}
You can optionally customize the name of these additional fields using a logging.Filter
(see below).
Similarily, you can choose to ignore additional attributes passed via the extra
keyword argument. This can be usefull to e.g. not log keywords named secret
or password
.
To do so, pass a list of names to the ignored_attrs
keyword when initializing a GelfFormatter
. You can also modify the ignored_attrs
instance variable of an already initialized formatter.
But be aware: nested fields will be printed! Only the root level of keywords is filtered by the ignored_attrs
.
attrs = ["secret", "password"]
formatter = GelfFormatter(ignored_attrs=attrs)
# or
formatter.ignored_attrs = attrs
logging.debug("app config", extra={"connection": "local", "secret": "verySecret!", "mysql": {"user": "test", "password": "will_be_logged"}})
{"version": "1.1", "short_message": "app config", "timestamp": 1557346554.989846, "level": 6, "host": "my-server", "_connection": "local", "_mysql": {"user": "test", "password": "will_be_logged"}}
Having the ability to define a set of additional fields once and have them included in all log messages can be useful to avoid repetitive extra
key/value pairs and enable contextual logging.
Python's logging module provides several options to add context to a logger, among which we highlight the logging.LoggerAdapter
and logging.Filter
.
Between these we recommend a logging.Filter
, which is simpler and can be attached directly to a logging.Handler
. A logging.Filter
can therefore be used locally (on a logging.Logger
) or globally (through logging.basicConfig
). If you opt for a LoggerAdapter
you'll need a logging.Logger
to wrap.
You can also use a logging.Filter
to reuse/rename any of the reserved logging.LogRecord
attributes.
class ContextFilter(logging.Filter):
def filter(self, record):
# Add any number of arbitrary additional fields
record.app = "my-app"
record.app_version = "1.2.3"
record.environment = os.environ.get("APP_ENV")
# Reuse any reserved `logging.LogRecord` attributes
record.file = record.filename
record.line = record.lineno
return True
formatter = GelfFormatter()
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(formatter)
handler.addFilter(ContextFilter())
logging.basicConfig(level=logging.DEBUG, handlers=[handler])
logging.info("hi", extra=dict(foo="bar"))
{"version": "1.1", "short_message": "hi", "timestamp": 1557431642.189755, "level": 6, "host": "my-server", "_foo": "bar", "_app": "my-app", "_app_version": "1.2.3", "_environment": "development", "_file": "app.py", "_line": 159}
Looking for a GELF log pretty-printer? If so, have a look at gelf-pretty :fire:
This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please refer to our contributing guide for further information.
FAQs
GELF formatter for the Python standard library logging module.
We found that gelf-formatter demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.