Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
giving
is a simple, magical library that lets you log or "give" arbitrary data throughout a program and then process it as an event stream. You can use it to log to the terminal, to wandb or mlflow, to compute minimums, maximums, rolling means, etc., separate from your program's core logic.
give()
every object or datum that you may want to log or compute metrics about.given()
and define pipelines to map, filter and reduce the data you gave.Code | Output |
---|---|
Simple logging
|
|
Extract values into a list
|
|
Reductions (min, max, count, etc.)
|
|
Using the
|
|
The
|
|
The
|
|
The above examples only show a small number of all the available operators.
There are multiple ways you can use give
. give
returns None unless it is given a single positional argument, in which case it returns the value of that argument.
give(key=value)
This is the most straightforward way to use give
: you write out both the key and the value associated.
Returns: None
x = give(value)
When no key is given, but the result of give
is assigned to a variable, the key is the name of that variable. In other words, the above is equivalent to give(x=value)
.
Returns: The value
give(x)
When no key is given and the result is not assigned to a variable, give(x)
is equivalent to give(x=x)
. If the argument is an expression like x * x
, the key will be the string "x * x"
.
Returns: The value
give(x, y, z)
Multiple arguments can be given. The above is equivalent to give(x=x, y=y, z=z)
.
Returns: None
x = value; give()
If give
has no arguments at all, it will look at the immediately previous statement and infer what you mean. The above is equivalent to x = value; give(x=value)
.
Returns: None
gv["key"]
, gv["?key"]
: filter based on keysNot all operators are listed here. See here for the complete list.
kmerge
Most of these reductions can be called with the scan
argument set to True
to use scan
instead of reduce
. scan
can also be set to an integer, in which case roll
is used.
give.wrap
breakword
.tag
, using the BREAKWORD
environment variable.Here are some ideas for using giving in a machine learning model training context:
from giving import give, given
def main():
model = Model()
for i in range(niters):
# Give the model. give looks at the argument string, so
# give(model) is equivalent to give(model=model)
give(model)
loss = model.step()
# Give the iteration number and the loss (equivalent to give(i=i, loss=loss))
give(i, loss)
# Give the final model. The final=True key is there so we can filter on it.
give(model, final=True)
if __name__ == "__main__":
with given() as gv:
# ===========================================================
# Define our pipeline **before** running main()
# ===========================================================
# Filter all the lines that have the "loss" key
# NOTE: Same as gv.filter(lambda values: "loss" in values)
losses = gv.where("loss")
# Print the losses on stdout
losses.display() # always
losses.throttle(1).display() # OR: once every second
losses.slice(step=10).display() # OR: every 10th loss
# Log the losses (and indexes i) with wandb
# >> is shorthand for .subscribe()
losses >> wandb.log
# Print the minimum loss at the end
losses["loss"].min().print("Minimum loss: {}")
# Print the mean of the last 100 losses
# * affix adds columns, so we will display i, loss and meanloss together
# * The scan argument outputs the mean incrementally
# * It's important that each affixed column has the same length as
# the losses stream (or "table")
losses.affix(meanloss=losses["loss"].mean(scan=100)).display()
# Store all the losses in a list
losslist = losses["loss"].accum()
# Set a breakpoint whenever the loss is nan or infinite
losses["loss"].filter(lambda loss: not math.isfinite(loss)).breakpoint()
# Filter all the lines that have the "model" key:
models = gv.where("model")
# Write a checkpoint of the model at most once every 30 minutes
models["model"].throttle(30 * 60).subscribe(
lambda model: model.checkpoint()
)
# Watch with wandb, but only once at the very beginning
models["model"].first() >> wandb.watch
# Write the final model (you could also use models.last())
models.where(final=True)["model"].subscribe(
lambda model: model.save()
)
# ===========================================================
# Finally, execute the code. All the pipelines we defined above
# will proceed as we give data.
# ===========================================================
main()
FAQs
Reactive logging
We found that giving demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.