Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
torchtraining-nightly
Advanced tools
So you want to train neural nets with PyTorch? Here are your options:
for
loopsEnter torchtraining - we try to get what's best from both worlds while adding: explicitness, functional approach, easy extensions and freedom to structure your code!
All of that using single **
piping operator!
Version | Docs | Tests | Coverage | Style | PyPI | Python | PyTorch | Docker | LOC |
---|---|---|---|---|---|---|---|---|---|
See tutorials to get a grasp of what's the fuss is all about:
tensorboard
.See documentation
for full list of extras (e.g. installation with integrations like horovod
).
To just start you can install via pip
:
pip install --user torchtraining
torchtraining
?There are a lot of training libraries around for a lot of frameworks. Why would you choose this one?
torchtraining
fits you, not the other way aroundWe think it's impossible to squeeze user's code in an overly strict API.
We are not trying to fit
everything into a single... .fit()
method (or Trainer
god class,
see 40!
arguments in PyTorch-Lightning trainer).
This approach has shown time and time again it does not work for more complicated
use cases as one cannot foresee the endless possibilities
of training neural network and data generation user might require.
torchtraining
gives you building blocks to calculate metrics, log results,
distribute training instead.
forward
instead of 40 methodsImplementing forward
with data
argument is all you will ever need (okay, accumulators
also need calculate
,
but that's it), we add thin __call__
.
Compare that to PyTorch-Lightning
's LightningModule
(source code here)
training_step
training_step_end
training_epoch_end
(repeat all the above for validation
and test
)validation_end
, test_end
configure_sync_batchnorm
configure_ddp
init_ddp_connection
configure_apex
configure_optimizers
optimizer_step
optimizer_zero_grad
tbptt_split_batch
(?)prepare_data
train_dataloader
tng_dataloader
test_dataloader
val_dataloader
This list could go on (and probably will grow even bigger as time passes).
We believe in functional approach and using only what you need (a lot of decoupled building blocks instead
of gigantic god classes trying to do everything). Once again: we can't foresee
future and won't squash everything into single class
.
You are offered building blocks and it's up to you what you want to use. Still, you are explicit about everything going on in your code, for example:
tensorboard
neural network(s)
go into what steploguru
stdout
and file
or maybe over the web?)See introduction tutorial to see how it's done
We don't think your neural network source code should be polluted with training.
We think it's better to have data
preparation in data.py
module,
optimizers
in optimizers.py
and so on. With torchtraining
you don't have to
crunch any functionalities into single god class
.
~3000
lines of code (including comet-ml
, neptune
and horovod
integration)
and short functions/classes allow you to quickly dig
into the source if you find something odd/not working. It's leverages what exists
instead of reinventing the wheel.
We don't force you to jump into and from numpy
as most of the tasks can already be
done in PyTorch
. We are pytorch
first.
Unless we have to integrate third party tool... In that case you don't pay for
this feature if you don't use it!
If we don't provide an integration out of the box, you can request it via issues
or make your own PR
. Any code you want can almost always be integrated via following steps:
amazing.py
)torchtraining.Operation
forward
for each operation which takes single argument data
which can be anything (Tuple
, List
, torch.Tensor
, str
, whatever really)forward
and return results**
!Other tools integrate components by trying to squash them into their predefined APIs and/or trying to be smart and guess what the user does (which often fails). Here's how we do:
Example of integration of neptune
image logging:
import torchtraining as tt
class Image(tt.Operation):
def __init__(
self,
experiment,
log_name: str,
image_name: str = None,
description: str = None,
timestamp=None,
experiment=None,
):
super().__init__()
self.experiment = experiment
self.log_name = log_name
self.image_name = image_name
self.description = description
self.timestamp = timestamp
# Always forward some data so it can be reused
def forward(self, data):
self.experiment.log_image(
self.log_name, data, self.image_name, self.description, self.timestamp
)
return data
This project is currently in it's infancy and we would love to get some help from you!
You can find current ideas inside issues
tagged by [DISCUSSION]
(see here).
accelerators.py
module for distributed trainingcallbacks.py
third party integrations (experiment handlers like comet-ml
or neptune
)Also feel free to make your own feature requests and give us your thoughts in issues
!
Remember: It's only 0.0.1
version, direction is there but you can be sure
to encounter a lot of bugs along the way at the moment
**
as an operator?Indeed, operators like |
, >>
or >
would be way more intuitive, but:
>
cannot be piped as easily>>
or |
Currently **
seems like a reasonable trade-off, still it may be subject to
change in future.
FAQs
Functional & flexible neural network training with PyTorch.
We found that torchtraining-nightly demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.