Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Command-line interface (CLI) to select lines of a text file.
create
: create a dataset based on a text fileexport-statistics
: exporting statistics to a CSVadd
: add subsetsremove
: remove subsetsrename
: rename subsetselect-all
: select all linesselect-fifo
: select lines FIFO-styleselect-greedily
: select lines greedily regarding unitsselect-greedily-ep
: select lines greedily regarding units (epoch-based)select-uniformly
: select lines with units uniformly distributedselect-randomly
: select lines randomlyfilter-duplicates
: filter duplicate linesfilter-by-regex
: filter lines by regexfilter-by-text
: filter lines by textfilter-by-weight
: filter lines by weightfilter-by-vocabulary
: filter lines by unit vocabularyfilter-by-count
: filter lines by global unit frequenciesfilter-by-unit-freq
: filter lines by unit frequencies per linefilter-by-line-nr
: filter lines by line numbersort-by-line-nr
: sort lines by line numbersort-by-text
: sort lines by textsort-by-weight
: sort lines by weightssort-by-shuffle
: shuffle linesreverse
: reverse linesexport
: export linescreate-from-file
: create weights from filecreate-uniform
: create uniform weightscreate-from-count
: create weights from unit countdivide
: divide weightspip install text-selection --user
usage: text-selection-cli [-h] [-v] {dataset,subsets,weights} ...
CLI to select lines of a text file.
positional arguments:
{dataset,subsets,weights} description
dataset dataset commands
subsets subsets commands
weights weights commands
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
tqdm
numpy
scipy
pandas
ordered_set>=4.1.0
If you notice an error, please don't hesitate to open an issue.
# update
sudo apt update
# install Python 3.8, 3.9, 3.10 & 3.11 for ensuring that tests can be run
sudo apt install python3-pip \
python3.8 python3.8-dev python3.8-distutils python3.8-venv \
python3.9 python3.9-dev python3.9-distutils python3.9-venv \
python3.10 python3.10-dev python3.10-distutils python3.10-venv \
python3.11 python3.11-dev python3.11-distutils python3.11-venv
# install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
# check out repo
git clone https://github.com/stefantaubert/text-selection.git
cd text-selection
# create virtual environment
python3.8 -m pipenv install --dev
# first install the tool like in "Development setup"
# then, navigate into the directory of the repo (if not already done)
cd text-selection
# activate environment
python3.8 -m pipenv shell
# run tests
tox
Final lines of test result output:
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
congratulations :)
MIT License
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see About => Cite this repository).
subsets select-randomly
subsets sort-by-shuffle
subsets add
option --skip-existing
subsets remove
didn't worked--limit
to select duplicates--limit
positional where applicablenumpy
on KLD selectionFAQs
Command-line interface (CLI) to select lines of a text file.
We found that text-selection demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.