
Security News
The Nightmare Before Deployment
Season’s greetings from Socket, and here’s to a calm end of year: clean dependencies, boring pipelines, no surprises.
uniqseq
Advanced tools
Stream-based deduplication for repeating sequences
uniqseq identifies and removes repeated multi-record patterns from streaming data. Unlike traditional line-by-line deduplication tools, it detects when sequences of records repeat, where a record can be a line, a byte sequence, or any delimiter-separated unit.
Works with text streams (line-delimited, null-delimited, etc.) and binary streams (byte-delimited with any delimiter), processes data in a single pass, and maintains bounded memory usage.
# Input with repeated 3-line sequence
$ cat app.log
Starting process...
Loading config
Connecting to DB
Starting process...
Loading config
Connecting to DB
Done
# Remove duplicates (specify window size to match pattern length)
$ uniqseq --window-size 3 app.log
Starting process...
Loading config
Connecting to DB
Done
brew tap jeffreyurban/uniqseq && brew install uniqseq
Homebrew manages the Python dependency and provides easy updates via brew upgrade.
pipx install uniqseq
pipx installs in an isolated environment with global CLI access. Works on macOS, Linux, and Windows. Update with pipx upgrade uniqseq.
pip install uniqseq
Use pip if you want to use uniqseq as a library in your Python projects.
# Development installation
git clone https://github.com/JeffreyUrban/uniqseq
cd uniqseq
pip install -e ".[dev]"
Requirements: Python 3.9+
# Basic usage (deduplicate 10-line sequences by default)
uniqseq app.log > clean.log
# Adjust window size for your data
uniqseq --window-size 3 build.log # 3-line patterns
uniqseq --window-size 5 errors.log # 5-line patterns
# Stream processing
tail -f app.log | uniqseq --window-size 5
# Ignore timestamps when comparing
uniqseq --skip-chars 24 timestamped.log
# Only deduplicate ERROR lines
uniqseq --track "^ERROR" app.log
# See what was removed
uniqseq --annotate app.log
from uniqseq import UniqSeq
# Initialize with configuration
deduplicator = UniqSeq(
window_size=3,
skip_chars=0,
max_history=100000
)
# Process stream
with open("app.log") as infile, open("clean.log", "w") as outfile:
for line in infile:
deduplicator.process_line(line.rstrip("\n"), outfile)
deduplicator.flush(outfile)
script command)uniqseq uses a sliding window with hash-based pattern detection:
Output is produced with minimal delay. When a window doesn't match any known pattern, the oldest buffered record is immediately emitted.
Read the full documentation at uniqseq.readthedocs.io
Key sections:
# Clone repository
git clone https://github.com/JeffreyUrban/uniqseq.git
cd uniqseq
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=uniqseq --cov-report=html
MIT License - See LICENSE file for details
FAQs
Stream-based deduplication for repeating sequences
We found that uniqseq demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Season’s greetings from Socket, and here’s to a calm end of year: clean dependencies, boring pipelines, no surprises.

Research
/Security News
Impostor NuGet package Tracer.Fody.NLog typosquats Tracer.Fody and its author, using homoglyph tricks, and exfiltrates Stratis wallet JSON/passwords to a Russian IP address.

Security News
Deno 2.6 introduces deno audit with a new --socket flag that plugs directly into Socket to bring supply chain security checks into the Deno CLI.