
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
truthound
Advanced tools
Zero-Configuration Data Quality Framework Powered by Polars
Sniffs out bad data.
Truthound 3.1.0 is a layered data quality system built around a Polars-first validation kernel, with first-party orchestration adapters, an additive AI review surface, and an operational console built on top of the same core runtime contract.
Truthound 3.1.0 is a layered data quality system. At the center is a small, durable, Polars-first validation kernel. Around that core sit an additive truthound.ai review surface, Truthound Orchestration for host-native execution inside schedulers and workflow systems, and Truthound Dashboard for operating Truthound through an installation-managed control-plane UI.
The point of the 3.x reset is not to hide the broader product line. It is to make the system boundary honest. The core validation kernel is the most rigorously validated contract in the ecosystem, while the AI review layer, orchestration adapters, and dashboard build on top of that contract instead of redefining it.
Documentation: truthound.netlify.app
Truthound 3.1.0 keeps the 3.0 kernel boundary and adds the first complete public AI review surface.
truthound.ai is now the canonical optional namespace for proposal
generation, run analysis, approval history, and controlled applyhas_ai_support() and get_ai_support_status() make it
safe for downstream integrations to feature-gate AI functionalitysuggest_suite(...), explain_run(...),
approve_proposal(...), reject_proposal(...), apply_proposal(...)Truthound AI directly and keeps the
dashboard at a boundary-level overview instead of a mirrored manual| Layer | Repository | Responsibility | Start Here |
|---|---|---|---|
Truthound Core | truthound | Validation kernel and data-plane: th.check(), ValidationRunResult, planner/runtime, zero-config workspace, reporters, checkpoints, Data Docs | Core docs |
Truthound AI | truthound.ai | Optional review-layer APIs for prompt-to-proposal compilation, run analysis, approval history, and controlled apply | /ai/ |
Truthound Orchestration | truthound-orchestration | First-party execution integration layer for Airflow, Dagster, Prefect, dbt, Mage, and Kestra | /orchestration/ |
Truthound Dashboard | separately distributed operational console | First-party control-plane for RBAC, sources, artifacts, incidents, secrets, observability, and AI review workflows | /dashboard/ |
Truthound is therefore not a monolithic platform with one flat feature surface. It is a layered system in which the core validation contract stays central, while the AI namespace, orchestration adapters, and dashboard expose first-party operational layers on top of it.
th.check(data) creates and reuses a local .truthound/ workspace automaticallyValidationRunResult shared by checkpoints, reporters, validation docs, and pluginsThe latest fixed-runner release-grade benchmark artifact set shows Truthound ahead of Great Expectations on every comparable workload in the current comparison catalog while preserving correctness parity.
| Workload | Truthound Warm (s) | GX Warm (s) | Speedup | Memory Ratio |
|---|---|---|---|---|
| local-mixed-core-suite | 0.028240 | 0.075232 | 2.66x | 44.29% |
| local-null | 0.016487 | 0.024964 | 1.51x | 43.62% |
| local-range | 0.002470 | 0.013219 | 5.35x | 43.84% |
| local-schema | 0.001479 | 0.017303 | 11.70x | 35.88% |
| local-unique | 0.002023 | 0.013785 | 6.81x | 42.28% |
| sqlite-null | 0.007370 | 0.032909 | 4.47x | 48.16% |
| sqlite-range | 0.006053 | 0.022355 | 3.69x | 43.80% |
| sqlite-unique | 0.002066 | 0.015655 | 7.58x | 42.12% |
The practical reasons behind that result are straightforward and core-specific:
This comparison is intentionally bounded. It covers comparable deterministic core checks and SQLite pushdown workloads. It is not a blanket claim about orchestration layers, dashboard operations, or every Great Expectations feature area.
Read the published evidence in Latest Verified Benchmark Summary.
Truthound Core 3.x centers the public contract around a smaller and more durable kernel:
| Layer | Responsibility |
|---|---|
TruthoundContext | Auto-discovered project workspace, baselines, run history, docs artifacts, plugin runtime, and resolved defaults |
contracts | Stable ports such as DataAsset, ExecutionBackend, MetricRepository, ArtifactStore, and plugin capabilities |
suite | Immutable validation intent via ValidationSuite, CheckSpec, SchemaSpec, evidence policy, and severity policy |
planning | Scan planning, backend routing, metric deduplication, and pushdown eligibility |
runtime | Session lifecycle, retries, timeout-safe execution, exception isolation, and evidence capture |
results | CheckResult, ValidationRunResult, and ExecutionIssue as the canonical output model |
Truthound Orchestration and Truthound Dashboard build on these contracts instead of replacing them. That is the key layered-system boundary.
The design is grounded in proven ideas from Great Expectations, Soda, Deequ, and Pandera, but optimized for a simpler zero-config starting point and a Polars-first execution path.
The practical 3.x kernel changes are:
th.check() returns ValidationRunResult directly.truthound/ workspace is auto-created and reusedvalidators=None now means deterministic AutoSuiteBuilder, not "run every built-in validator"compare moved to truthound.drift.compareCheckpointResult.validation_run and CheckpointResult.validation_viewValidationRunResult directly through reporter contract v3The practical 3.1.0 additions on top of that kernel are:
truthound[ai]pip install truthound
# Optional AI review surface
pip install truthound[ai]
# Development and docs workflows in this repository
uv sync --extra dev --extra docs
import truthound as th
from truthound.datadocs import generate_validation_report
from truthound.reporters import get_reporter
from truthound.drift import compare
run = th.check(
{"customer_id": [1, 2, 2], "email": ["a@example.com", None, "c@example.com"]},
)
print(run.execution_mode)
print([check.name for check in run.checks])
print(run.metadata["context_root"])
json_report = get_reporter("json").render(run)
validation_docs = generate_validation_report(run, title="Customer Quality Overview")
context = th.get_context()
schema = th.learn({"id": [1, 2], "status": ["active", "inactive"]})
masked = th.mask(
{"email": ["a@example.com", "b@example.com"]},
columns=["email"],
strategy="hash",
)
drift = compare({"score": [0.1, 0.2]}, {"score": [0.1, 0.8]})
truthound check data.csv --validators null,unique
truthound check --connection "sqlite:///warehouse.db" --table users --pushdown
truthound scan pii.csv
truthound profile data.csv
truthound doctor . --migrate-2to3
truthound doctor . --workspace
truthound plugins list --json
# Optional AI review workflow
truthound ai suggest-suite data.csv --prompt "Require customer_id to be unique"
truthound ai proposals list
truthound ai explain-run --run-id <run_id>
The root package intentionally exports a smaller API:
check, scan, mask, profile, learn, read, get_contextTruthoundContext, ValidationSuite, CheckSpec, SchemaSpec, ValidationRunResult, CheckResultth.check() returns ValidationRunResult directlyCheckpointResult.validation_run is canonical and CheckpointResult.validation_view is the compatibility projection for legacy action formattingtruthound.reporters.RunPresentation, truthound.reporters.ReporterContexttruthound.datadocs.ValidationDocsBuilder, truthound.datadocs.generate_validation_reporttruthound.drift.comparetruthound.ml, truthound.lineage, truthound.realtime, or truthound.datadocstruthound.ai after installing truthound[ai]Truthound now ships an additive truthound.ai namespace that preserves the
core hot path and zero-config workflow while exposing a reviewable AI layer.
suggest_suite(...) compiles prompts into persisted suite proposal artifactsexplain_run(...) compiles run evidence into persisted analysis artifactsapprove_proposal(...), reject_proposal(...), and apply_proposal(...) keep approval and mutation in explicit human-reviewed stepshas_ai_support() and get_ai_support_status() let downstream integrations feature-gate the AI surface cleanlyRead the technical docs in docs/ai/index.md.
The public CLI surface is additive as well:
truthound ai suggest-suitetruthound ai explain-runtruthound ai proposals list/show/approve/reject/apply/historytruthound ai analyses list/showtruthound ai smoke openaitruthound ai smoke openai-explain-runThe experimental use_engine and --use-engine switches remain removed.
Truthound 3.0 auto-creates a .truthound/ workspace at your project root. By default it manages:
.truthound/config.yaml: resolved project defaults.truthound/catalog/: asset fingerprints and source signatures.truthound/baselines/: learned schemas and metric history.truthound/runs/: persisted ValidationRunResult metadata.truthound/docs/: generated validation docs.truthound/plugins/: resolved plugin manifest and trust metadataIf you do nothing except call th.check(data), Truthound will:
TruthoundContextUse truthound doctor . --workspace to verify that the local .truthound/ layout, indexes, baselines, and persisted run artifacts are still structurally healthy.
Truthound now uses one lifecycle runtime:
PluginManager is the canonical plugin managerEnterprisePluginManager is an async, capability-driven facade over the same runtimeregister_check_factory, register_data_asset_provider, register_reporter, register_hook, and register_capabilityValidationRunResult is the canonical render input and RunPresentation is the shared render projectionuv run --frozen --extra dev python -m pytest -q
uv run --frozen --extra dev python -m pytest --collect-only -q tests
uv run --frozen --extra dev python -m pytest -q -m "contract or fault or e2e" -p no:cacheprovider
uv run --frozen --extra dev python -m pytest -q -m "contract or fault or integration or soak or stress or scale_100m or e2e" --run-integration --run-expensive --run-soak -p no:cacheprovider
uv run --frozen --extra dev python -m pytest -q tests/test_truthound_3_0_contract.py tests/test_api.py tests/test_public_surface.py tests/test_checkpoint.py -p no:cacheprovider
uv run --frozen --extra benchmarks python -m truthound.cli benchmark parity --suite pr-fast --frameworks truthound --backend local --strict
uv run --frozen --extra benchmarks python -m truthound.cli benchmark parity --suite nightly-core --frameworks both --backend local --strict
uv run --frozen --extra benchmarks python -m truthound.cli benchmark parity --suite nightly-sql --frameworks both --backend sqlite --strict
uv run --frozen --extra benchmarks python -m truthound.cli benchmark parity --suite release-ga --frameworks both --strict
python docs/scripts/prepare_public_docs.py --mode full
python docs/scripts/prepare_public_docs.py --mode public
uv run --frozen --extra dev python docs/scripts/check_links.py --mkdocs mkdocs.yml README.md CLAUDE.md build/full-docs
uv run --frozen --extra dev --extra docs mkdocs build --strict
uv run --frozen --extra dev --extra docs mkdocs build --strict -f mkdocs.public.yml
truthound doctor . --migrate-2to3
Official benchmark comparisons should cite the published fixed-runner artifact set: release-ga.json, env-manifest.json, and latest-benchmark-summary.md.
Tests now follow a failure-first lane model:
contract: stable public API and compatibility boundariesfault: deterministic failure injection, timeout, corruption, and concurrency scenariosintegration: opt-in backend and external-service coveragesoak and stress: nightly-only load and chaos coverageThe default local run is intentionally fast. Manual verification artifacts live under verification/phase6 and are intentionally kept out of pytest discovery.
Official performance claims should come only from the verified release-grade parity artifacts under .truthound/benchmarks/release/. Nightly outputs are for trend visibility, not public benchmark positioning.
When adding tests, prefer scenarios that protect public contracts or operational failure modes. Avoid adding default-value, getter/setter, enum-literal, to_dict() round-trip, or CSS-string existence tests unless they prove a compatibility boundary that has failed before.
FAQs
Zero-Configuration Data Quality Framework Powered by Polars
We found that truthound demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.