KProvEngine

V1 (scope-locked)

Deterministic provenance for AI-assisted workflows that require explicit human review.

Local-first by design: reproducible runs, captured review decisions, and evidence artifacts you can audit, diff, and defend.

  • Python
  • Determinism
  • Provenance
  • Human-in-the-loop
  • Audit-ready outputs
What you get (V1)
  • Reproducible run directory with deterministic stages: normalize → parse → extract → render
  • Evidence artifacts (manifest + hashes + provenance + toolchain disclosure)
  • Explicit review record captured as an artifact (human accountability)

Why this exists

Many AI-assisted workflows produce useful output but weak evidence. When provenance is missing, you can't reliably reproduce results, explain what happened, or separate human judgment from automation.

  • "What produced this output?" must be answerable with artifacts, not narrative.
  • Runs should be reproducible end-to-end (no hidden state, no surprise network calls).
  • Human responsibility must be captured explicitly, not implied.
  • Audit defense should be evidence-backed and reviewable.

Design constraints (V1)

  • Local-first execution (no required external services)
  • Deterministic, reproducible pipelines
  • Explicit human-in-the-loop review
  • No implied certification or automated validation
  • Minimal and defensible dependency surface

Architecture

Pipeline
  • Core stages: normalize → parse → extract → render
  • Deterministic + side-effect constrained
  • Clear separation between core logic and optional adapters
Adapters + evidence layer
  • Adapters (optional): OCR and LLM integrations
  • Non-authoritative by design: can assist extraction; never treated as source of truth
  • Evidence artifacts: manifest, hashes, provenance, toolchain disclosure, review artifacts
Full architecture diagrams and governance rules live in the repository. View architecture and notes →

Evidence outputs (what's actually captured)

  • manifest.json — file inventory + expected outputs
  • hashes — content hashes for inputs/outputs
  • provenance — execution metadata (what ran, when, with what versions)
  • toolchain disclosure — dependency/tool versions that impact reproducibility
  • human review artifact — explicit reviewer decision record

Intentionally out of scope (V1)

  • Hosted or SaaS deployment
  • Autonomous or agent-driven behavior
  • Claims of compliance certification
  • Workflow orchestration beyond a single deterministic run

Demo

Run the public demo script from the repository root:

# From the KProvEngine repo:
./demo.sh

# Or (editable install):
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[dev]"
echo "Hello provenance" > input.txt
python -m kprovengine.cli input.txt --out runs