The complete guide

AI code provenance

AI code provenance is the practice of recording where AI-generated code came from — which agent and model produced each line, under what prompt and context — and preserving that record as durable, verifiable evidence. It restores the authorship trail that AI coding agents otherwise erase, resolved down to the individual line.

Last updated June 4, 2026

Why AI code provenance matters now

For decades, authorship was implicit: a human wrote every line, and git blame pointed to a person who could explain it. AI coding agents broke that. A single commit can now contain lines from a developer, from Claude Code, from Cursor, and from Copilot — all attributed to one human committer. The signal that mattered on review and audit quietly disappears.

84%

of developers use or plan to use AI tools in their workflow. — Stack Overflow Developer Survey, 2025

Adoption is near-universal, but trust is not — and the volume of AI-authored code is growing faster than teams can review it. Provenance is the layer that makes that volume governable: it tells you which changes are AI-authored so scrutiny, policy, and evidence can be applied precisely where the risk is.

+23.5%

production incidents per pull request, Dec 2025 to early 2026. — Cycode, 2026

Why git blame can't provide it

git blame attributes a line to the last commit that touched it, and therefore to the committer. When an AI agent's output is committed under a developer's name, blame names the developer and drops everything that provenance is about: which agent, which model, what prompt, and what context the agent was missing.

git blame answers: which commit and committer last changed this line.
AI code provenance answers: which agent or human authored it, with which model, under what intent.
Provenance sits beside git history — it complements blame rather than replacing it.

How AI code provenance is captured

Reliable provenance has to be captured at the moment of authorship, not reconstructed afterward. AgentDiff hooks into each agent's native callback, records the agent, model, and line ranges it changed, then reconciles that session against the committed diff. The result is signed and appended to a git ref beside your code.

01Install once — register hooks for every agent your team uses.
02Commit normally — pre-commit reconciles the agent session against the diff by line-range overlap.
03Sign and store — the attribution record is signed with ed25519 and appended to refs/agentdiff/meta.
04Audit anywhere — query from the CLI, a dashboard, or straight from git.

$ agentdiff configure
✓ hooks installed · 8 agents registered
$ git commit -m 'feat: add billing webhook'
→ claude-code → apps/github-app/src/billing.ts  (+88 -12)
✓ signed · appended to refs/agentdiff/meta

Because the record is metadata — agent, model, file paths, line ranges, signature — and never file contents, provenance can live in git and travel with the repository without creating a new path for source code to leave your infrastructure.

“Signed line-level provenance reads like the right primitive. The hard part is making it auditor-legible and low-friction for devs — auditor-legible + low-friction is the whole game.”

Kenith Biju Philip · Lead GRC Engineer, Fivetran

How an AI code provenance platform relates to existing standards

AI code provenance is a distinct layer from the supply-chain provenance most teams already know. Standards like SLSA provenance, SBOM, and Sigstore attestations describe how a build artifact was produced and assembled. They operate at the artifact and build layer — not at the source line. AI code provenance sits earlier, at authorship: which agent wrote which line, before the build ever runs.

SLSA / SBOM / Sigstore — provenance of the build and its dependencies; complementary, not the same job.
AIBOM — an emerging bill of materials for AI components; aligns with recording AI's role in software.
Code lineage — the broader practice of tracing how code came to exist; line-level AI provenance is its most granular form.

An AI code provenance platform produces the authorship evidence those frameworks increasingly expect. AgentDiff signs each attribution in the same spirit as Sigstore-style attestation, but about who authored a line rather than how an artifact was built — so the two stack cleanly in one pipeline.

Provenance is not AI detection

AI content detectors estimate whether code looks AI-generated, after the fact and probabilistically. Provenance records which agent actually wrote each line, at the moment it was written, as signed evidence. Detection guesses; provenance proves. For source code, detection is unreliable — provenance is the dependable layer.

Provenance across every agent

Most teams use more than one AI tool, and each vendor only sees its own activity. AgentDiff captures every major agent into one vendor-neutral record using each tool's native hook — so cross-agent attribution is consistent regardless of which tool wrote the code.

Claude CodePostToolUse hook

CursorafterFileEdit / afterTabFileEdit hooks

GitHub CopilotVS Code extension integration

Windsurfpost_write_code hook

OpenCodetool.execute.after plugin

Codex CLInotify hook

Gemini / AntigravityBeforeTool / AfterTool hooks

Frequently asked questions

What is AI code provenance?+

AI code provenance is a record of where AI-generated code came from — which agent and model produced each line, under what context — preserved as durable, verifiable evidence. It resolves authorship down to the line, which matters when a single file mixes human and multiple AI authors.

How do I track which AI wrote a line of code?+

Capture the agent and model at authorship time via each tool's hooks, reconcile that against the committed diff by line range, and store a signed record next to git history. AgentDiff automates this across Claude Code, Cursor, Copilot, Codex, Windsurf, OpenCode, and Gemini.

Does AI code provenance send my source code anywhere?+

It should not. AgentDiff records only metadata — agent, model, file paths, and line ranges — signed and stored in your own git remote. No file contents or full prompts are transmitted.

Is AI code provenance required for compliance?+

No framework explicitly mandates it yet, but ISO 42001 and the EU AI Act are moving toward stronger traceability requirements, and auditors are beginning to ask whether AI generated a change and under what controls. Signed provenance is positioned as a supporting control.

How is provenance kept tamper-evident?+

Each attribution record is signed with an ed25519 key your organization controls. Any change to the record invalidates its signature, so tampering is detectable and the evidence is independently verifiable against your public keys.

Keep reading

Line-level provenance →AI code attribution →Signed code provenance →AI code audit trail →git blame for AI code →

See line-level provenance on a real repo.

AgentDiff records which agent wrote which line, signs it, and keeps it in your git history. Open the live dashboard or book a walkthrough.

Book a demo →Open dashboard