Evidence judge (v7)

Did the test actually prove the task?

arkheionx evidence-judge reads local Foundry tests and evidence and grades each on a transparent rubric: does it call the target, set up, act, assert, check pre/post state, exercise the negative or boundary path, avoid mocking away the risk, and avoid relying on a trusted-role mistake or a known issue. It does not confirm vulnerabilities.

strong / medium

The test sets up, acts, asserts, checks pre/post state, and exercises a negative or boundary path.

weak / insufficient

The test acts and asserts but is shallow, or barely exercises the target — it cannot support a conclusion.

invalid-test

The test does not call the target, has no assertion, or only shows the call did not revert. Rewrite it.

candidate-with-evidence

A human should review the candidate. This is not a confirmed vulnerability.

rejected-with-evidence

The guard appears to hold. This is not proof the protocol has no bugs.

likely-known / accepted / trusted-role / out-of-scope / low-only

Filtered by the scope rules before it reaches a human.

Run it

Scope-aware, local, and static.

arkheionx evidence-judge . --scope-file scope.md
arkheionx evidence-judge . --scope-file scope.md --json
arkheionx evidence-judge . --tasks-file .arkheionx/scope-pack/scope-tasks.json

Boundary

Evidence quality, not vulnerability validity.

Evidence quality is a heuristic. Candidate-with-evidence is not a confirmed vulnerability; rejected-with-evidence is not proof the protocol has no bugs. ArkheionX does not confirm vulnerabilities. Human review is required. See docs/EVIDENCE_JUDGE.md.

See the report filter