Skip to main content

Certification

ssg certify runs synthetic scenario files through the governance engine to verify that your ruleset correctly handles Claude Code tool calls. A passing certification run can be submitted to the Hub to earn a Certified badge on your ruleset card.

What certification tests

Scenarios use Claude Code wire format (tool_name, tool_input) — the exact format sent by Claude Code's PreToolUse hook — and assert the expected decision for each tool call. The same transformation performed by ssg hook eval is applied in-process, exercising the full adapter path.

Each scenario verifies:

  • Decision correctness — did the engine return the expected allow/block/force/log?
  • Rule attribution — did the correct rule fire? (optional expected_rule_id)
  • No state leakage — decisions in one step don't affect subsequent steps

Scenario file format

Scenario files are JSONL (one JSON object per line). The first line is kind:"meta" and subsequent lines are kind:"step".

Meta line

{
"kind": "meta",
"name": "destructive-ops",
"description": "Verify that destructive commands are blocked",
"claude_code_version": "current",
"tags": ["security", "execute"]
}
FieldTypeRequiredDescription
kind"meta"Must be the literal string "meta"
namestringUnique scenario name (used in reports)
descriptionstringHuman-readable description
claude_code_versionstringTarget version, e.g. "current" or "1.0.34"
tagsstring[]Categorization tags

Step lines

{
"kind": "step",
"tool_name": "Bash",
"tool_input": { "command": "rm -rf /" },
"expected_decision": "block",
"expected_rule_id": "no-destructive-ops",
"description": "rm -rf on root must be blocked"
}
FieldTypeRequiredDescription
kind"step"Must be the literal string "step"
tool_namestringClaude Code tool name (e.g. "Bash", "Read", "Agent")
tool_inputobjectClaude Code tool input object
expected_decisionstringallow, block, log, shadow, ask, or force
expected_rule_idstringIf set, asserts that this specific rule matched
descriptionstringHuman comment shown in reports

Complete example

{"kind":"meta","name":"destructive-ops","description":"Block rm -rf and force pushes","claude_code_version":"current","tags":["security","execute"]}
{"kind":"step","tool_name":"Bash","tool_input":{"command":"rm -rf /"},"expected_decision":"block","expected_rule_id":"no-destructive-ops","description":"rm -rf root must block"}
{"kind":"step","tool_name":"Bash","tool_input":{"command":"git push --force"},"expected_decision":"block","expected_rule_id":"no-destructive-ops","description":"force push must block"}
{"kind":"step","tool_name":"Bash","tool_input":{"command":"git status"},"expected_decision":"allow","description":"safe git command must pass"}

Running ssg certify

# Run all scenarios in .sigmashake/scenarios/ (default)
ssg certify

# Specify a scenario directory
ssg certify --dir=path/to/scenarios

# Verbose: show every step result
ssg certify --verbose

# Machine-readable JSON report
ssg certify --json

# Override the rules directory
ssg certify --rules-dir=path/to/rules

Exit codes

CodeMeaning
0All scenarios passed
1One or more scenarios failed

Interpreting reports

Human-readable output:

ssg certify — Claude Code current — ssg v0.2.0
✓ PASS: 8/8 scenarios passed

✓ destructive-ops (5/5 steps)
✓ safe-operations (5/5 steps)
✓ secret-file-access (3/3 steps)

All scenarios passed. This ruleset is certified for Claude Code current.

JSON report (with --json):

{
"version": "0.2.0",
"claude_code_version": "current",
"timestamp": "2026-04-04T12:00:00.000Z",
"scenarios_total": 8,
"scenarios_passed": 8,
"scenarios_failed": 0,
"scenarios": [
{
"name": "destructive-ops",
"passed": true,
"steps_total": 5,
"steps_passed": 5,
"steps_failed": 0,
"steps": [...]
}
]
}

Built-in scenarios

SigmaShake ships 8 built-in scenarios in .sigmashake/scenarios/:

ScenarioStepsWhat it tests
destructive-ops5rm -rf, git push --force, git reset --hard → block
safe-operations5ls, git status, Read, Glob, Grep → allow
secret-file-access3Read .env, .env.local, credentials → block
capability-mapping5Agent, WebFetch, Write, Edit, TaskCreate → correct capability
force-substitution2grep for symbols → force redirect to oracle
mixed-sequence6Interleaved safe/dangerous ops — no state leakage
malformed-inputs3Empty command, unknown tool, no input fields → allow
network-tools3WebFetch, WebSearch, curl → allow

Writing custom scenarios

Place .scenario.jsonl files in .sigmashake/scenarios/ (or any directory passed to --dir). A scenario should:

  1. Cover one coherent theme (e.g. "AWS operations", "database writes")
  2. Include both expected-block and expected-allow steps to prove specificity
  3. Pin expected_rule_id where the specific rule is important
  4. Use the exact Claude Code tool_name values your users will encounter

Pro tip: Run with --verbose to see actual decisions and reasons for every step, making it easy to debug unexpected outcomes.

Hub certification badge

Once your scenarios all pass, submit your report to the Hub:

# Run and capture report
ssg certify --json > report.json

# Submit to hub (requires GITHUB_TOKEN with repo scope)
curl -X POST https://hub.sigmashake.com/api/rulesets/<id>/certify \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "Content-Type: application/json" \
-d @report.json

A Certified badge will appear on your ruleset card, showing users that the rules have been verified against Claude Code tool calls.

CI integration

Add to your GitHub Actions workflow:

- name: Certify governance rules
run: |
pnpm add -g @sigmashake/ssg
ssg certify --dir=.sigmashake/scenarios

A non-zero exit code will fail the workflow, preventing you from publishing uncertified rule changes.