Workflow Engine and Quality Systems
Workflow Engine
Section titled “Workflow Engine”The DAG Model
Section titled “The DAG Model”The workflow is encoded as a machine-readable directed acyclic graph in
workflow-graph.json:
flowchart TD
S1["step-1: Requirements"]
G1{{"gate-1: Approval"}}:::gate
S2["step-2: Architecture"]
G2{{"gate-2: Approval"}}:::gate
S3["step-3: Design"]
S35["step-3.5: Governance"]
G25{{"gate-2.5: Approval"}}:::gate
S4B["step-4b: Bicep Plan"]
S4T["step-4t: TF Plan"]
G3{{"gate-3: Approval"}}:::gate
S5B["step-5b: Bicep Code"]
S5T["step-5t: TF Code"]
G4{{"gate-4: Validation"}}:::gate
S6B["step-6b: Bicep Deploy"]
S6T["step-6t: TF Deploy"]
G5{{"gate-5: Approval"}}:::gate
S7["step-7: As-Built"]:::endNode
S1 --> G1 --> S2 --> G2
G2 --> S3
S3 --> S35
S35 --> G25
G25 --> S4B & S4T
S4B & S4T --> G3
G3 --> S5B & S5T
S5B & S5T --> G4
G4 --> S6B & S6T
S6B & S6T --> G5
G5 --> S7
Each node has a type (agent-step, gate, subagent-fan-out, validation), and each
edge has a condition (on_complete, on_skip, on_fail). Conditional routing at IaC
nodes is governed by the decisions.iac_tool field.
Gates and Approval Points
Section titled “Gates and Approval Points”Five mandatory gates require explicit human confirmation before the workflow advances:
| Gate | After | Blocks Until |
|---|---|---|
| 1 | Step 1 | User approves requirements |
| 2 | Step 2 | User approves architecture and cost estimate |
| 3 | Step 4 | User approves implementation plan |
| 4 | Step 5 | Automated validation passes (lint, build, review) |
| 5 | Step 6 | User approves deployment and verifies resources |
IaC Routing
Section titled “IaC Routing”The iac_tool field in 01-requirements.md determines which track is activated.
Steps 4b, 5b, 6b form the Bicep track; steps 4t, 5t, 6t form the Terraform track.
Only one track is active for a given project.
Session State and Resume
Section titled “Session State and Resume”The 00-session-state.json file (schema v2.0) provides atomic state tracking:
{ "schema_version": "2.0", "project": "my-project", "current_step": 2, // (1)! "lock": { "owner_id": "copilot-session-abc123", // (2)! "heartbeat": "2026-03-04T10:15:00Z", "attempt_token": "550e8400-e29b-41d4-a716-446655440000" // (3)! }, "steps": { "2": { "status": "in_progress", "sub_step": "phase_2_waf", "claim": { "owner_id": "copilot-session-abc123", "heartbeat": "2026-03-04T10:15:00Z", "attempt_token": "550e8400-e29b-41d4-a716-446655440000", "retry_count": 0, "event_log": [] } } }}- Tracks which step is active — the Conductor uses this for resume
- Claim-based locking prevents concurrent sessions from corrupting state
- Unique token per attempt — stale heartbeats are auto-recovered
The claim model prevents concurrent sessions from corrupting state. Stale heartbeats
(older than stale_threshold_ms, default 5 minutes) are automatically recovered.
Session Break Protocol
Section titled “Session Break Protocol”At Gates 2 and 3, the Conductor recommends starting a fresh VS Code Copilot Chat session. Long-running sessions (3+ hours) experience forced context summarisations that lose critical decision context. The Session Break Protocol:
- Conductor writes current state to
00-session-state.json - Conductor writes
00-handoff.mdwith human-readable summary - Conductor prints a “SESSION BREAK RECOMMENDED” message
- User starts a new chat, invokes Conductor again
- Conductor reads
00-session-state.json, finds the next pending step, and resumes
This was driven by real-world observation: the nordic-fresh-foods end-to-end test experienced 5 forced context summarisations in a single 3h39m session.
Quality and Safety Systems
Section titled “Quality and Safety Systems”Validation Scripts
Section titled “Validation Scripts”Every convention is backed by a machine-enforceable check. The validation suite runs
via two parallel groups: validate:_node (Node.js
validators) and validate:_external (external tool validators):
| Category | Validators |
|---|---|
| Markdown | lint:md, lint:links:docs |
| Artefact format | lint:artifact-templates, lint:h2-sync, fix:artifact-h2 |
| Agent quality | lint:agent-frontmatter, lint:agent-body-size |
| Skill quality | lint:skills-format, lint:skill-size, lint:skill-references, lint:orphaned-content |
| Instruction quality | lint:instruction-frontmatter, validate:instruction-refs |
| Governance | lint:governance-refs, lint:mcp-config |
| Infrastructure | lint:terraform-fmt, validate:terraform |
| Session state | validate:session-state, validate:session-lock |
| Registry/config | validate:workflow-graph, validate:agent-registry, validate:skill-affinity |
| Code quality | lint:json, lint:python, lint:yaml |
| VS Code config | validate:vscode |
| Meta | lint:version-sync, lint:deprecated-refs, lint:docs-freshness, lint:glob-audit |
All validators run via npm run validate:all.
Git Hooks (Pre-Commit and Pre-Push)
Section titled “Git Hooks (Pre-Commit and Pre-Push)”Pre-commit (sequential, via lefthook): Validates staged files only — markdown lint, link checks, H2 sync, artefact templates, agent frontmatter, instruction frontmatter, Python lint, Terraform format and validate.
Pre-push (parallel, via lefthook): Diff-based domain routing. The diff-based-push-check.sh
script categorises changed files and runs only matching validators:
*.bicep→ Bicep build + lint*.tf→ Terraform fmt + validate*.agent.md→ Agent frontmatter + body size*.instructions.md→ Instruction frontmatterSKILL.md→ Skills format + skill size*.json→ JSON syntax*.py→ Ruff lint
Circuit Breaker
Section titled “Circuit Breaker”The circuit breaker pattern protects against runaway agent loops during deployment:
| Anomaly Pattern | Detection Threshold | Action |
|---|---|---|
| Error repetition | 3 consecutive | Halt, write blocked finding |
| Empty response loop | 3 consecutive | Halt, escalate to human |
| Timeout cascade | 3 consecutive | Halt, check auth |
| What-if oscillation | 2 cycles | Halt, flag resource conflict |
| Auth failure loop | 2 consecutive | Halt, prompt re-authentication |
Context Compression
Section titled “Context Compression”The context-shredding system defines three compression tiers for artifact loading:
| Tier | Trigger | Strategy |
|---|---|---|
full | < 60% used | Load entire artefact |
summarized | 60–80% | Key H2 sections only (tables preserved) |
minimal | > 80% | Decision summaries only (< 500 characters) |
When the challenger-review-subagent loads predecessor artefacts for review, it is
instructed to apply the same 3-tier compression: at the summarized tier, preserving
only resource list, SKUs, WAF scores, compliance matrix, and budget sections; at
minimal, using only the decisions field from 00-session-state.json plus the
resource list. Whether the LLM follows these instructions consistently varies —
the compact_for_parent carry-forward between passes is the part that reliably works.
Copilot Hooks
Section titled “Copilot Hooks”The project uses 3 Copilot hooks (.github/hooks/) that intercept agent actions
at runtime:
| Hook | Trigger | Purpose |
|---|---|---|
tool-guardian | preToolUse | Blocks dangerous commands (destructive ops, force pushes, DB drops) |
secrets-scanner | sessionEnd | Scans modified files for leaked secrets and credentials |
session-logger | sessionStart | Logs session lifecycle and injects project context |
governance-audit | userPromptSubmitted | Scans prompts for threat signals with governance levels |
post-edit-format | PostToolUse | Auto-formats files after agent edits (whitespace, trailing newlines) |
Hooks are defined in hooks.json files with type (command), path to shell script,
and timeout. They run automatically — agents do not invoke them explicitly.