Workflow Engine and Quality Systems
Workflow Engine
Section titled “Workflow Engine”The DAG Model
Section titled “The DAG Model”The workflow is encoded as a machine-readable directed acyclic graph in
workflow-graph.json:
flowchart TD
S1["step-1: Requirements"]
G1{{"gate-1: Approval"}}:::gate
S2["step-2: Architecture"]
G2{{"gate-2: Approval"}}:::gate
S3["step-3: Design"]
S35["step-3.5: Governance"]
G25{{"gate-2.5: Approval"}}:::gate
S4B["step-4b: Bicep Plan"]
S4T["step-4t: TF Plan"]
G3{{"gate-3: Approval"}}:::gate
S5B["step-5b: Bicep Code"]
S5T["step-5t: TF Code"]
G4{{"gate-4: Validation"}}:::gate
S6B["step-6b: Bicep Deploy"]
S6T["step-6t: TF Deploy"]
G5{{"gate-5: Approval"}}:::gate
S7["step-7: As-Built"]:::endNode
S1 --> G1 --> S2 --> G2
G2 --> S3
S3 --> S35
S35 --> G25
G25 --> S4B & S4T
S4B & S4T --> G3
G3 --> S5B & S5T
S5B & S5T --> G4
G4 --> S6B & S6T
S6B & S6T --> G5
G5 --> S7
Each node has a type (agent-step, gate, subagent-fan-out, validation), and each
edge has a condition (on_complete, on_skip, on_fail). Conditional routing at IaC
nodes is governed by the decisions.iac_tool field.
Gates and Approval Points
Section titled “Gates and Approval Points”Five mandatory gates require explicit human confirmation before the workflow advances:
| Gate | After | Blocks Until |
|---|---|---|
| 1 | Step 1 | User approves requirements |
| 2 | Step 2 | User approves architecture and cost estimate |
| 3 | Step 4 | User approves implementation plan |
| 4 | Step 5 | Automated validation passes (lint, build, review) |
| 5 | Step 6 | User approves deployment and verifies resources |
IaC Routing
Section titled “IaC Routing”The iac_tool field in 01-requirements.md determines which track is activated.
Steps 4b, 5b, 6b form the Bicep track; steps 4t, 5t, 6t form the Terraform track.
Only one track is active for a given project.
Session State and Resume
Section titled “Session State and Resume”The 00-session-state.json file (schema v3.0) provides atomic state tracking:
{ "schema_version": "3.0", "project": "my-project", "current_step": 2, "steps": { "2": { "status": "in_progress", "sub_step": "phase_2_waf", "started": "2026-03-04T10:05:00Z", "artifacts": ["agent-output/my-project/02-architecture-assessment.md"] } }}VS Code Copilot executes agents serially — only one agent runs at a time.
The v3.0 schema removed the lock/claim protocol (previously in v2.0) since
concurrent agent execution does not occur. Atomic writes (.tmp → rename
→ .bak) prevent file corruption.
Session Break Protocol
Section titled “Session Break Protocol”At Gates 2 and 3, the Orchestrator recommends starting a fresh VS Code Copilot Chat session. Long-running sessions (3+ hours) experience forced context summarisations that lose critical decision context. The Session Break Protocol:
- Orchestrator writes current state to
00-session-state.json - Orchestrator writes
00-handoff.mdwith human-readable summary - Orchestrator prints a “SESSION BREAK RECOMMENDED” message
- User starts a new chat, invokes Orchestrator again
- Orchestrator reads
00-session-state.json, finds the next pending step, and resumes
This was driven by real-world observation: the malta-catering end-to-end test experienced 5 forced context summarisations in a single 3h39m session.
Quality and Safety Systems
Section titled “Quality and Safety Systems”Validation Scripts
Section titled “Validation Scripts”Every convention is backed by a machine-enforceable check. The validation suite runs
via two parallel groups: validate:_node (Node.js
validators) and validate:_external (external tool validators):
| Category | Validators |
|---|---|
| Markdown | lint:md, lint:links:docs |
| Artefact format | validate:artifacts, lint:artifact-templates, lint:h2-sync |
| Agent quality | validate:agents |
| Skill quality | validate:skills, validate:skill-checks, lint:skill-references, lint:orphaned-content |
| Instruction quality | validate:instruction-checks |
| Governance | lint:governance-refs, lint:mcp-config |
| Infrastructure | lint:terraform-fmt, validate:terraform, validate:iac-security-baseline |
| Session state | validate:session-state (also covers deprecated lock/claim field detection) |
| Registry/config | validate:workflow-graph, validate:agent-registry |
| Code quality | lint:json, lint:python, lint:yaml |
| VS Code config | validate:vscode |
| Explorer graph | validate:explorer-graph |
| Meta | lint:version-sync, lint:deprecated-refs, lint:docs-freshness, lint:glob-audit, validate:no-hardcoded-counts, validate:terminology |
See reference/validation-reference
for the full authoritative list — it is generated from package.json.
All validators run via npm run validate:all.
Git Hooks (Pre-Commit and Pre-Push)
Section titled “Git Hooks (Pre-Commit and Pre-Push)”Pre-commit (sequential, via lefthook): Validates staged files only — markdown lint, link checks, H2 sync, artefact templates, agent frontmatter, instruction frontmatter, Python lint, Terraform format and validate.
Pre-push (parallel, via lefthook): Diff-based domain routing. The diff-based-push-check.sh
script categorises changed files and runs only matching validators:
*.bicep→ Bicep build + lint*.tf→ Terraform fmt + validate*.agent.md→ Agent frontmatter + body size*.instructions.md→ Instruction frontmatterSKILL.md→ Skills format + skill size*.json→ JSON syntax*.py→ Ruff lint
Circuit Breaker
Section titled “Circuit Breaker”The circuit breaker pattern protects against runaway agent loops during deployment:
| Anomaly Pattern | Detection Threshold | Action |
|---|---|---|
| Error repetition | 3 consecutive | Halt, write blocked finding |
| Empty response loop | 3 consecutive | Halt, escalate to human |
| Timeout cascade | 3 consecutive | Halt, check auth |
| What-if oscillation | 2 cycles | Halt, flag resource conflict |
| Auth failure loop | 2 consecutive | Halt, prompt re-authentication |
Context Compression
Section titled “Context Compression”The context-shredding system defines three compression tiers for artifact loading:
| Tier | Trigger | Strategy |
|---|---|---|
full | < 60% used | Load entire artefact |
summarized | 60–80% | Key H2 sections only (tables preserved) |
minimal | > 80% | Decision summaries only (< 500 characters) |
When the challenger-review-subagent loads predecessor artefacts for review, it is
instructed to apply the same 3-tier compression: at the summarized tier, preserving
only resource list, SKUs, WAF scores, compliance matrix, and budget sections; at
minimal, using only the decisions field from 00-session-state.json plus the
resource list. Whether the LLM follows these instructions consistently varies —
the compact_for_parent carry-forward between passes is the part that reliably works.
Copilot Hooks
Section titled “Copilot Hooks”Copilot hooks in .github/hooks/ intercept agent actions at runtime. See the
Hooks guide for the authoritative list; the current
set covers:
| Hook | Trigger | Purpose |
|---|---|---|
tool-guardian | PreToolUse | Blocks dangerous commands (destructive ops, force pushes, DB drops) |
secrets-scanner | Stop | Scans modified files for leaked secrets and credentials |
session-telemetry | SessionStart, Stop, UserPromptSubmit | Merged session lifecycle logging and governance audit |
subagent-validation | SubagentStop | Validates subagent invocation and outputs |
tool-audit | PostToolUse | Logs tool usage metadata (name, status) |
Hooks are defined in hooks.json files with type (command), path to shell script,
and timeout. They run automatically — agents do not invoke them explicitly.