Architecture Decision Records (ADRs) and the design rationale behind the Fendix hybrid scanner.
Fendix is a hybrid scanner. As of v0.9 (Phase 17b), the default scan path is a single Go binary — secrets and dependency-CVE checks run in-process; Semgrep shells out to the host binary if installed; the Python whitebox engine for auth / injection / AST checks is opt-in via --python-engine. The cross-engine communication contract (newline-delimited JSON, ADR-002 below) still applies for plugins and the opt-in Python path.
User CLI Command
|
v
+----------------------------------+
| Go Binary (v0.9+) |
| - CLI (cobra) |
| - HTTP Scanner (black-box) | Sends real HTTP requests
| - Orchestrator |
| - Correlator |
| - Reporters |
| |
| In-process Go scanners: |
| - Secrets (TASK-115) | 16 patterns + .env handling
| - Dep CVE (TASK-119) | govulncheck / pip / npm
| - Plugin runtime (TASK-113) | NDJSON IPC subprocess
+----------------+-----------------+
|
| shells out to host binary
v
+-------------------+
| semgrep (host) | TASK-116 — graceful absence
+-------------------+
|
| --python-engine OPT-IN only
v
+-------------------+
| Python Engine | TASK-118 — no longer bundled
| - Spec Parser | Requires local python/ tree
| - AST Analyzer | or FENDIX_ENGINE pointing at one
| - Auth checks |
+-------------------+Fendix needs two fundamentally different capabilities: black-box HTTP scanning (high concurrency, low latency, single binary distribution) and white-box static analysis (originally written in Python because the security tooling ecosystem there — Semgrep, Bandit, detect-secrets — was a year ahead of any Go equivalent). No single language excelled at both.
//go:embed and extracted to ~/.fendix/engine/ on first run. Spawned as a subprocess. Carried Semgrep, secrets, AST, and dependency CVE checks.Five years on the original ADR-001 has paid for itself. The Go scanner ecosystem has matured (govulncheck has call-graph reachability; the regex story is fine for secrets), and the embedded-Python cost — install footprint, cold-start latency, “do I have Python 3.9+?” support questions — outgrew the benefit. Phase 17b ports the secrets and Semgrep paths to native Go ( internal/scanner/secrets/, internal/scanner/semgrep/ shells out to the host binary), drops the embedded Python distribution from the binary, and makes Python whitebox spawning opt-in via --python-engine. The IPC contract from ADR-002 is preserved for plugins and the opt-in Python path; nothing about the wire shape changed.
SEC-* IDs, severities, references all match byte-for-byte; existing ingest pipelines absorb the new findings unchanged.--python-engine + provide a python treeThe Go orchestrator needs to communicate with the Python engine. The protocol must be simple, debuggable, streamable, and reliable.
Options considered: gRPC (too complex), Unix socket + JSON-RPC (socket management overhead), newline-delimited JSON over stdin/stdout (simplest).
ScanRequest (Go → Python stdin)
{
"mode": "whitebox",
"spec": "./openapi.yaml",
"code_path": "./src/",
"language": "python",
"checks": ["secrets", "auth", "injection", "semgrep", "deps"],
"verbose": false
}Finding (Python → Go stdout, one per line)
{
"id": "SEC-001",
"title": "Hardcoded API key detected",
"severity": "CRITICAL",
"source": "whitebox",
"category": "secrets",
"endpoint": "src/config.py:14",
"evidence": "API_KEY = 'sk-live-abc...' [truncated]",
"fix": "Move to environment variable. Rotate the exposed key immediately.",
"references": ["CWE-798"],
"confidence": "HIGH",
"line": "src/config.py:14"
}Stream terminator (final line)
{"done": true, "total": 12}Every finding is scored based on impact category, detection confidence, and whether multiple detection methods agree (correlated source gets a 1.1x multiplier).
Score = ImpactBase[category] x ConfidenceMult[confidence] x SourceMult[source]
CRITICAL >= 9.0 | ImpactBase: ConfidenceMult: SourceMult:
HIGH >= 7.0 | auth_bypass: 10.0 HIGH: 1.0 correlated: 1.1
MEDIUM >= 4.0 | injection: 9.5 MEDIUM: 0.75 blackbox: 1.0
LOW >= 1.0 | secrets: 9.0 LOW: 0.5 whitebox: 0.9
INFO < 1.0 | idor: 8.5
| data_exposure: 7.0
| cors: 6.5
| headers: 4.0
| info_disclosure: 2.0