Skip to content
Back to Docs
12ARCHITECTURE

Architecture

Architecture Decision Records (ADRs) and the design rationale behind the Fendix hybrid scanner.

System Overview

Fendix is a hybrid scanner. As of v0.9 (Phase 17b), the default scan path is a single Go binary — secrets and dependency-CVE checks run in-process; Semgrep shells out to the host binary if installed; the Python whitebox engine for auth / injection / AST checks is opt-in via --python-engine. The cross-engine communication contract (newline-delimited JSON, ADR-002 below) still applies for plugins and the opt-in Python path.

architecture
User CLI Command
       |
       v
+----------------------------------+
|         Go Binary  (v0.9+)       |
|  - CLI (cobra)                   |
|  - HTTP Scanner    (black-box)   |  Sends real HTTP requests
|  - Orchestrator                  |
|  - Correlator                    |
|  - Reporters                     |
|                                  |
|  In-process Go scanners:         |
|  - Secrets         (TASK-115)    |  16 patterns + .env handling
|  - Dep CVE         (TASK-119)    |  govulncheck / pip / npm
|  - Plugin runtime  (TASK-113)    |  NDJSON IPC subprocess
+----------------+-----------------+
                 |
                 | shells out to host binary
                 v
       +-------------------+
       |  semgrep (host)   |  TASK-116 — graceful absence
       +-------------------+
                 |
                 | --python-engine OPT-IN only
                 v
       +-------------------+
       |  Python Engine    |  TASK-118 — no longer bundled
       |  - Spec Parser    |  Requires local python/ tree
       |  - AST Analyzer   |  or FENDIX_ENGINE pointing at one
       |  - Auth checks    |
       +-------------------+

ADR-001: Go + Python Hybrid Architecture

Accepted (Phase 0); evolved in v0.9 / Phase 17b

Context

Fendix needs two fundamentally different capabilities: black-box HTTP scanning (high concurrency, low latency, single binary distribution) and white-box static analysis (originally written in Python because the security tooling ecosystem there — Semgrep, Bandit, detect-secrets — was a year ahead of any Go equivalent). No single language excelled at both.

Decision (Phase 0 — v0.1 through v0.8)

  • Go for the CLI interface, HTTP scanner, orchestrator, correlator, and report renderer. Compiles to a single binary with excellent concurrency primitives.
  • Python for the static analysis engine, embedded in the binary via //go:embed and extracted to ~/.fendix/engine/ on first run. Spawned as a subprocess. Carried Semgrep, secrets, AST, and dependency CVE checks.
  • Communication via newline-delimited JSON over stdin/stdout (see ADR-002 below).

Evolution in v0.9 / Phase 17b — default no longer carries Python

Five years on the original ADR-001 has paid for itself. The Go scanner ecosystem has matured (govulncheck has call-graph reachability; the regex story is fine for secrets), and the embedded-Python cost — install footprint, cold-start latency, “do I have Python 3.9+?” support questions — outgrew the benefit. Phase 17b ports the secrets and Semgrep paths to native Go ( internal/scanner/secrets/, internal/scanner/semgrep/ shells out to the host binary), drops the embedded Python distribution from the binary, and makes Python whitebox spawning opt-in via --python-engine. The IPC contract from ADR-002 is preserved for plugins and the opt-in Python path; nothing about the wire shape changed.

  • Default cold start dropped to ~5.6 ms p50 (was ~7.3 ms on v0.8).
  • No Python interpreter requirement in the default scan path — fendix runs on machines without Python installed.
  • Same finding shape across the transition — SEC-* IDs, severities, references all match byte-for-byte; existing ingest pipelines absorb the new findings unchanged.

Positive Consequences (still hold)

  • Best tool for each job — Go for networking, Python remains an option for AST-heavy analysis
  • Single binary distribution for the CLI (Python no longer bundled)
  • Optional Python engine remains independently runnable for debugging
  • Clean separation of concerns between engines via the NDJSON IPC contract (still load-bearing for plugins)
  • Each engine can still be tested independently

Trade-offs & Mitigations

  • Two language ecosystems to maintain (Go modules + pip when Python is opted in)
  • Python subprocess startup adds ~18 ms to opt-in scans (measured)
  • Users who relied on the implicit auth/injection Python checks must add --python-engine + provide a python tree
  • IPC contract documented and tested end-to-end in CI

ADR-002: Newline-Delimited JSON IPC Contract

Accepted

Context

The Go orchestrator needs to communicate with the Python engine. The protocol must be simple, debuggable, streamable, and reliable.

Options considered: gRPC (too complex), Unix socket + JSON-RPC (socket management overhead), newline-delimited JSON over stdin/stdout (simplest).

IPC Schema

ScanRequest (Go → Python stdin)

ScanRequest
{
  "mode": "whitebox",
  "spec": "./openapi.yaml",
  "code_path": "./src/",
  "language": "python",
  "checks": ["secrets", "auth", "injection", "semgrep", "deps"],
  "verbose": false
}

Finding (Python → Go stdout, one per line)

Finding
{
  "id": "SEC-001",
  "title": "Hardcoded API key detected",
  "severity": "CRITICAL",
  "source": "whitebox",
  "category": "secrets",
  "endpoint": "src/config.py:14",
  "evidence": "API_KEY = 'sk-live-abc...' [truncated]",
  "fix": "Move to environment variable. Rotate the exposed key immediately.",
  "references": ["CWE-798"],
  "confidence": "HIGH",
  "line": "src/config.py:14"
}

Stream terminator (final line)

terminator
{"done": true, "total": 12}

Positive Consequences

  • Zero dependencies — JSON and stdin/stdout exist in every language
  • Streamable — findings appear in Go as Python discovers them
  • Debuggable — pipe to jq or cat for inspection
  • Python engine independently testable
  • No network ports, sockets, or connection management

Trade-offs & Mitigations

  • No schema validation at protocol level (mitigated by tests)
  • No bidirectional communication mid-scan
  • Evidence fields truncated to 200 characters maximum
  • End-to-end contract tests run in CI

Severity Scoring Model

Every finding is scored based on impact category, detection confidence, and whether multiple detection methods agree (correlated source gets a 1.1x multiplier).

scoring-model
Score = ImpactBase[category] x ConfidenceMult[confidence] x SourceMult[source]

CRITICAL  >= 9.0    |  ImpactBase:           ConfidenceMult:   SourceMult:
HIGH      >= 7.0    |    auth_bypass: 10.0     HIGH:   1.0      correlated: 1.1
MEDIUM    >= 4.0    |    injection:    9.5     MEDIUM: 0.75     blackbox:   1.0
LOW       >= 1.0    |    secrets:      9.0     LOW:    0.5      whitebox:   0.9
INFO      <  1.0    |    idor:         8.5
                    |    data_exposure: 7.0
                    |    cors:          6.5
                    |    headers:       4.0
                    |    info_disclosure: 2.0