Brutalist Security AI Threat Modeling Process
Blaine W. sent this question over the weekend: How would you apply the principles you talk about in your blog to AI and AI-enabled systems threat modeling?
So, after some coffee and stuff, here’s a Brutalist Security threat modeling process for AI and AI-enabled systems, rooted in the Security Brutalism principles: honest, simple, visible, and resilient.
Follow the steps, ask the questions, and write down the answers.
Step 1: Name What’s Real
- What the AI model is actually doing: classification, generation, ranking, decisions, etc.
- Where is the AI model running: cloud API, on-prem GPU, client-side model.
- What connects to it: inputs, outputs, downstream consumers, retraining cycles.
- What it can do: the tools available to it, can it reach out to the Internet, can it touch sensitive data, etc.
Brutalist Security Rule: If you can't draw the architecture on a napkin, you don't understand well enough.
Step 2: Define What You’re Defending
- Model integrity (can it be stolen, poisoned, modified?)
- Input sanity (can prompts or data be weaponized?)
- Output safety (can AI generate bad, dangerous, or manipulated results?)
- Pipeline trust (can the training or deployment system be hijacked?)
- Decision impact (what happens when the AI is wrong?)
Brutalist Rule: Defend what matters, not what’s easy.
Step 3: Expose All Attack Surfaces
- Prompt injection
- Prompt or sensitive information in the model leaking
- Data poisoning (in training or finetuning)
- Model inversion or extraction
- Output manipulation or abuse
- Supply chain compromise (open source models and ML/LLM libraries)
- Overreliance by humans (automation bias)
Brutalist Rule: No hidden corners. No magic boxes.
Step 4: Assume the AI Will Be Attacked
- What if attackers control input data?
- What if attackers observe or replay outputs?
- What if attackers trick downstream users?
- What if attackers poison training data subtly?
Brutalist Rule: Plan for failure. Build to survive, not just to function.
Step 5: Kill the Complexity
- Remove unnecessary layers, wrappers, and orchestration.
- Use simple controls: logging, input validation, version control, minimal access.
- Prefer "boring" solutions over clever ML/LLM pipelines when security is critical.
Brutalist Rule: Complexity is a liability, not a feature.
Step 6: Build Visible Defenses
- Log everything: inputs, outputs, model versions, data sources.
- Use early warning detection to flag weird behavior.
- Monitor for misuse, not just failures.
- If you can't log it, it didn't happen: Make sure not to have AI make "transparent" decisions.
Brutalist Rule: If you can't see it, you can’t secure it.
Step 7: Acknowledge What You Can’t Protect
- Document known risks and failure modes. Calculate worst-case scenario impact.
- Disclose trade-offs: speed vs. accuracy, performance vs. safety.
- Don’t pretend AI is “secure” if you don’t fully control its environment.
Brutalist Rule: Honesty over false assurance.
Output: A Brutalist AI Threat Model
- One clear diagram: showing components and data flows.
- One-page table: mapping real threats to real defenses.
- One hard truth: what the AI system cannot do safely.
Note: The output requires review and adjustment, as the technology evolves too rapidly for this current approach to remain effective. That’s the on-the-ground reality. If you have any suggestions, I’d appreciate hearing them.
AI Threat Modeling Template
1. System Snapshot
Describe in plain language what the AI system does and how it fits into the architecture.
- System Name / Scope:
- AI Function(s):
- Deplyment Location:
- Data Sources:
- Downstream Consumers:
- Retraining Frequency and Trigger:
2. What Are We Defending?
List the critical assets in this system that must not be compromised. Below are some examples.
Asset | Why It Matters | What Happens If It Fails |
---|---|---|
Model Logic | ||
Training Data | ||
API | ||
Input Validation Logic |
3. Threat Surface Map
List potential attack vectors. Don't sanitize or over categorize.
Attack Surface | Possible Threats | Notes |
---|---|---|
Input Pipeline | Prompt Injection | |
Training Pipeline | Data poisoning, supply chain manipulation | |
Dependencies | Compromised packages | |
Human Factors | Overtrust, misused |
4. Assume It Gets Attacked
What's your answer when this system is attacked?
Threat Scenario | Detection? | Containment? | Recovery? |
---|---|---|---|
Adversarial Inputs | |||
Prompt Injection | |||
Data Poisoning |
5. Kill The Complexity
Where can we cut clutter, remove fargile connections, or simplify trust boundaries?
- Eliminate unnecessary components.
- Collapse redundant layers.
- Prefer static over dynamic when possible.
- Simplify permissions and access rights.
- Isolate model artifacts from production logic.
6. Make It Visible
What's being logged? What's being monitored? What's being watched in real time?
- Logs collected:
- Input/output logs
- Model version
- Anomaly detection
- Etc
- Alerts triggered by:
- Toxic output
- Unusual access
- Behavior change
- Others
7. One Hard Truth
What can this AI system do safely?
- Known Limits:
- Disallowed Use Cases:
- Security Concerns Still Under Review:
Summary Artifact
Attach to the security document:
- The architecture diagram
- This filled template
- Any critical documents or policies to reference