Applying Security Brutalism, A Playbook for Leadership and Practitioners

Part 1: For Security and IT Leadership

The Frame

The goal of a security program is not to eliminate risk. That goal is unachievable, and pursuing it leads to programs that spend enormous resources on low-consequence problems while leaving critical systems exposed. The actual goal is survivability: reducing how bad it gets when something goes wrong, and reducing how long you stay failed.

Three questions drive every decision in this approach. Does this reduce susceptibility, meaning realistic attack paths into systems that matter? Does it limit damage, the blast radius if compromise happens? Does it reduce recovery time, meaning how fast you detect, contain, and restore? If a control, tool, policy, or process cannot answer yes to at least one of these questions with evidence, it is consuming resources and adding complexity without improving your security position.

Complexity is not free. Every additional tool, integration, and process that cannot justify its survivability contribution creates attack surface, consumes staff time, and adds failure modes. Removing things that do not meet the bar is itself a security improvement.

Mapping Consequence: Where You Start

Before evaluating any control or investment, you need a consequence map. This is the most important thing a security program can produce, and it is frequently skipped because organizations assume they already know it.

The consequence map is a ranked list of your systems and capabilities, ordered by what failure actually costs. For each system, you answer: what does this system exist to do, what does the business actually lose if it stops or is corrupted, and is that loss recoverable or existential.

Recoverable means expensive but survivable: revenue loss for days or weeks, customer churn, incident response costs, reputational damage that fades over time. Existential means the organization does not recover: regulatory action that shuts down operations, permanent loss of the data the business is built on, liability exposure that exceeds organizational capacity, or reputational damage that permanently destroys customer trust.

This list is not a risk register. It does not need probability scores, maturity ratings, or CVSS numbers. It needs honest answers to those questions, for each system that matters.

The way to run this conversation with your team is to ask them to walk through each major system and describe the worst realistic scenario. Not the theoretical worst case, but a realistic scenario where a competent attacker with time and skill targets that system. What do they take? What do they break? What does the business look like on day three of that incident? What does recovery look like, and how long does it actually take?

Systems that produce existential scenarios go to the top of the list and receive the most attention in everything that follows. This list drives all prioritization decisions that come after it.

Evaluating Your Current Program

Once you have a consequence map, you can evaluate whether your security program is actually protecting the things that matter. Most programs are not organized this way. They tend to be structured around compliance frameworks, vendor capabilities, or historical precedent rather than consequence.

The questions to ask your team are direct. For each item on your consequence map, can they describe the realistic attack paths that lead to compromise of that system? If they cannot, you have a visibility problem you have to fix before anything else matters.

For each consequential system, do they know what a realistic attacker can do after gaining access? What data can be reached? What actions can be taken? What else can the attacker move to from that position? If these answers are not known, your blast radius is undefined and almost certainly larger than you think.

For each consequential system, how long does it take to detect that it has been compromised? Not how long before an alert theoretically fires, but how long before a human is looking at the right data and understands what is happening. If the honest answer is measured in weeks rather than hours, your detection is not working.

For each consequential system, how long does it take to fully restore it from a known-good state? Has that been tested in the last 90 days, under realistic conditions, with the result measured and documented? If it has not been tested, you do not know the answer. Plans that exist on paper but have never run under pressure fail in ways that are not apparent until the incident.

A program that scores well on compliance audits and has a substantial tooling budget can fail all four of these questions. The score and the budget describe inputs; these questions describe outcomes.

Evaluating Security Investments

When your team proposes a new tool, control, or initiative, the evaluation is the same three questions: does this reduce susceptibility to realistic attacks on your consequential systems, does it limit blast radius if compromise occurs, does it reduce detection or recovery time? Does it do so for the systems that are highest on your consequence map?

If the answer is no, the proposal should not move forward regardless of how it is framed. Compliance requirements are a separate consideration and should be evaluated separately.

The cost of adding tooling is worth understanding clearly. Every new tool is a new attack surface, a new integration that can be misconfigured, a new thing that requires staffing and maintenance, and a new dependency in your supply chain. Security tooling has been a primary attack vector in high-profile supply chain compromises. The burden of proof for adding complexity should be high, and removing things that cannot meet that burden is a legitimate security action.

What to Expect From Your Team

A security team operating this way should be able to produce the following without months of preparation:

A ranked list of the organization's most consequential systems, and the top two or three realistic attack paths against each one
Current detection coverage for each of those attack paths: which ones would produce a signal if executed today, and which ones would not
Recovery status for each consequential system: last tested restoration, measured time-to-restore, and documented gaps
A list of standing access grants and long-lived credentials touching consequential systems, and when they were last reviewed

If the team cannot produce these on short notice, the program is organized around something other than survivability, and reorienting it is the first task.

Compliance and Security

Compliance requirements are real and have to be met. Regulatory exposure, insurance requirements, and contractual obligations are legitimate organizational constraints. The problem is treating compliance work as equivalent to security work, because they are solving different problems.

Compliance satisfies external requirements: auditors, frameworks, regulators, and customers. Security reduces how bad it gets and how long you stay failed when something goes wrong. These overlap sometimes, and when they do, that is efficient. When they do not overlap, they need to be managed and funded separately.

A useful test for any control: does it reduce how bad it gets or how long you stay failed, or does it primarily satisfy an auditor? Both are valid reasons to do something. The mistake is funding one under the pretense of the other, or allowing compliance work to crowd out the resources that should be going toward survivability.

Metrics

The metrics most commonly reported describe the existence of controls, not their effectiveness. Tool coverage percentages, vulnerability counts, and compliance scores tell you what was purchased and deployed. They do not tell you whether the program is working.

Time to detect for high-consequence systems, measured from an event occurring to a human understanding what is happening, tells you whether detection is functioning. Time to contain and time to restore, measured from actual incidents and test exercises rather than runbook estimates, tell you whether recovery capability is real. Blast radius per system, tested rather than assumed, tells you the actual scope of a compromise. Alert signal quality, meaning the proportion of alerts that represent real activity requiring attention, tells you whether detection is producing signal or noise. Honeytoken activation, meaning canary credentials placed in locations only an attacker would find, provides near-certain evidence of active intrusion when triggered.

These metrics require testing and operational discipline to produce, but they are the ones that tell you whether the program is doing what it is supposed to do.