THE SECURITY BRUTALIST

Security Brutalism Under Real Conditions, Part 6: Starting from Zero

This series has spent four parts on how to apply Security Brutalism and Survivability Engineering to programs that already exist. This post is for the organization that does not have a security program yet, or has one in name only. It covers what a solid baseline looks like, how to know when you actually have it, and why these foundations never stop mattering, even for teams that have been doing this for years.

Start With the Baseline

A baseline security program is not a compliance certification and it is not a tool stack. It is a set of capabilities that answer the survivability questions for the systems that matter most. Nothing more than that, and nothing less.

Here is what it covers:

A consequence map for at least five systems. Authentication, customer data storage, payment processing if applicable, production deployment pipeline, and source code. These five, for most organizations, account for the scenarios that would actually end the business. Map them before touching anything else. Know what failure costs for each one. Know whether that cost is recoverable or existential. Everything else flows from that list.

A working inventory of identities. Query the identity provider for every active account, human and non-human. Pull IAM roles and service accounts from every cloud environment. Run secret scanning against code repositories. For every non-human identity: what does it have access to, when was it last used, does it have a current owner? Anything with no recent use and no documented owner gets revoked now. This is not optional and it is not a future-quarter project. Unused credentials with no owner are the path an automated attack will find and use.

MFA on everything. Every authentication path that reaches from the internet into any system should require a second factor. This single control removes a significant portion of credential-based attack paths. It is the highest-leverage control available per hour of effort for a program starting from zero.

No long-lived credentials to consequential systems. Standing API keys, service accounts with permanent production access, credentials that have not been rotated in months: these are the access paths that a breach converts into persistent intrusion. Rotate or revoke them. If a system breaks when a credential is revoked, that is useful information about an undocumented dependency, not a reason to keep the credential.

Honeytokens deployed in at least three locations. A canary credential in a configuration file that is no longer in active use. A fake API key in an internal wiki or documentation page. A canary database credential in an old backup directory. These cost almost nothing to deploy and nothing to maintain, and they produce near-zero false positives. When one activates, something is actively looking through your environment. That is the first layer of detection, and it is available to an organization with a two-person security team and no SIEM.

Logging on consequential systems, with a human who reads it. Not comprehensive SIEM coverage of everything. Logging on the five systems from the consequence map, configured to alert on the anomalies that matter: first-time access from any identity, access outside normal hours, changes to access controls, unusual data volumes. And critically: someone who actually reads those alerts within a reasonable window. A log that nobody reads is not detection. It is storage costs.

One tested restoration. Pick the most consequential system on the list. Run an actual restoration from backup to a test environment. Time it end-to-end. Document what broke or was missing. This single exercise will tell you more about the actual survivability of that system than any other activity. If the restoration fails, that is the most important thing you can learn, and finding out during a scheduled test costs far less than finding out during an incident.

Named owners for each of the above. Not a team. A person. If every item on this list has a named human who owns it and is accountable for it, the baseline survives turnover and growth. If ownership is diffuse or assumed, it decays the moment the person who built it leaves.

How to Test That it is Working

A baseline that exists on paper and a baseline that actually works are different things. Testing is not optional, and testing does not mean reviewing documentation.

Run the survivability test on the most consequential system. Assume it is compromised right now. Walk through what a realistic attacker can do with current access. Then measure: how long before someone on your team notices something is wrong? How long to revoke all access to that system? How long to restore it to a known-good state from the last backup? Write those three numbers down. Compare them to what you thought they were before you ran the test. The gap is what you are actually working with.

Trigger one honeytoken manually. Access one of the canary credentials you deployed. See how long it takes for the alert to reach a human, how long for that human to understand what it means, and how long to begin an investigation response. If the alert takes hours to surface or nobody knows what to do with it, the detection capability is not actually working.

Revoke your own access to a production system. Pick an account that should not have standing access. Revoke it. See whether any legitimate process breaks, and how long it takes before someone raises it as a problem. If it breaks something undocumented, that is a mapping gap to close. If nobody notices for days, that is a signal about how much standing access in your environment is actually being used versus sitting idle.

Try to answer these three questions from evidence, not memory. For your most consequential system: what identities have access to it right now? If it were compromised at this moment, what else could an attacker reach from it? When was the last time a backup was successfully restored? If any of these takes more than ten minutes to answer, or if the honest answer is "I don't know", those are the gaps to close first.

The testing above is not theoretical. AI-assisted attack tools run continuously against exposed services, find misconfigurations before anyone reports them, and identify which credentials in which repositories are active and what they have access to. This is not a sophisticated nation-state capability. It is available to a moderately resourced attacker running automated tooling. An organization with no documented non-human identity inventory, no tested recovery procedure, and no detection on its most consequential systems is not in a stable waiting position. It is in a state that automated scanning will find, characterize, and potentially act on while the security team is building its program. The urgency of the baseline is not about achieving perfection. It is about removing the easiest paths while the harder work continues. The specific things AI-assisted attacks do well that the baseline directly addresses: Automated credential discovery. Secret scanning finds committed tokens fast. Rotating credentials proactively removes the value of what is found. Revoking unused credentials removes the access before it is discovered and used. Scale. AI-assisted phishing and credential attacks operate at a volume that makes low-friction access paths reliable targets even if success rates are low per attempt. MFA removes the value of mass-scale credential attacks because a valid password alone is not sufficient. Speed of exploitation. The window between a vulnerability being known and being actively exploited has shrunk. A system where compromise of one component immediately grants access to all others is a much faster path to the objective than a system where blast radius is bounded. Segmentation and blast radius design do not slow the initial compromise. They slow everything after it.

Mature Programs Need This Too

The baseline described here is not a beginner program that mature teams have graduated past. It is the foundation that a mature program needs to maintain.

Mature programs decay. They accumulate tools and access grants and service accounts faster than they review them. Permissions granted for a project that ended two years ago are still active. Detection baselines were calibrated in a different environment and have not been reviewed as the environment changed. The last tested restoration is eighteen months old. The consequence map was built in a workshop that nobody has revisited since the last acquisition.

A sixty-person security team can have weaker foundations than a three-person team that has maintained the discipline, because the sixty-person team has more complexity to manage and more opportunities for the foundations to decay. The survivability test applies equally. The quarterly entitlement review applies equally. The tested restoration applies equally.

The most common failure mode in mature programs is not a missing tool or an unfunded control. It is that the foundations were built and then assumed to be stable. They are not stable. They require the same ongoing maintenance as any other security control, and that maintenance is what the cadence in Part 3 is designed to provide.

As Part 5 discussed, the active layer of a security program, the specialist cell operating outside the walls with deception technology and continuous war-gaming, requires a clean environment to be effective. A mature program that has let its foundations decay will not get the value from that layer that it should. Deception intelligence produced in a noisy environment gets buried. War-gaming results against a poorly understood environment are not actionable. The active layer makes the foundation more powerful. It does not substitute for one.

So, what to focus on in a mature program?

For a team that has the foundations in place and is asking where to go next, the priorities organize around three questions.

Is the detection still calibrated to the actual environment? Environments change faster than detection rules. A behavioral baseline built two years ago reflects a different system, different team, and different access patterns. Review it. If alert volumes have grown to the point where investigation rates have dropped, that is the signal that recalibration is overdue.

Has the consequence map kept up with the business? Acquisitions, new products, new regulated markets, and architectural changes all shift what the existential list looks like. A consequence map that predates a significant business change is a fictional map being used to make real prioritization decisions.

Is recovery capability still real? Test it. Not a tabletop. An actual restoration, timed. If the most recent tested restoration is more than six months old, the recovery capability is assumed, not known.

Beyond the foundations: the active layer described in Part 4 is the natural extension for a mature program that has the foundations solid and is looking to raise the cost of reaching what it has hardened. And the agent security work described in Part 4 is not optional for any organization that has deployed autonomous AI systems in production. Those agents are running with the same access assumptions that service accounts ran with before organizations learned to audit them, and they will produce the same class of incident if left unexamined.

The foundation never stops being the foundation. What changes as the program matures is depth of coverage, accuracy of testing, and the extensions that build on top of it. Not whether it needs to be maintained.


If you’re interested in building a stronger security program along these lines, you can reach out at Black Arrows.