decision-making

Red Team, Blue Team

TL;DR for executives

Before you take a recommendation to a board, a partner, or a room that will challenge it, red teaming lets you challenge it yourself first, going after the assumptions it rests on, finding the scenarios where it breaks, identifying what a smart adversary would exploit. The result is reasoning you can defend because you’ve already faced the hardest questions and know exactly where it holds and where it doesn’t.

Most frameworks focus on thinking through a problem from your own perspective. Even when you stress-test your own hypothesis or run a pre-mortem, you are still you, looking at the problem from your angle, with your assumptions, through your lens.

Red Team / Blue Team splits you in two. One side builds the case. The other side tears it apart. On purpose. Systematically. With full permission to be adversarial.

The structure: Blue Team develops the strategy, the plan, the recommendation. They make the strongest possible case for why this is the right move. Red Team then attacks it. But instead of vague skepticism, they use specific, targeted challenges designed to find the weakest points. The goal isn’t to win but to make the strategy stronger by exposing its vulnerabilities before reality does.

Where it comes from:

  • The practice originates in military and intelligence communities. The term “red team” comes from Cold War military exercises where one team played the enemy (red) and the other played friendly forces (blue). The read team’s job was to think like the adversary and to attack the blue team’s plan using the adversary’s tactics, resources, and mindset.
  • Later the CIA formalized red teaming. The problem wasn’t lack of information but having everyone looking at the information through the same lens. Groupthink. Confirmation bias. Shared assumptions that nobody challenged because challenging felt disloyal and pessimistic.
  • In business, red teaming is used by Amazon (where teams must write a “press release” for a product and then a separate team tries to destroy their argument), by cybersecurity firms (where red teams literally try to hack into their own system), and by consulting firms during the quality review process before presenting to clients.

Why this matters:

  • Executives trust recommendations if they survived adversarial testing. If you can say “I tried to destroy this recommendation and here’s what I found,” that’s a fundamentally different level of credibility than “here’s what I think.”
  • More practically: every executive has people around her who will challenge their decisions. Board members, investors, competitors, skeptical direct reports. If your recommendation falls apart under their first hard question, you’ve damaged her positioning. If your recommendation has already been stress-tested and you can tell her “they’ll probably push back on X, and here’s why that pushback won’t hold.”
  • Red teaming forces you to systematically attack your own best thinking. It’s the discipline of treating your own ideas as hypotheses to be tested, not conclusions to be defended.

The key disciplines:

  • Separation is essential. The blue team and red team must think independently. If the same person tries to build and attack simultaneously, the attack is always weaker because the builder’s ego protects the idea. In a team setting, different people play each role. When you’re working alone or with one thinking partner, you need to create temporal separation, build the case first, then explicitly switch modes and attack it. The switch needs to be deliberate and complete.
  • The red team must be genuinely adversarial. Not “devil’s advocate” in the casual sense of raising mild objections. The red team should be trying to destroy the strategy. What’s the fatal flaw. What assumption, if wrong, collapses the entire case? What has the blue team ignored, downplayed, or assumed away? The red team’s job is to find the thing that kills.
  • Attack assumptions, not conclusions. The most effective red team move isn’t arguing with the recommendation but identifying the hidden assumptions that the recommendation rests on and asking: what if these aren’t true? Every strategy is built on assumptions about the market, the customer, the competition, the team, the timeline. Most of these assumptions are invisible to the people who made them. The red team makes the visible and tests them.
  • The red team must propose alternatives. Pure criticism is easy and unhelpful. The red team should not only identify weaknesses but propose what a better strategy would look like given those weaknesses. This prevents the exercise from becoming nihilistic, tearing everything down without building anything up.

Variations: 

  1. Pre-mortem plus red team. Run a pre-mortem first to generate failure scenarios, then use the red team to attack mitigation strategies. Layered stress-testing.
  2. Competitive red team. The red team doesn’t just attack your strategy, but also play as your competitor. “If I were your competitor and I saw you making this move, here’s exactly how I’d respond to neutralize or exploit it.” This surfaces competitive dynamics that internal analysis misses.
  3. Customer red team. The red team play as the customer. “If I were your target buyer and you pitched me this, here’s why I’d say no.” This might surface positioning and messaging weaknesses.
  4. Assumption audit. A lighter version where you skip the full team build and go straight to listing every assumption underlying a strategy then rating each one on how confident you are and how catastrophic it would be if wrong. The assumptions that are low confidence AND high impact are your vulnerabilities.

The common pitfalls:

  1. Red team that’s too polite. If the red team’s challenges are soft, “well, we might want to consider,” the exercise fails. The red team needs permission and encouragement to be harsh. The best red team challenges make the blue team uncomfortable.
  2. Blue team that’s defensive. If the blue team treats red team challenges as personal attacks rather than structural tests, the integration phase fails. The blue team’s job is to listen, evaluate honestly, and update. Not to defend their ego.
  3. Red team that only finds small problems. Nitpicking parameters while missing fundamental strategic flaws. The red team should start with the biggest assumptions and work down, not start with details and never reach the foundation.
  4. Running it once. The best red team exercises are iterative. Blue team builds. Red team attacks. Blue team revises. Red team attacks the revision. Each round produces a stronger strategy.
  5. Confusing red teaming with pessimism. A good red team isn’t negative, but rigorous. They want the strategy to succeed, they’re testing it so it can survive the real world. The spirit is “let’s make this bulletproof” not “let’s prove this won’t work.”

How to go about it:

  • Phase one: Blue Team builds the case. State the strategy clearly. List the key assumptions it rests on. Present the evidence supporting it. Make it as strong as possible. Don’t hedge. Don’t preemptively address weaknesses. Build the case you can.
  • Phase two: Red Team attacks. Examine each assumption. Where’s the evidence weakest? What’s been assumed without verification? What alternative explanations exist for the same evidence? What would a smart, motivated adversary do to exploit the strategy’s weaknesses? What’s the scenario where this strategy fails catastrophically?
  • Phase three: Integration. The blue team responds to the red team’s challenges. Some attacks will be valid: the strategy needs to change. Some will be addressable: the strategy holds but needs additional mitigation. Some will be irrelevant: the red team overreached. The final strategy is stronger than either the blue team or red team could produce alone.

Exercise

A new AI company targeting enterprise clients. Strong technical co-founder. They’ve built a genuinely good AI-powered contract analysis tool. But they have zero clients, zero brand recognition, and they’re targeting inherently conservative enterprise buyers. The co-founder says: “Our product is strong. But nobody will talk to us. How do we break through?” Build the GTM strategy (blue team), attack it (red team), integrate.

Answer

  • Blue Team: the GTM strategy
    • Can we go mass GTM right now? No. Being enterprise-only and enterprise-native, casting a wide net will dilute our efforts and resources. These buyers operate on trust and recommendations, not ads and content funnels. So I’d start by narrowing.
    • Choose one or two industries where contract analysis carries the highest stakes and the most specificities: cargo /transportation, or financial and regulatory industries. Start by talking to people: interview professionals, identify what makes these industries different in terms of contract analysis, list the real risks they face, assess their trust levels regarding AI adoption, and build messaging that removes their specific worries. Meanwhile, signal trust through getting and publishing multiple certifications.
    • After the analysis, we might discover specific use cases and bounded scenarios with unpredictable actors that these industries experience. We build AI workflows for those specific scenarios and design messaging around them. This is our positioning and communication kit.
    • We also need to consider the lawyer ecosystem. In these industries, enterprises work with law firms. We can’t position the tool as threatening lawyers. It needs to empower them. But some enterprises have in-house legal departments, which creates different dynamics.
    • In terms of GTM, based on the research, I’d focus on personal networking rather than massive online GTM ads, which burn cash without results when you have zero logos or case studies. We need one or two first clients because they open the doors to the rest.
    • Two original entry mechanisms for the zero-trust problem:
      • Intel products for the industry: Create analysis of contract errors and their impact as a gift that demonstrates expertise and opens doors without selling.
      • A niche podcast or TV series: Invite industry professionals to discuss contract challenges. This is a legitimate, non-sales way of building relationships with decision-makers. The value isn’t the audience, but the guest relationship.
  • Red Team: the attack
    • Time, time, time. How long will all this analysis take? How long to create the podcast and get the first guest? We’re taking the sophisticated route when we could start running ads TODAY and testing TODAY. Small lawyer offices with under ten people, understaffed, with many cases, could be the perfect target for ads. They move almost like B2C: they need results fast, they’re overloaded. We can position ourselves against the regular AI tools they already use to read contracts. Gaining access to small law firms gives us access to their enterprise clients, so we test the market through action, not through waiting for the perfect analysis.
    • Assumption one (fatal): We can pick the right niche from the outside. Based on what criteria would we choose those industries? Each industry has different particularities. We have no clue how they're different. The fact that they’re “complex” doesn’t mean we can cover their use cases. And industries that seem underserved might actually be the best-served already. The real opportunity might be in industries with less legal work but more dependence on gatekeepers, industries we'd never consider from a desk.
    • Assumption two (seriously damaging): Enterprise buying is purely trust-based and relationship-driven. That’s a generalization. Enterprise companies are messy. Not every decision goes through curated trust networks. Cold outreach with AI tools costs almost nothing and produces real-world signal even if it doesn’t convert: we learn how they respond, what they care about, how they talk. We don’t lose time.
  • Integration
    • We do both. Go into the world with ads for small lawyer firms and cold emails to different enterprises, but with clear tests upfront. Pick two or three enterprise niches. Test high-regulated versus low-regulated to see which respond. Same with lawyer firms. Define assumptions, adapt ads accordingly, and see how reality responds. Even if we generate zero results, that’s still signal that we need to change our approach. If we do get results, the structure allows us to make the right conclusions and adapt.
    • At the same time, test a podcast pilot. Season 1 dedicated to a specific topic, eight episodes. Hand-pick guests. Build content that compounds in quality and in networking. This is a long-term investment that builds the brand over time.
    • The key design principle: structured experimentation where every outcome, including failure, is informative (think effectuation). We don’t need to be right before moving. We need to move in ways that make us less wrong.