Threat Modeling

Threat modeling is the practice of systematically identifying what can go wrong with your system's security before it goes wrong. It shifts security left — into the design phase, where fixing issues is cheap — rather than discovering vulnerabilities in production, where fixing them is expensive and damage may already be done. A 30-minute threat modeling session during design prevents weeks of remediation after deployment.

What Threat Modeling Is

Threat modeling answers four questions:

What are we building? Understand the system's architecture, data flows, and trust boundaries.
What can go wrong? Identify threats that could compromise the system.
What are we going to do about it? Plan mitigations for the highest-risk threats.
Did we do a good enough job? Validate that mitigations are effective.

It is not a one-time exercise. Threat models should be revisited when the architecture changes, new features are added, or new threats emerge.

The STRIDE Framework

STRIDE is Microsoft's threat classification framework. It provides six categories that cover the major ways systems can be attacked. Each category maps to a security property that is violated.

Spoofing

Pretending to be someone or something else. Spoofing attacks violate authentication.

# Examples
- Forged authentication tokens
- IP address spoofing
- Email sender spoofing
- DNS spoofing redirecting traffic to attacker's server
- Stolen session cookies used to impersonate a user

# Mitigations
- Strong authentication (MFA)
- Certificate-based identity verification
- DKIM/SPF/DMARC for email
- Session tokens with integrity checks

Tampering

Modifying data or code without authorization. Tampering attacks violate integrity.

# Examples
- Man-in-the-middle attacks modifying network traffic
- SQL injection altering database records
- Unauthorized modification of configuration files
- Tampered software updates (supply chain attack)
- Modified audit logs to cover attacker activity

# Mitigations
- TLS for data in transit
- Input validation and parameterized queries
- File integrity monitoring
- Code signing for software distribution
- Append-only audit logs

Repudiation

Denying that an action was performed. Repudiation attacks violate accountability.

# Examples
- User denies making a purchase
- Admin denies deleting records
- Attacker covers tracks by deleting logs
- Employee denies accessing sensitive data

# Mitigations
- Comprehensive audit logging
- Tamper-evident log storage
- Digital signatures on transactions
- Non-repudiation in financial operations
- Centralized log aggregation

Information Disclosure

Exposing data to unauthorized parties. Information disclosure attacks violate confidentiality.

# Examples
- Error messages revealing database structure
- API returning more data than the user should see
- Unencrypted data at rest accessible to backup operators
- Directory listing enabled on web servers
- Sensitive data in application logs

# Mitigations
- Encryption at rest and in transit
- Principle of least privilege for data access
- Generic error messages for external users
- Data classification and handling policies
- Log scrubbing for sensitive fields

Denial of Service

Making a system unavailable to legitimate users. DoS attacks violate availability.

# Examples
- Volumetric DDoS flooding network bandwidth
- Application-layer attacks exhausting CPU/memory
- Resource exhaustion through unthrottled API calls
- Algorithmic complexity attacks (regex DoS)
- Database lock contention from crafted queries

# Mitigations
- Rate limiting and throttling
- CDN and DDoS protection services
- Auto-scaling infrastructure
- Input validation (reject oversized payloads)
- Circuit breakers and timeouts

Elevation of Privilege

Gaining capabilities beyond what was authorized. Elevation of privilege attacks violate authorization.

# Examples
- Exploiting a buffer overflow to gain root access
- IDOR allowing access to other users' resources
- JWT manipulation to change role from "user" to "admin"
- Container escape to access the host system
- Privilege escalation through misconfigured sudo rules

# Mitigations
- Principle of least privilege
- Input validation on authorization tokens
- Proper access control checks on every request
- Container security hardening
- Regular access reviews

How to Threat Model

Step 1: Identify Assets

Start by listing what you are protecting. Not everything has the same value — customer PII, financial data, and authentication credentials are higher-value targets than public marketing content.

# Asset inventory example
High value:
  - Customer database (PII, payment info)
  - Authentication service (credentials, tokens)
  - Encryption keys and secrets

Medium value:
  - Application source code
  - Internal documentation
  - Employee directory

Low value:
  - Public website content
  - Marketing materials
  - Open-source dependencies

Step 2: Map the Architecture

Draw the system's components, data flows, and trust boundaries. A trust boundary is any point where the level of trust changes — between the internet and your network, between your application and the database, between microservices with different permission levels.

# Architecture mapping elements
Components:        Web server, API, database, cache, queue
Data flows:        User requests, API calls, database queries
Trust boundaries:  Internet/DMZ, DMZ/internal, app/database
Entry points:      Public APIs, admin panels, file uploads
External deps:     Third-party APIs, cloud services, CDNs

Step 3: Identify Threats

Walk through each component and data flow, applying STRIDE categories. For each component, ask: "Can this be spoofed? Can the data be tampered with? Can actions be repudiated?" and so on through all six categories.

# Threat identification for a payment API
Component: POST /api/payments

Spoofing:    Can an attacker forge payment requests?
Tampering:   Can payment amounts be modified in transit?
Repudiation: Can a user deny making a payment?
Info Disc:   Can payment details leak through error messages?
DoS:         Can the payment endpoint be overwhelmed?
EoP:         Can a user process payments for other accounts?

Step 4: Rate Risk

Not all threats are equal. Use a simple risk matrix combining likelihood and impact.

# Risk rating matrix
                 Low Impact   Medium Impact   High Impact
High Likelihood:   Medium        High          Critical
Med Likelihood:    Low           Medium         High
Low Likelihood:    Low           Low            Medium

Focus mitigation efforts on high and critical risks. Low risks are documented but may not require immediate action.

Step 5: Plan Mitigations

For each high-risk threat, define a specific mitigation with an owner and timeline.

# Mitigation plan example
Threat: Payment amount tampering in API request
Risk: High (medium likelihood, high impact)
Mitigation: Server-side price calculation from product catalog;
  client-submitted prices are ignored
Owner: Payment team
Timeline: Before launch
Status: Implemented

When to Threat Model

During design. The highest-value time to threat model is before code is written. Architecture decisions are cheap to change during design and expensive to change after deployment.

Before major changes. Adding a new payment provider, migrating databases, introducing a new API, or changing authentication flows all warrant a threat model update.

For critical systems. Systems handling payments, authentication, personal data, or infrastructure access should always have a current threat model.

After a security incident. Incidents reveal blind spots in existing threat models. Update the model to include the attack vector that was missed.

The 30-Minute Threat Model

Comprehensive threat modeling can take days. But a lightweight 30-minute session is vastly better than no threat modeling at all. Use this when time is limited.

# 30-minute threat model format
5 min:  Draw the system on a whiteboard
        (boxes for components, arrows for data flows)
5 min:  Mark trust boundaries
        (where does trust level change?)
10 min: Walk through STRIDE for the highest-value
        data flow (usually the main user action)
5 min:  Rate the top threats by risk
5 min:  Assign owners for the top 3 mitigations

This lightweight approach works for feature-level changes. Save comprehensive multi-day sessions for system-level architecture reviews.

Real-World Example

Microsoft threat-modeled the Xbox Live authentication system before launch and identified that session token theft could allow account takeover. The mitigation — binding session tokens to the console hardware ID — prevented a class of attacks that plagued competing services. The threat modeling session cost a few hours. The alternative — discovering the vulnerability post-launch — would have cost millions in incident response, customer compensation, and reputation damage.

Common Pitfalls

Analysis paralysis. Spending weeks on a perfect threat model instead of producing a good-enough model quickly. Start lightweight and iterate.
Threat modeling only at the start. Systems evolve. A threat model from two years ago does not cover features added last month. Review regularly.
Not involving the development team. Threat models produced only by security teams miss implementation details. Developers know their system's actual behavior, edge cases, and shortcuts.
Ignoring low-probability, high-impact threats. Rare events with catastrophic consequences (like supply chain compromise) deserve mitigation even if the probability seems low.
No follow-through on mitigations. Identifying threats without implementing mitigations provides zero security benefit. Track mitigations like any other engineering work.
Only modeling external threats. Insider threats, supply chain compromises, and misconfiguration are often higher-risk than external attackers.

Key Takeaways

Threat modeling identifies security issues during design, when fixes are cheap.
STRIDE provides six threat categories: Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege.
The process: identify assets, map architecture, identify threats, rate risks, plan mitigations.
Threat model during design, before major changes, for critical systems, and after incidents.
A 30-minute lightweight threat model is dramatically better than no threat model.
Involve developers in threat modeling — they know the system's actual behavior and edge cases.
Follow through on mitigations with owners, deadlines, and tracking.