Building AI-Assisted Security Tools

This is Part 2 of “The Centaur’s Toolkit” series. In Part 1, we covered the four collaboration modes for AI pair programming. Now we apply that framework to higher-stakes territory: security.

You’ve embraced AI pair programming. You’re using Strategist mode for architecture, Editor mode for refinement, and you feel like a genuine Centaur. Then your manager asks you to build a security tool.

Suddenly, the stakes feel different.

In regular coding, an AI-suggested bug might waste a few hours of debugging. In security, an AI-suggested bug might become a vulnerability that sits in production for months. The cost of being wrong isn’t just time. It’s trust, data, and potentially your users’ safety.

So can you still use AI for security work? Absolutely. But the how changes significantly.

The Human-in-the-Loop Imperative

When building security tools with AI assistance, one principle must guide every decision: AI accelerates; humans validate.

This isn’t about distrusting AI. It’s about understanding what AI is good at (pattern matching, processing volume, suggesting possibilities) versus what humans must own (judgment calls, context awareness, final decisions).

In security, we call this the human-in-the-loop model. The AI does the heavy lifting of analysis, but a human always makes the final call on anything consequential.

Let me show you what this looks like in practice.

Building an AI-Augmented Log Analyzer

We’re going to build a security log analyzer that uses AI to identify suspicious patterns, then surfaces them for human review. This is a common real-world use case: security teams are drowning in logs, and AI can help separate signal from noise.

The Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Log Files  │────▶│  Parser     │────▶│  AI Analysis│────▶│  Human      │
│  (raw)      │     │  (structure)│     │  (patterns) │     │  Review     │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
                                              │
                                              ▼
                                        ┌─────────────┐
                                        │  Confidence │
                                        │  Score      │
                                        └─────────────┘

The key insight: AI provides analysis and a confidence score. Low-confidence findings get human review. High-confidence findings still get human review for anything actionable. The AI helps prioritize; it doesn’t decide.

Step 1: Log Parsing

First, we need structured data. Here’s a simple parser for auth logs:

import re
from dataclasses import dataclass
from datetime import datetime
from typing import Iterator

@dataclass
class AuthEvent:
    timestamp: datetime
    event_type: str  # 'success', 'failure', 'lockout'
    username: str
    source_ip: str
    details: str

def parse_auth_logs(log_file: str) -> Iterator[AuthEvent]:
    """Parse authentication log file into structured events."""

    # Pattern for common auth log format
    pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) AUTH (\w+): user=(\S+) src=(\S+) (.+)'

    with open(log_file, 'r') as f:
        for line in f:
            match = re.match(pattern, line.strip())
            if match:
                yield AuthEvent(
                    timestamp=datetime.fromisoformat(match.group(1)),
                    event_type=match.group(2).lower(),
                    username=match.group(3),
                    source_ip=match.group(4),
                    details=match.group(5)
                )

Nothing fancy here. We’re just converting unstructured logs into structured data that’s easier to analyze.

Step 2: AI-Powered Pattern Detection

Now the interesting part. We’ll use an LLM to analyze batches of events and identify suspicious patterns:

import json
from openai import OpenAI

client = OpenAI()

def analyze_events_batch(events: list[AuthEvent]) -> dict:
    """Use AI to analyze a batch of auth events for suspicious patterns."""

    # Format events for the prompt
    events_text = "\n".join([
        f"{e.timestamp} | {e.event_type} | {e.username} | {e.source_ip} | {e.details}"
        for e in events
    ])

    prompt = f"""Analyze these authentication events for security concerns.

Events:
{events_text}

For each concern found, provide:
1. A brief description of the pattern
2. The specific events involved (by timestamp)
3. A confidence score (0.0 to 1.0) based on how clearly this indicates a threat
4. Recommended priority (critical, high, medium, low)

Consider patterns like:
- Brute force attempts (many failures from same IP or for same user)
- Credential stuffing (failures across many users from same source)
- Impossible travel (same user from distant IPs in short time)
- Off-hours access (successful logins at unusual times)
- Account enumeration (systematic username probing)

Return your analysis as JSON with this structure:
{{
    "findings": [
        {{
            "description": "...",
            "events": ["timestamp1", "timestamp2"],
            "confidence": 0.85,
            "priority": "high",
            "reasoning": "..."
        }}
    ],
    "summary": "Brief overall assessment"
}}

If no suspicious patterns are found, return an empty findings array.
Be conservative. It's better to surface a false positive for human review
than to miss a real threat."""

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {
                "role": "system",
                "content": "You are a security analyst assistant. Analyze logs for threats but always defer final judgment to human reviewers."
            },
            {"role": "user", "content": prompt}
        ],
        response_format={"type": "json_object"}
    )

    return json.loads(response.choices[0].message.content)

Notice a few important things about this implementation:

The prompt is explicit about deferring to humans. The system message emphasizes that final judgment belongs to human reviewers. This isn’t just politeness; it shapes how the AI frames its output.

We ask for confidence scores. This helps humans prioritize review. A 0.9 confidence finding gets looked at before a 0.3.

We ask for reasoning. The AI must explain why something is suspicious. This helps humans quickly validate or dismiss findings.

We bias toward false positives. The prompt explicitly says it’s better to surface something for review than to miss a threat. In security, false negatives are far more costly than false positives.

Step 3: Human Review Interface

The AI’s findings need to be presented for human review. Here’s a simple terminal-based interface:

from dataclasses import dataclass
from enum import Enum

class ReviewDecision(Enum):
    CONFIRMED_THREAT = "confirmed"
    FALSE_POSITIVE = "false_positive"
    NEEDS_INVESTIGATION = "investigate"
    SKIP = "skip"

@dataclass
class ReviewedFinding:
    original: dict
    decision: ReviewDecision
    reviewer_notes: str

def present_for_review(findings: list[dict]) -> list[ReviewedFinding]:
    """Present AI findings to human reviewer for decisions."""

    reviewed = []

    # Sort by confidence, highest first
    sorted_findings = sorted(
        findings,
        key=lambda x: x.get('confidence', 0),
        reverse=True
    )

    for i, finding in enumerate(sorted_findings, 1):
        print(f"\n{'='*60}")
        print(f"Finding {i}/{len(sorted_findings)}")
        print(f"{'='*60}")
        print(f"Priority: {finding.get('priority', 'unknown').upper()}")
        print(f"Confidence: {finding.get('confidence', 0):.0%}")
        print(f"\nDescription: {finding.get('description', 'N/A')}")
        print(f"\nAI Reasoning: {finding.get('reasoning', 'N/A')}")
        print(f"\nEvents involved: {', '.join(finding.get('events', []))}")

        print("\nYour decision:")
        print("  [c] Confirmed threat - escalate")
        print("  [f] False positive - dismiss")
        print("  [i] Needs investigation - queue for deeper review")
        print("  [s] Skip for now")

        while True:
            choice = input("\nDecision: ").strip().lower()
            if choice in ('c', 'f', 'i', 's'):
                break
            print("Invalid choice. Enter c, f, i, or s.")

        decision_map = {
            'c': ReviewDecision.CONFIRMED_THREAT,
            'f': ReviewDecision.FALSE_POSITIVE,
            'i': ReviewDecision.NEEDS_INVESTIGATION,
            's': ReviewDecision.SKIP
        }

        notes = ""
        if choice in ('c', 'i'):
            notes = input("Notes (optional): ").strip()

        reviewed.append(ReviewedFinding(
            original=finding,
            decision=decision_map[choice],
            reviewer_notes=notes
        ))

    return reviewed

This interface does several things right:

It shows the AI’s reasoning. The reviewer can quickly assess whether the AI’s logic makes sense.

It requires explicit decisions. No finding gets silently dropped. Every item gets a human decision.

It captures reviewer notes. This creates an audit trail and helps improve the system over time.

It’s friction-ful on purpose. Security review shouldn’t be a rubber stamp. A bit of friction ensures attention.

Putting It Together

def run_security_analysis(log_file: str, batch_size: int = 100):
    """Main analysis pipeline."""

    print(f"Parsing logs from {log_file}...")
    events = list(parse_auth_logs(log_file))
    print(f"Found {len(events)} events")

    all_findings = []

    # Process in batches
    for i in range(0, len(events), batch_size):
        batch = events[i:i + batch_size]
        print(f"\nAnalyzing batch {i//batch_size + 1}...")

        result = analyze_events_batch(batch)
        findings = result.get('findings', [])

        if findings:
            print(f"  Found {len(findings)} potential issues")
            all_findings.extend(findings)
        else:
            print(f"  No issues detected")

    if not all_findings:
        print("\nNo security concerns identified.")
        return

    print(f"\n{'#'*60}")
    print(f"HUMAN REVIEW REQUIRED: {len(all_findings)} findings")
    print(f"{'#'*60}")

    reviewed = present_for_review(all_findings)

    # Summary
    confirmed = sum(1 for r in reviewed if r.decision == ReviewDecision.CONFIRMED_THREAT)
    false_pos = sum(1 for r in reviewed if r.decision == ReviewDecision.FALSE_POSITIVE)
    investigate = sum(1 for r in reviewed if r.decision == ReviewDecision.NEEDS_INVESTIGATION)

    print(f"\n{'='*60}")
    print("Review Summary")
    print(f"{'='*60}")
    print(f"Confirmed threats: {confirmed}")
    print(f"False positives: {false_pos}")
    print(f"Needs investigation: {investigate}")

    # This data should be logged for improving the AI prompts over time
    return reviewed

The Four Modes in Security Context

Let’s revisit our Centaur collaboration modes from Part 1, applied specifically to security work:

Strategist Mode: Threat Modeling

Use AI to explore your threat landscape:

You: We're building an API that handles payment information.
     Help me think through the threat model. What attack vectors
     should we prioritize defending against?

AI: [Proposes STRIDE analysis, identifies top risks, suggests
    which threats need immediate vs. future mitigation...]

The AI helps you think broadly. You decide which threats matter most given your actual users, infrastructure, and risk tolerance.

Editor Mode: Refining Detection Rules

AI can draft detection logic, but you refine it with domain knowledge:

You: Write a detection rule for identifying potential
     credential stuffing attacks in our auth logs.

AI: [Generates initial rule]

You: Good start, but we have a mobile app that legitimately
     retries on network failures. Add logic to exclude
     rapid retries from the same device ID within 30 seconds.

AI: [Refined rule with your context]

Debugger Mode: Investigating Incidents

When something triggers an alert, use AI to accelerate investigation:

You: We got an alert for unusual data access patterns from
     user account X. Here's the last 24 hours of their
     activity logs. Help me understand what happened.

AI: [Identifies timeline, highlights anomalies, suggests
    what to check next...]

You: The access spike correlates with their timezone's
     end-of-quarter reporting. Looks like legitimate
     business activity. But flag the API endpoint they
     hammered for rate limiting review.

Learner Mode: Understanding New Threats

Security evolves constantly. Use AI to get up to speed:

You: I keep seeing references to "prompt injection attacks"
     in security feeds. Explain this threat class to me like
     I'm a senior security engineer who just hasn't worked
     with LLMs yet. Include real examples.

AI: [Detailed explanation with attack examples, defense
    strategies, and links to further reading...]

When NOT to Trust AI in Security

The Centaur model works well for analysis and detection. But some security decisions should never be automated, no matter how confident the AI seems:

Automated remediation without approval. AI can suggest blocking an IP or disabling an account. A human must approve it. The cost of a false positive (blocking a legitimate user or customer) is too high.

Access control decisions. “Should this user have access to this resource?” is a judgment call that requires understanding business context, relationships, and intent. AI can flag anomalies; humans grant or deny access.

Cryptographic implementations. Never let AI write your crypto code without expert review. The failure modes are subtle and catastrophic.

Anything touching customer data. If an AI-suggested action would access, modify, or expose customer data, a human must explicitly approve it.

Incident communication. AI can draft an incident report. A human must review it before it goes to customers, executives, or regulators.

The pattern: AI handles volume and pattern matching. Humans handle judgment, accountability, and anything with significant blast radius.

Measuring Success

How do you know if your AI-assisted security tooling is working? Track these metrics:

False positive rate. What percentage of AI findings do humans dismiss? If it’s above 50%, your prompts need tuning.

Time to review. How long does human review take? AI should be reducing this, not adding busywork.

Catch rate. Are you finding real issues? Compare against known-good test cases or red team exercises.

Reviewer feedback. Are your security analysts finding the AI helpful? Their qualitative input matters.

Over time, use the human decisions to improve your prompts. If certain patterns are consistently marked as false positives, teach the AI to deprioritize them.

The Security Centaur Advantage

AI doesn’t replace security expertise. It amplifies it.

A security analyst reviewing logs manually might catch 10 incidents per hour. The same analyst with AI assistance might review 100 potential incidents per hour, while maintaining (or improving) accuracy.

The key is keeping the human in control. AI surfaces possibilities. Humans make decisions. This isn’t just good practice; it’s often a compliance requirement. Many security frameworks explicitly require human oversight of automated systems.

When you build security tools this way, you get the best of both worlds: AI’s speed and pattern recognition, combined with human judgment and accountability.

What’s Next

We’ve covered how to apply AI collaboration to security tooling, with humans firmly in the loop. But we’ve glossed over a crucial question: how do you calibrate your trust?

AI is confidently wrong on a regular basis. Sometimes it misses obvious threats. Sometimes it hallucinates patterns that don’t exist. How do you develop the intuition for when to trust AI output and when to dig deeper?

That’s exactly what we’ll cover in Part 3.

Next in the series: When to Trust (and Verify) AI Output →

Building AI-assisted security tools requires balancing automation with human judgment. This balance is part of the Centaur framework I detail in my book, The Centaur’s Edge: A Practical Guide to Thriving in the Age of AI, which includes exercises for calibrating trust and building effective human-AI workflows.