Executive Summary

AI is being deployed at scale across security operations. Vendors are integrating large language models into SIEM platforms, XDR consoles, and threat intelligence tools. Security analysts are using AI to summarize incidents, draft reports, and query their data. The productivity gains are real.

The risk is also real, and it is being dramatically underacknowledged. AI systems in security contexts are generating unverified, uncited, hallucination-prone outputs that analysts are using as inputs to high-stakes decisions. The problem is not AI itself. The problem is AI without evidence discipline: the gap between what the model says and what the data actually shows.


The Hallucination Problem in Security Context

Large language models hallucinate. This is not a bug: it is a fundamental property of how they generate text. They predict statistically likely next tokens given training data and context. When that context is insufficient or ambiguous, they generate plausible-sounding text that is factually incorrect.

In consumer applications, hallucination is annoying. In security operations, it can be catastrophic.

False attribution is the first failure mode. An AI system analyzes a set of events and concludes that the activity is consistent with APT29 tradecraft based on OAuth token theft and lateral movement techniques. The analyst, presented with this attribution, escalates to the nation-state threat response playbook. The actual attacker is a financially motivated ransomware affiliate using commodity tooling. The response is entirely wrong. The breach expands while the analyst pursues a false threat model.

Fabricated evidence is the second. An AI system asked what evidence supports the conclusion that an access event was malicious returns a list: some items reflecting actual events in the data, some confabulated from statistical associations in training data. The analyst cannot distinguish real from fabricated without reviewing all primary evidence, which defeats the purpose of AI assistance.

False negative confidence is the third. An AI system concludes with high confidence that an alert is a false positive because the activity pattern is consistent with normal administrative behavior. The confidence is based on pattern similarity, not on investigation of whether the specific events in question are actually authorized. The analyst clears the alert. The breach continues.

Each of these failure modes is occurring today in AI-augmented SOCs. They are predictable consequences of deploying AI without evidence discipline.


What Evidence Discipline Means

Evidence discipline in AI security operations means one thing: every AI output must be tied to specific, verifiable, retrievable evidence from the actual data under analysis.

Not "this pattern is associated with lateral movement." Instead: events EVT-2024-0847, EVT-2024-0848, and EVT-2024-0851 constitute a lateral movement sequence via T1021.002, with direct process lineage establishing PROVABLE causation between EVT-2024-0847 and EVT-2024-0848, and MIXED-grade evidence connecting EVT-2024-0848 to EVT-2024-0851 via shared credential artifact ART-7723.

Every event ID cited must be retrievable. Every edge cited must exist in the causal graph. Every artifact reference must correspond to a real artifact in the evidence store. AI outputs that cannot cite their claims must not reach the analyst.

This sounds obvious. It is almost universally ignored in current AI security tooling.


The Architecture of Evidence-Locked AI

Building AI that cannot hallucinate evidence requires separating the AI's reasoning from its evidence retrieval: what researchers call a tool-use or agentic architecture, applied with security-specific constraints.

The tool registry is the foundation. A defined set of tools through which the AI can access actual data, and only through which it can do so. Tools include: get_chain (retrieve a specific causal chain by ID), get_edges (retrieve edges for a given chain), get_event (retrieve specific event details), match_iocs (query the IOC database against a specific artifact), get_vuln_posture (retrieve current vulnerability exposure for a host), fetch_feed_item (retrieve a specific threat intelligence item), get_enrichments (retrieve enrichments for a specific entity).

The AI cannot invent events. It cannot reference data that does not exist in the tool registry's backing stores. An attempt to cite an event ID that does not exist fails citation validation and requires regeneration.

The citation validator runs on every AI output before it reaches an analyst. Every claimed event_id, edge_id, feed_item_id, and entity reference is checked against actual data stores. Uncorroborated claims are flagged. Outputs with uncorroborated claims above a threshold are rejected entirely and the AI is required to try again.

The AI_Run audit trail logs every reasoning session: inputs, tool calls made, tool responses received, intermediate reasoning, final output, citations claimed, citation validation result. The analyst can review not just the conclusion but the full reasoning chain that produced it. This is what separates AI assistance from AI authority.

Tool call limits bound each reasoning session to a maximum number of tool calls (20 per run in TRA-CE's implementation). This prevents runaway inference loops, creates a natural audit scope, and ensures AI conclusions are bounded by evidence that can be gathered and reviewed in a defined investigation window.


Chain Grading and AI's Role

One of the most important applications of evidence-locked AI in causal security intelligence is chain grading: determining whether a chain's evidence meets the threshold for PROVABLE, MIXED, or INFERRED classification.

AI can legitimately contribute under specific constraints.

AI can upgrade INFERRED to MIXED. A reasoning session that uses tool calls to retrieve corroborating evidence, evidence that exists in the actual data and is cited with valid event or feed item references, can recommend upgrade to MIXED. The upgrade is conditional on citation validation passing.

AI cannot upgrade MIXED to PROVABLE. PROVABLE classification requires explicit, verifiable mechanism evidence: direct process lineage, direct API causation. This is a deterministic determination and should not be subject to AI inference.

AI cannot downgrade. A chain graded PROVABLE by the deterministic causal analysis engine cannot be downgraded by AI reasoning. Evidence-based grades are one-directional.

This constraint architecture ensures AI enhances investigation without undermining the integrity of the evidence base.


PII and the Routing Problem

Security event data frequently contains personally identifiable information: usernames, email addresses, IP addresses associated with individuals, access logs that reveal behavior patterns. Routing this data to cloud AI providers creates regulatory exposure under GDPR, CCPA, HIPAA, and sector-specific regulations. The AI provider's training pipeline, data retention policies, and breach surface all become vectors for regulatory risk.

The appropriate response is PII-aware AI routing. Every payload submitted to an AI reasoning session is analyzed for PII before routing. Payloads containing PII are routed exclusively to locally-deployed models, regardless of the quality differential between local and cloud models.

This creates a two-tier routing architecture: cloud models (Anthropic, OpenAI, Google) for aggregated, de-identified analysis; local models for any analysis involving specific identity data, specific user behavior, or any other PII-containing payload. The routing decision is automatic and policy-enforced. Analysts should not have to evaluate whether a given query contains PII before deciding where to route it.


The Cost of Getting This Wrong

The 2022 Uber breach involved an attacker who sent repeated MFA push notifications to a compromised employee until the employee, fatigued by the repeated prompts, approved one. The technical controls were correct. The human was the failure point, and the failure was induced by the system design.

AI that generates confident-sounding but uncorroborated conclusions creates the same dynamic for analyst decision-making. Analysts are busy, under pressure, processing high volumes of information. Confident AI output creates authority bias: a tendency to accept conclusions without scrutinizing the underlying evidence.

When that conclusion is wrong, when it is built on pattern association rather than actual evidence from the data under analysis, the analyst's acceptance is not a failure of the analyst. It is a failure of the system that presented uncorroborated output with the same visual weight as verified evidence.

Evidence-locked AI removes the authority bias problem by making the evidence the authority. The AI's conclusion is only as confident as its citations. Thin citations produce low confidence, and that is reflected in both the chain grade and the analyst-facing presentation. The AI is not an oracle. It is a reasoning assistant operating within the bounds of verifiable evidence.


Conclusion

AI will transform security operations. The transformation should produce better-evidenced analysis, faster investigation, and more consistent reasoning, not confident-sounding output that bypasses analyst scrutiny because it came from a model.

Evidence discipline is the prerequisite for AI being genuinely useful in a security context, not a constraint on it. Without mandatory citation, tool-bounded reasoning, and citation validation, AI in the SOC is plausible-sounding noise with authority bias at scale.

The bar for AI in security operations is not whether it sounds right. It is whether it can prove it.


TRA-CE.ai | Causal Security Intelligence | tra-ce.ai