[COMPLIANCE & ENTREPRISE]

[

2/24/26

]

SOC 2 for AI Agents: What Auditors Are Going to Start Asking About

[Author]:

Amjad Fatmi

Your SOC 2 Type II was fine last year. Same controls, same evidence, same auditor. You renewed it, you put the badge on your website, procurement moved on.

Then your engineering team shipped agents.

Now those agents are calling your payments API, reading customer records, sending emails, running database queries, and in some cases executing shell commands. All autonomously. At machine speed. Without a human in the loop.

Your SOC 2 auditor is going to notice. And when they do, they are going to ask questions your current control environment cannot answer.

This post is about exactly those questions: what they are, why your existing logging and observability tools don't cover them, and what a control environment that actually does cover them looks like.

Why Traditional SOC 2 Controls Don't Map to Agents

SOC 2 was designed for systems where humans make decisions and systems execute them. The Trust Services Criteria (Security, Availability, Processing Integrity, Confidentiality, Privacy) were written with that assumption baked in.

The Security criteria (CC6, CC7, CC8, CC9) ask about logical access controls, change management, monitoring, and incident response. They assume that a human being was authorized to perform an action, that a system recorded what the human did, and that another human can review those records.

Agents break all three assumptions at once.

The actor is not a human. It is a model producing outputs that vary based on prompt, temperature, context window, and the specific version of the model running when the request came in.

The action is not preceded by a human authorization decision. It is preceded by an inference step whose reasoning is probabilistic and whose output is not guaranteed to be stable across runs.

The record, in most current deployments, captures what happened: the tool that was called, the parameters, the return value. It does not capture why it was authorized, under which policy version, or whether any human was involved in the authorization decision.

SOC 2 auditors will ask for evidence of your testing methodology, and audit logs prove compliance when regulators come asking. That is true. The problem is that most agent audit logs cannot prove what auditors are increasingly asking for: not just what ran, but what authorized it to run.

The Five Questions Auditors Will Ask

These are not hypothetical. They are the natural extension of existing SOC 2 controls criteria applied to agentic systems. Some auditors are asking them now. All of them will be asking them within eighteen months.

1. "Who authorized this action?"

For human-initiated actions, authorization is traceable. A user logged in, was verified against an access control list, and performed an action that was logged with their identity attached.

For agent-initiated actions, the answer is currently: "the model decided to." That is not an authorization record. That is an inference output.

The question auditors will ask is more specific: what policy was in effect at the time this action executed? Was that policy explicitly approved? Is it version-controlled? Can you show me the policy that governed this specific action, the version hash of that policy at decision time, and proof that the policy was not modified between the decision and the execution?

Most current agent deployments cannot answer any of those questions. Instructions live in prompts. Prompts are not versioned as policy. Prompts cannot be hashed. Prompts cannot be replayed deterministically.

2. "What would have happened if you had applied today's policy to that action three months ago?"

This is the counterfactual question. It comes up in incident response, in policy change reviews, and increasingly in SOC 2 evidence requests when auditors are evaluating the operating effectiveness of controls over the Type II observation period.

For traditional systems, this question is hard but answerable. You have logs of what happened and you can evaluate them against historical policy versions.

For agents, this question is currently unanswerable. If your policy lives in a prompt, there is no mechanism to deterministically re-evaluate past actions under a different policy. The model is not deterministic. The prompt is not versioned. The logs record effects, not authorization inputs.

The Faramesh Core Specification calls this capability deterministic replay:

"Replay MUST satisfy: No execution (no external side effects). No state mutation. Deterministic decision output for identical (CAR, PolicyProgram, EvalState, ProfileBytes)."

Without this capability, you cannot answer the counterfactual question. Without being able to answer the counterfactual question, your controls cannot be assessed for operating effectiveness over time. That is a SOC 2 Type II problem.

3. "Can you prove this audit record has not been modified?"

SOC 2 CC7.2 requires monitoring for unauthorized access and modification. For agent audit records, this means the auditor will ask whether your agent action logs are tamper-evident: whether you can prove that a record has not been modified, deleted, or reordered since it was created.

Standard logging systems like Datadog, Splunk, CloudWatch, and OpenTelemetry are not tamper-evident by design. Logs can be deleted. Log retention policies can be changed. Log records can be modified if an attacker gains access to the logging infrastructure.

The Faramesh Core Specification defines a hash-chained Decision Provenance Record structure where:




Each record's hash is the input to the next record's hash. Deleting, modifying, or reordering any record breaks the chain. The auditor can verify chain integrity without trusting the log operator. That is tamper evidence. Standard logging is not.

4. "What credentials did this agent have access to, and when?"

CC6.1 requires that logical access to systems and data is restricted to authorized users. For agents, this maps to a straightforward question about credentials: what API keys, tokens, and secrets does the agent process hold, and for how long?

The honest answer for most current agent deployments is: all of them, always. Keys loaded at initialization are held in process memory for the lifetime of the agent run. The agent has ambient access to every credential it was given, whether it needs that credential for a specific action or not.

This creates two SOC 2 problems. First, the principle of least privilege (CC6.3) requires that access rights are limited to what is necessary for the function. An agent holding a full Stripe API key when it only needs read access violates this principle. Second, the credential exposure surface extends to the entire agent runtime. If the agent is compromised through prompt injection, the credentials are in the process context.

The control that addresses this is ephemeral credential injection: fetch the credential from your secrets manager at execution time, inject it for the duration of one action, then discard it. Nothing credential-shaped sits in the agent process or in any persistent storage. The auditor can verify this because the credential broker's logs show fetch events tied to specific action IDs, not ambient credential loads at startup.

5. "What happens when the agent encounters a situation its policy doesn't cover?"

This is the fail-closed question. CC9.1 requires that risk mitigation activities include controls to address identified risks. For agents, the risk of an unhandled action (one that falls outside the defined policy) is real and should be explicitly controlled.

Most agent systems fail open by default. If no guardrail matches, the action proceeds. If the model decides to do something the policy didn't anticipate, it does it.

SOC 2 auditors will ask for evidence of your testing methodology. When they test your agent's behavior on edge cases (inputs that fall outside your defined policy) what is the documented, verifiable outcome?

The correct answer for a controlled system is: DENY. The Faramesh Core Specification is explicit: "If no policy rule matches, the server MUST produce HALT with reason 'default_deny'." That is a verifiable, testable, auditable control. "The model usually doesn't do things outside its instructions" is not.

What a SOC 2-Ready Agent Control Environment Looks Like

Mapping to the Trust Services Criteria directly:

CC6.1 — Logical and Physical Access Controls

What is required: access to information assets is restricted to authorized users and processes.

What agents need: policy-based action authorization that explicitly permits specific tools and operations for specific agent identities. Not ambient access. Not "the agent was given this API key so it can use it however it wants." Per-action authorization evaluated at execution time.

What this looks like in practice:

rules:
  - match:
      tool: stripe
      operation: refund
      agent_id: support-agent-prod
      amount_lte: 500
    allow: true

  - match:
      tool: stripe
      operation: refund
    deny: true
    reason: "Only prod support agent may issue refunds, max $500"
rules:
  - match:
      tool: stripe
      operation: refund
      agent_id: support-agent-prod
      amount_lte: 500
    allow: true

  - match:
      tool: stripe
      operation: refund
    deny: true
    reason: "Only prod support agent may issue refunds, max $500"
rules:
  - match:
      tool: stripe
      operation: refund
      agent_id: support-agent-prod
      amount_lte: 500
    allow: true

  - match:
      tool: stripe
      operation: refund
    deny: true
    reason: "Only prod support agent may issue refunds, max $500"

This policy is version-controlled, hashed, and evaluated before every action. The auditor can read it. The auditor can verify the hash. The auditor can confirm it was the policy in effect for any given historical action.

CC6.3 — Least Privilege

What is required: access rights are limited to the minimum necessary.

What agents need: ephemeral credential injection scoped to specific actions, not ambient credential access for the lifetime of the agent process.

What this looks like: the Stripe credential is not loaded at agent initialization. It lives in Vault, AWS Secrets Manager, or Azure Key Vault. When a permitted Stripe action is about to execute, Faramesh fetches the credential via workload identity, injects it for that one call, and discards it. The agent process never holds a live Stripe key. A compromise of the agent process yields no credentials.

CC7.2 — Monitoring of System Components

What is required: the organization monitors system components and the operation of controls to detect anomalies.

What agents need: immutable, tamper-evident records of every authorization decision. Not just what happened, but what authorized it to happen.

What this looks like: a hash-chained Decision Provenance Record for every agent action, containing the canonical action hash, the policy version hash, the decision outcome, the risk score, and the timestamp. Chain integrity verifiable without trusting the storage operator.

CC8.1 — Change Management

What is required: changes to infrastructure and software are authorized, tested, and monitored.

What agents need: policy-as-code with version control, change review, and hash-binding of policy versions to individual action decisions. When your agent policy changes, that change is a commit. It goes through pull request review. The hash of the new policy version is recorded in every subsequent DPR. You can prove to the auditor exactly when your policy changed and which version governed which historical actions.

Processing Integrity — PI1

What is required: system processing is complete, valid, accurate, timely, and authorized.

The word "authorized" in PI1 is where agents create compliance exposure. Processing integrity for autonomous agents requires demonstrating that each action was authorized by an explicit policy decision, not just permitted by the absence of a blocking control.

The Evidence Package an Auditor Will Accept

If your auditor asks about agent controls today, here is what a complete evidence package looks like:

Policy documentation: The YAML policy files that govern each agent, stored in version control, with a commit history showing every change and who approved it.

Policy hash binding: Evidence that every action in the audit period was evaluated against a specific, hashed policy version. The DPR for each action contains policy_hash, the SHA-256 of the exact policy that governed that decision.

Decision records: The complete DPR log for the audit period, with chain integrity verified. Every action: tool, operation, parameters, decision outcome, risk score, whether human approval was involved, timestamp.

Credential access records: Logs showing that credentials were fetched ephemerally at execution time, tied to specific action IDs, with no ambient credential loads.

Fail-closed evidence: Test results demonstrating that actions outside defined policy are denied by default, not permitted. This is testable and reproducible.

Human approval records: For actions that required human review, the approval workflow records showing who approved, when, and the re-evaluation decision after approval.

Replay capability: Demonstration that any historical action can be re-evaluated deterministically under any policy version without re-executing the action. This answers the counterfactual question and supports operating effectiveness assessment over the Type II period.

That is a complete SOC 2 evidence package for agent controls. Most companies deploying agents today can produce none of it because the infrastructure to generate it doesn't exist in their current stack.

The Timing Question

Cyber insurance carriers now require documented evidence of these controls. Some insurers offer "AI Security Riders" that mandate adversarial red-teaming and model-level risk assessments as prerequisites for coverage.

The SOC 2 audit cycle is annual. The observation period for Type II is six to twelve months. If your agents ship to production in Q1 without an authorization control environment, your next Type II observation period is already running. The evidence gap is accumulating.

The question is not whether your auditor will ask these questions. The question is whether they ask at your next renewal or the one after that, and whether you have the control environment to answer them when they do.

The companies that get ahead of this will have a compliance advantage that is genuinely hard to catch up to. Not because the technology is hard to implement (twelve minutes of integration time, as the quickstart post shows), but because the DPR chain has to run for the duration of the observation period. You cannot retroactively create a six-month tamper-evident audit trail. You can only start it now.

Faramesh's Decision Provenance Records, policy-as-code, and ephemeral credential broker are designed to produce exactly the evidence package described in this post. The core is open source at github.com/faramesh/faramesh-core. The managed platform with multi-tenant audit trail and compliance reporting is at faramesh.dev.

Previous

More

Previous

More

Next

More

Next

More

[GET STARTED IN MINUTES]

Ready to give Faramesh a try?

The execution boundary your agents are missing.
Start free. No credit card required.

[GET STARTED IN MINUTES]

Ready to give Faramesh a try?

The execution boundary your agents are missing.
Start free. No credit card required.