Securing what you cannot see: The agent behaviour problem

Securing-what-you-cannot-see:-The-agent-behaviour-problem

ISJ hear exclusively from Chris Hughes Vice President of Strategy, Zenity about understanding the purpose of AI agents.

AI agents are moving into production faster than security teams can observe them.

The real risk isn’t permissions, it’s the inability to understand what agents are doing and why, in real time.

AI agents are moving into production

The experimentation phase is over. AI agents are moving into production across the enterprise.

These autonomous systems make decisions, invoke tools, access data and chain actions together without waiting for human approval.

They are embedded in SaaS platforms as co-pilots and workflow automations.

They run inside developer environments writing and executing code, and they are increasingly used in internal workflows across functions such as security, HR, finance and legal.

Gartner’s recent Innovation Guide for AI Agents reflects what practitioners are already seeing.

Agents are becoming the next operational layer of enterprise technology and adoption is accelerating faster than governance.

The industry momentum is real.

Every major platform vendor is shipping agentic capabilities and organisations are deploying them into environments that interact with sensitive data, critical systems and real-world business processes.

Many existing governance frameworks including NIST AI RMF, ISO 42001, the EU AI Act, don’t mention agents.

Organisations are beginning to develop new governance to fill the gap.

However, a more immediate operational problem is emerging that governance frameworks alone will not solve: security teams often cannot see what these agents are actually doing.

Why traditional security models break down

The security architecture most organisations rely on was designed for a fundamentally different world. It assumes predictable systems executing defined code paths.

It assumes human-triggered actions with clear intent.

It assumes that logging what happened after the fact is sufficient for accountability and forensics.

None of those assumptions hold when you introduce autonomous agents that can take actions, inherit permissions and impact running production environments.

The mismatch is structural, not just cosmetic.

Traditional applications are static, they execute the same logic given the same inputs.

Agents on the other hand are adaptive, they reason about context, select tools dynamically and adjust their approach based on intermediate results.

Post-event logging captures what happened, but agents operate in real-time decision loops where the gap between action and consequence can be milliseconds and perhaps most fundamentally, security models are built around human identity.

When an agent acts, who is the principal? Is it the user who deployed it, the developer who built it, the platform hosting it or the model provider whose weights generated the reasoning?

The industry is still working through these questions and traditional identity models do not cleanly map to agent behaviour.

This isn’t a tooling gap you can close by adding another integration to your SIEM.

It’s a structural shift in how applications behave, and security architectures haven’t caught up and neither have industry or organisational governance frameworks.

The behavioural blind spot

Here’s a reality that most organisations haven’t fully internalised: you can see what an agent does and still have no idea why it did it.

Security teams can often observe the surface-level artifacts.

API calls get logged, data access events show up in audit trails, tool invocations leave traces, but the reasoning chain that led to those actions, the sequence of decisions, context evaluations and intermediate steps that produced the behaviour, is largely invisible and that reasoning chain is exactly where the risk lives.

Drift is a particular danger because an agent doesn’t need to be compromised to become a problem.

It just needs to gradually shift its behaviour in ways that diverge from what was intended.

Maybe it starts accessing data sources that weren’t part of its original scope, maybe it chains tool calls in sequences that produce unintended side effects or maybe it interprets ambiguous instructions in ways that expand its operational footprint beyond what anyone authorised.

None of these behaviours look like attacks.

They look like agents doing their job, slightly differently than expected, in ways that compound over time and often without direct visibility, oversight or governance by security teams with legacy security tooling.

Multi-step tool chains make this worse. When an agent executes a multi-step workflow, it may query a database, call an external API, process the response, generate a document, send it to a recipient and log the result.

In these situations, accountability fragments across the chain. If the output is wrong, harmful, or unauthorised, tracing the failure back to the specific decision point requires visibility that most organisations simply don’t have.

The challenge isn’t blocking access, it’s understanding behaviour in context, in real time, at the speed agents operate.

Early patterns emerging in practice

The agents-in-production era is young, but the failure patterns are already emerging and they’re consistent enough across organisations to be instructive.

Over-permissioned agents are the most common.

Organisations deploy agents with broad tool access because scoping permissions precisely is hard and slows deployment.

The result is agents invoking tools outside their intended role, not because they’re malicious, but because their permission boundaries are too wide and nothing prevents them from exploring the full scope of what they can do.

The industry has struggled for years to implement least-privilege access for human users.

The exponential growth of agents is poised to amplify this challenge at a scale like never before.

Weak separation between agent and human identities is another recurring pattern.

When an agent acts using a human user’s credentials or a shared service account, the audit trail becomes meaningless.

You can’t distinguish between a human-initiated action and an agent-initiated action, which means you can’t apply different governance standards to each, and you absolutely need to.

Incomplete or fragmented audit trails compound both problems.

Agents often interact across multiple systems, each with its own logging format and retention policy.

The result is a patchwork of partial records that don’t tell a coherent story about what an agent did, why or what the downstream impact was.

These aren’t edge cases, they’re the default state of most early agent deployments, and they’re early warning signals for what happens when these deployments scale.

What runtime governance actually requires

If the problem is behavioural visibility, the solution must operate where behaviour occurs: at runtime.

Reviewing agent configurations at deployment time and checking permissions quarterly isn’t governance, it’s a snapshot of intent that tells you nothing about operational reality.

That snapshot in time approach didn’t work in earlier systems and it is even more brittle and problematic in the agentic era.

Runtime governance for agentic AI requires three capabilities: continuous visibility into agent behaviour, policy enforcement at execution time and ongoing posture management across agents and their tool access as environments, models and tools evolve.

This is why I’ve argued that the compliance model needs to shift from periodic assessment to continuous assurance.

Standards like AIUC-1 are moving in this direction focusing on continuous assurance rather than periodic compliance reviews.

The operational layer beneath the standard is what matters.

Can your security team actually observe, understand and govern what agents are doing right now? Without that capability, you’re unable to truly determine organisational risks and potential impacts.

From AI risk to AI operations security

The industry conversation around agentic AI has largely been framed as a risk management challenge. While this is the case, it’s becoming something more immediate than that.

AI agents are becoming operational infrastructure.

They’re not experimental tools sitting in sandboxes, they’re making decisions that affect customers, data, systems and business outcomes in real time.

When infrastructure operates autonomously, security models must evolve accordingly.

The organisations that get this right won’t be the ones with the best policies or the most comprehensive frameworks.

They’ll be the ones that can observe and understand how agents behave in real time. If you can’t see how an agent behaves you can’t secure it.

When security controls operate without visibility, they rely on assumptions rather than evidence. The blind spot isn’t theoretical.

It is operational, growing and already present in production environments.

The question is whether security teams will close it before the agents outrun them.

Share this content

Latest Issue

Connect with us

Free digital subscription

Receive the latest breaking news straight to your inbox