Agentic AI Security: Wrong Context, Wrong Decisions at Machine Speed

  Rassegna Stampa, Security
image_pdfimage_print

Context is the central plank of AI in general, and agentic AI in particular. If an AI system doesn’t have the correct context, it cannot make the correct decisions.

Security is moving toward reliance on the autonomous and automatic action of agentic AI. It has little choice. The increasing speed, volume and efficiency of attacks automated by adversarial use of both generative and agentic AI will only be matched by defensive AI with as little slow human intervention (the proverbial man-in-the-loop) as possible.

But defensive agentic AI can get it wrong and make bad decisions through lack of context. We’re not yet ready for fully autonomous AI.

Emanuel Salmona, CEO at Nagomi Security

“The problem that keeps me up at night is simple: an agent is only as good as the context it operates on,” explains Emanuel Salmona, CEO and co-founder at Nagomi Security. “Give it an accurate, correlated view of your environment – your assets, your controls, your exposures, your threat landscape – and it can make decisions that genuinely reduce risk. Give it incomplete data and it will still act. Confidently. Quickly. Incorrectly. Automation without verified context is just a faster way to be wrong at scale.”

Confidence is provided by the LLM used by the agentic system (it’s what LLMs are designed and trained to do). Speed comes from the machine-speed performance of artificial intelligence. Potential inaccuracy is determined by the accuracy of the context it uses. Context is king. Inadequate context can lead to bad decisions confidently, quickly, and implemented automatically.

This reliance on context applies to all agentic AI used in business, including customer service automation, software development, financial operations, sales operations, and personal assistants – and autonomous SOC applications. Give them the wrong context and they will give you bad decisions.

Context

Context is of little relevance to LLMs. Context here is fundamentally the user’s prompt – to which the LLM responds in accordance with its training. The LLM’s context is this prompt window, comprising both query and response; and it is stateless.

Advertisement. Scroll to continue reading.

Agentic AI has a goal. Its context is stateful and includes anything and everything it is allowed to see and use to achieve its goal. If the context it is given does not include the relevance of a specific device to business continuity, the response it provides will not take that into consideration – it could make immediate shutdown its conclusion, unaware of the catastrophic business effect of shutting down that device at this moment.

Agentic AI does not stop until it achieves its goal. Put simply, based on the context it is given, it presents a possible response to a received alert to an LLM in the form of a prompt. If the LLM does not agree with the validity of the prompt it receives, its own response is added to the agent’s context – and a new proposal/prompt is issued based on the new context.

Eventually, the prompt and prompt response will agree, and the agent will, if so designed, enact the proposal automatically and, where allowed, autonomously. Since the end could in theory allow device isolation or shutdown, autonomous automatic shutdown could be the result (the end) governed by the context (the means). In agentic AI, the end must not justify the means; the means must justify the end. If the context is lacking, the decision of the AI will almost certainly also be lacking.

If the agent designer and developer gets the context right, agentic AI can be a massive boon to the security of the user. If the context is wrong or inadequate, any autonomous action could be catastrophic. The precise context must be defined by the agent’s goal. But getting it right is very difficult. 

Too much context for an agent is similar to sensory overload for a human: slower reasoning and degraded performance, goal drift and loss of focus, oscillation between incompatible actions (the agent may get stuck in a never-ending loop), and potential hallucinations as it attempts to connect loosely related bits of data.

Learn More at the AI Risk Summit | Ritz-Carlton, Half Moon Bay

Too little context is even more problematic. Just as humans might guess the answer to a problem by assuming bits of data that seem logical, so an agent that is instructed to achieve a goal might invent data to bridge the gap in its contextual knowledge. Operational accuracy and reliability may be lost through more hallucination. That hallucination could be a very bad decision delivered confidently.

The real world is constantly changing, so an agent’s context must continually be updated. Here, its ability to learn and adapt its own context can help. For example, a professional assistant in the US could be instructed to initiate a video meeting with an engineer in Europe. If it does so using its US timezone, it could be out of sync with Europe. The engineer’s personal assistant might reject this and reply, ‘I can only accept calls within this (UTC) timeframe’. The US assistant receives this, and the knowledge could become part of its context for future reference.

The ease with which context can be improved and expanded offers hope that the use of agentic AI will improve. It will make bad decisions to begin with but will get better with usage – but the ability to do so must be built into the system.

The problem for agentic security

Using AI to automate the work of the SOC provides an example of potential agentic issues.

The primary purpose of the original SOC is to manually triage alerts and find and respond to those that are most urgent and dangerous to the business and its IT infrastructure. This is costly and time-consuming while the time-to-disaster is collapsing. The appeal of using AI to increase the speed of triaging and reduce its cost is obvious.

SOC analysts already receive an abundance of alerts from multiple sources: EDR, NDR and XDR, SIEM and SOAR, IAM and threat information platforms. And we should include the SBOMs that should be provided with all new software and should provide vulnerability details. Getting data is not the problem. Interpreting and using data is the problem.

The difficulty for agentic AI in security is twofold. Firstly, it can only operate within the data it is given (which is its context). The conclusion it reaches while analyzing an alert within the confines of its context is entirely dependent on the adequacy of that context. To make it more difficult, adequate context is continually changing since business and infrastructure is continually changing.

Secondly, even if the context is good, the recommendation from the agent is usually poor – its reasoning is not competently explained to the user. Even with a human in the loop, the information provided by the AI may simply be, ‘this alert means there is a critical issue with this device, act now’, or perhaps ‘critical’ or ‘mild’, or ‘8 out of 10’ or ‘3 out of 10’. 

The attraction of feeding alerts into an agentic system to perform machine-speed autonomous triaging is obvious. But the process comes with a major flaw. “No board would accept a set of numbers without an audit trail, yet many accept intelligence that shapes approvals and decisions with no method of visibility,” says Adam Irwin, managing partner at Heligan Strategic Advisory.

Agentic automation feeds raw alerts to the agent without the benefit of SOC expert triaging and then makes a decision on those alerts that is accepted by management without the benefit of visible reasoning. We question what we see on paper but automatically assume that our AI is correct. We are likely to assume that an autonomous SOC is accurate, but we have no proof that it is.

One alternative approach

Obbe Knoop, founder and CEO at Lanxit, has a different approach – his Security Decision Intelligence Layer uses artificial intelligence, but is not an agentic AI system. He believes that agentic AI is not sufficiently mature to be trusted with autonomous action; decisions and actions should currently be left in the hands of human experts. But those experts are being hamstrung by receiving too much data, too little reasoning, and little or no context.

“I take an alert and I pull in all the context that I need to make a decision,” he explains. “I go to the VPN gateway, I go to the identity solution, for example Okta, I go to Active Directory, I go to CrowdStrike, I pull in threat intel, I look at the target’s CMDB, and I look at the business structure, purpose and employees.”

Obbe Knoop, founder and CEO at Lanxit

Knoop gathers the context, fresh every time at the time of use. His product analyzes alerts in that current context and makes a recommendation within minutes. But it doesn’t simply say, this is critical or this is not critical; it explains why it has made its conclusion, and what the user should do about the situation. It will even say, “I don’t have enough context to make a clear decision on this alert” rather than hallucinate a recommended but ultimately guessed action.

What it will not do is take any autonomous automated action on behalf of the user. The final decision on what action to take in response to a detected issue is left to the user, but with more understanding of what is happening, why it is an issue, and a recommendation on how it could be solved – all delivered in plain English.

Does he believe AI will eventually have the maturity to be allowed autonomous action? “Probably,” he says. “But we’re not there yet.” He offers autonomous vehicles as an example. They are largely but not completely trusted. In some regions, users are still required by law to keep hands on the steering wheel, just in case. His solution is to give as much accurate and current information as possible on a potential vulnerability with recommendations on how to solve it, but to allow the user to keep hands on the wheel.

This context-based decision-making is good in many ways, but still has one potential drawback. While CMDBs are often and mostly accurate, this is not guaranteed to be always true. Without constant and possibly fallible human oversight and manual maintenance wherever and whenever necessary, they can drift. An automatically interrogated CMDB may not always provide ground truth for the system’s current context.

Context to AI is like batter to a cake. If you don’t have the right ingredients mixed in the right amounts, you almost certainly won’t like the outcome.

Current state of AI decision-making

Our descriptions here are simplified, while AI and its use is evolving rapidly. LLMs are being given short-term memories, so they can have their own (limited) context, if only for the current session. Agentic AI concepts are also advancing in better context gathering, better decisions and usage with fewer hallucinations. Furthermore, the general acceptance that accurate interpretation of data is more important than sheer volume of data has become more widely recognized. Nevertheless, accurate and relevant context is the axis upon which all else revolves.

AI has been around for many years; but the current state of accessible AI is only a few years old. We should not expect it to behave as a mature technology, and yet we do. We don’t know how it will evolve, in either design or use, over the next few years. What is already clear, however, is the current state of agentic AI can offer huge benefits or surprising failures depending on how we develop and manage it. Getting the context within which it operates is essential for beneficial performance. This is possible, but as we have seen, it is very difficult to achieve because of all the pitfalls discussed above.

What is needed going forward is more efficient and reliable methods for gathering and aligning relevant context to each agentic goal, and better descriptions delivered by the AI on how and why decisions are made – with detailed explanations on the users’ options for next steps. There are signs that this is happening. We have security intelligence systems. We have autonomous agentic AI. If we combine the two in a single product and monitor its performance for a few years, we will be in a better position to understand how much and where we can allow autonomous decision making to become autonomous action taking.

Related: Can We Trust AI? No – But Eventually We Must

Related: Should We Trust AI? Three Approaches to AI Fallibility

Related: ‘Mythos-Ready’ Security: CSA Urges CISOs to Prepare for Accelerated AI Threats

Related: How to 10x Your Vulnerability Management Program in the Agentic Era

https://www.securityweek.com/agentic-ai-security-wrong-context-wrong-decisions-at-machine-speed/