Security of 100 AI Agents Tested and Ranked – What You Need to Know

AI is our new leader. We just accept and do what it tells us. Maybe we should be a bit more circumspect.

Concern over the performance of AI agents has been constant, ranging from ‘leaky’ to just plain wrong decision-making. Since the pressure to use more agents more autonomously because of supercharged AI-assisted attacks is now constant, Adversa AI’s decision to measure and compare the performance and security of 100 agents across ten categories is welcome.

But the results are not. Of the 100 agents tested, and positioned within a new AI Risk Quadrant, only 11 are categorized as ‘capable well-defended’. 

The root problem is the AI agent ‘lethal trifecta’, which Adversa describes as ‘private data access + exposure to untrusted content + ability for outbound actions’. This translates directly into the standard lethal trifecta of too much power + too much trust + too little control’.

Since all three parts of this trifecta are necessary for an AI agent to achieve its goal, capability and security will always be a big ask. Ninety-eight percent of the agents have this trifecta, so it is no surprise to learn – but still shocking to hear – that so few are both capable (useful) and defendable (secure).

Capability and security verge on mutual exclusion. “The same vendors shipping the most capable agents ship the widest attack surface – a structural feature of the market, not a handful of outliers,” states Adversa’s analysis in its AI Risk Quadrant for Agent Security report. It calls this a ‘power-protection inversion’ and adds that it appears in all ten agent categories.

Advertisement. Scroll to continue reading.

The agent categories with the greatest power protection inversion, however, are ‘computer agents’ followed by ‘coding agents’.

Computer agents are designed to perform a specific task, such as make a decision or perform an action for a user. Since agents can only operate with what they know (the context problem, where poor context leads to bad decisions in all agents), computer agents are given wide access rights, effectively the complete operating system. “A compromise hands the attacker the user’s entire machine, not just one application or tab,” warns Adversa.

Such agents also suffer from an issue that affects all agents: the user has little, if any, visibility into or control over what the agent actually does. It is given an input (the task), and it generates an output (the completed task). But with computer agents, the user doesn’t know the route it takes between input and output, nor what specific actions within the operating system it takes along that route.

“The deeper issue is that the desktop confirmation step looks like a control while being unreliable in practice,” warns the analysis. ‘The human and the model reason over different abstractions (windows and labels vs. screenshots and accessibility trees). That gap produces confirmation mismatch: the human approves the appearance of the action, not what the agent is about to do, because nothing in the interface surfaces the difference.”

The second-worst offender in the exposed giants quadrant is coding agents. This is concerning since ‘vibe-coding’ applications are becoming the future of software, and ‘vibe-coded’ in-house applications may live with us for many years.

The analysis sub-divides coding agents into three types: “coding copilots (human reviews each suggestion), autonomous coding agents (goal-in, repo-out), and app builders (prompt-to-deployed-app). The first might appear to be the least dangerous, but the user still doesn’t know what the agent does between input and output. “Coding agents don’t just write code – they touch shell, dependencies, and tokens long before a diff lands in review,” comments Adversa.

“This is the class where compromise most directly becomes production compromise. The danger is not bad code suggestions; it is high-trust operation inside the software supply chain. Non-determinism makes code review an incomplete defense: even if a human reviews the final diff, the agent may already have traversed secrets, run tests against production-like services, modified configs, or selected risky dependencies. Review catches outputs; it does not catch the full action trail.”

Coding agents figure so highly among the exposed giants because they have a wide attack surface, an extensive blast radius, and poor defense controls. The attack surface is wide because they run shell commands, load MCP servers, and auto-load rules files. The blast radius comes from sitting inside the software supply chain with access to secrets, signing keys, and deployment pipelines. And their primary defense is a code review of the output, which doesn’t consider either the attack surface or the blast radius.

We’ve glanced at just two of the ten agent types included in Adversa’s agent analysis and AI Risk Quadrant. The other eight categories are general assistant, work copilot, browser, conversational, custom workflow, business process, platform operations, and data engineering. None come out squeaky clean. Ninety-eight percent of the tested agents are subject to the lethal trifecta, with only one agent in each of the general assistant and data engineering agents being the exceptions.

Learn More at the AI Risk Summit | Ritz-Carlton, Half Moon Bay

General comments from Adversa include: agent defaults favor velocity over safety; agents with the most power have the least protection, while the agents with the most protection have the least power; only 11% qualify for the capable and defended quadrant; tool execution accounts for 76% of blast radius; 37% of the market is audited more than defended; and 83% of claimed AI agent defenses are not publicly verifiable.

Agents are effectively black boxes – it’s a take it or leave it scenario. Business economics is forcing us to take it. Since we cannot control what the agent does while it is running, our only option is to be careful over what we input, and control, where possible, the output. 

Here, Adversa recommends concentration on controlling the output since there is little that can be done on the input prompts. “Defend the legs you can own, not the one you can’t,” it suggests. “Prompt injection has no deterministic fix – no classifier reliably separates the agent’s data from its instructions, and vendors concede it. Concede the input boundary and spend the defensive budget on the trifecta legs the operator does control: egress, identity, and irreversible actions.”

This is where we are today. The headlong rush into agentic AI solutions is irreversible but concerning. We will only match adversarial AI-assisted attacks by using AI-assisted defense. All businesses will only remain competitive if they are faster, and more efficient than the competition. In business, all roads lead to AI. We must hope, and can probably expect, that AI will improve in all areas in the future. To what extent and when that may happen is another unknown. 

But in the meantime, the ultimate message from Adversa’s massive and detailed analysis is clear: “Let’s be careful out there.”

Related: Can We Trust AI? No – But Eventually We Must

Related: The Wild West of Agentic AI – An Attack Surface CISOs Can’t Afford to Ignore

Related: Sweet Security Launches Agentic AI Red Teaming to Counter ‘Mythos Moment’

Related: Raising the Cybersecurity Stakes: Ante up for the Agentic Era

https://www.securityweek.com/security-of-100-ai-agents-tested-and-ranked-what-you-need-to-know/




AI, Trump vuole testare i modelli più potenti prima del lancio sul mercato

Testare e valutare i modelli AI più potenti per rafforzare la sicurezza nazionale, l’ordine esecutivo firmato da Trump

Non sarà un obbligo imposto alle Big Tech come OpenAI, Google oAnthropic, ma un meccanismo volontario di collaborazione con il Governo degli Stati Uniti. È quanto prevede l’ultimo ordine esecutivo firmato dal presidente Donald Trump, che pone le basi per una nuova fase di sperimentazione e valutazione dei modelli di intelligenza artificiale più avanzati a livello federale.

Ufficialmente, come si legge nel documento pubblicato dalla Casa Bianca, l’iniziativa ha l’obiettivo di rafforzare il coordinamento tra i principali dipartimenti federali, dal Dipartimento della Difesa a quello del Tesoro, passando per la Sicurezza Nazionale e l’Homeland Security, per potenziare la protezione informatica del Paese e delle infrastrutture critiche.

Prima dell’implementazione e del rilascio sul mercato dei modelli AI più potenti, l’amministrazione Trump ha dunque espresso la volontà di valutarne preventivamente gli aspetti di sicurezza, con l’obiettivo dichiarato di tutelare gli interessi nazionali.

Le capacità avanzate dell’intelligenza artificiale rendono la nostra nazione più forte, ma introducono anche nuove implicazioni per la sicurezza nazionale che richiedono un’azione coordinata tra dipartimenti e agenzie esecutive”, si legge nell’ordine esecutivo. Il documento sottolinea inoltre che l’amministrazione Trump “lavorerà a stretto contatto con l’industria per garantire che le tecnologie più avanzate e sicure vengano implementate rapidamente per contrastare qualsiasi minaccia al Paese”.

Firma posticipata di 10 giorni per no danneggiare le Big Tech?

Secondo quanto riportato da NBC News, il decreto avrebbe dovuto essere firmato il 21 maggio nel corso di una cerimonia pubblica alla presenza degli amministratori delegati delle principali aziende tecnologiche statunitensi. All’ultimo momento, tuttavia, Trump avrebbe annullato l’evento, rinviando la firma del provvedimento, poi avvenuta ieri a porte chiuse.

Una scelta che, secondo l’emittente americana, sarebbe stata motivata dal timore di penalizzare le stesse Big Tech e di compromettere la competitività dell’industria tecnologica statunitense nei confronti della Cina.

Non solo. Il rinvio potrebbe aver consentito all’amministrazione di introdurre modifiche sostanziali al testo. L’ordinanza stabilisce infatti che il programma di test volontario consentirà al Governo federale di accedere ai modelli di intelligenza artificiale di frontiera fino a 30 giorni prima della loro distribuzione ad altri partner considerati affidabili, rispetto ai 90 giorni previsti nella bozza iniziale.

Il documento dispone inoltre che i procuratori generali attribuiscano priorità ai procedimenti che coinvolgono l’utilizzo dell’intelligenza artificiale, con particolare attenzione agli agenti AI e ai sistemi autonomi impiegati in attività di criminalità informatica.

In questo contesto, il caso Mythos di Anthropic potrebbe aver contribuito ad accelerare un processo interno di maggiore attenzione da parte del Governo federale ai temi della cybersecurity nazionale.

Se l’AI finisce nelle mani sbagliate …

Se da un lato il modello potrebbe rappresentare uno strumento utile per aiutare aziende e organizzazioni a individuare vulnerabilità nei propri sistemi di sicurezza informatica, dall’altro diversi esperti, così come numerosi esponenti dell’amministrazione, temono che tali capacità possano essere sfruttate da attori malevoli per identificare e colpire debolezze presenti nei software e nelle infrastrutture digitali.

È relativamente raro che l’amministrazione Trump intervenga con misure che introducono forme di supervisione sulle tecnologie più avanzate. Finora, infatti, l’approccio è stato prevalentemente orientato nella direzione opposta, con una forte opposizione sia alle proposte normative federali sia alle iniziative legislative dei singoli Stati che, secondo la Casa Bianca, rischierebbero di rallentare l’innovazione americana nel settore dell’intelligenza artificiale.

Dalla partecipazione volontaria a quella obbligatoria?

Non mancano, tuttavia, le critiche all’approccio adottato dalla Casa Bianca, come riportato da Politico. Caleb Knapp, responsabile senior delle politiche dell’Alliance for Secure AI, ha definito l’ordinanza “un buon punto di partenza per costruire le capacità istituzionali necessarie a una supervisione efficace dei modelli di intelligenza artificiale avanzata“.
Secondo Knapp, però, un sistema basato esclusivamente sulla partecipazione volontaria delle aziende non sarebbe sufficiente a garantire un controllo adeguato delle tecnologie più potenti. Per questo motivo ha invitato il Congresso a intervenire con una normativa più stringente, introducendo l’obbligo per gli sviluppatori di sottoporre i propri modelli a una revisione governativa prima del rilascio. Una posizione che evidenzia il dibattito in corso negli Stati Uniti tra chi ritiene necessario rafforzare gli strumenti di vigilanza e chi teme che un eccesso di regolamentazione possa rallentare l’innovazione.

Anche Caleb Max, presidente e CEO della National Artificial Intelligence Association, ha sottolineato come le iniziative volontarie rappresentino spesso soltanto una prima fase del processo normativo: “Raramente il governo adotta misure destinate a rimanere esclusivamente volontarie. In genere, le regole tendono a diventare più restrittive nel tempo, non più permissive“.

Novità su Google, per aggiungere Key4Biz tra le tue fonti preferite, clicca qui

Aggiungi Key4Biz tra le tue fonti preferite

Leggi le altre notizie sull’home page di Key4biz

https://www.key4biz.it/ai-trump-vuole-testare-i-modelli-piu-potenti-prima-del-lancio-sul-mercato/574024/




Microsoft’s Project Solara is an Android OS designed for agents instead of apps

However, Microsoft is clear that this is still just a concept. None of it works, but the company is committed to spending money on it as part of its massive AI expansion plans.

Agentic concepts

Microsoft has shown off two concept devices that illustrate where it hopes to go with Project Solara. The more conventional is the Desk Concept, which looks like a typical smart display. It’s got a touchscreen, microphones, and a camera. While you sit at your desk, this gadget would keep you apprised of what your theoretical AI agents are doing on your behalf. It can act as a secondary monitor or become a standalone Windows PC with Windows 365 cloud computing. This concept is built around MediaTek IoT chips.

The other Solara concept skews weirder. What if the work badge at the end of your lanyard had a touchscreen, 5G connectivity, a camera, microphones, and a fingerprint scanner? That’s the Badge Concept. It would have the same Solara software, piping in generative interfaces from your preferred AI agent. Microsoft envisions this Qualcomm-based device providing biometric-authenticated access to your agents—just tap the sensor and start telling your personal robot what to do. It could also record and summarize meetings and use the camera to “take action on the environment,” whatever that means.

You can’t even get in line to buy either of these devices. Microsoft’s next step is to demo its agent-first devices with industry partners, including AccuWeather, Best Buy, CVS Health, Levi’s, and Target.

Microsoft has struggled to branch out beyond traditional computing and enterprise services, having tried and failed on numerous occasions to gain a foothold in mobile computing. With AI, Microsoft was uncharacteristically at the forefront of change. With its OpenAI deal sputtering, the company is now looking to the future, and this is it: agents instead of apps.

[embedded content]

This is an interesting pitch for how we might actually use AI agents, and it’s not coming totally out of left field. Google is also pursuing agentic interfaces in its search products. At I/O, Google previewed new agent-first search tools that can instantly build dashboards and mini-apps based on your search queries.

As vague and pie-in-the-sky as Project Solara may be, Microsoft is pretty in tune with the rest of big tech’s AI plans. If any of it works, we can only hope it doesn’t lead to a new generation of touchscreen millstones around our necks.

https://arstechnica.com/gadgets/2026/06/microsofts-project-solara-is-an-android-os-designed-for-agents-instead-of-apps/




Mathematicians warn of AI threats to profession as industry encroaches

Recommendations for humans

So what is a human mathematician to do during the AI boom? The Leiden Declaration recommends that individual mathematicians transparently disclose their use of AI tools, retain responsibility for the correctness of their mathematical work, continue crediting human authors while properly attributing work even if AI tools make that difficult, and consider using only AI tools that align with the values articulated in the declaration

The declaration also reminds mathematicians that mathematics has “applications in the development of technology for use in warfare, oppression, mass surveillance, and the undermining of democracy,” and so mathematicians should make ethical decisions accordingly when choosing external partnerships with tech companies.

Professional mathematical organizations can develop guidelines for the use of AI and other automated tools in publication and review, protect the rights of researchers as authors through licensing agreements that prevent their work from being used as training data without consent, and support the role of peer-reviewed publications. The declaration also suggests such organizations “actively prepare to become involved if major mathematical results are claimed using unconventional means.”

The authors of the declaration also offer straightforward recommendations for policymakers, including “protect the rights of authors,” “regulate the artificial intelligence industry,” and “invest in public computational infrastructure.” Under “don’t believe the hype,” the declaration warns about how “there is currently a strong commercial incentive on the part of the technology industry to overstate the capabilities of their products.”

Lastly, the declaration acknowledges that the tech industry “has offered lucrative jobs, monetary rewards, computing resources, and intellectually stimulating opportunities that some mathematicians have found attractive… in an era of underfunding of higher education and precarious academic employment.” It calls on such collaborations between mathematicians and the tech industry to abide by the standards laid out in the declaration.

“By endorsing the declaration, the IMU affirms that the future of mathematical research must be guided by human judgment, fair and transparent practices, and the shared values of the global mathematical community,” said Ulrike Tillmann, vice president of the International Mathematical Union, in a statement. “Mathematics is, and should always remain, a profoundly human endeavor.”

https://arstechnica.com/tech-policy/2026/06/mathematicians-warn-of-ai-threats-to-profession-as-industry-encroaches/




Android phones will soon be able to detect spoofed calls and impersonation scams

We’re expecting Android 17 to begin rolling out later this month, but first, Google has a batch of updates for the wider Android device ecosystem. As usual, some of the new features are limited to specific devices, and others require using Google’s apps. But if you don’t mind the latter, you can get automated protection from the growing threat of deepfake phone scams.

According to Google, “impersonation fraud” is one of the most common types of financial scams. The FTC tracked almost $3 billion in losses from such scams during 2024, and the improvements in AI voice cloning tools more recently are making the schemes easier to pull off. The voice models are becoming so capable that it can be difficult to identify a fake caller even when an AI is imitating someone you talk to every day.

Google’s solution is an expansion of the system it debuted last month for verified financial calls. Now, a similar feature will work with anyone in your contacts. Many of the most effective deepfake scams involve spoofing a contact’s number, which makes the call look more legitimate when your phone lights up. Victims of these scams are then greeted by an accurate re-creation of the person’s voice spinning a yarn that involves an urgent need for cash.

Google’s scam call detection feature will be available on all phones running Android 12 or higher, but it does require you to have three Google apps installed: Phone by Google, Contacts, and Google Messages. Depending on your device, you may already have these. They’re the preloaded options on Pixel and Motorola phones, and Samsung has now switched over fully to Google Messages. Google claims that Phone by Google is the most widely used dialer, but that doesn’t seem right—Samsung has its own phone app, and it’s the largest Android OEM by far.

https://arstechnica.com/gadgets/2026/06/google-announces-deepfake-call-detection-for-android-new-airdrop-device-support/




Two New Reports Offer Competing Explanations for Cybersecurity’s Growing Crisis

Two reports offer differing viewpoints. One suggests a failure of tools to provide what security teams really need. The other suggests the tools exist but are not properly managed.

The industrialization of cybercrime threatens to overwhelm cyber defense. It’s a process that started before the arrival of ChatGPT, was supercharged by the age of AI, and is now typified as the post-Mythos era. It’s a time when defenders must improve their performance or cede the battleground to the adversary. Applications are the battlefield. The speed, scale and sophistication of AI-assisted attacks is difficult to contain. 

“AI is not just creating more vulnerabilities. It is exposing the fact that companies cannot fix known vulnerabilities fast enough,” explains Daniel Shechter, CEO and co-founder at Miggo Security. “For years, security programs have been measured by how well they find risk before software goes live. Frontier AI like Mythos changes the question. If attackers can move from disclosure to exploit in hours, boards and CISOs need to understand how long the business remains exposed, and what can be done to mitigate quickly and efficiently.”

The Cloud Security Alliance (CSA) State of Modern Application and AI Security report (PDF), commissioned by Miggo and published on June 2, 2026, confirms and explains this new reality. CSA surveyed more than 900 cybersecurity leaders and found that vulnerabilities in this post-Mythos era are evading the pre-production phase while 82% of organizations lack effective runtime visibility.

“The real challenge begins once applications are in production, where security teams must rapidly determine which exposures are truly exploitable, prioritize the risks that matter most, and respond before attackers can take advantage,” suggests Daniel Shechter, CEO and co-founder at Miggo Security.

Most breaches are driven by known vulnerabilities. Eighty percent of the companies surveyed have suffered at least one incident involving a known vulnerability in the last year. If it is known, it is almost certainly patchable; but in the post-Mythos era there are too many patches to handle. The biggest problem is knowing which of those vulnerabilities are exploitable and most urgently need patching.

Advertisement. Scroll to continue reading.

Only 9% remediate critical vulnerabilities within 24 hours; with74% take one to seven days. Patch time is important: Organizations taking four or more days had a 97% incident rate. Those taking three or less had a 67% rate. The implication is that patch rates must be increased and exploitable vulnerabilities better understood – and preferably both.

It gets more complicated, and urgent, in runtime, which is described as the breach battlefield. Most organizations only know what happened after reconstructing the event after the horse has bolted. Most (73%) would adopt virtual patching if they had better confidence in minimal false positives; but only 17% configure WAFs for automatic blocking, with 56% citing a lack of application context as the reason.

Because of the runtime difficulties, there is an intention by 42% of the organizations to increase investment in runtime monitoring and protection over the next few years. But since protection is always better than cure, the bulk of investment (52%) remains in pre-production such as CI/CD build protection.

The potential solutions are clear. Improved visibility into vulnerability exploitability together with better all-round contextual understanding of the application concerned – and its effect on business stability – would allow autonomous patching for many vulnerabilities and confidence in increased automated blocking.

A separate FireMon Insights report, also published June 2, 2026, suggests that concern over the automated use of firewalls as a security barrier is unsurprising but at least partially due to a lack of human oversight. FireMon discusses firewalls in general, but the same principles will apply to WAFs.

“Firewall complexity is no longer just an operational problem. It is a control problem,” says Jody Brazil, CEO at FireMon. “Security teams have massive investments in firewalls, cloud, and segmentation platforms, but without control of policy those environments become difficult to manage securely. The problem is no longer lack of tools. It is lack of operational control.”

It concludes that manual policy management is inefficient and allows risk across the attack surface to continue to expand rapidly, primarily due to an environment in which high severity policy failures persist over extended periods of time, and are exacerbated by unused and redundant rules. 

FireMon suggests a failure in human management rather than firewall capability. For example, 45% of firewall rules lack an owner or documentation, 17% are redundant or shadowed, and 69% are unused.

“Firewall complexity is no longer just an operational problem. It is a control problem,” adds Brazil. “Security teams have massive investments in firewalls, cloud, and segmentation platforms, but without control of policy those environments become difficult to manage securely. The problem is no longer lack of tools. It is lack of operational control.”

While this suggests a route toward better usage of firewalls, it doesn’t discuss or explain the fear that contextually incorrect blocking rules might adversely affect business operations – which lies at the heart of improving application security.

The two reports are, however, slightly at odds. The CSA report suggests the problem is a failure of security tools to provide the solutions really necessary, while the FireMon report suggests the tools exist, but are not being properly managed.

Related: Anthropic Unveils ‘Claude Mythos’ – A Cybersecurity Breakthrough That Could Also Supercharge Attacks

Related: The Hidden ROI of Visibility: Better Decisions, Better Behavior, Better Security

Related: New Class of CI/CD Attacks Could Have Led to PyTorch Supply Chain Compromise

Related: Microsoft to Enable ‘Windows Baseline Security’ With New Runtime Integrity Safeguards

https://www.securityweek.com/two-new-reports-offer-competing-explanations-for-cybersecuritys-growing-crisis/




Anthropic Expanding Mythos Access to 150 New Organizations

Anthropic announced on Tuesday that it is expanding Project Glasswing, its collaborative program aimed at securing critical software using AI. 

The initiative, launched with roughly 50 initial partners in early April, granted them access to Claude Mythos Preview. Those partners have since used Mythos to scan codebases and identified thousands of vulnerabilities.

The expansion adds roughly 150 new organizations, each required to meet Anthropic’s standards before gaining access. These partners are based in more than 15 countries and include providers of critical infrastructure in sectors such as power, water, healthcare, communications, and hardware. 

Many are vendors and maintainers of widely used codebases relied upon by governments and other organizations worldwide.

A common factor among the new partners is the potential impact of a successful cyberattack targeting their products, which could affect more than 100 million people for most participants and carry significant national and global security implications. 

The expansion of Project Glasswing follows collaboration with existing partners, the security industry, open source software maintainers, and the US government.

Advertisement. Scroll to continue reading.

Anthropic has not shared the expanded list of partners, but the Financial Times reported that the newly added organizations include Okta, Samsung, the EU cybersecurity agency ENISA, and NATO. 

The AI giant reported recently that Mythos identified more than 23,000 potential vulnerabilities, with the company estimating that more than 6,000 will be confirmed as severe flaws.

Organizations such as Mozilla, Palo Alto Networks, and Cloudflare saw good results when turning Mythos against their own products. 

[ Read: Anthropic Releases New Claude Sandbox, Security Guidance Plugin ]

With Mythos and other AI tools rapidly discovering vulnerabilities, the problem now shifts to verifying and patching them. For instance, of the thousands of security bugs found by Mythos, only 75 critical and high-severity issues have been patched. 

Anthropic says Mythos can also help with verification and patching, and the company is working with others to “substantially scale up the reviewing and patching of vulnerabilities in open-source software”.

“We’re also working on sharing ideas and best practices for disclosing vulnerabilities to open-source maintainers, with the intent of making these reports easier to triage and to act upon,” Anthropic said

Related: Mythos Proves Potent in Vulnerability Discovery, Less Convincing Elsewhere

Related: The Mythos Moment: Enterprises Must Fight Agents with Agents

Related: OpenAI Widens Access to Cybersecurity Model After Anthropic’s Mythos Reveal

Related: Sweet Security Launches Agentic AI Red Teaming to Counter ‘Mythos Moment’

https://www.securityweek.com/anthropic-expanding-mythos-access-to-150-new-organizations/




Meta AI Hands Over High-Profile Instagram Accounts to Hackers

Threat actors compromised multiple high-profile Instagram accounts last week by simply asking Meta’s AI-powered account recovery assistant to hand them over.

The attackers exploited a logic flaw in the AI assistant, a classic ‘confused deputy’ issue, to have their own email addresses linked to the targeted accounts and take them over.

Confused deputy weaknesses have been known to security researchers for decades and involve tricking a deputy that has elevated privileges into performing specific actions on the attacker’s behalf.

In this case, the Meta AI assistant had API access to account management systems, being deployed to help users re-link email addresses, reset passwords, and verify they are the owners of specific accounts.

Due to the logic flaw, hackers were able to simply ask the chatbot to link a targeted account to a new email address, under the pretense that they had been hacked or that they had lost access to the previously linked email address.

To bypass Meta’s fraud detection protections, they used VPNs to appear as if they were in the target’s geographic location.

Advertisement. Scroll to continue reading.

The AI assistant happily linked the new email address and then sent a code that allowed the attackers to reset the password for the targeted account, locking the rightful owner out.

In the event that the chatbot asked for a selfie to verify account ownership, the attackers reportedly modified victims’ photos using AI tools and submitted the altered images.

Inexplicably, the attack also bypassed two-factor authentication (2FA) protections for the targeted accounts, and some victims say they were never notified of the password reset attempts.

Hundreds of high-profile accounts were reportedly compromised and immediately sold on the dark web. Some miscreants were seen sharing videos and instructions on how the account takeover is performed.

Using the trick, the hackers gained access to the Obama White House handle and to the accounts of Sephora and John Bentivegna, the Chief Master Sergeant of the Space Force.

Instagram parent company Meta has resolved the issue, and the exploit no longer works, but it’s unclear how many accounts might have been affected. SecurityWeek has emailed the company for a statement and will update this article if it responds.

“This is a great illustration of why AI agent authorization is the harder, and more critical, problem than authentication. Meta’s bot verified nothing about who was asking; it just helpfully did what it was told to do, up to and including sending the attacker a confirmation code to make sure the new email address was valid. The industry is pretty focused on keeping AI from saying bad things. That’s fine, as long as we don’t completely overlook whether AI should be allowed to do what it’s trying to do,” FusionAuth senior director Dan Moore commented.

Related: As the Pentagon Pushes for Battlefield AI, Some Military Leaders Urge Caution

Related: Researcher Discovers 4th WhatsApp View Once Bypass; Meta Won’t Patch

Related: McDonald’s Chatbot Recruitment Platform Exposed 64 Million Job Applications

Related: Pro-Iranian Hacking Group Claims Credit for Hack of FBI Director Kash Patel’s Personal Account

https://www.securityweek.com/meta-ai-hands-over-high-profile-instagram-accounts-to-hackers/




AI costs how much? GitHub Copilot users react to new usage-based pricing system.

“Even though I was super cautious on the first day, trying it out with a limited number of uses, it still consumed 840 credits,” one user wrote of testing Claude Sonnet 4.6 through Copilot today. “I haven’t even done any really complex work yet,” another user complained after reported usage representing 21 percent of their monthly Pro Copilot subscription’s credit allotment in a single day. “I have a feeling I’ll be going somewhere else pretty soon.”

Using all 8,000 of your org’s monthly AI credits in a single day is… probably not sustainable.

Using all 8,000 of your org’s monthly AI credits in a single day is… probably not sustainable. Credit: gxjo / X

Amid the pricing change, plenty of GitHub Copilot users are predictably and publicly threatening to cancel their subscriptions or looking for other AI coding options. But others say they have been able to adjust to the new world of usage-based pricing. Coder Henri Kinnunen writes that they only burned 161 credits in a “productive day” of using GPT 5.3-Codex through Copilot, thanks to limiting themselves to “very focused and deliberate changes with AI.” Over on Bluesky, coder Neil Hewitt wisely noted that continuing a three-day-old chat session on Copilot probably isn’t as wise now, since it means sending “the entire chat history as context every time… hey, input tokens use credits… it’s not rocket science.”

While some Copilot users are jumping ship for other services with more generous usage limits, that kind of subsidized customer acquisition may soon give way to Copilot-style usage-based pricing across the industry. If that happens, LLMs that are more efficient with their tokens may win the economic battle; on Reddit, one user is already discussing how they’ve integrated Deepseek into their GitHub VSCode environment at a cost of only “about 7 cents for 15 million tokens.” While you might say “you get what you pay for,” some AI users are now contemplating a world where they also have to pay for what they get.

https://arstechnica.com/ai/2026/06/ai-costs-how-much-github-copilot-users-react-to-new-usage-based-pricing-system/




Hackers duped Meta AI support chatbot to steal celebrity Instagram accounts

Both ZachXBT and Dark Web Informer also confirmed how hackers had targeted and resold particularly valuable Instagram accounts, including the short handles @hey and @jowo with a “combined gray-market valuation estimated above $1 million,” according to the CyberSec Guru. Such accounts can be valuable even if hackers hold them for just a few days because of “clout, resale or brand impersonation,” the security blog reported.

The wide security hole

The CyberSec Guru also described the exploit as representing the classic “confused deputy” problem from computer security, in which a program with elevated permissions is tricked into misusing those permissions on behalf of a less privileged third party. But in this case, the “deputy” was a large language model with a “probabilistic response model you can nudge with words” instead of a “deterministic program” with “hard-coded conditionals you’d need to bypass with code.”

It’s worth keeping in mind that users had simple security solutions available, even with the Meta AI support chatbot being exploited. The hackers reported their exploit failing against any accounts that had enabled multifactor authentication (MFA), including the “least robust form of MFA that Instagram offers” in the form of one-time codes sent through SMS, according to KrebsOnSecurity.

But the exploit still highlights the broader risk of tech companies and other organizations rushing to deploy AI agents with elevated permissions that allow them to modify, create, or delete critical data. Meta had launched its Meta AI support assistant in March 2026 with the promise that it could “provide reliable, 24/7 support for nearly any support issue at any time.”

The “minimum” architecture required to do this more safely, according to the CyberSec Guru, would include “out-of-band verification before any account modification… rate limiting on AI-initiated reset flows keyed to account risk signals, action logging with anomaly detection for unusual AI-driven account modifications, and a hard deterministic gate.”

https://arstechnica.com/ai/2026/06/meta-ai-support-chatbot-gave-hackers-access-to-notable-instagram-accounts/