When AI Agents Go Rogue: Cybersecurity Experts Warn of ‘Query Injection’ Risks

As artificial intelligence agents begin taking over everyday digital tasks, researchers fear that the same autonomy that makes them useful could also turn them into tools for hackers.

Contents

The Next Frontier—and the Next Threat From Chatbots to Commanders Big Tech’s Uneasy Defenses When Convenience Meets Compromise

The Next Frontier—and the Next Threat

Artificial intelligence agents—programs that use chatbots to perform human-like online tasks such as booking travel or managing calendars—are being hailed as the next frontier in the AI revolution. But cybersecurity experts are sounding alarms: the very autonomy that allows these systems to act on behalf of users may also make them dangerously easy to hijack.

“These agents are like digital employees that can browse the web, handle payments, or manage data,” said Eli Smadja of the Israeli cybersecurity firm Check Point. “The number one security problem now is query injection.”

This emerging technique allows hostile actors to manipulate AI systems by subtly altering their instructions—sometimes in real time—to make them execute unintended commands. A harmless prompt like “book me a hotel room” could be rewritten midstream into “transfer $100 to this account.”

“Centre for Police Technology” Launched as Common Platform for Police, OEMs, and Vendors to Drive Smart Policing

From Chatbots to Commanders

AI tools have rapidly evolved from text and image generators into self-directed agents capable of taking action online. This leap, experts say, has opened a new attack surface. Johann Rehberger, a researcher known in cybersecurity circles as “wunderwuzzi,” described the shift as a “fundamental change” in the threat landscape.

“For the first time in decades, we’re seeing novel attack vectors that can come from anywhere,” Rehberger told AFP. “They only get better,” he added, referring to hacker tactics that adapt as fast as AI itself.

What once required complex coding can now be achieved through cleverly written language prompts hidden within web pages or data files. When an AI agent interacts with compromised material—say, scraping a website for information—it can unwittingly trigger malicious instructions buried within the text.

Big Tech’s Uneasy Defenses

Major technology firms are rushing to contain the threat. Microsoft has integrated detection tools that flag malicious instructions based on their origin, while OpenAI has begun alerting users when their agents attempt to access sensitive websites—halting operations until a human supervises in real time.

Meta, which calls query injection a “vulnerability,” and OpenAI’s own security chief, Dane Stuckey, have both acknowledged that the issue remains “unresolved.” Industry insiders warn that the rush to build powerful autonomous systems may have outpaced efforts to secure them.

“The biggest mistake,” said Smadja, “is giving one AI agent all the power to do everything.”

When Convenience Meets Compromise

Part of the challenge, researchers say, lies in balancing ease of use with security. Users crave convenience—AI that books tickets, manages emails, or shops online without oversight. But that convenience may come at the cost of control.

“AI agents aren’t yet mature enough to be trusted with important missions or data,” Rehberger cautioned. Some experts now recommend that all sensitive actions—like exporting data or accessing bank accounts—require explicit user approval. The risk is compounded by the democratization of power.

“The ability to order AI agents around with plain language makes it possible for even the non-technical to do mischief,” noted software engineer Marti Jorda Roca of NeuralTrust.

As AI startup Perplexity put it in a recent post:

“We’re entering an era where cybersecurity is no longer about protecting users from hackers with technical skill—but from anyone with the right words.”

📲 Join Our WhatsApp Channel