AI Agent Security
Resistance testing for autonomous systems with tool access
Why AgentGuard?
An AI agent that executes code, sends emails, or accesses databases amplifies the consequences of every vulnerability. A malicious instruction no longer triggers just a bad response, but an irreversible action. AgentGuard tests your autonomous agents against known attack techniques on tooled systems.
Supported agent types
Customer Support Agents
Assistants with access to user accounts, CRM, knowledge base
Development Agents
Agents capable of modifying code, executing commands, creating PRs
Research & Analysis Agents
Agents that browse the web, read documents, produce reports
Orchestration Agents
Agents coordinating multiple sub-agents or automated workflows
Commerce Agents
Agents with access to payments, orders, inventory management
Data & Analytics Agents
Agents with database access, SQL generation, query execution
Communication Agents
Agents sending emails, messages, automated notifications
System Agents
DevOps, SRE, cloud infrastructure management agents
Your agent not listed?
If your framework or agent type isn't listed, contact us. We develop test modules on demand.
Contact us →Attack categories tested
Tool hijacking
Manipulating the agent to call tools unexpectedly or with malicious parameters.
Goal drift
Tests to gradually divert the agent from its initial mission via intermediate instructions.
Privilege escalation
Attempts to make the agent access resources or tools beyond its authorized scope.
Infinite loops & exhaustion
Causing the agent to loop or consume its resources until exhaustion.
Memory poisoning
Injection of malicious information into the agent's long-term memory.
Backdoor detection
Identification of abnormal behavior triggered by hidden triggers in inputs.
Cross-agent contamination
Tests for multi-agent architectures: can a compromised agent infect the others?
Data exfiltration
Attempts to make the agent transmit sensitive data via its tools (email, external API, logs).
Human approval bypass
Techniques to bypass human validation steps required for critical actions.
Get started in a few commands
# Installpip install rednblue
# Test an autonomous agentrnb llm --file my_agent.py --attacks AGT
# Full suite (LLM + Agent + RAG)rnb llm --file my_agent.py --all
# Autonomous mode (multi-tool agent)rnb llm --file my_agent.py --mode autonomous --max-steps 20