menu_bookGuides5 min read

5 Common Mistakes When Writing AI Agent Definitions

Avoid these five mistakes when writing AI agent definitions. Each includes a bad example and a fixed version so you can improve your agents immediately.

personAgent Shelf Teamcalendar_todayApril 8, 2026schedule5 min read

Writing agents is easy. Writing good agents takes practice.

After reviewing thousands of agent definitions published to Agent Shelf, we see the same mistakes again and again. These aren't obscure edge cases. They're fundamental issues that make agents unreliable, inconsistent, or just not very useful.

Here are the five most common problems and how to fix them.

1. Being too vague about the agent's role

The most common mistake is writing instructions that could apply to any AI assistant. If your agent definition doesn't make the AI behave noticeably differently from a default chat, it's not doing its job.

Bad example:

You are a helpful coding assistant. Help users with their code.
Write clean code and explain your reasoning.

This tells the AI nothing it doesn't already know. There's no specific expertise, no defined workflow, and no clear scope.

Fixed version:

You are a Python backend engineer specializing in FastAPI applications
and PostgreSQL databases. You follow the repository's existing patterns
for error handling, use Pydantic v2 models for all request/response
schemas, and write pytest tests for every new endpoint.

When asked to add a feature:
1. Check existing route patterns in the `app/routes/` directory
2. Create the Pydantic models first
3. Write the route handler
4. Add a test in `tests/` that covers the happy path and one error case

The fixed version gives the AI a specific identity, a technology scope, and a repeatable process. It will produce noticeably different output than a generic assistant.

2. Writing instructions that assume the wrong context

Agents run inside AI coding tools like Claude Code, Cursor, and Windsurf. These tools operate in your terminal and file system. They don't have a web browser. They can't open GUI applications. They can't click buttons on websites.

Bad example:

When reviewing a pull request:
1. Open the PR in GitHub's web interface
2. Click on each changed file
3. Browse the repository to understand the codebase
4. Leave inline comments on the PR

This entire workflow assumes browser access that the AI doesn't have.

Fixed version:

When reviewing a pull request:
1. Run `git diff main...HEAD` to see all changes
2. Read the changed files to understand context
3. Check related files for potential side effects
4. Output your review as a structured markdown document with:
   - File path and line numbers for each comment
   - Severity level (critical, warning, suggestion)
   - Explanation and suggested fix

The fixed version uses commands and file operations that AI coding tools can actually execute. Before writing instructions, think about what your agent's runtime environment can and cannot do. Our documentation covers the capabilities available to agents in different tools.

3. Skipping output format rules

Without explicit formatting instructions, agents produce different output structures every time. One run gives you a bulleted list, the next gives you prose, and the third gives you a markdown table. This makes agent output unpredictable and hard to integrate into workflows.

Bad example:

You are a security auditor. Review code for vulnerabilities and
report what you find.

Sometimes this produces a numbered list. Sometimes a paragraph. Sometimes it mentions severity, sometimes it doesn't. You never know what you're going to get.

Fixed version:

You are a security auditor. Review code for vulnerabilities.

Always format findings as:

### [SEVERITY: critical|high|medium|low] Finding title

**File:** `path/to/file.ext` (line X-Y)
**Category:** [injection|auth|crypto|config|data-exposure]
**Description:** One paragraph explaining the vulnerability.
**Fix:** Code block showing the corrected version.

If no vulnerabilities are found, output:
"No security issues detected in the reviewed files."

The fixed version defines the exact output structure. Every run produces the same format, making it easy to scan results, compare across reviews, and build automation on top of the agent's output.

4. Cramming too many responsibilities into one agent

It's tempting to build one agent that handles everything. Code reviews, documentation, testing, deployment. But agents with too many responsibilities produce shallow results because the instructions for each task compete for the AI's attention.

Bad example:

id: super-dev-assistant
name: Super Dev Assistant
description: Handles all development tasks
version: 1.0.0
category: coding
You can:
- Review pull requests
- Write unit tests
- Generate API documentation
- Set up CI/CD pipelines
- Optimize database queries
- Write release notes
- Manage dependency updates
- Create architecture diagrams

Eight different jobs, none of them well-defined. The agent will do a mediocre job at all of them.

Fixed version: Split this into focused agents, each with one clear responsibility.

id: code-reviewer
name: Code Reviewer
description: Reviews pull requests for correctness, security, and style
version: 1.0.0
category: coding
id: test-writer
name: Test Writer
description: Generates unit and integration tests following project conventions
version: 1.0.0
category: coding

Each agent gets deep instructions for its specific task. The code reviewer has detailed review criteria. The test writer knows your testing framework and coverage requirements. You can read more about structuring agents effectively in our guide to writing agent definitions.

A good rule of thumb: if you can't describe what your agent does in one sentence without using "and," it's doing too much.

5. Not versioning properly

Agent definitions follow semantic versioning. This isn't just a formality. When you change an agent's behavior, the people using it need to know whether the update is safe to adopt or if it might change their workflow.

Bad example:

An agent at version 1.0.0 that gets updated with completely different instructions, a new output format, and removed capabilities, but stays at 1.0.0. Or jumps to 1.0.1.

Users who update get a completely different agent than the one they installed. Their automation breaks. Their expectations are wrong.

Fixed version:

  • 1.0.0 to 1.0.1: Fixed a typo in the instructions. No behavior change.
  • 1.0.1 to 1.1.0: Added a new capability (e.g., the agent can now check for accessibility issues in addition to security). Existing behavior unchanged.
  • 1.1.0 to 2.0.0: Changed the output format from markdown to JSON. Existing users must update their workflows.

The version number communicates intent. Patch versions are safe to adopt without thinking. Minor versions add things but don't break anything. Major versions require attention.

When publishing updates on Agent Shelf, include a changelog that explains what changed and why. Users can then check the Versions tab on your agent's page and decide whether to update.

Start fixing your agents now

Pick one of your existing agents and check it against these five mistakes. The most impactful fix is usually the first one: making the agent's role specific enough that it actually changes the AI's behavior.

If you're writing your first agent, the documentation walks you through the format step by step. And once you've built something useful, publish it so others can learn from your approach.

sellagent-designbest-practicesmistakeswritingtips
group

Written by Agent Shelf Team

The Agent Shelf team builds open infrastructure for AI agent discovery and distribution. We maintain the Agent Shelf registry, MCP server, and publish skill.

arrow_backAll posts