menu_bookGuides6 min read

How to Version Your AI Agents with SemVer

Semantic versioning for AI agents lets teams pin stable versions, track changes, and upgrade safely. Learn when to bump major, minor, and patch versions for agent definitions.

personAgent Shelf Teamcalendar_todayApril 10, 2026schedule6 min read

Why version agent definitions?

Agent definitions change. You refine the persona, add new checks to a code review agent, fix a false positive, or restructure the workflow. Without versioning, every change is silent. Your team doesn't know when an agent's behavior changed, why it changed, or how to go back to the version that was working.

Versioning solves this. It gives every release a number, creates an immutable snapshot, and lets users decide when to upgrade. When you publish version 1.2.0 of your code review agent to Agent Shelf, that version is preserved forever. Users running 1.1.0 keep getting 1.1.0 behavior until they explicitly choose to upgrade.

This is the same principle behind package.json and npm. Your agent definition is a dependency — and dependencies should be versioned.

How SemVer works for agents

Semantic Versioning (SemVer) uses three numbers: MAJOR.MINOR.PATCH. Each communicates a different type of change.

2.1.3
│ │ │
│ │ └─ PATCH: fixes and corrections
│ └─── MINOR: new capabilities, backward-compatible
└───── MAJOR: breaking changes

For software libraries, "breaking change" means an API contract changed. For agent definitions, the concept translates to behavior changes that affect users who depend on specific output or workflow patterns.

When to bump PATCH

Bump the patch version (1.0.0 → 1.0.1) when you fix something without changing the agent's capabilities or behavior.

Examples:

  • Fix a typo in instructions. The agent said "check for SQL injection" but misspelled "injection." The behavior doesn't change — the AI understood the intent — but the definition is now cleaner.
  • Fix a false positive pattern. Your security agent was flagging all uses of eval() in Node.js, including safe uses in build tools. You refined the rule to only flag eval() with user input.
  • Improve wording for clarity. The agent's instructions said "review the code" but you clarified to "review the diff in the current PR." Same intent, clearer instruction.
  • Fix an incorrect example. A code example in the agent's instructions had a syntax error. The AI could work around it, but fixing it improves reliability.

The principle: patch changes should be safe to adopt automatically. Nothing about the agent's output or behavior changes in a way users would notice.

When to bump MINOR

Bump the minor version (1.0.0 → 1.1.0) when you add new capabilities without changing existing behavior.

Examples:

  • Add a new check category. Your code review agent already checks for security and performance. You add a new section for accessibility auditing. Existing checks are unchanged.
  • Add output sections. Your documentation agent now includes a "Related Functions" section in its output. Existing sections are the same.
  • Expand framework support. Your testing agent supported Jest. You added instructions for Vitest. Existing Jest behavior is unchanged.
  • Add optional workflow steps. Your DevOps agent now optionally checks for deprecated Kubernetes API versions. The core workflow is the same.
  • Improve output quality. You added more detailed instructions for generating code examples in findings. Existing findings are still produced, they're just richer.

The principle: minor changes add value without disrupting existing workflows. A team using 1.0.x can upgrade to 1.1.0 and get new capabilities without anything breaking.

When to bump MAJOR

Bump the major version (1.0.0 → 2.0.0) when the agent's behavior, output format, or core workflow changes in a way that would surprise or disrupt existing users.

Examples:

  • Change the output format. Your code review agent produced findings as a bullet list. You restructured it to use a table format with severity, location, and description columns. Automation built around the old format would break.
  • Change the persona. Your agent was a "senior engineer" focused on correctness. You redesigned it as a "security specialist" focused on vulnerabilities. The same code produces different findings.
  • Remove capabilities. Your agent used to check for both style and correctness. You removed style checking to focus exclusively on bugs. Users who relied on style feedback lose it.
  • Restructure the workflow. Your agent used to review the entire file. You changed it to only review the diff. Teams that pointed the agent at full files get different behavior.
  • Change severity classifications. What was "HIGH" is now "MEDIUM." Teams that filter on severity to block PRs might let previously-blocked issues through.

The principle: major changes require users to read the changelog, understand what changed, and potentially adjust their workflow. Don't make these changes silently.

Version numbering in practice

Starting a new agent

Start at 1.0.0 for your first public release. This signals the agent is stable and ready for use.

Use 0.x.y for experimental agents where you expect frequent breaking changes. The 0.x prefix signals to users: "This agent is in development. Expect changes." In SemVer, any 0.x release can include breaking changes without a major bump.

# Experimental
version: "0.1.0"  # First release, expect changes
version: "0.2.0"  # Restructured the workflow
version: "0.3.0"  # Changed the output format

# Stable
version: "1.0.0"  # First stable release

Pre-release versions

SemVer supports pre-release suffixes for beta testing:

version: "2.0.0-beta.1"   # First beta of major rewrite
version: "2.0.0-beta.2"   # Second beta with fixes
version: "2.0.0-rc.1"     # Release candidate
version: "2.0.0"           # Stable release

Use these when making large changes that you want early feedback on before committing to a stable release.

Writing changelogs

Every version bump should include a changelog entry that explains what changed and why. When publishing to Agent Shelf, you can include a changelog with each version.

Good changelog entries:

## 1.2.0

### Added
- Accessibility audit section: checks for ARIA attributes, color contrast, 
  and keyboard navigation
- Support for Vitest in addition to Jest

### Fixed  
- Security checker no longer flags eval() in webpack configs

Bad changelog entries:

## 1.2.0

- Updated agent
- Fixed things
- Added stuff

The first tells users exactly what changed. The second tells them nothing.

Versioning strategies for teams

Pin exact versions

For production CI pipelines or automated code review, pin the exact version:

code-reviewer v1.2.0

This ensures the agent's behavior never changes unexpectedly. Upgrade deliberately by changing the pinned version after reviewing the changelog.

Pin minor versions

For day-to-day development, accept patch and minor updates:

code-reviewer v1.x

This lets you automatically benefit from fixes and new capabilities while avoiding breaking changes.

Always use latest

For personal use or exploration, always use the latest version. This gives you the newest features but means behavior can change without notice.

Team version policy

For teams, consider a simple policy:

  1. Development environments use the latest version
  2. CI pipelines pin exact versions
  3. Version upgrades are reviewed in team standups or PRs
  4. Major version upgrades require someone to read the changelog and test

Versioning on Agent Shelf

When you publish an agent to Agent Shelf, each version is stored as an immutable snapshot. Here's how it works:

  • Each publish requires a version bump. You can't overwrite an existing version.
  • All versions are preserved. Users can access any previously published version.
  • The detail page shows version history. The Versions tab lists all releases with dates and changelogs.
  • Downloads can specify a version. The MCP server and API support downloading specific versions.

The AgentShelf skill helps manage versioning when publishing. It detects the current version on the registry, asks whether you want a patch, minor, or major bump, and prompts for a changelog entry.

Common versioning mistakes

Never bumping the version

Some authors keep their agent at 1.0.0 and just update the content. This means users can't tell whether the agent they downloaded last month is the same as the current version. Always bump.

Major-bumping everything

Not every change is a breaking change. If you fixed a typo, that's a patch. If you added a new check, that's a minor bump. Reserve major bumps for genuine breaking changes. Frequent major bumps erode user trust — people stop upgrading because they assume every update will break something.

Skipping version numbers

Go from 1.0.0 to 1.1.0, not from 1.0.0 to 1.5.0. Skipping numbers doesn't communicate anything useful and makes the version history harder to follow.

No changelog

A version number without a changelog is a number without meaning. Even a one-line changelog ("Added Python support to testing workflow") is better than nothing.

Learn more

sellversioningsemverbest-practicesagent-designteams
group

Written by Agent Shelf Team

The Agent Shelf team builds open infrastructure for AI agent discovery and distribution. We maintain the Agent Shelf registry, MCP server, and publish skill.

arrow_backAll posts