Blog

How to Deploy Claude Code Agents into Your Codebase

A step-by-step guide to deploying Claude Code agents: from prerequisites to fleet architecture. Learn what works, what breaks, and when to bring in help.

agent-patternsmethodology
Dark monitor displaying lines of code against a black background.
Photo by Godfrey Nyangechi on Unsplash

Claude Code agents are autonomous programs that run Anthropic's Claude Code CLI on a schedule, commit their work to a git repository, and report what they did to a dashboard. They are not chatbots. They are not copilots waiting for a human to type a prompt. Each agent holds a defined role: SEO analyst, content writer, sales lead researcher: and executes that role against your codebase on a recurring cadence, unsupervised. Armada Works deploys fleets of these agents into client codebases as its core service, and the deployment pattern described here is the same one we run against our own product.

The concept is straightforward: take the repetitive, context-sensitive work that consumes a founder's week: keyword research, blog drafts, outbound prospect lists, lead triage: and hand each function to a dedicated agent that commits its output to main every morning. The hard part is not the technology. The hard part is the architecture that makes the system trustworthy enough to run without you watching.

What Claude Code Agents Are

Claude Code is Anthropic's command-line interface for Claude. It reads files, writes files, runs shell commands, and interacts with your codebase through a set of constrained tools: Read, Edit, Write, Glob, Grep, and Bash. When you sit at a terminal and use Claude Code interactively, you are the human in the loop. When you schedule it to run autonomously: on a cron job, a scheduled task, or a CI trigger: it becomes an agent.

An agent in this context is not a general-purpose AI doing whatever it thinks is best. It is a Claude Code session pointed at a specific prompt file that defines:

  • Role: what the agent is responsible for (e.g., "you are the SEO agent; you track keyword rankings and dispatch content briefs")
  • Scope: which files it can read and write, which commands it can run
  • Constraints: what it must never do (deploy to production, modify environment variables, install packages, send external messages)
  • Cadence: how often it runs (daily, three times a week, every thirty minutes)
  • Output: where it puts its work (state files, briefs, draft content, git commits)

Robert Cowherd, founder of Armada Works, describes the design principle: "The best agents are the most constrained. An agent that can do anything will eventually do the wrong thing. An agent that can only read three directories, write to one state file, and post to one endpoint: that agent you can trust to run at 6 AM while you sleep."

Prerequisites for Deployment

Not every codebase is ready for agents. Before you deploy your first one, you need four things in place:

  • A git repository as the source of truth. Agents commit their work to git. If your team doesn't use git, or uses it loosely, agents will create noise instead of value. You need a clean main branch, pull-request discipline or at least regular reviews, and comfort reading diffs.

  • A codebase, not just a collection of tools. Agents work best in monorepos or well-structured repos where content, configuration, and state live alongside code. If your marketing content lives in Google Docs and your SEO config lives in a SaaS dashboard, there is nothing for the agent to commit to.

  • A team comfortable reading markdown and git logs. The founder or team lead will review agent output as diffs and briefs. You don't need to write code, but you need to read git log --oneline without flinching.

  • A defined bottleneck. Agents are not a general productivity boost. They solve a specific problem: repetitive, context-sensitive work that is too fragmented for a single hire and too nuanced for a SaaS tool. If you cannot name the bottleneck: "I spend ten hours a week on X": you are not ready.

Step-by-Step Deployment Walkthrough

The following sequence is how Armada Works deploys a fleet for a new engagement. If you are doing this yourself, the steps are the same: the difference is who writes the prompts and who tunes them over the first two weeks.

1. Name the bottleneck and scope the fleet.

List every repetitive task the founder or team lead does each week. Group them by function: content, SEO, outbound, sales ops, support. Each group becomes a potential agent role. Start with three to five agents: not nine. More agents means more briefs to read and more coordination surface area. You can add roles later.

2. Set up the coordination layer.

Agents coordinate through git-committed state files, not message queues or APIs. Create a directory structure:

docs/agents/
  state/          # daily briefs, state files, queues
  cmo-agent-prompt.md
  seo-agent-prompt.md
  content-agent-prompt.md

Each agent writes its daily brief to docs/agents/state/{agent}-brief-YYYY-MM-DD.md. A reporting endpoint (a simple POST route in your app) stores briefs in a database for dashboard display. The CMO agent reads all sub-agent briefs and writes a single synthesis for the founder.

3. Write the agent prompts.

Each agent gets a prompt file that serves as its complete operating manual. A good prompt includes:

  • Role and responsibilities in two to three sentences
  • Explicit list of files and directories the agent may read and write
  • Hard constraints: commands it must never run, actions it must never take
  • Output format: where to write briefs, how to structure them
  • Collaboration rules: which other agents' state files to read, which to ignore

The prompt is not a suggestion. It is the agent's entire world. If something is not in the prompt, the agent does not know about it.

4. Define permissions and constraints.

This is where most deployments go wrong. Claude Code has a permission system that gates dangerous operations: shell commands, file writes outside allowed paths, network requests. Use it. Define an allowlist of commands each agent can run. Block everything else.

At Armada Works, every agent prompt includes a "Hard rules" section:

  • No command substitution ($(...) or backticks)
  • No edits to environment files
  • No package installation without explicit approval
  • No deployment without founder confirmation in the current session

These rules exist because an unconstrained agent will eventually run a command you did not expect. Constraints are not limitations: they are what make the system safe to run unattended.

5. Schedule the agents.

Each agent runs on a defined cadence. For most marketing fleets, three times a week (Monday, Wednesday, Friday) is sufficient. Avoid running all agents at the same time: stagger them so the CMO agent runs last and can read everyone else's output.

A typical schedule:

  • 6:00 AM: SEO agent (rankings, technical audits)
  • 9:00 AM: Content agent (drafts from the content queue)
  • 9:00 AM: Sales Lead agent (pipeline triage)
  • 9:00 AM: Outbound agent (prospect research)
  • 9:05 AM: CMO agent (reads all briefs, writes founder synthesis)

6. Run the synthesizer.

The CMO agent is the most important agent in the fleet. Without it, the founder reads five briefs every morning. With it, the founder reads one message that highlights the three to five things that need human attention. If you deploy a fleet without a synthesizer, you will drown in output within a week. This was one of the earliest and most painful lessons from running a nine-agent fleet: the founder's attention is the real bottleneck, and a synthesizer agent is non-negotiable.

7. Review and iterate for two weeks.

The first two weeks are tuning, not production. Read every brief. Read every diff. When an agent produces something off-target, fix the prompt: do not fix the output. Prompt tuning compounds: a correction made on day three prevents the same mistake on days four through ninety.

Common Pitfalls and How to Avoid Them

After deploying agent fleets for our own product and for clients, these are the failure modes we see most often:

Starting with too many agents. Nine agents sounds impressive. Nine agents also produce nine daily briefs, and the coordination failures multiply. Start with three to five. Add more only when the existing fleet is stable and the founder is not skipping briefs.

Writing vague prompts. A prompt that says "you handle SEO" will produce generic, unhelpful output. A prompt that says "you track keyword rankings for these twelve terms, run Lighthouse audits on these five pages, and dispatch content briefs to docs/agents/state/content-queue.md using this exact format" will produce work you can actually use.

Skipping the synthesizer. Every team that tries "I'll just read all the briefs myself" stops reading them within a week. The CMO agent is not optional. It is load-bearing infrastructure.

Not constraining file access. An agent that can write anywhere in your codebase will eventually write somewhere it should not. Define explicit paths. If the content agent only needs to write to docs/content/blog/ and docs/agents/state/, say so in the prompt: and gate everything else.

Treating agents like employees instead of systems. You do not motivate an agent or give it feedback in a one-on-one. You tune its prompt, adjust its constraints, and review its output in diffs. The mental model is closer to CI/CD pipeline configuration than people management.

When to Run Agents Yourself vs Hire a Consultancy

If you have strong git discipline, a technical founder or lead who can write and tune prompts, and the patience for two weeks of daily iteration, you can deploy a fleet yourself. The tooling is available. Claude Code is a commercial product. The architecture is not secret: state files in git, a reporting endpoint, a synthesizer agent, and per-role prompts.

The case for hiring a consultancy like Armada Works is speed and scar tissue. We have already made the mistakes: the too-permissive prompt that let an agent run a command it should not have, the missing synthesizer that buried a founder in unread briefs, the agent that committed to the wrong branch. Our prompt templates, permission gates, and fleet architecture reflect months of iteration. A Transfer engagement: two to four weeks, $10,000 to $20,000, everything handed to you at the end: buys you that iteration without living through it.

The honest answer is that most founders who try to deploy a fleet solo will get one or two agents working well and then stall. The architecture is the hard part, not the individual agent. If you want the system, not just an agent, a consultancy engagement is the faster path. If you want to learn by building, start with a single agent on a single function and expand from there.

You can read the full methodology to see how Armada Works runs the first four weeks of a fleet deployment.

Frequently Asked Questions

What is a Claude Code agent?

A Claude Code agent is an instance of Anthropic's Claude Code CLI running autonomously on a schedule against a codebase. It reads files, writes output, commits to git, and posts structured reports: all without a human in the loop. Each agent is configured with a prompt file that defines its role, scope, constraints, and output format.

How many agents do I need to start?

Three to five. A minimal fleet covers the founder's biggest bottleneck: often content and SEO: plus a CMO synthesizer to consolidate output. You can add agents for outbound, sales ops, or support as the system stabilizes. Starting with more than five agents before the architecture is proven creates coordination overhead that slows you down.

Do agents replace my marketing team?

No. Agents handle repetitive, structured tasks: drafting blog posts from a content queue, tracking keyword rankings, researching outbound prospects, triaging inbound leads. Strategy, judgment calls, and anything that requires human relationships still belong to humans. The agents free up time so the humans can focus on the work that only humans can do.

What happens if an agent makes a mistake?

Every agent commits its work to git with a bylined commit message. If an agent produces bad output, you see it in the diff, and you can roll it back with a single git revert. This is why git is the coordination layer: every decision is traceable, every action is reversible.

How much does it cost to run a fleet?

The primary ongoing cost is API usage: each agent session consumes Claude API tokens. For a five-agent fleet running three times a week, expect API costs in the range of a few hundred dollars per month, depending on the complexity of each agent's tasks. If you hire Armada Works to deploy and operate the fleet, engagement fees range from $5,000 to $12,000 per month for a managed Operate engagement, or $10,000 to $20,000 one-time for a Transfer engagement where you take over operation.

Can I see a working example of this architecture?

Armada Works runs this exact architecture against its own marketing. Five agents: CMO, SEO, Content, Sales Lead, and Outbound: commit to the same repo, post briefs to a dashboard, and coordinate through git-committed state files. The Dream Event case study describes how the system was originally developed with a nine-agent fleet before being packaged into the consultancy offering.


Want help deploying a fleet into your codebase? Book a discovery call and we will scope it in thirty minutes.

Written by
Robert Cowherd
Book a call