Blog

What Happens After the AI Consultancy Leaves

A practical owner's manual for running your AI agent fleet after the handoff. What breaks, what doesn't, and the three habits that keep the system running.

methodology
Vintage hand tools arranged on a dark surface with worn wooden handles and weathered metal heads
Photo by Fer Troulik on Unsplash

The engagement ends on a Friday. You get a final commit, a clean runbook, and a recorded walkthrough. Monday morning, the agents run on schedule and nobody from Armada Works is watching. That is the whole point.

Our transfer engagement is built around a specific promise: the system is yours when we leave. Not "yours in the sense that you can log in." Yours in the sense that every prompt, every state file, every dashboard component lives in your repo, committed to main, and you can modify any of it with a text editor. But owning a system and operating one are different things. This is the agent handoff playbook for the second part.

The First Monday Morning

Your agents have been running for two to three weeks by the time we exit. You have already seen the pattern: briefs land in the dashboard before you open your laptop. The CMO agent has read the sub-agent output and written you a synthesis. Your job is the same as it was during the shadow week. Read the synthesis. Decide if anything needs a human response. Move on.

The difference is that nobody is watching over your shoulder. If a prompt needs tuning, you tune it. If an agent's output drifts, you catch it in the daily brief or you don't. The system doesn't degrade suddenly. It drifts. And drifts are fixable if you're paying attention for five minutes a day.

That five-minute read is the single habit that keeps everything running. When we ran our own fleet against a sibling product, the founder's daily review was the one non-negotiable. Skip it for a day and nothing breaks. Skip it for a week and you start finding stale briefs, missed keyword shifts, and outbound drafts that reference last month's positioning. The system doesn't crash. It gets quietly dumber.

What Actually Breaks

After running our own agent fleet for months and building the architecture that became Armada Works: almost nothing breaks mechanically. The agents run on scheduled tasks. They commit to main. They post briefs. The infrastructure is boring on purpose.

What does go wrong falls into three categories.

Prompt drift. You launched with prompts tuned to your current positioning, your current ICP, your current product. Three months later, at least one of those has shifted. The Content agent is still writing for last quarter's audience. The SEO agent is still targeting keywords you have since deprioritized. This is the most common failure mode, and it is completely fixable. Open the prompt file, edit the relevant section, commit. The agent picks up the change on its next run.

State file staleness. Agents coordinate through state files in git. If you change something upstream (a new pricing page, a renamed product tier, a deprecated feature) and forget to update the relevant state files or prompt references, the agents will keep operating on old information. The fix is the same: edit the file, commit, push.

Credential expiration. API keys, service role tokens, OAuth credentials. These expire. When they do, the agent's run fails silently or posts a brief that says "credentials invalid." The runbook we hand off includes a checklist of every credential the fleet uses and its renewal cadence. Keep the checklist current.

None of these are emergencies. All of them are maintenance. The question is whether you build the habit of doing that maintenance regularly or let it accumulate until the fleet's output stops being useful.

Three Habits That Keep It Running

You do not need a technical team to maintain an agent fleet. You need someone who reads diffs, edits markdown, and checks in on the system regularly. Here is what that looks like in practice.

Habit one: the daily five-minute read. Open the CMO synthesis. Read it. If something looks off, open the sub-agent brief that feeds it. If the sub-agent brief is wrong, either fix the prompt or flag it for the next tuning session. Most days, the answer is "everything looks fine" and you close the tab.

Habit two: the monthly prompt review. Block 30 minutes once a month. Open each agent's prompt file. Read the positioning section, the ICP description, the hard rules. Ask yourself: is this still true? If your pricing changed, update it. If you added a new product, add it. If your target customer shifted, rewrite the ICP block. This is the single highest-leverage maintenance task for any agent fleet.

Habit three: the quarterly retrospective. Look at the last 90 days of agent output. Which agents are producing work you actually use? Which are generating briefs you skip? If an agent is consistently producing output nobody reads, either retune its prompt or pause it. A four-agent fleet that produces useful work beats a six-agent fleet where two agents are noise.

Expanding the Fleet

After you have the habits down, you will start seeing opportunities the original fleet was not designed to cover. Maybe you need a support triage agent. Maybe the outbound agent should be split into two: one for cold research, one for warm follow-up. Maybe you want an agent that monitors competitor pricing pages weekly.

The architecture we hand off is designed for this. Adding an agent means writing a prompt file, setting a cadence, and wiring it into the CMO agent's read list so it gets synthesized. The documentation covers the process step by step. When we built the system that became Armada Works, we started with five agents and expanded to nine as the operation matured. Not every addition stuck. Two agents got paused because their output didn't justify the daily review time. That is fine. The system is cheap to experiment with because the only cost is runtime, and pausing an agent costs nothing.

When to Call for Help

The transfer engagement includes an optional $1,500/month support tier (see pricing). Most founders use it for the first two to three months, then drop it. Here is a rough guide for when it is worth having.

You probably need support if you are the only technical person on the team and your agents touch integrations you did not build (API connections, webhook handlers, database queries). Having someone to call when a credential expires or an integration breaks saves real time in the first quarter.

You probably do not need support if you have a developer on the team who is comfortable with git, your fleet is content and SEO only (no complex integrations), and you did the shadow week attentively. The system is markdown and scheduled tasks. If you can edit a text file and push to main, you can maintain it.

If you are unsure, start with support and cancel when you stop using it. There is no lock-in and no minimum term. The goal, as with everything in the transfer model, is to make yourself independent as fast as possible.

The Metric That Matters

Six months after the handoff, ask yourself one question: would you rebuild this system from scratch if it disappeared? If the answer is yes, the engagement worked. You internalized the architecture, adapted it to your operation, and made it yours. If the answer is no, something went wrong during the transfer, and that is worth a conversation.

The entire premise of Armada Works is that the system we build for you should outlast the engagement. Not because we built something so complex that only we can maintain it. Because we built something simple enough that you can. A fleet of agents that commit to main, post briefs to a dashboard, and coordinate through state files. No proprietary platform. No vendor dependency. Just your repo, your prompts, and the habits to keep them current.

If you want to see whether your codebase is ready for a fleet, book a 30-minute discovery call. Or read how we engage to understand the full methodology before you reach out.

Written by
Robert Cowherd
Book a call