I run production AI agent teams across nine companies. Not demos. Not proof-of-concepts. Systems that wake up on a schedule, do real work, produce real output, and go back to sleep until the next run.
This post is the technical walkthrough of how I build them.
I'm using OpenClaw as the orchestration layer, Claude and Kimi 2.5 as the LLMs, and a set of patterns I've developed over the last year of deploying these systems for my own ventures and for clients. If you're building AI agents that need to run autonomously — not chatbots, not copilots, but actual autonomous workers — this is how I do it.
The Architecture in Plain English
Every AI agent system I build has three components:
Skill files define what the agent does. Think of them as a job description plus standard operating procedures, written in a format an LLM can follow. A skill file contains the agent's role, the inputs it expects, the outputs it produces, the tools it has access to, and the specific instructions for how to do its job.
Cron schedules define when the agent works. Monday at 9am. Every 6 hours. First of the month. The agent doesn't sit around waiting for a prompt — it gets triggered on a schedule, does its job, and stops.
Approval gates define what requires human review. Anything customer-facing gets flagged for approval before it goes live. Internal analysis, research, drafts — those can run fully autonomous. The gate is the trust calibration: how much do I trust this agent to act on my behalf?
That's it. Skill + schedule + gates. Every agent I deploy follows this pattern.
A Real Example: The Entity SEO Engine
Let me show you a concrete system. This is the entity SEO engine I built for my own personal brand — the system that's producing the content strategy behind mattcretzman.com.
It has seven agent roles:
1. Entity Strategist — Runs monthly. Browses Google for every branded search term ("Matt Cretzman," "Matt Cretzman AI," "Matt Cretzman TextEvidence," etc.), catalogs page 1-3 results, checks spoke sites for founder pages, verifies structured data, and produces a monthly entity health report with gaps and priorities.
2. Keyword Researcher — Runs weekly (Mondays). Reads the target keyword clusters file, identifies content gaps, checks what's ranking, and adds new opportunities to the content queue.
3. Authority Content Writer — Runs daily (Tuesday through Saturday). Pulls the next item from the content queue, reads the keyword brief, writes a full blog post following the brand voice guide and SEO checklist, and saves it as a draft.
4. SEO Editor — Runs after each post is written. Scores the draft against a 100-point SEO checklist covering structure, keyword optimization, entity signals, technical SEO, and engagement. Posts scoring 90+ go to publish. 80-89 get flagged for minor tweaks. Below 80 get sent back to the writer.
5. Publisher — Runs after editorial approval. Takes the approved post, commits it to the Git repo, and deploys.
6. Cross-Poster — Runs Wednesday and Friday. Takes published posts and generates adapted versions for Medium (full repost with canonical URL), LinkedIn (newsletter edition), and dev.to (technical posts only).
7. Entity Monitor — Runs weekly (Fridays). Checks branded search results, compares to the previous week's snapshot, flags changes, and produces a digest.
Seven agents. One system. It runs the entire content operation for my personal brand with minimal human intervention. I review the weekly digest. I approve or redirect posts that need it. Otherwise, the engine runs.
The Skill File Anatomy
Here's what a skill file actually looks like in structure. I'll use the Authority Content Writer as an example:
entity-seo-engine/
├── SKILL.md ← The main instruction set └── refs/ ├── brand-voice.md ← Matt's writing style guide ├── seo-checklist.md ← 100-point scoring rubric ├── keyword-clusters.md ← Target keyword families ├── mdx-template.md ← Blog post template ├── content-queue-schema.md ← JSON schema for the queue ├── entity-map.md ← Entity relationships └── cross-post-guide.md ← Syndication rules
The SKILL.md is the main file — always under 500 lines. It contains the role definition, the step-by-step process, and pointers to reference files the agent should read for specific tasks.
The reference files are loaded on demand. The writer agent reads brand-voice.md and seo-checklist.md every time it writes a post. It reads keyword-clusters.md when selecting what to write about. It never needs to load all seven reference files at once — progressive disclosure keeps the context window focused.
This pattern — a concise main skill file with on-demand reference documents — is what makes the system work at scale. You can build incredibly sophisticated agent behavior without hitting context window limits, because the agent only loads what it needs for the current task.
The Cron Configuration
The schedule for the entity SEO engine looks like this:
| Agent | Schedule | Trigger |
|---|---|---|
| Entity Strategist | 1st of each month, 8am | Cron |
| Keyword Researcher | Monday 8am | Cron |
| Content Writer | Tue-Sat 9am | Cron |
| SEO Editor | After each write | Event (post-write hook) |
| Publisher | After editorial pass | Event (approval gate) |
| Cross-Poster | Wed + Fri 2pm | Cron |
| Entity Monitor | Friday 4pm | Cron |
Some agents run on time-based crons. Others run on event triggers — the SEO Editor fires automatically when the Content Writer finishes a draft. The Publisher fires when a human (me) approves the editorial output.
This hybrid approach — cron for scheduled work, events for sequential workflows — is how you build systems that feel cohesive rather than like a collection of disconnected scripts.
Approval Gates in Practice
This is where most people building AI agents either over-trust or under-trust the system.
Over-trust: let the agent publish customer-facing content without review. You'll get something embarrassing in production within a week.
Under-trust: require approval for every intermediate step. You'll spend more time reviewing agent output than it would take to do the work yourself. That defeats the purpose.
My rule: anything a customer or prospect will see requires human approval. Everything else runs autonomous.
For the SEO engine, that means:
- Keyword research → autonomous
- Content writing → autonomous (agent writes the draft)
- SEO scoring → autonomous
- Publishing → approval gate (I review the draft before it goes live)
- Cross-posting → autonomous (the source post was already approved)
- Entity monitoring → autonomous (it's reporting to me, not acting externally)
For TextEvidence's AI SDR, the gates are different:
- Lead qualification → autonomous
- Answering standard product questions → autonomous (trained on a knowledge base)
- Booking demo calls → autonomous
- Custom pricing discussions → approval gate
- Anything involving legal claims about the product → approval gate
The gates aren't static. As I build trust in an agent's output quality, I loosen the gates. The SEO engine's Content Writer started with every post requiring approval. Now, posts scoring 95+ publish automatically. I only review the ones that score 80-94.
Trust is earned, even from AI.
The LLM Selection Decision
I use two primary models:
Claude for reasoning-heavy tasks — content writing, strategic analysis, entity mapping, anything that requires nuanced judgment. Claude handles the Authority Content Writer, Entity Strategist, and SEO Editor roles.
Kimi 2.5 for high-volume, cost-sensitive tasks — cross-posting adaptations, routine keyword research, and monitoring. Where I need throughput and the task is well-defined, Kimi 2.5 delivers at a fraction of Claude's cost.
The decision framework is simple: if the task requires judgment, use Claude. If the task requires volume and follows a clear template, use Kimi 2.5. Some agents use both — the Content Writer uses Claude for the initial draft and Kimi 2.5 for generating the Medium and LinkedIn adaptations.
Deploying This for Clients
The pattern I've described for my personal brand is the same pattern I deploy for Stormbreaker Digital clients. The specific agents change — a manufacturing client might need an RFP response agent instead of a content writer — but the architecture is identical:
- Map the process. Walk through every marketing or sales workflow the client runs. Identify the repetitive, well-defined tasks.
- Design the agents. Each repeatable task becomes an agent with a skill file. Define the role, inputs, outputs, tools, and instructions.
- Set the schedule. Cron for time-based work. Event triggers for sequential workflows.
- Calibrate the gates. Start with more human review than you think you need. Loosen as the agents prove reliable.
- Deploy and monitor. The Entity Monitor pattern applies to every system — build an agent that watches the other agents and reports on output quality.
For one defense contractor, I deployed a recruitment marketing system across 25+ military installations. AI agents handled LinkedIn outreach across 4 profiles simultaneously, email campaigns across multiple domains, and blog content for SEO — all coordinated through a Cascading Channel Architecture that ensured each channel benefited from data generated by the others.
The connection-to-opportunity rate hit nearly 4x the industry benchmark. Not because the AI was magic — because the system ran consistently, on schedule, without the human inconsistency that tanks most outbound campaigns.
The Mistakes I Made Early
A few things I got wrong that might save you time:
Vague skill files. My first agent instructions read like I was talking to a smart intern: "Write a good blog post about AI agents." The output was generic garbage. The fix was explicit instructions — word count targets, SEO requirements, structural templates, voice guidelines, specific examples of good and bad output. AI agents don't struggle with capability. They struggle with ambiguity.
No monitoring agent. I deployed a content engine without a quality check and got three weeks of declining SEO scores before I noticed. Now every system includes a monitoring agent that tracks output quality over time.
Too many gates. My first client deployment required approval at five different stages. The client got approval fatigue by day three and started rubber-stamping everything. Two gates — one after draft, one before external publish — is usually right.
Monolithic skill files. I tried putting everything into one massive SKILL.md. The context window got saturated and the agent's quality dropped. The progressive disclosure pattern — concise main file, reference docs loaded on demand — solved this completely.
What This Actually Costs
The infrastructure cost for running these agent systems is surprisingly low.
The entity SEO engine running 5 posts per week, with all seven agent roles active, costs roughly $50-75/month in LLM API calls. That includes Claude for writing and editing, Kimi 2.5 for cross-posting and monitoring, and the occasional burst when the Entity Strategist runs its monthly comprehensive audit.
Compare that to a content marketing specialist ($4,000-$8,000/month), a content writer ($2,000-$5,000/month), and an SEO analyst ($3,000-$6,000/month). The agent system doesn't replace all human judgment — I still review and approve — but it replaces 90% of the execution hours.
For HeyBaddie, the entire AI platform runs at $15-25/month. For TextEvidence's SDR agent, the cost is negligible compared to a human SDR at $50,000-$70,000/year base salary.
The economics aren't even close.
The Compounding Effect
Here's what nobody tells you about building AI agent systems: the second one is 5x faster than the first, and the tenth is barely any work at all.
Every skill file I write teaches me patterns that transfer. The approval gate framework I built for the SEO engine works for any content system. The monitoring agent pattern works for any system. The progressive disclosure architecture works for any complex skill.
More importantly, skill files themselves transfer between ventures. The Delightful Outbound skill I wrote for TextEvidence's attorney outreach campaign is now used by Stormbreaker for every client engagement. The brand voice extraction process I built for a healthcare payments client became a reusable Voice Extraction Method. The PRD template I use for Phase-Gate AI Development started as a one-off document for MyPRQ and now kicks off every product build.
This is what compounding looks like with AI agents. Not just the output — the infrastructure itself gets better with every deployment.
Tomorrow: the origin story nobody asks about — how a nonprofit founder who lost his daughter ended up building AI agent systems in Texas. The context matters.
I'm Matt Cretzman. I build AI agent systems through Stormbreaker Digital. If you want to see the systems in action — or if you're building agent teams yourself and want to compare notes — connect with me on LinkedIn or follow along at mattcretzman.com/blog.
