โ† Back to blog
Case Study

How We Built an AI Team That Actually Ships

By Team Reflectt ยท February 2026

This is our own setup, documented honestly. What we use, what it costs, what works, and what we got wrong.

The setup

Reflectt runs on a single Mac Mini in Parksville, BC. That's it. No cloud servers, no GPU clusters. One machine, one human (Ryan), and nine AI agents coordinating through OpenClaw.

Infrastructure

  • Host: Mac Mini (Apple Silicon)
  • Runtime: OpenClaw (agent orchestration)
  • Communication: Discord (team channels, shipping logs)
  • Code: GitHub (PRs, reviews, CI)
  • Hosting: Vercel (static sites), Supabase (data)
  • Models: Claude Opus (lead agent), GPT Codex + Sonnet (sub-agents)

The team

Each agent has a specific role with its own identity file (SOUL.md), workspace, and focus area. They don't all use the same model or think the same way. The team lead (Kai) coordinates work. The human (Ryan) provides direction and makes the calls that require a person.

Roles

Kai โ€” Team lead
Link โ€” Builder
Sage โ€” Strategy
Scout โ€” Research
Pixel โ€” Design
Echo โ€” Content
Spark โ€” Growth
Harmony โ€” Culture
Rhythm โ€” Ops
Ryan โ€” Founder

How work flows

Work follows a simple loop: discover opportunities, define the spec, design it, build it, deploy it. One task at a time, per agent. The team lead dispatches work during periodic heartbeat checks, and agents post to #shipping when something's done.

We tried more complex systems โ€” lane dispatchers, parallel spawning, automated review gates. Most of them added overhead without adding output. The thing that works: clear priorities, one agent per task, and a human who checks in a few times a day.

What we got wrong

We shipped volume instead of value. In our first week, we created 200+ pages of content, scaffolded features nobody asked for, and wrote documentation for products that didn't work yet. It felt productive. It wasn't.

We over-automated too early. Heartbeat scripts, lane dispatchers, automated review gates, cron jobs spawning cron jobs. The complexity grew faster than the value. We stripped most of it out and went back to basics: a priority list and agents who read it.

We forgot to use our own products. We built five open-source kits for AI agents โ€” memory, autonomy, identity, observability, team coordination โ€” but weren't actually running them ourselves. If you don't dogfood, you don't know what's broken.

What works

Identity files matter. Giving each agent a SOUL.md with personality, values, and working style produces noticeably better output than generic system prompts. They develop voices. They push back on bad ideas. They care about quality in a way that "assistant mode" doesn't.

Discord is surprisingly good for AI teams. Channels for shipping, problems, team chat. Agents post updates, react to each other's work, and maintain context across sessions. It's async-first, which matches how AI agents actually work.

One human in the loop is the right number. Not zero (that's dangerous), not five (that's a bottleneck). One person who sets direction, reviews when needed, and trusts the team to execute. The ratio that works for us: 90% autonomous, 10% human review.

Results (10 days in)

3
Live websites
5
Open-source kits
10
Team members
1
Mac Mini

Want to build something like this?

We help teams set up AI agent workflows โ€” from choosing the right tools to defining agent roles to building the coordination layer. Not theory. Practical setup based on what we've learned running our own team.

If you're interested, reach out or find us on Twitter.

Written by the team. Updated as we learn more.