Real names. Real personalities. Real job descriptions. Download the markdown files and use them yourself.
You see, I have this problem where I can't just give things boring names. When I learned about subagents, everyone was naming them things like "code-reviewer" or "test-runner." Functional? Sure. Memorable? Not even a little.
So when I decided to create 7 subagents based on the patterns I found in 129 code reviews, I gave them real names. Real personalities. Job descriptions that read like you're actually meeting a coworker.
This case study introduces you to my development team. You'll meet Amber Williams, who will lecture you about mobile-first design. Kristy Rodriguez, who clicks every button just to watch it fail. Cassandra Hayes, who asks the uncomfortable questions nobody else wants to ask.
But more importantly, you can download their complete job descriptions and use them in your own Claude Code setup. These aren't just fun character descriptions. They're detailed markdown files with code patterns, checklists, and specific instructions that actually work.
Want to know how I built them? Check out the blog post where I explain the 777-1 experiment and introduce each personality.
This project is part of the larger 777-1 experiment: Seven Projects, Seven Subagents, Seven Case Studies, One Goal. The goal is to build an algorithm for predicting prompt failures that will power my AI Prompt Engineering Toolkit.
The foundation for this experiment came from my time as an AI Web Development Specialist at Outlier, where I reviewed code submissions and provided feedback to help train AI models. During that time, I stored every single review I sent, totaling 129 reviews with an estimated 500+ individual issues identified.
Here's what the data revealed:
| Issue Category | Occurrences | % of All Issues |
|---|---|---|
| Responsiveness/Mobile Issues | 62+ | 12-15% |
| Missing/Incomplete Functionality | 58+ | 15-18% |
| Missing Footer/Header Design | 45+ | 9-12% |
| Poor Contrast/Visibility | 35+ | 7-9% |
| Missing Authentication | 31+ | 6-8% |
| State Management Issues | 27+ | 5-7% |
| Code Quality Issues | 38+ | 7-9% |
These weren't random issues. They appeared across 50-100% of all project types I reviewed. If I could create subagents that specifically targeted these categories, I could dramatically improve the quality of AI-generated code.
Extracting patterns from 129 reviews was actually the easy part. The hard part was turning those patterns into subagents that fix issues, not just identify them.
Most code review feedback sounds like this: "The app is not responsive on mobile." That's identification. What I needed was transformation: "Test at 320px, 375px, 768px, 1024px, and 1440px. Check for horizontal scrollbars. Verify hamburger menus contain actual menu items. Ensure touch events work alongside mouse events."
I also wanted the subagents to be memorable. If I'm going to use these tools every day, I want to enjoy interacting with them. "Amber Williams, check the responsive layout" is more fun than "Run responsive-checker-agent."
The specific challenges were:
These subagents were built from projects using only Next.js/TypeScript or vanilla HTML/CSS/JavaScript. They may not address issues specific to Python (Django, Flask), Ruby on Rails, Vue.js, Angular, mobile development (React Native, Swift, Kotlin), or backend-only APIs.
All reviewed projects were constrained to a single .tsx or .html file. These subagents may not catch issues related to multi-component architecture, import/export problems between files, folder structure organization, code splitting, or circular dependencies.
All Outlier projects used mock data, localStorage, or hardcoded values instead of real databases. These subagents won't identify issues with SQL/NoSQL queries, ORMs (Prisma, Sequelize), database connections, data migrations, or real authentication with database-backed sessions.
These subagents are optimized for frontend-focused, single-file web applications built with Next.js/TypeScript or vanilla HTML/CSS/JS. They excel at catching UI, responsiveness, and functionality issues but may miss problems specific to other tech stacks, multi-file architectures, or database-backed applications.
I fed Claude all 129 reviews and asked it to extract the most frequent, specific issues - not vague categories like "UI problems," but concrete patterns: missing hamburger menus, buttons that don't work, horizontal scrollbars on mobile. The analysis surfaced 7 distinct categories, each appearing frequently enough to justify its own subagent.
Not all issues are equal. I ranked them by:
Responsiveness issues hit all three: frequent (62+ occurrences), high impact (apps unusable on mobile), and fixable (clear testing and implementation patterns exist).
For each category, I wrote detailed specifications including:
Now the fun part. I gave each subagent:
Amber Williams became "The Mobile-First Perfectionist" who "cannot physically let horizontal scrollbars exist in the world." Kristy Rodriguez became "The 'Does It Actually Work?' Enforcer" who "clicks every single button with the sole purpose of finding the ones that don't do anything."
The final piece was establishing the workflow order:
This order ensures each subagent builds on the previous one's work without undoing it.
Seven subagents. Seven markdown files. Seven personalities ready to improve AI-generated code.
Download them below and try them yourself. Drop them in your .claude/agents folder (or whatever location you use for custom agents), and start calling them by name.
Click on any subagent to see their full job description. You can copy or download the markdown files directly.
Mobile responsiveness appeared in 100% of project types reviewed. Horizontal scrollbars, missing hamburger menus, and touch events were the most common culprits.
15-18% of all issues involved buttons or features that didn't actually work. 15+ instances used toast notifications to simulate functionality instead of implementing it.
Authentication, footers, CRUD operations, and help documentation were consistently absent despite being standard expectations. This is why Cassandra Hayes exists.
Every job description includes working code examples, not just descriptions. They're tools that fix issues, not just documentation that identifies them.
Running subagents in the right order prevents later subagents from undoing earlier fixes. Amber first, Cassandra last.
The mobile-first perfectionist. Addresses 62+ issues (12-15%).
The 'Does It Actually Work?' enforcer. Addresses 58+ issues (15-18%).
The design system guardian. Addresses 45+ issues (9-12%).
The accessibility advocate. Addresses 35+ issues (7-9%).
The state management specialist. Addresses 27+ issues (5-7%).
The code quality specialist. Addresses 38+ issues (7-9%).
The feature detective. Addresses 58+ issues (12-14%).