Back to Blog
8 min
technical

777-1: Seven Projects, Seven Subagents, Seven Case Studies, One Goal

129 code reviews, 7 recurring issues, 7 custom subagents. Here's my ambitious plan to build an algorithm that predicts prompt failures before they happen.

777-1AIPrompt EngineeringSubagentsCase StudiesBuilding in PublicOutlierPortfolio Development

777-1: Seven Projects, Seven Subagents, Seven Case Studies, One Goal

Published: November 23, 2025 • 8 min read

I've been cooking these past few days. Not just yellow potatoes, but ideas.

You see, I have a confession to make.

The Project I Put on Pause

In this blog post, I mentioned that I was going to work on something I nicknamed "Project of Projects." This project is the AI Prompt Engineering Toolkit, and I even highlighted a plan for it. This project is supposed to be the project that demonstrates the intuition I have built over my lifetime of working extensively with AI.

However, since the 23rd of October, this project has been sitting at 30% completion.

And the reason is that for a long time, I haven't been exactly clear on how exactly I want this application to work or come together. So far, I have made good use of the application, even while at 30% completion, when I performed this case study to see how well Claude Code could redesign the application, changing its design trend from Neobrutalism to Glassmorphism. However, that is not where I want the application to end.

30 Days of Clarity

Over the past 30 days (wow, it's actually quite shocking to me that I'm writing this exactly 30 days since I put that project on pause), I've had this app sitting in the back burner of my brain. Given the work I have done, I think I finally have some clarity on how I am going to use this application to demonstrate the intuition I have built while working with AI.

Here is my plan.

The Foundation: 129 Code Reviews

When I worked at Outlier reviewing applications built by other attempters with the goal of generating golden standard applications, I stored every review that I sent. The reason is that, as I described in the review workflow, the same task could come to me multiple times to be reviewed, and keeping track of the reviews I sent previously for a specific task allowed me to confirm that the attempter fixed all the issues highlighted in my reviews.

Here are 3 examples of what the reviews I sent looked like:

Review Example 1

Amazing Job on the UI! Well done, unfortunately, I will have to send this back as a lot of the features are not functional, and one of the crucial baselines is that "THE UI MUST BE FULLY FUNCTIONAL AND WORK CLOSELY AS A PRODUCTION UI WOULD". You might want to start by implementing a login and sign-up authentication system for the application. I would also suggest implementing the search functionality and allowing users to leave comments as well. I noticed the main header at the top becomes transparent as you scroll through. While this looks "cool", it is also weird to see text on the header over the text on the main page as you scroll through; you might want to change that. Good job adding a footer, but consider improving it to make it more commercial-like. It also doesn't make sense that there is a follow button on the profile section. Does this mean I can follow myself? How would that work? You should also implement the like functionality. When you are done with these, if there are other functionalities/buttons that have no function, consider adding a toast message that announces that that feature is coming soon when it is clicked on.

Review Example 2

This is a good start to the task but unfortunately there are a number of issues I have to bring to your attention. First, the nature of this application makes it one that by default requires an authentication system so please include sign in and sign up features. I see that in settings I am able to edit the profile information but this does not change the profile icon which has the text "AS" on it so it makes the update inconsistent. The footer looks nice but on really large screens, it has extra spacing on the left and right side which does not look good. You want to make sure that the content fills up the footer area without leaving any unnecessary white space. Between dimensions 770-1316, the application, particularly the header/navigation is not responsive. As I make the width of the screen smaller, the footer acts responsively and divides its content between 2 instead of 4 columns. The issue with this though is that the right side of the footer then has an excessive amount of space compared to the left side. When I start a study session, then pause it, I expect the pause button to change its text to "Continue" and the "Start" button to be greyed out. Please correct this to make that feature more intuitive. The "Browse Courses" button on the Dashboard page does not open the "Courses" section as I would expect. In the "My Courses", "Progress", "Recommendations" and "Discussions" sections, there isn't enough padding around the content in the main content area so they appear to be too close to the tags at the top on those pages. The toast notifications that start with "Opening..." can be a bit confusing. It might be a good idea to explicitly say in the toast message that the feature is yet to be implemented. When I unfollow a discussion, the UI does not update immediately until I switch filters. Please fix this as well. If the icons in the footer are meant to be social media icons, I am pretty sure you can find better more intuitive icons to use there. When you change the icons, put the actual social media website urls in the href tag. Please fix these issues and any other ones you notice as you do so.

Review Example 3

Good start to the task, but there is a little issue with the code. It appears that you have JSX mixed inside a string template literal, which is invalid JavaScript syntax. Look at the div tag below where you have the comment "{/* Start & End Time */}". I believe this should be resolved by moving the JSX outside the className. Also, there is a part of the application that doesn't seem very intuitive. Upon first glance, I wonder what the difference between the Start and Focus buttons are. The eye icon on the focus button suggests that it allows you to view or hide something, which in this case would be the focus modal. If that is the intention, then the exit button on the focus modal should not end the focus session. Maybe add two buttons, Stop session and Exit modal? Sorry, it's just not very intuitive when I have to think about which of the buttons I am supposed to click on, so I suggest you work on improving that. Good luck making these changes!

The Data

Now here is the thing. Sitting in my computer right now are exactly 129 reviews similar to the above.

Now you may be wondering, but your website shows 65+ applications built. Well, that is a rough estimate of the number of applications I built, either from scratch or when I completed review tasks that required modifying the work of a previous attempter. The + in 65+ was to signify that there were more applications that were simply reviewed.

Check out this blog post for more details on what that workflow looked like.

I have always wondered what to do with all these reviews sitting on my computer. Well, after learning about subagents, a light bulb clicked in my head.

The Extraction: 7 Recurring Issues, 7 Custom Subagents

So two days ago, I had Claude read every single review I had on my computer and come up with a comprehensive document detailing the top 7 issues I pointed out consistently across reviews.

Then using those issues, I manually built 7 subagents in Claude. Now when I say manually, I mean that I did not use the /agents functionality in Claude Code to define these agents but rather created the markdown files with each agent's specifications myself.

The Plan: 7 Projects, 7 Subagents, 7 Case Studies

Now here is my plan.

I am going to build 7 projects, inspired by a range of projects that I worked on at Outlier. I'll use the general-purpose subagent to build these projects and then apply each subagent I have created to the work done by the general-purpose subagent.

I will document extensively the entire process and even create multiple GitHub branches to track changes made by each subagent. I will share my documentation for each project in a case study.

So you see now why this is titled "7 projects, 7 subagents, 7 case studies, One Goal."

But what is that One Goal?

The Goal: An Algorithm for Predicting Prompt Failures

The Goal is to define an algorithm with a scoring mechanism for writing good prompts. This algorithm will be used to create a key part of the Prompt Engineering Toolkit: the playground.

If you look at the version of the Prompt Engineering Toolkit used in the meta-prompting case study, the algorithm in the playground was defined by Claude itself. Now, I want to create mine.

You see, I want people to be able to provide and/or construct their prompts in my Prompt Engineering Toolkit's playground, and I want to be able to predict what issues might happen if the user goes ahead to execute that prompt.

These predictions will be backed by these 7 projects and their case studies documentation. The goal is to show that I've built intuition for working with LLMs.

The Prompt Engineering Toolkit will look at your prompt and predict multiple areas where the model might fail in execution, reference one or more case studies that show this, and then provide testing suggestions to confirm that they don't fail as well as prompt improvement strategies.

The Project That'll Never End

Now I have to acknowledge that I recognize there are real limitations of the work I am about to do.

For starters, I am only using 7 projects. That's never going to be enough to define an algorithm that'll be so comprehensive.

Also, all the applications will be built with Next.js and TypeScript, so how would this be useful to other developers working with a different tech stack?

These and many more questions are probably running through your head as they are running through mine, and that is why this project, the Prompt Engineering Toolkit, now gets a second nickname: "The Project That'll Never End".

I will continue to iterate and add more and more projects to the methodologies I implement, but to avoid making this blog post any longer, I'll save the details about that for the future.

For now, just know that every blog post I write about this project will have "777-1" in its title and as a tag so that you can easily search for them.

The Emotions

I'm scared and excited for the outcomes of this task. Ever felt those 2 emotions at the same time before?

Anyways, as always, thanks for reading!

Share this article

Found this helpful? Share it with others who might benefit.