Eight tools. Four days. Zero new spend.

Written with the editorial assistance of Claude (Anthropic).

In Part 1, I laid out the trap: every new feature was a risk to the 95% adoption rate we earned with the MVP. Add too much, too fast, in front of a transient volunteer base, and we risk losing the very thing that made the product work.

I had already built eight prototypes around our core user types. Testing the UX and socializing new designs with coordinators and repeat volunteers would give us critical insight into the next version of the app.

But prototypes weren’t the bottleneck.

The reframe

I didn’t have a design problem.

I had a design validation problem.

Budget constraints around new software still existed, but the larger issue was that we had no consistent workflow for observing volunteers and collecting behavioral insights early enough to protect adoption.

The real question became:

How do you validate design changes against real volunteer behavior, quickly, with a transient user base that can’t participate in formal research?

The existing tools didn’t fit.

Neither did the budget.

An ad about building websites in 15 minutes kept replaying in my head. I didn’t expect to build a complete solution that quickly, but it pushed me toward a different question:

Was building our own solution actually more feasible and more cost-effective than buying one, and how quickly could I do it?

I started pressure-testing ideas with Claude.

The questions that picked the tools

Before I chose any tools, I had to work through a practical problem.

I needed to test eight interfaces and user journeys before launch.

The volunteers using the platform were already comfortable with the current experience. The July release would expose them to entirely new workflows and layouts. If those changes created confusion, I wanted to learn that before launch, not after.

The challenge quickly became overwhelming.

I needed a way to deploy interfaces anywhere, at any time. I needed to observe behavior across multiple user types. I needed to compare reactions across eight different experiences. I needed basic demographic information to provide context for the results. And I needed all of it to happen without creating a complicated process for volunteers or coordinators.

So I broke the problem into smaller questions.

First: How do I make each prototype accessible from anywhere?

My initial thought was to send links directly to volunteers. That immediately led to another question: where would those interfaces live?

Netlify became the answer.

It gave me a simple way to deploy HTML experiences and make them available from any device with an internet connection.

Second: How do I observe behavior?

I didn’t just need opinions.

I needed to see where people clicked, where they hesitated, what they ignored, and where they got lost.

Microsoft Clarity solved that problem.

Heatmaps, session recordings, and behavioral analytics gave me a way to compare interactions across interfaces without introducing additional cost.

Third: How do I create the experiences themselves?

By that point, Claude had become part of my daily workflow. I was already using it to think through product requirements and implementation details.

Because the rest of the team had their own responsibilities, I took ownership of the initial interface concepts myself. I pulled ideas from Nielsen Norman Group research, Medium articles, established UX patterns, and existing application templates.

Once Claude and I worked through the details, the concepts moved into design refinement using Claude Design.

I also pressure-tested the plan with ChatGPT. AI models can sometimes get locked into a particular line of thinking, and bringing in another model can completely reframe the problem. ChatGPT agreed that Claude’s initial timeline was over-engineered, and we simplified.

I anchored the whole thing on AstroKit, a lightweight framework built on Astro by Shawn Sandy, Chief Technologist at Say Hello Neighbor. Even while using AI heavily, I still rely on the judgment of subject matter experts before making final decisions.

Every tool ended up with a specific job.

The stack wasn’t assembled because the tools were popular.

It was assembled because each one solved a problem that stood between me and meaningful user feedback.

The full stack

Claude for PRD drafting, refinement, and code generation
ChatGPT for model triangulation and simplification
Claude Design for initial webpage concepts
AstroKit for the design system and framework
GitHub for version control
Netlify for hosting and deployment
Microsoft Clarity for behavioral analytics
VS Code for refinement and assembly

What it actually cost

Tool	Cost	Notes
Claude	$100/mo	Existing subscription already used in our workflow
Claude Design	Included	Generated initial webpage concepts
ChatGPT	$20/mo	Existing subscription already in use
GitHub	Included	Sponsored by Shawn Sandy
AstroKit	Free	Open framework built on Astro
Netlify	Free	Free tier covered deployment
Microsoft Clarity	Free	Free analytics platform
VS Code	Free	Editor

No additional software costs were introduced for this project.

Every tool was already part of the existing workflow.

That’s the part I want nonprofit leaders and PMs to hear clearly:

The barrier wasn’t budget.

The barrier was realizing a stack like this could support a real project from idea to deployment.

The constant question

Before I committed to the build, the constant question was:

Did I actually need to build this?

Honestly, if I had a dedicated UX researcher on the team, I could have justified purchasing Maze or one of the other platforms.

But that wasn’t our reality.

While the application had been designed with support from a UI designer, we still needed to validate whether the interfaces and workflows were actually effective for volunteers in real-world use.

I understood some UX research methodologies, and I could interpret the reports and findings coming out of them.

But designing and executing effective UX experiments is an entire professional discipline I wasn’t formally trained in.

Most of the platforms were built for dedicated UX researchers and designers, and I knew we would only use a fraction of the functionality they offered.

At the same time, the soft launch was approaching quickly, and I needed validation now.

I needed volunteers to see the interfaces, click through workflows, react naturally, and tell us where friction existed.

That would give me actionable insight faster than spending weeks learning tools we weren’t operationally prepared to maximize.

Those instincts turned out to be more accurate than I realized.

The build

It wasn’t effortless.

The push to get Playground live was intense.

I wasn’t writing code directly, but I spent hours reviewing changes, testing localhost development servers, catching ideas I had missed, and removing features that drifted beyond scope.

Even with highly specific artifacts and requirements drafted ahead of time, I still didn’t fully trust Claude inside VS Code to execute the complete MVP cleanly.

My instincts weren’t wrong.

I constantly had to reel Claude’s ambitions back in.

The AI often wanted to expand the solution beyond what the moment required. Sometimes the suggestions were genuinely strong. Other times, they introduced unnecessary complexity that would have hurt velocity and adoption.

At one point I realized the biggest challenge wasn’t generating ideas.

It was protecting the scope.

Still, the process was worth it.

Four days later, Playground was live, well ahead of the timelines both Claude and ChatGPT initially suggested.

It took multiple 10-hour days of iteration, refinement, and repeatedly simplifying the experience until it felt right.

After taking a short break to reset, I came back on the fourth day to finalize the remaining details.

By then, Playground was ready.

Ready to test every UI planned for the soft launch.

What I actually built

By the time Playground went live, I felt I had created something that solved a genuine problem.

Not a replacement for UX research.

A bridge.

A way for a small team to gather meaningful feedback before committing to a direction.

Most importantly, it gave us a chance to protect the trust we’d already earned from volunteers.

Four days after asking whether building our own solution was possible, Playground was live.

Not partially built.

Not sitting in a backlog.

Live.

It took multiple 10-hour days, constant review, scope management, testing, refinement, and hundreds of decisions about what belonged in the product and what didn’t.

AI accelerated the build.

It didn’t eliminate the judgment.

The Playground was the venue. But a venue is only as useful as what plays in it.

Eight prototypes were already waiting. Eight new app designs, ready to face real volunteers.

Building those prototypes had taken a month, a designer, and a different kind of work entirely.

That’s the story of Part 3.