Building an AI app: when design goes too far šŸ˜… (+ Choosing an IDE)

Exploring different approaches to AI-assisted programming and the challenges of over-engineering

Preberi v slovenŔčini

Here’s the continuation of my experiment building an app with AI tools.

Last week I showed you how I went from an idea to initial requirements (project requirements document or PRD for short) for the ā€œMind Fitness Nowā€ app. I promised to continue with the practical part — and here it is!

But let me remind you, this isn’t just another app. Even though I’m building it for demonstration purposes, I wanted to create something truly useful. Something that will make users better at being aware of their thoughts — because that’s where everything starts, right? šŸ˜‰

To quickly summarize: thoughts lead to actions and shape how we feel. If we can be conscious of our thoughts most of the time, we’ve almost won. šŸ’Ŗ If they wander, we just redirect them where we want.

The ā€œJust One More Featureā€ Loop

This was the most fascinating part during the PRD preparation: Several times I caught myself in a loop of wanting to add features that seemed essential right from the start. For instance, I listened to two YouTube videos about self-development, got some great ideas and immediately rushed to Gemini, with whom I was creating the PRD. I continued the conversation like: ā€œYou know what, it would really benefit the user if we added this and thatā€¦ā€

And what happened? Gemini (whom I initially prompted with: Act as the best psychologist and game designer) started ā€œthinking.ā€ He explained the advantages and disadvantages of my suggestion, and concluded: ā€œSounds good, but start with the minimum.ā€ This happened several more times! His persistence in keeping me focused on the MVP (Minimum Viable Product) convinced me. I said to myself, okay, I’ll trust him. Let’s give him a chance šŸ™‚

So, I have a PRD document with clear requirements.

Choosing a Development Environment (IDE)

Great, the plan is ready. Now I need an environment where I can ā€œinputā€ this plan and get an actual mobile application out. As I mentioned in previous newsletters, for this we need an IDE (Integrated Development Environment) or a code editor that often comes with a built-in AI assistant nowadays.

The most popular options for working with AI:

For an easier start, I recommend Cursor or Windsurf. Both have a free trial month, so you can see which one suits you better. And you are coverd for two months. The main advantage is that you don’t need additional setup for AI. Later, if you’re interested in cost optimization, I can show you how to set up access through VS Code.

Installation Instructions:

PS: If you get stuck, don’t forget about ChatGPT or YouTube guides — they’re always helpful!

Techniques for Working with AI Programmers

Once we have the tool (IDE with AI), we need a good workflow. How do we give tasks to the AI assistant to minimize errors and produce high-quality code?

There are several approaches, and I tried two that are interesting for higher quality programming with AI:

(By the way, Cline and Roocode have similar functionalities, like Boomerang Tasks, already built-in.)

Let’s first look at Claude Task Master.

What is Claude Task Master?

The idea is that the system automatically creates all necessary tasks from requirements. If you wish, you can perform complexity analysis and for more difficult tasks, you can generate smaller, more manageable subtasks.

My Process with Task Master:

(If you’re not interested in technical details, you can jump straight to the results below. Let me tell you that things didn’t go as expected.)

  1. Installing the tool (globally): ` npm install -g task-master-ai

    `

  2. Initializing in the project: ` task-master init

    `

  3. Configuration:

    • Rename .env.example to .env
    • Add your ANTHROPIC_API_KEY (for Claude model)
    • Add your PERPLEXITY_API_KEY (for research function - optional)
  4. Add PRD:

    • I added my PRD document to the project folder.
  5. PRD analysis and task creation: ` task-master parse-prd your_prd_filename.txt

    `

  6. List of basic tasks: ` task-master list

    `

  7. Complexity analysis (optional): ` task-master analyze-complexity task-master complexity-report

    `

  8. **Breaking tasks into subtasks (with research):**I used Perplexity to help define detailed steps for each task: `

    Example for task with ID 5, I want 6 subtasks

    task-master expand —id=5 —num=6 —research —prompt=ā€œBreak this task into detailed steps for implementation.ā€ *(Of course, you replace*—id*,*—num*and*—prompt`with your values)

  9. Final list of tasks with subtasks: task-master list with-subtasks

Results of My Task Master Experience

On paper it looks great, right? Detailed instructions for the AI assistant. But the result for me… wasn’t optimal.

Why?

To give the AI programmer a very clear goal, I thoroughly researched each task and divided it into micro-subtasks. Generally, the rule is: clearer instructions -> better results. But in my case…

…there was ā€œover-engineeringā€ of the plan. Too much detailed planning.

The AI did follow the instructions, but the entire process became too rigid and slow. Instead of leveraging the AI’s ability to understand broader context, I limited it to executing very small, specific steps. This took a lot of preparation time, and the end result wasn’t necessarily better than if I had given it slightly more general instructions.

Here’s an example of one task and how Task Master (with Perplexity’s help) broke it down into detailed subtasks:

​EXAMPLE OF COMPLEX SUBTASKS​

I spent 34 cents on the Perplexity API to create subtasks.

For the final execution of all tasks (with all subtasks), 137 calls were made in Cursor using the Claude 3.7 Sonnet model.

Before start starting

At the end

262 files and 44k lines of code were created 🫣

The experiment took approximately 6 hours.

Conclusion

The more code there is, the more possibilities for errors. Even if you’re running tests along the way.

Next time I’ll show you the other approach (Cursor Memory Bank), or how I took the best from each approach and simplified the process. With far fewer calls and better results. And results achieved in less than 1 hour.

To ā€œtease you a bit,ā€ here’s the first result of four different tests, which I’ll describe next time:

First result of 4 different models/workflows

Interested in the final result? Let’s stay in touch.

Talk to you soon, Primož