Treat Your AI Assistant Like an Overconfident Junior Developer

As AI coding assistants level up from autocomplete to autonomous, the real challenge isn’t what they can do - it’s how we use them wisely.

From finishing your sentences in emails to finishing entire blocks of code, AI has come a long way. It’s like having a hyper-eager junior developer on your team – fast, capable, and sometimes overconfident.

But speed isn’t everything. These tools still need guidance, context, and careful oversight.

In this article, Birgitta Böckeler (Distinguished Engineer, Thoughtworks) shares practical strategies for using AI responsibly, helping developers harness its power without sacrificing quality or maintainability.

Clean code makes AI shine

In their early days, tools like GitHub Copilot mostly acted as advanced autocomplete assistants, predicting the next few lines of code. Today, AI has leveled up to agents that can tackle multi-step tasks – refactoring files, running tests, or even updating entire repositories.

AI agents can now fix failing tests, optimize dependencies, and even propose small architecture tweaks. Still, as Birgitta points out, these time-saving powers come with their own set of headaches:

Developers now need to give clearer context, define their goals more precisely, and double-check AI outputs with extra care.

Because these systems lack persistent memory, developers keep session notes or hand-offs to track project state. Birgitta Böckeler notes that AI assistants work best in modular, well-structured codebases where context and dependencies are clear.

In contrast, legacy or entangled systems often cause the AI to misinterpret relationships or overlook hidden dependencies. As a result, productivity improvements depend heavily on the specific context in which they are implemented.

Claims of 80% faster development rarely hold. AI speeds up small tasks, but big architecture, integrations, and testing still need human expertise.

AI can produce code fast, but it needs human oversight

Böckeler also addressed the growing gap between the hype surrounding AI and what it can actually do.

Many online demonstrations show AI building games or applications in mere minutes, but these impressive-looking outputs often exaggerate reality. In most cases, they produce only basic scaffolding or boilerplate code rather than fully functional, production-ready solutions, reminding developers that human oversight and refinement are still essential.

The quality of AI-generated code still depends on professional oversight, since trade-offs, compatibility concerns, and maintainability are inherently contextual and beyond the AI’s current reasoning capacity.

For example, an AI might correctly adjust a memory limit when a process fails, but it can miss deeper dependency conflicts. It may also merge methods incorrectly if compatibility rules are unclear or generate rigid test cases that complicate debugging instead of simplifying it.

Don’t blindly trust AI-generated code

To help developers navigate these realities, Böckeler proposed a useful mental model:

AI assistants should be treated like junior developers. They are fast, capable, and eager to help, but they can also be overconfident and prone to mistakes.

Understanding their limits is key; like mentoring a new team member, trust must be conditional and context-dependent. Blindly accepting AI-generated code can lead to subtle bugs and long-term maintainability issues.

The hidden pitfalls of AI-generated code

Drawing from her own experience, Böckeler emphasized several recurring pitfalls:

Superficial fixes: AI often suggests quick solutions that don’t address deeper architectural problems.
Problematic test cases: Generated tests can be too brittle or too vague, sometimes requiring as much debugging as the original code.
Reinforcing poor design: In messy or poorly structured systems, AI may perpetuate suboptimal design choices, increasing future maintenance costs.
Increased code churn: Studies show more rework is needed on AI-generated commits, often within weeks.
Unexpected debugging effort: Developers frequently spend more time fixing AI outputs than initially anticipated, highlighting the need for careful oversight and management.

This is why Böckeler recommends a proactive, disciplined approach: AI-generated code should never be accepted at face value but reviewed and thoroughly tested. Checkpoints and version control help roll back unwanted changes, and breaking complex tasks into smaller steps improves AI accuracy.

At the team level, quality control should remain a shared responsibility – automated tests and pull requests aren’t enough.

Monitoring quality metrics and integrating AI gradually helps prevent long-term risks to maintainability and security. Above all, expectations must remain realistic: AI cannot guarantee fixed productivity gains or eliminate the need for experienced developers.

The key lies in responsible use

In closing, Böckeler said that AI coding tools have become a permanent fixture in software development. They are robust, adaptable, and increasingly embedded in professional workflows, but their actual value depends on how responsibly they are used.

Developers must learn not only how to operate these tools, but also how to supervise, evaluate, and sustainably integrate them.

The challenge ahead lies not in automation itself, but in ensuring that it enhances productivity without compromising quality, maintainability, or team cohesion.