Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

AI Coding Agents Will Fail — Unless We Learn To Guide Them

What Solution Partners should know before recommending “autonomous” dev to clients

TL;DR
Gen‑AI coding agents are force multipliers: they boost clarity or chaos, depending on what you feed them. This post maps the common failure modes, shares a lightweight framework you can implement in Confluence today, and shows how an AI‑powered add‑on can automate the heavy lifting.

The Prototype That Made Me Feel Like a Fraud

Pairing ChatGPT with a code‑generation plug‑in, I shipped a clickable demo in four hours. This used to cost two sprints, six stand‑ups, and dozens of Slack pings.
 

Magic… until the next morning, when I was:

  • Debugging undefined states
  • Explaining edge cases one prompt at a time
  • Chasing down implicit assumptions I never documented

Lesson #1 If you give sloppy requirements to a human, you lose days. Give them to a swarm of LLM agents, and you lose days at machine speed.

 

Where “Autonomous Dev” Breaks Down

Well‑known agents (Devin, Lovable, V0) promise self‑driving software. Reality bites on three fronts:

Failure point

What actually happens

Undefined goal

Agent optimises for the wrong outcome, then proudly ships it.

Scope creep

Every clarifying prompt shifts the target; scope expands, complexity grows.

Missing validation

“Works for me” hides regressions that explode in prod.

So what do AI coding agents need from us to succeed?

The same things junior engineers do: 

✅ Clear goals

✅ Well-scoped tasks

✅ Thorough validation

And just like junior engineers, when they don’t have this context, you get code that technically “works”… but breaks the moment you scale, tweak the scope, or hit an edge case.

 

With my interns, I solved this manually:

After I discovered that one of my interns had spent days implementing a totally wrong functionality, I established some guardrails to prevent that in the future:

📍Daily meetings and detailed documents helped to share context
📍Encouraged interns to ask questions when they are unsure
📍Small PRs that were easy to review helped identify gaps early

From that point on, I was confident that my interns were always on the right track,

But it doesn’t scale.
You can mentor 2 interns.
You can’t mentor 200 AI agents the same way..

 

Embedding Context Inside Confluence (Framework)

Step 1 — Headline Test
“We’ll know this works when metric X moves from A → B.”

Step 2 — Reveal Context
Ask the agent to list information gaps; route unanswered ones to the PM/Tech Lead.

Step 3 — Failure Modes
Write three ways it could break. Prompt the LLM for two more.

Step 4 — Self‑Review
Before a PR, the agent answers: “Which assumption am I least sure about, and what test would prove it wrong?”

 

Partners can paste this block into any Confluence template today. One healthcare company reduced implementation time by 52% by just asking the right questions.

 

Automating the Heavy Lifting (Optional Add‑On)

If you’d rather not police templates manually, an AI coach inside Confluence can:

  • Interrogate new pages for missing goals, scope or validation checkpoints

  • Flag untested assumptions and suggest automated checks

  • Package the resolved context for engineers and coding agents—no extra Slack threads

An example is Wisary. It doesn’t write code; it creates clarity—making any agent, script, or human contributor far more predictable.

 

7 · Before vs. After Fixing the Thinking Layer

 

Before

After

POC ships fast, dies in production

POC evolves into stable prod code

Every standup identifies more new tasks than closing existing

80% of ambiguity resolved upfront reducing scope creep

Trust erodes with every rollback

Context‑driven loop → faster cycles → higher trust

Next Steps for Solution Partners

  1. Add the 4‑step framework to a test space in your internal Confluence.

  2. Run one upcoming ticket through it and measure clarifying‑question count.

  3. Decide whether manual templates or an automated coach make more sense for your clients.

Curious to see a live demo of the automated approach—or want to share your own experiments? Drop a comment below. Always happy to trade notes.

 

Closing Thought

AI agents amplify whatever you already have. If that’s clarity, you’ll ship faster than ever. If it’s confusion, you’ll reach chaos sooner. The thinking layer is ours to fix.


About the author

Ala Stolpnik is a former Google engineering leader turned founder. My team builds Wisary, an AI‑powered Confluence app that helps product and engineering teams think clearly at scale.

0 comments

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events