Keeping AI Coding Agents in Their Lane with Test Driven Development(TDD)

AI coding agents can generate code in seconds. But without guardrails, they often produce brittle “AI-slop”. In this post I show how a simple TDD rule can guide an agent to produce higher-quality code.

AI Agents are Everywhere

On the AI timescale, my first post about code generation with ChatGPT is now ancient history. We’ve evolved at breakneck speed. In 2026, we ask agents to plan, write, test and refactor entire products. Every major coding tool (e.g. Claude Code, Cursor, Antigravity) ships with an agent mode that supports planning, semi-autonomous and fully-autonomous task execution.

Like with Human Developers, Guidance is Key

As you would expect, these agents are still stochastic systems. You might have seen that they are often eager to generate loads of code, occasionally get stuck in loops and sometimes overengineer the solution. With rules, commands, skills and workflows, you can guide these agents to work within your guardrails, guidelines and preferences. Rules, in particular, are non-negotiable, hardened instructions that an agent must follow, making them ideal for enforcing Test Driven Development (TDD).

Let’s Try Agentic TDD

These days, I alternate between Claude Code, Antigravity and Cursor depending on whether I’m working on office or personal projects. The good news is that agent instructions such as rules and the like are compatible across these tools. For this exercise, I’m using Antigravity. I dislike the usual order-management or todo-app examples. Instead, I’m going to extend an existing piece of functionality using agentic TDD. Let’s assume we have a simple PiiDetector Java class that can detect PII (Personally Identifiable Information) in a given text. Claude Opus 4.6 generated this code for me.

Extend existing functionality

package dev.laksitha;
import java.util.List;
import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PiiDetector {
// Regex for 13 to 19 digits with optional spaces or hyphens in between
private static final Pattern BANK_CARD_PATTERN = Pattern.compile("\\b\\d(?:[\\s-]?\\d){12,18}\\b");
// Regex for UK phones: e.g., +44 7911 123456 or 020 7946 0991
private static final Pattern UK_PHONE_PATTERN = Pattern.compile("(?:\\+44|0)[\\s-]?\\d{2,4}[\\s-]?\\d{3,4}[\\s-]?\\d{3,4}");
// Regex for Names: First Name and Last Name (optionally hyphenated)
private static final Pattern NAME_PATTERN = Pattern.compile("\\b[A-Z][a-z]+(?:[\\s-]+[A-Z][a-z]+)+\\b");
public List<PiiMatch> detect(String text) {
List<PiiMatch> matches = new ArrayList<>();
Matcher cardMatcher = BANK_CARD_PATTERN.matcher(text);
while (cardMatcher.find()) {
String value = cardMatcher.group();
if (isLuhnValid(value)) {
matches.add(new PiiMatch(PiiType.BANK_CARD, value, cardMatcher.start(), cardMatcher.end()));
}
}
Matcher phoneMatcher = UK_PHONE_PATTERN.matcher(text);
while (phoneMatcher.find()) {
matches.add(new PiiMatch(PiiType.UK_PHONE, phoneMatcher.group(), phoneMatcher.start(), phoneMatcher.end()));
}
Matcher nameMatcher = NAME_PATTERN.matcher(text);
while (nameMatcher.find()) {
matches.add(new PiiMatch(PiiType.NAME, nameMatcher.group(), nameMatcher.start(), nameMatcher.end()));
}
return matches;
}
private boolean isLuhnValid(String num) {
String cleanNum = num.replaceAll("[^0-9]", "");
if (cleanNum.length() < 13 || cleanNum.length() > 19) return false;
int sum = 0;
boolean alternate = false;
for (int i = cleanNum.length() - 1; i >= 0; i--) {
int n = cleanNum.charAt(i) - '0';
if (alternate) {
n *= 2;
if (n > 9) {
n = (n % 10) + 1;
}
}
sum += n;
alternate = !alternate;
}
return (sum % 10 == 0);
}
}

A Reusable and Practical TDD Rule

Now let’s try to redact the detected PIIs. Frankly, you could just vibe-code or one-shot this entire functionality including the tests. It’s almost effortless with current AI agents.

However, that’s where the danger lies, you can easily cross the fine line between “AI-slop” and “AI-empowered engineering”.

Let’s pause and think about what a professional software engineer would do. Before jumping into code, the most important thing they’d do is problem discovery and planning the solution. I usually sketch or whiteboard my approach before writing even a single line of code.

TDD formalises this kind of discipline. It is a learned and practised workflow. If we guide an AI agent to follow TDD, the quality of the generated code improves noticeably.

An effective way to do this is by pointing the agent to a rule that enforces the methodology. A rule is a concise, reusable set of instructions that the agent applies consistently.

I created the following.

# TDD Rule
Follow this disciplined Test-Driven Development (TDD) cycle to ensure code reliability and design clarity.
---
## 1. Plan: The Test List
Before writing any code, create a list of the specific behaviours and scenarios you want to cover.
* **Action:** Write these as a `TODO` list in a comment or a dedicated file.
* **Constraint:** Focus solely on **outcomes**. Do not mix implementation design decisions into this list.
## 2. Red: Write Exactly One Test
Select the next item from your list and turn it into a concrete, runnable test.
* **Requirement:** The test must have specific assertions. Never write "empty" tests just for coverage.
* **Constraint:** Do not convert the entire list into tests at once. Focus exclusively on one failing test.
* **Verification:** Run the test to confirm it fails as expected.
## 3. Green: Make it Pass
Write the minimal amount of code necessary to make the current test—and all previous tests—pass.
* **Rule of Discovery:** If you discover the need for a *new* test during this phase, add it to your Test List; do not write it yet.
* **Prohibited:** * Do **not** delete assertions to "force" a pass.
* Do **not** copy-paste computed values from failure output into your test's expected results.
* Do **not** refactor during this phase.
## 4. Refactor: Clean the Design
Once the test is green, optionally improve the implementation design.
* **Action:** Remove duplication and increase clarity.
* **Constraint:** Do not abstract too soon. Only refactor what is necessary for the current session.
## 5. Repeat
Check the Test List. If items remain, return to **Step 2**. If the list is empty, the task is complete.

You notice that this rule is not purely textbook TDD. In addition to the standard TDD steps, I added a planning step to ensure the agent’s plan is more structured and easier to follow. Once such rules are polished, they can be reused across multiple projects and teams. Let’s try it out.

Stop! Plan It First

I am going to ask Antigravity to create a plan to implement the PII redaction functionality using TDD. This is what I got.Antigravity Implementation PlanI reviewed the plan and proposed the agent to follow Single Responsibility Principle (SRP) for the implementation. In this agentic workflow, the highest leverage activity is reviewing and shaping the plan, not typing the implementation.

Run the TDD Cycle

It’s time to write some code. Let’s see how this agent writes the first test.Start Writing First Test CaseFirst Test Case is RedFirst Test Case is GreenThe agent struggled with the terminal environment and couldn’t run the tests. I stepped in to execute the commands manually and let the agent continue the cycle. Remember, the agent is my friendly teammate and I’m the human-in-the-loopAgent Struggled with Terminal

All tests went green and the plan was completed.

Antigravity Walkthrough

Let’s look at the generated code.

package dev.laksitha;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;
class PiiRedactorTest {
private final PiiRedactor redactor = new PiiRedactor(new PiiDetector());
@Test
void testRedactBankCard() {
String input = "My card is 4111 1111 1111 1111 ok";
String result = redactor.redact(input);
assertEquals("My card is [BANK_CARD] ok", result);
}
@Test
void testRedactUkPhone() {
String input = "Call +44 7911 123456 please";
String result = redactor.redact(input);
assertEquals("Call [UK_PHONE] please", result);
}
@Test
void testRedactName() {
String input = "Please reach out to John Doe for details";
String result = redactor.redact(input);
assertEquals("Please reach out to [NAME] for details", result);
}
@Test
void testRedactMixedPii() {
String input = "John Doe card 4111 1111 1111 1111 phone +44 7911 123456";
String result = redactor.redact(input);
assertEquals("[NAME] card [BANK_CARD] phone [UK_PHONE]", result);
}
@Test
void testRedactNoPii() {
String input = "No sensitive data here.";
String result = redactor.redact(input);
assertEquals("No sensitive data here.", result);
}
}

And the production code

package dev.laksitha;
import java.util.Comparator;
import java.util.List;
/**
* Redacts PII detected by a {@link PiiDetector} by replacing matched
* text regions with type-specific placeholder masks (e.g. {@code [BANK_CARD]}).
*/
public class PiiRedactor {
private final PiiDetector detector;
public PiiRedactor(PiiDetector detector) {
this.detector = detector;
}
/**
* Returns a copy of {@code text} with all detected PII replaced
* by {@code [TYPE_NAME]} placeholders.
*
* @param text the input text to redact
* @return the redacted text
*/
public String redact(String text) {
List<PiiMatch> matches = detector.detect(text);
// Sort descending by startIndex so replacements don't shift earlier indices.
matches.sort(Comparator.comparingInt(PiiMatch::startIndex).reversed());
StringBuilder sb = new StringBuilder(text);
for (PiiMatch match : matches) {
sb.replace(match.startIndex(), match.endIndex(),
"[" + match.type().name() + "]");
}
return sb.toString();
}
}

Key Takeaways

As you see, my agent is now capable of producing better quality code in minutes. In essence, the workflow mirrors what experienced software engineers already do:

  • Define and apply clear guidelines like TDD
  • Spend time on problem discovery and planning
  • Write focused tests
  • Implement the minimal solution
  • Refactor and continue refining the design

In this model, the agent acts as a fast and properly onboarded coworker, while I remain responsible for shaping the plan and maintaining engineering discipline.

Good engineering practices don’t disappear in the age of AI agents, they become even more important.

Comments

Leave a comment