Chapter 7

Tests: Making Sure We're Getting What We Want

Tests as the ultimate communication tool

In This Chapter:

  • Using tests to define success before writing code
  • Tests as the ultimate communication tool for humans and AI
  • Let AI generate comprehensive test scenarios
  • How testing transforms from verification to requirements definition

“I don’t need tests, my code works fine,” said every developer before their code spectacularly broke in production. We’ve all been there—convinced our latest masterpiece is flawless until reality delivers a humbling reminder about Murphy’s Law. But in our new AI-augmented world, tests aren’t just safety nets; they’re the secret to making sure your AI partner understands what you actually want.

When you’re writing code yourself, bugs happen because you misunderstood something or made a mistake. When AI is writing code for you, bugs happen because the AI misunderstood your intent—or because you failed to clearly define what success looks like. Tests become your primary way to communicate requirements, both to the AI and to your future self.

Think about it: when you ask an AI to “build a user authentication system,” you’re essentially handing it a blank canvas and saying “paint something nice.” But when you provide tests that specify “the system should verify passwords match the stored hash and prevent login after three failed attempts,” you’ve given it a precise blueprint. You’re speaking in code rather than English, and code is a language with far less ambiguity.

As your AI-human collaboration tackles software projects, you’ll discover tests aren’t just quality checks—they’re your primary tool for defining, communicating, and validating what you actually want. Let’s explore how testing changes in this new landscape, and why the old “test after coding” approach needs to be flipped on its head.

Tests First: Defining Success Before Writing a Single Line of Code

In traditional development, testing often feels like the vegetables on your dinner plate—important for health, but frequently pushed aside in favour of the dessert (writing new features). We’ve all been guilty of skipping the green stuff, either rationalising that we’ll “add tests later” or convincing ourselves that manual testing is “good enough for now.”

With AI in the mix, this attitude isn’t just questionable—it’s a recipe for disaster. When you’re working with a partner that can generate hundreds of lines of code in seconds, you need guardrails in place before you begin. Without clear success criteria, you’re essentially asking your AI colleague to build a bridge without telling it which river needs crossing.

Concept in Practice: Test-First Development with AI

Let’s see how test-first development transforms the collaboration with AI. Imagine you’re building a payment processing feature for an e-commerce application:

**You**: We need to implement a payment processing function that handles credit card payments. It should validate the card information, process the payment through our payment gateway, and return appropriate success or error responses.

**AI**: I'd be happy to help implement that payment processing function. Before we dive into the implementation, would you like to define some test cases to clarify the requirements and expected behavior?

**You**: Good idea. Let's write some test scenarios first to clarify what we need.

**AI**: Great approach! Here are some test scenarios we could start with:

**Scenario 1: Valid Payment Processing**
- Given: A valid credit card with sufficient funds
- When: The user makes a payment of $100 USD
- Then: The system should return a success response
- And: The response should include a transaction ID
- And: The payment status should be "completed"

**Scenario 2: Invalid Card Number**
- Given: A credit card with an invalid number
- When: The user attempts to make a payment
- Then: The system should return an error response
- And: The error code should be "invalid_card_number"
- And: The error message should indicate the card number is invalid

**Scenario 3: Expired Card**
- Given: A credit card that has expired
- When: The user attempts to make a payment
- Then: The system should return an error response
- And: The error code should be "expired_card"

Would these scenarios cover the main cases you want to handle? Should we add any others?

**You**: These look good. Let's add scenarios for:
1. Zero or negative payment amounts
2. Missing required fields
3. Payment gateway timeout
4. Successful payment but with pending status (for certain payment methods)

**AI**: Perfect, here are the additional test scenarios:

**Scenario 4: Zero or Negative Payment Amounts**
- Given: A valid credit card
- When: The user attempts to make a payment of $0.00
- Then: The system should return an error response
- And: The error code should be "invalid_amount"

- Given: A valid credit card
- When: The user attempts to make a payment of -$50.00
- Then: The system should return an error response
- And: The error code should be "invalid_amount"

**Scenario 5: Missing Required Fields**
- Given: A payment request with the card number field missing
- When: The payment is submitted
- Then: The system should return an error response
- And: The error code should be "missing_required_field"
- And: The error message should indicate which field is missing

**Scenario 6: Payment Gateway Timeout**
- Given: A valid payment request
- When: The payment gateway takes too long to respond
- Then: The system should return an error response
- And: The error code should be "gateway_timeout"
- And: The response should indicate the payment should be retried

**Scenario 7: Pending Payment Status**
- Given: A valid payment with a payment method that requires additional processing (like bank transfers)
- When: The payment is submitted
- Then: The system should return a success response
- And: The response should include a transaction ID
- And: The payment status should be "pending"

Now we have a comprehensive set of test scenarios covering various success and failure cases. Would you like me to implement the payment processing function based on these requirements?

This dialogue demonstrates how test-first development radically changes the AI collaboration. Notice several key aspects:

  1. The tests define precisely what “success” means for the payment processing function
  2. Edge cases (expired cards, missing fields, etc.) are identified before writing any implementation
  3. The AI immediately understands the expected behaviour and response format
  4. The conversation naturally identifies additional scenarios that might have been missed in a vague description

When you ask the AI to implement the payment processing function now, it has a clear specification to work from—not just a general description. The AI knows exactly what input parameters are expected, what validation to perform, what error codes to return, and how to structure the response.

This is a fundamental shift from the traditional “implement first, test later” approach. By defining success through tests before implementation begins, you provide much clearer guidance to your AI collaborator, resulting in code that’s more likely to meet your actual needs from the first attempt.

Tests as the Ultimate Communication Tool

“I thought you wanted it to work that way” might be the most expensive phrase in software development. Misunderstandings between what stakeholders want, what product managers specify, and what developers build cost the industry billions annually. The fundamental problem is language—humans are notoriously bad at precisely communicating complex requirements in natural language.

Tests solve this communication problem by replacing ambiguous descriptions with clear input-output expectations. They transform vague statements like “users should be able to find products easily” into precise scenarios: “When a user searches for ‘blue shoes’, the system should return results including blue shoes within 0.5 seconds.”

Illustrative Dialogue: Tests as Communication

In this example, we’ll see how tests serve as a communication tool between stakeholders with different perspectives:

**Product Manager**: We need to add a feature that allows premium users to access exclusive content.

**You**: I'll work on that with our AI assistant. To make sure we're aligned, let's define some test scenarios that would verify this feature works correctly.

**Product Manager**: Good idea. What would those look like?

**You**: Let's ask our AI to help us draft some test scenarios to clarify the premium content access feature.

**AI**: Here are some test scenarios that could help clarify the premium content access feature:

**Scenario 1: Premium User Access to Exclusive Content**
- Given: A user with an active premium subscription
- When: They attempt to access exclusive content
- Then: They should be granted access to view the content

**Scenario 2: Free User Access to Exclusive Content**
- Given: A user with a free subscription
- When: They attempt to access exclusive content
- Then: They should be denied access
- And: They should see a message about upgrading to premium

**Scenario 3: Free User Access to Regular Content**
- Given: A user with a free subscription
- When: They attempt to access non-exclusive content
- Then: They should be granted access to view the content

**Scenario 4: Premium Content in Premium User's Feed**
- Given: A user with an active premium subscription
- When: They view their content feed
- Then: Both exclusive and regular content should appear in their feed

**Scenario 5: Premium Content in Free User's Feed**
- Given: A user with a free subscription
- When: They view their content feed
- Then: Only regular content should appear in their feed
- And: No exclusive content should be visible

These scenarios clarify who can access what content and how content discovery works for different user types. Would these help clarify the requirements?

**Product Manager**: These are helpful, but they're raising some questions I hadn't considered. What happens when a premium user downgrades to free? Do they immediately lose access to exclusive content they were previously viewing?

**AI**: That's an excellent question! Let's add scenarios for subscription changes:

**Scenario 6: User Downgrades from Premium to Free**
- Given: A user with an active premium subscription who is currently accessing exclusive content
- When: They downgrade to a free subscription
- Then: They should immediately lose access to the exclusive content they were viewing
- And: They should be redirected to a page explaining the change

**Scenario 7: User Upgrades from Free to Premium**
- Given: A user with a free subscription who was previously denied access to exclusive content
- When: They upgrade to a premium subscription
- Then: They should immediately gain access to all exclusive content

This raises an important product decision: Should access be immediately revoked upon downgrade, or should there be a grace period? The scenarios as written assume immediate revocation, but we could modify them to reflect a different policy if preferred.

**Product Manager**: I think we should add a 24-hour grace period after downgrading. And let's also consider what happens with content that users have saved or bookmarked.

**You**: Great points. Let's add scenarios for those cases too.

This dialogue demonstrates how tests serve as a powerful communication tool:

  1. The tests transformed a vague feature request (“premium users access exclusive content”) into specific, testable behaviours
  2. The test scenarios exposed unstated assumptions and edge cases (what happens upon subscription changes)
  3. The conversation naturally evolved to address important product decisions (grace periods, saved content)
  4. All participants gained a clearer understanding of the feature’s implications

Tests serve as a shared language that bridges the gap between business requirements and technical implementation. They’re concrete enough to guide development but readable enough for non-technical stakeholders to understand and validate.

When working with AI, this shared language becomes even more valuable. The AI can help generate test scenarios that expose assumptions and edge cases, leading to more thoughtful product decisions before a single line of implementation code is written.

Letting AI Generate Comprehensive Test Cases

Now that we’ve established tests as our primary communication tool, let’s explore how AI can help create the most comprehensive set of test scenarios you’ve ever worked with—scenarios that might never have been considered if you had to identify every case manually.

Testing thoroughness often falls victim to human nature. We start with good intentions, defining the main scenarios and a few edge cases. Then fatigue sets in. Thinking of the twentieth variation of a similar test feels redundant. We convince ourselves we’ve covered the important cases. Six months later, a production bug emerges from that untested edge case we thought was too unlikely to matter.

Concept in Practice: AI-Generated Test Coverage

Let’s see how AI can help create comprehensive test coverage for a feature. In this example, we’re implementing a user registration system:

**You**: We need to implement a user registration system with email verification. I want to make sure we have comprehensive tests for this feature.

**AI**: I'd be happy to help create comprehensive test scenarios for a user registration system with email verification. Let's start by outlining the main categories of tests we should cover:

1. **Basic Registration Flow**
   - Successful registration with valid inputs
   - Email verification process
   - Login after verification

2. **Validation Tests**
   - Email format validation
   - Password strength requirements
   - Username requirements
   - Required fields

3. **Edge Cases**
   - Duplicate email/username attempts
   - Timing issues (verification link expiration)
   - Account state transitions

4. **Security Concerns**
   - Password storage
   - Rate limiting
   - Account enumeration prevention

Would you like me to generate detailed test scenarios for any specific category to start with?

**You**: Let's start with validation tests and edge cases. Those seem like areas where we could miss important scenarios.

**AI**: Great choice! Here are comprehensive test scenarios for validation and edge cases:

### Validation Tests

**Email Validation Scenarios:**

1. **Valid Email Formats**
   - Standard format: user@example.com
   - Subdomain format: user@subdomain.example.com
   - With numbers: user123@example.com
   - With special characters: first.last@example.com, first-last@example.com
   - With plus addressing: user+tag@example.com

2. **Invalid Email Formats**
   - Empty email
   - Missing @ symbol: userexample.com
   - Missing username: @example.com
   - Missing domain: user@
   - Missing TLD: user@domain
   - Double dots: user@domain..com
   - Invalid characters: user*name@example.com

**Password Strength Scenarios:**

1. **Invalid Passwords**
   - Empty password
   - Too short (less than 8 characters)
   - No uppercase letters
   - No lowercase letters
   - No numbers
   - No special characters
   - Common passwords (e.g., "password", "12345678")
   - Password matches username or email

2. **Valid Passwords**
   - Meets minimum length with mixed case, numbers, and special characters
   - Long passwords (32+ characters)
   - Passwords with spaces and unicode characters (if supported)

**Username Validation Scenarios:**

1. **Invalid Usernames**
   - Empty username
   - Too short (less than 3 characters)
   - Too long (more than 50 characters)
   - Contains spaces or invalid special characters
   - Reserved words or system usernames

2. **Valid Usernames**
   - Alphanumeric
   - With allowed special characters (e.g., underscores)
   - Mixed case

**Required Fields Scenarios:**

- Missing email
- Missing password
- Missing username
- Missing consent to terms (if required)

### Edge Cases

**Duplicate Registration Scenarios:**

1. **Attempt to register with existing email**
   - Attempt with same email, different username
   - Attempt with same email, same username
   - Attempt with email differing only by case (e.g., User@example.com vs user@example.com)

2. **Attempt to register with existing username**
   - Attempt with same username, different email
   - Attempt with username differing only by case (e.g., UserName vs username)

**Verification Link Scenarios:**

1. **Verification timing**
   - Using link immediately after registration
   - Using link near expiration time (e.g., 23 hours if 24-hour expiry)
   - Using link after expiration time
   - Using link after user has already verified

2. **Verification link integrity**
   - Using malformed verification token
   - Using valid token for wrong user
   - Attempting to verify deleted account

**Account State Transition Scenarios:**

1. **Before verification**
   - Attempt to log in before verification
   - Request password reset before verification
   - Delete account before verification

2. **Re-sending verification**
   - Request new verification email
   - Using old verification link after requesting new one
   - Requesting multiple verification emails in quick succession

Would you like me to elaborate on any of these scenarios? Or shall we continue with the security concerns category?

**You**: These are excellent. I hadn't thought about case sensitivity in emails and usernames, or what happens if someone requests multiple verification emails. Let's add the security concerns as well.

This example demonstrates how AI can generate comprehensive test coverage that goes far beyond what most humans would think to include:

  1. The AI systematically categorises different types of tests (validation, edge cases, security)
  2. Within each category, it considers numerous variations that humans often overlook
  3. It identifies subtle edge cases like case sensitivity and timing issues
  4. The scenarios are presented in a clear, non-technical format that stakeholders can understand

This comprehensive approach discovers potential issues before they reach production. Instead of learning about edge cases through customer support tickets, you identify and handle them during development.

The key advantage is that AI never gets tired of generating test scenarios. While a human might stop after identifying 5-10 cases, AI can easily generate dozens of scenarios covering virtually every possible situation. This thoroughness translates directly to more robust, reliable software.

DEEPEN YOUR UNDERSTANDING

Creating comprehensive test scenarios requires thinking beyond the happy path. Consider a feature you’re currently working on—what edge cases might you have overlooked? How could tests serve as communication tools for your team? The Companion AI can help generate test scenarios for your specific project. Try: “Can you help me create test cases for my user authentication system following the patterns in Chapter 7?”

Conclusion

Throughout this chapter, we’ve explored how AI transforms the testing process from a verification activity into a primary communication and definition tool. The days of writing code first and testing later are behind us—in AI-augmented development, tests become the blueprint that guides implementation.

The story of testing has always been one of trade-offs between thoroughness and effort. Despite knowing better code comes from better testing, teams often shortchange testing due to time constraints.

AI fundamentally changes this equation. By collaborating with an AI partner, you create more comprehensive test suites with less effort, shifting focus from writing test code to strategising what to test.

More importantly, tests now serve dual purposes—they’re both verification and communication tools. They define software requirements in precise terms that humans and AI both understand, becoming the ultimate source of truth.

This transformation has been dramatic. We defined goals through test scenarios, which clarified stakeholder requirements and guided our AI collaborator during implementation. The question wasn’t vaguely “Can you build a user authentication system?” but specifically “Can you implement a function that passes these tests?”

This tests-first, AI-augmented approach combines the best of human and artificial intelligence: your strategic thinking paired with AI’s thoroughness in exploring edge cases. Together, you create software that’s more robust, more thoroughly tested, and better aligned with user needs.

As you tackle your own projects, keep this tests-first mindset central. Define success before writing code, use AI to identify comprehensive test scenarios, and let tests be your primary communication tool. The result will be software built both faster and better.

MAINTAINING CONTEXT

When moving from Testing to Coding, bring forward:
• Your Ideation Summary for solution context
• Your Requirements Summary for functional specifications
• Complete test scenarios that define expected behaviour
• Edge cases and failure modes identified during test creation
• The success criteria for each component

TIP: Begin implementation with a comprehensive context package: ideation concepts, requirements specifications, and test definitions. This three-layered context creates a clear path from "why" through "what" to "how" for your AI partner.

TL;DR

Tests become even more valuable in an AI workflow as they define what success looks like in concrete terms. By focusing on test scenarios (inputs and expected outputs) before implementation, you create clear communication that both stakeholders and AI can understand. AI excels at generating comprehensive test cases, considering edge conditions that humans might overlook. This collaboration changes testing from a costly verification activity into an economical requirements definition process. The result is more thoroughly tested software with less effort, making what was previously a trade-off between speed and quality into a win-win proposition.