Why AI demands a new playbook

Product planning in a post-predictable world

For years, product planning relied on predictability.

You defined a feature.
Engineering built it.
The output matched the specification.

Waterfall assumed certainty.
Agile reduced batch size but preserved forecastability.

Even iterative delivery depended on a stable assumption.

If we build X, we get Y.

AI breaks that assumption.

Deterministic product management had clear causality

Traditional software is deterministic.

Given the same input, it produces the same output.

A calculator returns the same number.
A rule engine applies the same logic.
A feature behaves as coded.

Roadmaps reflected that structure.

Scope mapped to output.
Output mapped to impact.
Impact mapped to forecast.

AI does not preserve that chain.

Non-deterministic development changes the ground rules

AI systems are probabilistic.

Given the same prompt, a large language model can generate different outputs. Try this:

“Write a short story about a robot who learns to paint.”

Run it three times in any LLM-powered assistant.

The tone shifts.
The structure shifts.
The ending shifts.

That variability is not a defect.

It is how the model works.

As the Nielsen Norman Group explains, LLMs are probabilistic systems that generate responses based on patterns and probabilities.

This means AI features are not static code.

They are dynamic systems.

Planning must shift from delivery to learning

In a deterministic world, delivery was the unit of progress.

In a probabilistic world, learning is.

Instead of saying:

We will build Feature A and deliver Outcome B.

You say:

We hypothesise that Feature A will improve Metric B. We will test, measure, and adapt.

The roadmap becomes a learning agenda.

Not a promise of fixed outputs, but a sequence of structured experiments.

Not all AI features behave the same

Some AI features are tightly constrained.

Others are open-ended and generative.

A classification model has bounded outputs.
A recommendation engine adapts over time.
A generative assistant produces variable responses.

Knowing the feature type shapes the strategy.

Constraint level determines risk.
Risk determines testing depth.
Testing depth determines evaluation design.

This is product judgement, not novelty.

Variability requires new communication

Stakeholders expect certainty.

AI offers ranges.

That gap must be managed explicitly.

Product Managers need to explain:

What is controllable
What is probabilistic
What success means under variability

Silence creates fear.
Clarity creates alignment.

Metrics must expand beyond usage and conversion

Traditional product metrics remain necessary.

They are no longer sufficient.

AI products require structured evaluation.

Evals provide that structure.

Evals are systematic tests that measure model performance against defined tasks. They act like user stories for models, with measurable outcomes.

They test:

Response accuracy
Relevance
Bias and fairness
Hallucination rates
Factual consistency
Output format compliance

Leading teams treat evals like automated tests.

They build evaluation pipelines directly into deployment.

This is not experimentation theatre.

It is quality control for probabilistic systems.

Tooling for evaluation is maturing

Several platforms support structured evaluation workflows:

The tooling is evolving.

The discipline must evolve faster.

Product Managers must move closer to model behaviour

You do not need to train models.

You do need to understand them.

That means understanding:

Data dependencies
Model metrics
Failure modes
Edge cases
Legal and ethical constraints

AI products sit at the intersection of engineering, data science, and governance.

Product must sit there too.

Old playbook versus new reality

Old playbook:

Define feature.
Build feature.
Ship feature.
Measure usage.

New reality:

Define hypothesis.
Deploy experiment.
Evaluate behaviour.
Refine system.

The shift is not cosmetic.

It is structural.

This is not the end of product management

AI does not eliminate product thinking.

It intensifies it.

Deterministic delivery required coordination.

Non-deterministic systems require judgement.

The best Product Managers will still obsess over users.

They will still balance business constraints.

They will still lead cross-functional teams.

But now they must design learning systems, not feature pipelines.

The old playbook assumed certainty.

The new playbook designs for variability.

And designing for variability is the new competence.

In this article