Generative AI

Building a recommendation engine that doesn't trust the LLM

This is the engineering companion to the production architecture piece. Instead of re-arguing why open-ended agents are risky in commerce, it walks through the actual implementation choices in `ai-florist`: FastAPI boundaries, LangGraph orchestration, pgvector retrieval, learned scoring weights, deterministic fallbacks, and runtime observability.

April 12, 2026
Share on
11 min read

The easiest way to build an AI recommendation system is to let the model browse the catalog, reason in a loop, and pick products.

When I built ai-florist, I wanted the opposite property: if a recommendation is wrong, I want to know which layer was wrong. Was intent parsing bad? Did delivery filtering remove too much? Did vector retrieval miss the right candidates? Were the scoring weights off? Did the rationale overstate the fit?

That requirement leads to a very different design from the usual "agent with tools" pattern. The LLM is still there, but it is treated like a narrow component with typed inputs and typed outputs. The engine that decides what to show is deterministic, observable, and debuggable.

This article is the engineering companion to What agentic commerce actually requires in production. That piece makes the architectural argument. This one walks through the implementation.

About the author

Cyril Noirot

Cyril Noirot

Lead Data Scientist

Freelance data scientist. I design and ship decision systems — forecasting, pricing, marketing measurement, optimization.

Seen in practice

Anonymized case studies where these ideas were applied to real decision problems.

Newsletter

Technical writing on forecasting, pricing, and decision systems. No fixed schedule, no spam.

Enter your email
Subscribe