Generative AI2026Premium Florist (reference implementation)

AI-guided recommendation engine for premium floral e-commerce

A production-oriented recommendation system that guides customers through emotionally loaded floral purchases — using a deterministic state machine with LLM components constrained to intent parsing and rationale generation only.

Key result

Full guided purchase flow: delivery validation, persona selection, budget extraction, curated recommendations with auditable scoring

The challenge

Premium floral e-commerce has a specific conversion problem: customers arrive for an emotionally loaded, time-sensitive purchase (anniversary, funeral, apology) and face a catalog of 400-600 bouquets with no guidance beyond category filters and a search bar. The cognitive load is too high for a decision that needs to happen in under two minutes.

The standard GenAI approach — giving an LLM access to the catalog and letting it pick products — introduces hallucinated SKUs, unpredictable latency, no audit trail, and no way for the merchandising team to influence what gets surfaced. The challenge was building a system where the AI handles language fluently while every business-critical decision remains deterministic, auditable, and controllable.

The solution

I designed and built a LangGraph-based state machine with ~27 conversation nodes that guides the customer through delivery context, persona selection, intent extraction, and budget profiling before producing 2-3 curated recommendations with per-card rationale.

The LLM is used in exactly three places: parsing customer intent into a structured Pydantic object, classifying query scope to block out-of-domain requests, and generating short rationale text constrained to 18 words per card. Everything else — retrieval, ranking, business rules, delivery validation, fallback logic — is deterministic.

The retrieval pipeline uses a four-layer approach: hard filters first (price, occasion, delivery, season), then pgvector similarity search, then multi-signal scoring (semantic + occasion + tone + price fit + business rules), then fallback chains for edge cases. Business rules are hot-reloadable from a JSON config file — the merchandising team can change brand boosts and campaign priorities without engineering involvement.

AI-guided recommendation engine for premium floral e-commerce

Technical approach

  • LangGraph state machine for conversation orchestration (~27 nodes)
  • Constrained LLM usage: intent parsing (structured outputs), scope classification, rationale generation only
  • Four-layer deterministic retrieval: hard filters → pgvector → multi-signal scoring → fallback chains
  • Hot-reloadable business rules (JSON config, auto-reload per request, time-bounded campaigns)
  • Production observability: TurnMonitor recording every LLM call, retrieval step, and ranking decision
  • Phoenix Evals: 5 evaluator types (PII, quality, safety, hallucination, task completion) on scheduled runs
  • Prompt versioning with input hashing for quality regression correlation

Implementation

The system runs as a monorepo with three services: a FastAPI backend (~2,300 lines of core logic), a Next.js frontend for the guided chat interface, and a Playwright-based data collection pipeline that scrapes and normalizes the product catalog.

The backend exposes two API paths: a direct recommendation endpoint and a conversational turn endpoint. The conversation state is persisted to Redis between turns. Rate limiting is Redis-backed and config-driven. The scope guard blocks out-of-domain queries before they reach the retrieval pipeline.

Real catalog data from a premium French florist (~600 products across 16 occasion categories) with structured fields for flowers, colors, occasions, tones, and delivery constraints. Each product is embedded with OpenAI text-embedding-3-small and stored in pgvector.

Results

Architecture

Deterministic core

LLM in 3 narrow places only

Catalog

~600 products

Real data, 16 occasion categories

Observability

5 eval types

PII, quality, safety, hallucination, completion

Operator control

Hot-reload

Business rules via JSON, no deploy needed

Overall impact

The implementation demonstrates that production GenAI commerce does not require giving the LLM control over business decisions. The deterministic-first architecture provides auditable recommendations, predictable costs, and a control surface for the merchandising team — while the LLM handles the narrow language tasks it is genuinely good at. The same architectural pattern applies to any commerce or decision domain where hallucinations have business consequences and non-engineers need to control system behavior.

Key lessons

  1. 01The hardest part is not building the LLM integration. It is resisting the temptation to let the LLM creep into the decision-making layer.
  2. 02Business rules as hot-reloadable config is the single feature that determines whether the system survives its first quarter in production.
  3. 03Observability is not a nice-to-have. If you cannot replay why a specific recommendation was made, you cannot improve the system.
  4. 04Data quality dominates architecture quality. Rich, structured product metadata makes every layer of the pipeline sharper.

Tech stack

Python · FastAPI · LangGraph · OpenAI · pgvector · PostgreSQL · Redis · Next.js · Docker · Arize Phoenix

Similar project?

Need help with a similar challenge? Let's discuss how I can help.