Real Latency vs Perceived Latency in GenAI Systems
Raw latency and perceived latency are different engineering problems. Production GenAI systems feel fast when they expose progress early, overlap backend work, and avoid silent waiting.
Independent AI & decision systems consultant
I design and deploy production-grade systems for revenue optimization, marketing effectiveness, demand forecasting, and commercial intelligence.
Independent consultant working across luxury, retail, healthcare, and consumer sectors.
Worked with
What I help with
Demand forecasting, scenario planning, inventory strategy, and operational decisions under uncertainty.
Pricing optimization, elasticity modeling, portfolio trade-offs, and decision support for teams.
MMM, response curves, budget allocation, and shared finance-marketing decision systems.
AI workflows, recommendation systems, and auditable business-rule-driven decision logic.
Decisions are the product. Software is the tool.
Forecasting, pricing, MMM, and AI decision workflows designed for operations.
Selected work
A few examples across forecasting, MMM, and operational risk.
A production-oriented recommendation system that guides customers through emotionally loaded floral purchases — using a deterministic state machine with LLM components constrained to intent parsing and rationale generation only.
Key resultFull guided purchase flow: delivery validation, persona selection, budget extraction, curated recommendations with auditable scoring
Multi-SKU demand forecasting pipeline for 30+ products across 100+ duty-free locations with automated monthly updates.
Key resultForecast error: 38% → 24%
Proprietary Marketing Mix Model with budget optimization replacing intuitive allocation with data-driven decision making across multiple countries and touchpoints.
Key resultStrategic budget reallocation based on incremental response curves
Writing
The writing is there as proof of depth, not as a substitute for the offer.
Raw latency and perceived latency are different engineering problems. Production GenAI systems feel fast when they expose progress early, overlap backend work, and avoid silent waiting.
Enterprise AI systems often work at the first level of granularity, then become fragile when the business asks for more precision. A field lesson from pharmaceutical supply optimization on why incremental architectures matter.
Intelligence-native systems need agent access to decision artefacts and feedback loops. Why context, not models, is the differentiator — and how MCP, traditional ML, and versioned artefacts fit together.
Markdown planning explodes when weeks, events, discount levels, and phases are modeled as separate dimensions. Collapsing each week into a single state turns it into a clean optimization problem.
Seasons, campaigns, and weekly events — retail runs on overlapping cycles, and the AI recommender has to keep up with all of them. Notes on the business-rules control surface that lets merchandising teams steer a conversational recommender without editing prompts, filing tickets, or waiting for a deploy.
Macy's annonce que les clients utilisant son assistant IA dépensent 4,75x plus. Sephora vient de lancer une app dans ChatGPT. Zalando déploie son assistant dans 25 marchés. La question pour tous les autres retailers n'est plus 'faut-il le faire ?' mais 'comment l'architecturer pour que ça tienne en production ?'
Tools
Interactive tools and calculators for practitioners — built to explore ideas, not to sell software.
Marketing saturation curves
Marketing carryover effects
Budget optimization
Customer lifetime value
Marketing efficiency zones
Market share & loyalty
The most useful early step is usually clarifying the decision, the constraints, and what actually needs to be productionized.