There is a precise moment in every GenAI project where the architect wakes up at night.
It is not when the model hallucinates. It is not when the embeddings lack precision. It is when they realize that the entire pipeline depends on the LLM returning valid JSON. Every call. Without exception. In production.
This article is a direct field report — RAG on PowerPoint , an LLM-as-ETL pipeline for normalizing KPIs, an input formalization agent for an analytics SaaS. In every case, the same lesson emerged: the LLM is a probabilistic generator, not a deterministic parser. Confusing the two is expensive.
When you discover the parsing capabilities of modern LLMs, the enthusiasm is understandable. You can ask Claude GPT-4 or Gemini flash 3 to extract structured entities from any text, normalize heterogeneous formats, understand complex semi-structured documents. All zero-shot. No regex. No manual rules.
About the author

Cyril Noirot
Lead Data Scientist
Freelance data scientist. I design and ship decision systems — forecasting, pricing, marketing measurement, optimization.