Call volume forecasting per operator
Bayesian hierarchical model for predicting call volumes at the operator level, enabling data-driven workforce planning and staffing decisions.
Le défi
A large international banking group operates a multi-channel customer service center handling thousands of inbound calls daily. Accurate call volume estimation at the operator level is critical to ensure adequate staffing and service levels, optimize workforce planning, avoid over- or under-utilization of agents, and control operational costs while maintaining customer satisfaction.
The challenge was to predict the number of calls handled per operator over a given time period, accounting for differences in schedules, experience, and operating conditions.
As a first baseline, a linear regression model was considered due to its simplicity and interpretability. However, exploratory data analysis quickly revealed structural issues: call volumes are discrete counts (0, 1, 2, …), not continuous; predictions from linear regression could be negative or non-integer, which is not meaningful in this context; and variance increased with the mean, violating linear regression assumptions. These observations indicated a model–data mismatch.
La solution
Given the nature of the target variable (call counts), the problem was reframed using a count-based probabilistic model. The approach started with Poisson regression as a natural framework for count data, using a log-link function to ensure strictly positive predictions. An exposure offset accounted for different working durations (e.g., hours worked per operator), and a hierarchical structure modeled systematic differences between operators while sharing information across the population.
The final solution was implemented as a Bayesian hierarchical count model using PyMC, allowing for operator-level random effects (experience, efficiency, behavioral differences), global effects from contextual variables (shift type, workload indicators, tenure), and full uncertainty quantification via posterior distributions.
When over-dispersion was detected, the model was extended to a Negative Binomial formulation, improving robustness and predictive performance. This approach aligns the statistical assumptions of the model with the real-world data-generating process.
Approche technique
- Poisson regression as the foundation for count data, with log-link for strictly positive predictions
- Exposure offset to account for varying operator working hours
- Hierarchical structure to model operator-level effects while pooling information across the population
- Bayesian inference via PyMC for full uncertainty quantification
- Extension to Negative Binomial when over-dispersion was detected
- Posterior predictive checks for model validation
Résultats
Impact global
The model enabled operations and workforce teams to move from heuristic staffing rules to data-driven forecasts, simulate staffing scenarios under different schedules or demand assumptions, and better anticipate peak-load risk while avoiding unnecessary overstaffing. By matching the statistical model to the operational reality, the project demonstrated how sound modeling fundamentals can materially improve decision quality.
Enseignements clés
- 1.Sometimes the difference between a misleading model and a useful decision tool is not more complexity but choosing the right statistical framework for the data.
- 2.Count data requires count models: forcing continuous methods on discrete outcomes leads to invalid predictions.
- 3.Hierarchical modeling provides natural regularization for sparse data segments while still capturing individual differences.
- 4.Uncertainty quantification is not optional for operational decisions it enables risk-aware planning.
Stack technique
Projet similaire ?
Besoin d'aide sur un défi similaire ? Discutons de comment je peux vous aider.
Prendre contact →Autres projets
Travel retail forecasting system
Multi-SKU demand forecasting pipeline for 30+ products across 100+ duty-free locations with automated monthly updates.
Consumer healthcare marketing mix model & budget optimizer
Proprietary Marketing Mix Model with budget optimization replacing intuitive allocation with data-driven decision making across multiple countries and touchpoints.
CLV-driven acquisition & budget allocation
Predictive CLV modeling combined with lookalike audience optimization to shift acquisition from volume metrics to value-based targeting.