Call volume forecasting per operator
Bayesian hierarchical model for predicting call volumes at the operator level, enabling data-driven workforce planning and staffing decisions.
Key result
From heuristic rules to uncertainty-aware forecasts
The challenge
A large international banking group operates a multi-channel customer service center handling thousands of inbound calls daily. Accurate call volume estimation at the operator level is critical to ensure adequate staffing and service levels, optimize workforce planning, avoid over- or under-utilization of agents, and control operational costs while maintaining customer satisfaction.
The challenge was to predict the number of calls handled per operator over a given time period, accounting for differences in schedules, experience, and operating conditions.
As a first baseline, a linear regression model was considered due to its simplicity and interpretability. However, exploratory data analysis quickly revealed structural issues: call volumes are discrete counts (0, 1, 2, …), not continuous; predictions from linear regression could be negative or non-integer, which is not meaningful in this context; and variance increased with the mean, violating linear regression assumptions. These observations indicated a model–data mismatch.
The solution
Given the nature of the target variable (call counts), the problem was reframed using a count-based probabilistic model. The approach started with Poisson regression as a natural framework for count data, using a log-link function to ensure strictly positive predictions. An exposure offset accounted for different working durations (e.g., hours worked per operator), and a hierarchical structure modeled systematic differences between operators while sharing information across the population.
The final solution was implemented as a Bayesian hierarchical count model using PyMC, allowing for operator-level random effects (experience, efficiency, behavioral differences), global effects from contextual variables (shift type, workload indicators, tenure), and full uncertainty quantification via posterior distributions.
When over-dispersion was detected, the model was extended to a Negative Binomial formulation, improving robustness and predictive performance. This approach aligns the statistical assumptions of the model with the real-world data-generating process.
Technical approach
- Poisson regression as the foundation for count data, with log-link for strictly positive predictions
- Exposure offset to account for varying operator working hours
- Hierarchical structure to model operator-level effects while pooling information across the population
- Bayesian inference via PyMC for full uncertainty quantification
- Extension to Negative Binomial when over-dispersion was detected
- Posterior predictive checks for model validation
Results
Prediction Quality
No invalid outputs
Eliminated negative or fractional call predictions
Distribution Fit
Improved alignment
Better match with observed call distributions
Low-Volume Stability
Hierarchical shrinkage
More stable forecasts for operators with sparse data
Decision Support
Uncertainty intervals
Transparent confidence bounds for planning
Overall impact
The model enabled operations and workforce teams to move from heuristic staffing rules to data-driven forecasts, simulate staffing scenarios under different schedules or demand assumptions, and better anticipate peak-load risk while avoiding unnecessary overstaffing. By matching the statistical model to the operational reality, the project demonstrated how sound modeling fundamentals can materially improve decision quality.
Key lessons
- 01Sometimes the difference between a misleading model and a useful decision tool is not more complexity but choosing the right statistical framework for the data.
- 02Count data requires count models: forcing continuous methods on discrete outcomes leads to invalid predictions.
- 03Hierarchical modeling provides natural regularization for sparse data segments while still capturing individual differences.
- 04Uncertainty quantification is not optional for operational decisions it enables risk-aware planning.
Tech stack
Similar project?
Need help with a similar challenge? Let's discuss how I can help.
More projects
AI-guided recommendation engine for premium floral e-commerce
A production-oriented recommendation system that guides customers through emotionally loaded floral purchases — using a deterministic state machine with LLM components constrained to intent parsing and rationale generation only.
Travel retail forecasting system
Multi-SKU demand forecasting pipeline for 30+ products across 100+ duty-free locations with automated monthly updates.
Consumer healthcare marketing mix model & budget optimizer
Proprietary Marketing Mix Model with budget optimization replacing intuitive allocation with data-driven decision making across multiple countries and touchpoints.