Grasp the intricacies of hierarchical models in the realm of sports analytics. This article presents a comprehensive analysis of these advanced techniques, highlighting their potential in transforming data-driven sports strategies.
Decision-making in sports has become increasingly data-driven with GPS, cameras, and other sensors providing streams of information at high spatial and temporal resolution. While machine learning is a popular approach for turning these data streams into actionable information, Bayesian statistical methods offer a robust alternative. They allow for the combining of multiple data sources, a natural means for imputing missing data, as well as full accounting for various system uncertainties. In particular, hierarchical models provide a means for integrating information at multiple scales and adjusting for biases associated with small sample sizes. I will demonstrate a Bayesian workflow for model development using PyMC version 5, from data preparation through to the summarization of estimates and predictions, using baseball data.
Chris is the Principal Quantitative Analyst in Baseball Research & Development for the Philadelphia Phillies. He is interested in computational statistics, machine learning, Bayesian methods, and applied decision analysis. He hails from Vancouver, Canada and received his Ph.D. from the University of Georgia.
00:00:00 Welcome
00:07:24 Presentation begins
00:09:11 Data Science in Baseball
00:09:36 Sabermetrics
00:10:33 Canoncial Baseball statistcs
00:12:02 Advanced metrics
00:13:03 Ball Tracking technology
00:13:44 Trackman
00:14:08 Hawkeye
00:17:36 Bayesian inference
00:18:58 PyMC
00:19:59 Home run rate estimation
00:23:37 Prior predictive checks
00:25:00 Nuts about MCMC
00:28:14 Posterior predictive sampling
00:28:48 Informative priors
00:31:18 Unpooled Model
00:31:40 Hierarchical Model
00:32:16 Partial pooling
00:32:40 HyperPriors
00:32:56 Partial Pooling Model
00:34:06 Group Covariate Model
00:36:12 Park Effects
00:38:24 Model Comparison with Expected Log Predictive Density
00:39:08 Leave One Out Cross Validation
00:40:18 Individual covariates
00:42:03 Variable interactions
00:42:27 Gaussian processes
00:43:55 Accelerated Sampling
00:45:13 Out-Of-Sample Prediction
00:47:05 Prediction Model
00:48:38 Workflow steps
00:50:51 Q/A Could you explain the kernel function ...?
00:52:30 Q/A What is the advantage of ...?
00:54:23 Q/A How would you handle categorical variables in the individual ...?
00:56:37 Q/A How Bayesian analytics is bringing value to ...?
01:00:26 Q/A Can you give insights into how you interact ...?
01:01:40 Q/A Do you have recommended ...?
01:03:32 Q/A Any advice if I'm new and want to improve?
01:04:28 Q/A Does it happen that a selected model is not good at ...?
01:06:13 Q/A Could you comment on the usage of Bayesian decision-making...?
01:08:10 Webinar Ends
Modeling spatial data with Gaussian processes in PyMC
Using Bayesian decision making
If you are interested in seeing what we at PyMC Labs can do for you, then please email info@pymc-labs.com. We work with companies at a variety of scales and with varying levels of existing modeling capacity. We also run corporate workshop training events and can provide sessions ranging from introduction to Bayes to more advanced topics.