pymc
Bayesian modeling and probabilistic programming with PyMC. Build hierarchical models, perform MCMC sampling (NUTS), variational inference, and conduct rigorous model comparison using LOO and WAIC.
Introduction
PyMC is a specialized skill for Bayesian modeling, designed for researchers and data scientists who need to perform probabilistic programming and inference. This skill leverages the modern PyMC 5.x API to help users construct, fit, and validate complex statistical models. It is particularly well-suited for problems requiring uncertainty quantification, hierarchical data analysis, and principled handling of measurement errors or missing data. By integrating with ArviZ for diagnostic visualization, it ensures that models are not only built correctly but are also robust, converged, and statistically sound.
-
Perform advanced Bayesian inference using No-U-Turn Samplers (NUTS) and variational inference (ADVI).
-
Build complex hierarchical and multi-level models that account for group-level variations.
-
Conduct rigorous model selection and assessment using information criteria such as Leave-One-Out (LOO) cross-validation and WAIC.
-
Implement prior and posterior predictive checks to validate model assumptions and identify potential misspecifications.
-
Diagnose sampling performance by analyzing R-hat convergence metrics, Effective Sample Size (ESS), and divergence transitions.
-
Facilitate linear regression, logistic regression, and custom probabilistic structures through flexible model definitions.
-
Always standardize continuous predictors to improve Hamiltonian Monte Carlo sampling efficiency.
-
Use weakly informative priors instead of flat priors to guide the model towards physically plausible parameter ranges.
-
Explicitly define model coordinates and dimensions to enhance code readability and facilitate complex data indexing.
-
Set target_accept parameters higher (0.9–0.99) when encountering complex posteriors or sampling divergences.
-
Inputs typically include numerical arrays or pandas DataFrames; outputs include InferenceData objects containing posterior traces, diagnostics, and summary statistics.
-
Ensure sufficient tune samples and multiple chains to guarantee that the MCMC chains have fully explored the parameter space.
Repository Stats
- Stars
- 19,798
- Forks
- 2,209
- Open Issues
- 41
- Language
- Python
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- Apr 30, 2026, 04:08 PM