Data Scientist TJO in Tokyo

Data science, statistics or machine learning in broken English

Bayesian modeling with R and Stan (3): Simple hierarchical Bayesian model

In 2 previous posts, you learned what Bayesian modeling and Stan are and how to install them. Now you are ready to try it on some very Bayesian problems - as many people love - such as hierarchical Bayesian model.

Definition of hierarchical Bayesian models

Prior to tackling with a practical example, let's overview what and how hierarchical Bayesian model is. A famous book on Bayesian modeling with MCMC, written by Toshiro Tango and Taeko Becque and published in Japan, describes as below*1.

ベイジアン統計解析の実際 (医学統計学シリーズ)

ベイジアン統計解析の実際 (医学統計学シリーズ)

In a fixed-effects model of frequentist, each result is assumed to have a common average \theta.

y_i \sim N(\theta, s^2_i)

On the other hand, in a random-effects model, each result is assumed to have a distinct average \theta_i and it is distributed around a global average \mu.

y_i \sim N(\theta_i, s^2_i)

\theta_i \sim N(\mu, \sigma^2_i)

Bayesian hierarchical models assume prior probability for parameters \mu, \sigma^2_\theta of a probability distribution of \theta_i in a random-effects model, such as

\mu \sim N(0,100)

1/{\sigma^2_i} \sim Gamma(0.001,0,001)

It is said that such models have a hierarchical structure with two levels, that is,

  • 1st level: a probability distribution is assumed for \theta_i
  • 2nd level: one more probability distribution is assumed for parameters of the 1st level \mu, \sigma^2_i

This is a textbook definition of hierarchical models, but I think it can be understood more intuitively; in hierarchical Bayesian models, often the models have to handle some excessive fluctuations as nonlinear effects more than expected in usual frequentist's models. Priors used in such models can be seen as an "absorber" that can absorb various kinds of fluctuations distributed around true parameters.

*1:Its original text is of course in Japanese, so this is just my own interpretation

Read more

Bayesian modeling with R and Stan (2): Installation and an easy example

The previous post overviewed what and how is Stan on R.

Are you ready now? OK, this post reviews how to install Stan. Let's start here! :) In principle this post just follows a content of "RStan Getting Started" but some tips are added in order to fix less known problems.

Warning: this post assumes you are an Windows user. If you use Mac OS or Linux, please see notification for each OS.

Read more

Bayesian modeling with R and Stan (1): Overview

Although I've written a series of posts titled "Machine Learning for package uses in R", usually I don't run machine learning on daily analytic works because my current coverage is so-called an ad-hoc analysis.

Instead of machine learning, ad-hoc analysts often use statistical modeling such as linear models (called "multiple regression" in general), generalized linear models (GLM) and/or econometric time series analysis. But in some situations such linear model and its variants would not work because of nonlinear components and/or individual variance, called "random effect".

In general, random effect can be well handled by generalized linear mixed models (GLMM) and for example CRAN has some related packages. But in some cases random effects cannot be formulated concisely and explicitly... if so, we have a strong alternative method to resolve it: "Bayesian using Markov Chain Monte Carlo (MCMC) method".


As one of the strongest methods for ad-hoc analysis, a series of posts will argue about Bayesian modeling with MCMC and its apllication. For the first time, this post overviews it.

Read more