I’m trying to do some multilevel modeling and am not sure how to actually specify the multiple levels in my case.

The existing tutorials don’t seem to address what I’m try to do (A Primer on Bayesian Methods for Multilevel Modeling, @OriolAbril 's PyMC3 with labeled coords and dims).

In the demo below, I have six groups, which have three replicates each.

The groups may have different number of observations, as show below.

There could be a case where some groups have different numbers of replicates, too.

## Generate Data

```
num_groups = 6
num_replicates = 3
mu_a_list = np.arange(num_groups)
sigma_a_list = np.arange(num_groups) * 0.25 + 0.25
sigma_b_list = np.arange(num_groups) * 0.1 + 0.3
df_data = pd.DataFrame()
mu_array = np.zeros((num_replicates, num_groups))
sigma_array = np.zeros((num_replicates, num_groups))
for i in range(num_groups):
mu_a = mu_a_list[i]
sigma_a = sigma_a_list[i]
for j in range(num_replicates):
sigma_b = sigma_b_list[j]
mu_ij = stats.norm.rvs(loc=mu_a, scale=sigma_a)
sigma_ij = stats.halfnorm.rvs(scale=sigma_b)
mu_array[j,i] = mu_ij
sigma_array[j,i] = sigma_ij
x = stats.norm.rvs(loc=mu_ij, scale=sigma_ij, size=(2000 + i + j))
df_temp = pd.DataFrame(data=x, columns=["height"])
df_temp["group"] = i
df_temp["replicate"] = j
df_data = pd.concat((df_data, df_temp), axis=0)
df_data = df_data.reset_index(drop=True)
fig, axes = plt.subplots(nrows=num_replicates, ncols=num_groups, figsize=(20, 10))
for i in range(num_groups):
for j in range(num_replicates):
plt.sca(axes[j, i])
plt.hist(
df_data.loc[
np.logical_and(df_data["group"] == i, df_data["replicate"] == j),
"height",
]
)
plt.xlim(-10, 10)
```

## Inference with PyMC3

I would like to recover `mu_a_list`

, `sigma_a_list`

, `sigma_b_list`

, `mu_array`

, `sigma_array`

.

This model doesn’t work, but is in the direction of what I think should work. In part, I don’t know how to specify the joint membership of an observation in both a `group`

and `replicate`

```
group_idx, groups = pd.factorize(df_data["group"], sort=True)
replicate_idx, replicates = pd.factorize(df_data["replicate"], sort=True)
coords = {
"group": groups,
"obs_id": np.arange(df_data.shape[0]),
"replicate":replicates,
}
model = pm.Model(coords=coords)
with model:
BoundHalfNormal = pm.Bound(pm.HalfNormal, lower=0)
mu_a = pm.Normal("μ_a", mu=0, sigma=10, dims="group")
sigma_a = BoundHalfNormal("σ_a", sigma=10, dims="group")
sigma_b = BoundHalfNormal("σ_b", sigma=10, dims="group")
mu = pm.Normal("μ", mu=mu_a[group_idx], sigma=sigma_a[group_idx], dims=("group", "replicate"))
sigma = BoundHalfNormal("σ", sigma=sigma_b[replicate_idx], dims=("group", "replicate"))
obs = pm.Normal(
"obs",
mu=mu[group_idx, replicate_idx],
sigma=sigma[group_idx, replicate_idx],
observed=df_data["height"],
dims="obs_id",
)
```