Note

This tutorial was generated from an IPython notebook that can be downloaded here.

PyMC3 extras

exoplanet comes bundled with a few utilities that can make it easier to use and debug PyMC3 models for fitting exoplanet data. This tutorial briefly describes these features and their use.

Custom tuning schedule

The main extra is the exoplanet.PyMC3Sampler class that wraps the PyMC3 sampling procedure to include support for learning off-diagonal elements of the mass matrix. This is very important for any problems where there are covariances between the parameters (this is true for pretty much all exoplanet models). A thorough discussion of this can be found elsewhere online, but here is a simple demo where we sample a covariant Gaussian using exoplanet.PyMC3Sampler.

First, we generate a random positive definite covariance matrix for the Gaussian:

import numpy as np

ndim = 5
np.random.seed(42)
L = np.random.randn(ndim, ndim)
L[np.diag_indices_from(L)] = 0.1*np.exp(L[np.diag_indices_from(L)])
L[np.triu_indices_from(L, 1)] = 0.0
cov = np.dot(L, L.T)

And then we can sample this using PyMC3 and exoplanet.PyMC3Sampler:

import pymc3 as pm
import exoplanet as xo

sampler = xo.PyMC3Sampler()

with pm.Model() as model:
    pm.MvNormal("x", mu=np.zeros(ndim), chol=L, shape=(ndim,))

    # Run the burn-in and learn the mass matrix
    step_kwargs = dict(target_accept=0.9)
    sampler.tune(tune=2000, step_kwargs=step_kwargs)

    # Run the production chain
    trace = sampler.sample(draws=2000)
Only 2 samples in chain.
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 154/154 [00:04<00:00, 37.27draws/s]
The chain reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.
The chain reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.
Only 2 samples in chain.
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 54/54 [00:01<00:00, 43.46draws/s]
The chain reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.
The chain reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.
Only 2 samples in chain.
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 104/104 [00:00<00:00, 274.50draws/s]
Only 2 samples in chain.
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 204/204 [00:00<00:00, 722.30draws/s]
Only 2 samples in chain.
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 404/404 [00:00<00:00, 742.09draws/s]
Only 2 samples in chain.
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 804/804 [00:01<00:00, 739.52draws/s]
Only 2 samples in chain.
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 2304/2304 [00:02<00:00, 972.94draws/s]
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 4100/4100 [00:03<00:00, 889.30draws/s]

This is a little more verbose than the standard use of PyMC3, but the performance is several orders of magnitude better than you would get without the mass matrix tuning. As you can see from the pymc3.summary, the autocorrelation time of this chain is about 1 as we would expect for a simple problem like this.

pm.summary(trace)
mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
x__0 -0.000770 0.163174 0.001844 -0.313189 0.326268 5308.199246 1.000289
x__1 0.009594 0.530463 0.006201 -0.966045 1.112490 5869.219661 0.999756
x__2 -0.008501 0.663295 0.008059 -1.262766 1.302731 5604.258930 1.000301
x__3 -0.021440 1.187398 0.014497 -2.399616 2.175922 5562.507765 1.000134
x__4 -0.007055 2.091880 0.025115 -4.017539 4.126919 6069.101651 1.000062

Evaluating model components for specific samples

I find that when I’m debugging a PyMC3 model, I often want to inspect the value of some part of the model for a given set of parameters. As far as I can tell, there isn’t a simple way to do this in PyMC3, so exoplanet comes with a hack for doing this: exoplanet.eval_in_model(). This function handles the mapping between named PyMC3 variables and the input required by the Theano function that can evaluate the requested variable or tensor.

As a demo, let’s say that we’re fitting a parabola to some data:

np.random.seed(42)
x = np.sort(np.random.uniform(-1, 1, 50))
with pm.Model() as model:
    logs = pm.Normal("logs", mu=-3.0, sd=1.0)
    a0 = pm.Normal("a0")
    a1 = pm.Normal("a1")
    a2 = pm.Normal("a2")
    mod = a0 + a1 * x + a2 * x**2

    # Sample from the prior
    prior_sample = pm.sample_prior_predictive(samples=1)
    y = xo.eval_in_model(mod, prior_sample)
    y += np.exp(prior_sample["logs"]) * np.random.randn(len(y))

    # Add the likelihood
    pm.Normal("obs", mu=mod, sd=pm.math.exp(logs), observed=y)

    # Fit the data
    map_soln = pm.find_MAP()
    trace = pm.sample()
logp = 42.615, ||grad|| = 17.722: 100%|██████████| 19/19 [00:00<00:00, 2140.81it/s]
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [a2, a1, a0, logs]
Sampling 2 chains: 100%|██████████| 2000/2000 [00:01<00:00, 1501.39draws/s]
The acceptance probability does not match the target. It is 0.8823602292815543, but should be close to 0.8. Try to increase the number of tuning steps.

After running the fit, it might be interesting to look at the predictions of the model. We could have added a pymc3.Deterministic node for eveything, but that can end up taking up a lot of memory and sometimes its useful to be able to experiement with different outputs. Using exoplanet.utils.eval_in_model() we can, for example, evaluate the maximum a posteriori (MAP) model prediction on a fine grid:

import matplotlib.pyplot as plt

x_grid = np.linspace(-1.1, 1.1, 5000)
with model:
    pred = xo.eval_in_model(a0 + a1 * x_grid + a2 * x_grid**2, map_soln)

plt.plot(x, y, ".k", label="data")
plt.plot(x_grid, pred, label="map")
plt.legend(fontsize=12)
plt.xlabel("x")
plt.ylabel("y")
plt.xlim(-1.1, 1.1);
../../_images/pymc3-extras_10_0.png

We can also combine this with exoplanet.get_samples_from_trace() to plot this prediction for a set of samples in the trace.

samples = np.empty((50, len(x_grid)))
with model:
    y_grid = a0 + a1 * x_grid + a2 * x_grid**2
    for i, sample in enumerate(xo.get_samples_from_trace(trace, size=50)):
        samples[i] = xo.eval_in_model(y_grid, sample)

plt.plot(x, y, ".k", label="data")
plt.plot(x_grid, pred, label="map")
plt.plot(x_grid, samples[0], color="C1", alpha=0.1, label="posterior")
plt.plot(x_grid, samples[1:].T, color="C1", alpha=0.1)
plt.legend(fontsize=12)
plt.xlabel("x")
plt.ylabel("y")
plt.xlim(-1.1, 1.1);
../../_images/pymc3-extras_12_0.png