ZigZagBoomerang.jl - parallel inference and variable selection

07/30/2021, 8:00 PM8:30 PM UTC
Purple

Abstract:

ZigZagBoomerang.jl provides piecewise deterministic Monte Carlo methods. They have the same goal as classical Markov chain Monte Carlo methods: to sample for example from the posterior distribution in a Bayesian model. Only that the distribution is explored through the continuous movement of a particle and not one point at a time. This provides new angles of attack: I showcase a multithreaded sampler and high-dimensional variable selection sampler.

Description:

ZigZagBoomerang.jl - parallel inference and variable selection

ZigZagBoomerang.jl provides piecewise deterministic Monte Carlo (PDMC) methods. They have the same goal as classical Markov chain Monte Carlo methods: to sample from a probability distribution, for example the posterior distribution in a Bayesian model. Only that the distribution is explored through the continuous movement of a particle and not one point at a time. The particle changes direction at random times and moves otherwise on deterministic trajectories. For example it may move with constant velocity along a line, see the picture. The random direction changes are calibrated such that the trajectory of the particle samples the target distribution, in general the particle is turned back (reflected) when moving too far into the tails of the distribution. From the trajectory, the quantities of interest, such as the posterior mean and standard deviation, can be estimated.

The decision of whether to change direction in one coordinate only requires the evaluation of a partial derivative which depends on few coordinates – the neighbourhood of the coordinate in the Markov blanket. That allows exploiting multiple processor cores using Julia's multithreaded parallelism (or other forms of parallel computing). The difference between threaded Gibbs sampling and threaded PDMP is that in Gibbs sampling part of the state is fixed, while the other part is changed. Here, the particle never ceases to move, and it is the decisions about direction changes which happen in parallel on subsets of coordinates. Metaphorically speaking this is the difference between walking, where one foot is on the ground all the time, and running, where both feet are in the air between steps.

Because the particle moves on a deterministic trajectory between the times of random events, one can determine exactly the time when the process would leave an area of interest. That allows to sample distributions of bounded support, or spending additional time in a lower dimensional subset of the space, the basis of variable selection with the sticky PDMPs in high dimensional sparse inference problems.

In the presentation I showcase a multithreaded sampler and high-dimensional variable selection with sticky PDMPs.

Links

Literature

  1. Joris Bierkens, Paul Fearnhead, Gareth Roberts: The Zig-Zag Process and Super-Efficient Sampling for Bayesian Analysis of Big Data. The Annals of Statistics, 2019, 47. Vol., Nr. 3, pp. 1288-1320. [https://arxiv.org/abs/1607.03188].
  2. Joris Bierkens, Sebastiano Grazzi, Kengo Kamatani and Gareth Robers: The Boomerang Sampler. ICML 2020. [https://arxiv.org/abs/2006.13777].
  3. Joris Bierkens, Sebastiano Grazzi, Frank van der Meulen, Moritz Schauer: A piecewise deterministic Monte Carlo method for diffusion bridges. Statistics and Computing, 2021 (to appear). [https://arxiv.org/abs/2001.05889].
  4. Joris Bierkens, Sebastiano Grazzi, Frank van der Meulen, Moritz Schauer: Sticky PDMP samplers for sparse and local inference problems. 2020. [https://arxiv.org/abs/2103.08478].

Platinum sponsors

Julia Computing

Gold sponsors

Relational AI

Silver sponsors

Invenia LabsConningPumas AIQuEra Computing Inc.King Abdullah University of Science and TechnologyDataChef.coJeffrey Sarnoff

Media partners

Packt Publication

Fiscal Sponsor

NumFOCUS