Pigeons.jl enables users to leverage distributed computation to obtain samples from complicated probability distributions, such as multimodal posteriors arising in Bayesian inference and high-dimensional distributions in statistical mechanics. Pigeons is easy to use single-threaded, multi-threaded and/or distributed over thousands of MPI-communicating machines. We demo Pigeons.jl and offer advice to Julia developers who wish to implement correct distributed and randomized algorithms.

In this talk we provide an overview of Pigeons.jl and describe how we addressed the challenges of implementing a distributed, parallelized, and randomized algorithm, exploiting a strong notion of â€śparallelism invarianceâ€ť which we will exemplify and motivate. The talk appeals to practitioners who want to leverage distributed computation to perform challenging Bayesian inference tasks or sample from complex distributions such as those arising in statistical mechanics. We briefly describe how Pigeons uses a state-of-the-art method known as non-reversible parallel tempering to efficiently explore challenging posterior distributions. The talk also appeals to a broad array of Julia developers who may want to implement distributed randomized algorithms. The open-source code for Pigeons.jl is available at https://github.com/Julia-Tempering/Pigeons.jl.

Ensuring code correctness at the intersection of randomized, parallel, and distributed algorithms is a challenge. To address this challenge, we designed Pigeons based on a notion of â€śparallelism invarianceâ€ť: the output for a given input should be **identical** regardless of which of the following four scenarios is used: 1. one machine running on one thread, 2. one machine running on several threads, 3. several machines running, each using one thread (in our case, communicating via MPI.jl), and 4. several machines running, each using several threads. Since (1) is significantly simpler to debug and implement than (2, 3, 4), being able to exactly compare the four outputs pointwise (instead of distributional equality checks, which have false positive rates), is a powerful tool to detect software defects.

Two factors tend to cause violations of parallelism invariance: (a) task-local and thread-local random number generators, (b) non-associativity of floating point operations. We discuss libraries we have developed to workaround (a) and (b) while preserving the same running time complexity, including a Julia SplittableRandom stream library (https://github.com/UBC-Stat-ML/SplittableRandoms.jl) and a custom distributed MPI reduction in pure Julia.

Joint work with: Alexandre Bouchard-CĂ´tĂ©, Paul Tiede, Miguel Biron-Lattes, Trevor Campbell, and Saifuddin Syed.