Gillespie.jl: Stochastic Simulation Algorithm in Julia

Introduction

Gillespie.jl provides an implementation of Gillespie's direct method for performing stochastic simulations, which are widely used in many fields, including systems biology and epidemiology. It borrows the basic interface (although none of the code) from the R library GillespieSSA by Mario Pineda-Krch, although Gillespie.jl only implements the standard exact method at present, whereas GillespieSSA also includes tau-leaping, etc..

Installation

Gillespie.jl can be installed from the Julia read-eval-print-loop (REPL) as follows.

Pkg.add("Gillespie")

Example

Let's take the 'standard' susceptible-infected-recovered (SIR) model, commonly used in epidemiology. A deterministic version of this model is as follows.

Let's consider a stochastic version of the SIR model.

We first need to load the library.

using Gillespie;

We next need to define a function that given state variables x (type: Array{Int64,1}) and a vector of parameters (type: Vector{Float64}), returns a vector of rates of length k for different types of transitions. For this example, there are two transition functions, corresponding to infection and recovery.

function F(x,parms)
  (S,I,R) = x
  (beta,gamma) = parms
  infection = beta*S*I
  recovery = gamma*I
  [infection,recovery]
end;
F (generic function with 1 method)

We define the states of the system - a Vector{Int64} of length n with the number of susceptible, infected, and recovered individuals.

x0 = [9999,1,0];
3-element Array{Int64,1}:
 9999
    1
    0

To define the transitions, we define an Array{Int64,k,n} that denotes the changes to each of the n state variables for each of the k transitions. Infection results in a loss of 1 susceptible and a gain of one infected individual, while recovery is associated with a loss of one infected and a gain of one recovered.

nu = [[-1 1 0];[0 -1 1]];
2x3 Array{Int64,2}:
 -1   1  0
  0  -1  1

Finally, we define the parameter values (in the order required by the function F), and the time we want the simulation to finish. The simulation will finish early if the propensity rates are zero.

parms = [0.1/10000.0,0.05]
tf = 1000.0;
1000.0

Given the above, the simulation can be run using the function ssa. It's usually a good idea to set a random number seed prior to simulation first.

srand(1236)
result = ssa(x0,F,nu,parms,tf);
Gillespie.SSAResult([0.0,11.7409,12.1425,17.647,18.0987,21.2081,22.2022,22.6326,26.1136,27.201  …  413.431,413.962,414.973,420.548,422.656,424.17,424.486,430.513,433.936,482.663],15704x3 Array{Int64,2}:
 9998  2     0
 9998  2     0
 9997  3     0
 9996  4     0
 9995  5     0
 9994  6     0
 9994  5     1
 9994  4     2
 9994  3     3
 9993  4     3
    ⋮
 2149  6  7845
 2149  5  7846
 2149  4  7847
 2148  5  7847
 2148  4  7848
 2148  3  7849
 2148  2  7850
 2148  1  7851
 2148  0  7852,Gillespie.SSAStats("zeroprop",15703),Gillespie.SSAArgs([9999,1,0],ex-1.F,2x3 Array{Int64,2}:
 -1   1  0
  0  -1  1,[1.0e-5,0.05],1000.0))

This will return an object of type SSAresult. This can be converted to a DataFrame using the function ssa_data.

data = ssa_data(result);
15704×4 DataFrames.DataFrame
│ Row   │ time    │ x1   │ x2 │ x3   │
├───────┼─────────┼──────┼────┼──────┤
│ 1     │ 0.0     │ 9998 │ 2  │ 0    │
│ 2     │ 11.7409 │ 9998 │ 2  │ 0    │
│ 3     │ 12.1425 │ 9997 │ 3  │ 0    │
│ 4     │ 17.647  │ 9996 │ 4  │ 0    │
│ 5     │ 18.0987 │ 9995 │ 5  │ 0    │
│ 6     │ 21.2081 │ 9994 │ 6  │ 0    │
│ 7     │ 22.2022 │ 9994 │ 5  │ 1    │
│ 8     │ 22.6326 │ 9994 │ 4  │ 2    │
⋮
│ 15696 │ 413.962 │ 2149 │ 6  │ 7845 │
│ 15697 │ 414.973 │ 2149 │ 5  │ 7846 │
│ 15698 │ 420.548 │ 2149 │ 4  │ 7847 │
│ 15699 │ 422.656 │ 2148 │ 5  │ 7847 │
│ 15700 │ 424.17  │ 2148 │ 4  │ 7848 │
│ 15701 │ 424.486 │ 2148 │ 3  │ 7849 │
│ 15702 │ 430.513 │ 2148 │ 2  │ 7850 │
│ 15703 │ 433.936 │ 2148 │ 1  │ 7851 │
│ 15704 │ 482.663 │ 2148 │ 0  │ 7852 │

This makes it straightforward to plot e.g. using Gadfly.

using Gadfly
plot(data,
  layer(x="time",y="x1",Geom.step,Theme(default_color=colorant"red")),
  layer(x="time",y="x2",Geom.step,Theme(default_color=colorant"blue")),
  layer(x="time",y="x3",Geom.step,Theme(default_color=colorant"green")),
  Guide.xlabel("Time"),
  Guide.ylabel("Number"),
  Guide.manual_color_key("Population",
                            ["S", "I", "R"],
                            ["red", "blue", "green"]),
  Guide.title("SIR epidemiological model"))

Application programming interface

Types

# Gillespie.SSAStatsType.

A type storing the status at the end of a call to ssa:

  • termination_status : whether the simulation stops at the final time (finaltime) or early due to zero propensity function (zeroprop)
  • nsteps : the number of steps taken during the simulation.

# Gillespie.SSAArgsType.

A type storing the call to ssa:

  • x0 : a Vector of Int64, representing the initial states of the system.
  • F : a Function or a callable type, which itself takes two arguments; x, a Vector of Int64 representing the states, and parms, a Vector of Float64 representing the parameters of the system.
  • nu : a Matrix of Int64, representing the transitions of the system, organised by row.
  • parms : a Vector of Float64 representing the parameters of the system.
  • tf : the final simulation time (Float64)

# Gillespie.SSAResultType.

This type stores the output of ssa, and comprises of:

  • time : a Vector of Float64, containing the times of simulated events.
  • data : a Matrix of Int64, containing the simulated states.
  • stats : an instance of SSAStats.
  • args : arguments passed to ssa.

Functions

# Gillespie.ssaFunction.

This function performs Gillespie's stochastic simulation algorithm. It takes the following arguments:

  • x0 : a Vector of Int64, representing the initial states of the system.
  • F : a Function or a callable type, which itself takes two arguments; x, a Vector of Int64 representing the states, and parms, a Vector of Float64 representing the parameters of the system.
  • nu : a Matrix of Int64, representing the transitions of the system, organised by row.
  • parms : a Vector of Float64 representing the parameters of the system.
  • tf : the final simulation time (Float64)

# Gillespie.ssa_dataFunction.

This takes a single argument of type SSAResult and returns a DataFrame.

# Gillespie.pfsampleFunction.

This function is a substitute for StatsBase.sample(wv::WeightVec), which avoids recomputing the sum and size of the weight vector, as well as a type conversion of the propensity vector. It takes the following arguments:

  • w : an Array{Float64,1}, representing propensity function weights.
  • s : the sum of w.
  • n : the length of w.