library("tidyverse"); theme_set(theme_bw())
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.7 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
set.seed(20230307)
The exponential distribution is commonly used as a waiting time distribution, i.e. the time until something happens.
A random variable \(X\) has an exponential distribution if its density function is \[ f(x) = \lambda e^{-\lambda x} \quad \mbox{for }x>0 \] for rate \(\lambda>0\).
r <- 3 # rate
ggplot(data.frame(x = 0:(4/r))) +
stat_function(fun = dexp, args = list(rate = r))
The probability density function is strictly decreasing for any rate.
Note: Some parameterizations of the exponential use the mean rather than the rate as the parameter.
The cumulative distribution function is \[ F(x) = P(X \le x) = 1-e^{-\lambda x} \quad \mbox{for }x > 0. \]
ggplot(data.frame(x = 0:(4/r))) +
stat_function(fun = pexp, args = list(rate = r))
The probability of being greater than some value has a simple form for the exponential distribution. \[ P(X > x) = 1-P(X\le x) = e^{-\lambda x}. \]
The exponential distribution has a property called the memoryless property that states \[ P(X > x+c | X > c) = P(X > x) = e^{-\lambda x} \]
For example,
c <- 1
# Check probability
1-pexp(c, rate = r) # if this probability is too small, the simulation will take a while
## [1] 0.04978707
# Simulate exponential random variables that are greater than c
x <- rexp(1e4, rate = r)
for (i in seq_along(x)) {
while (x[i] < c)
x[i] <- rexp(1, rate = r)
}
ggplot(
data.frame(x=x),
aes(x-c) # performs the shift
) +
stat_ecdf() +
stat_function(fun = pexp, args = list(rate = r), color = "red")
Let \[ X_i \stackrel{ind}{\sim} Exp(\lambda_i) \] what is the distribution for \(T = \min\{X_1,\ldots,X_n\}\)?
It turns out that \[ T \sim Exp\left(\sum_{i=1}^n \lambda_i\right). \]
If you are only told that \(T=t\)
which \(X_i=t\)?
That is, which exponential was the minimum one?
We won’t know, but the probability is proportional to the rate. So, we know \[ P(X_i = t) \propto \lambda_i = \frac{\lambda_i}{\sum_{j=1}^n \lambda_j}. \]
These types of models are used in a wide variety of fields. Here we will motivate the module using a chemistry experiment called Michaelis-Menton kinetics. This experiment creates a product “P” from a substrate “S” in a reaction that is catalyzed by an enzyme “E”. In order to convert the substrate to the product, the enzyme must bind to the substrate to create the enzyme-substrate complex “ES”.
This system can be represented by this system of reactions \[ E + S \leftrightarrows ES \to P. \] This system of reactions actually has 3 reactions: \[ \begin{array}{rll} I:& E + S &\stackrel{\lambda_1}{\to} ES \\ II:& ES &\stackrel{\lambda_2}{\to} E + S \\ III:& ES &\stackrel{\lambda_3}{\to} E + P \\ \end{array} \] The \(\lambda\) indicate the rate of the reaction.
In this system, reaction I requires that an enzyme molecule come into contact with a substrate molecule.
If these molecules are in a volume, then the system might look like this
set.seed(20230309)
n <- 30
d <- data.frame(
x = runif(n),
y = runif(n),
species = sample(c("E","S","ES","P"), size = n, replace = TRUE,
prob = c(0.2,0.6,0.1,0.1)),
speed = runif(n,0.05, 0.1),
direction = runif(n, max = 2*pi)
) %>%
mutate(
xend = x + speed*cos(direction),
yend = y + speed*sin(direction),
species = factor(species, levels = c("E","S","ES","P"))
)
ggplot(d, aes(x = x, y = y, xend = xend, yend=yend,
color = species, shape = species)) +
geom_point(size = 5) +
geom_segment(arrow = arrow(length = unit(0.2, "cm")), color = "black") +
scale_color_manual(
values = c(
"E" = "red",
"S" = "blue",
"ES" = "purple",
"P" = "green"
)
) +
scale_shape_manual(
values = c(
"E" = 4,
"S" = 0,
"ES" = 7,
"P" = 19
)
)
These reactions can be coded by indicting two matrices: Pre and Post. These matrices indicate the reactants (Pre) and the products (Post).
Pre <- rbind(
# E S ES P
c(1, 1, 0, 0),
c(0, 0, 1, 0),
c(0, 0, 1, 0)
)
rownames(Pre) <- c("I","II","III")
colnames(Pre) <- c("E","S","ES","P")
Pre
## E S ES P
## I 1 1 0 0
## II 0 0 1 0
## III 0 0 1 0
Post <- rbind(
# E S ES P
c(0, 0, 1, 0),
c(1, 1, 0, 0),
c(1, 0, 0, 1)
)
rownames(Post) <- c("I","II","III")
colnames(Post) <- c("E","S","ES","P")
Post
## E S ES P
## I 0 0 1 0
## II 1 1 0 0
## III 1 0 0 1
The reaction matrix is constructed by subtracting these and the stoichiometry matrix by transposing the reaction matrix.
A <- Post - Pre # reaction
S <- t(A) # stoichiometry
A
## E S ES P
## I -1 -1 1 0
## II 1 1 -1 0
## III 1 0 -1 1
S
## I II III
## E -1 1 1
## S -1 1 0
## ES 1 -1 -1
## P 0 0 1
We can update the state of the system by either the stoichiometry or reaction matrix.
x0 <- c(5, 100, 1, 0)
x0 + S[,1]
## E S ES P
## 4 99 2 0
x0 + S[,2]
## E S ES P
## 6 101 0 0
x0 + S[,3]
## E S ES P
## 6 100 0 1
To determine which reaction occurs we need to calculate the propensity of the reaction. The propensity is the product of two components: the reaction rate and the number of combinations of the Pre molecules.
For this model we have the following propensities: \[ \begin{array}{rll} I:& \lambda_1 E \cdot S \\ II:& \lambda_2 ES \\ III:& \lambda_3 ES \\ \end{array} \]
To simulate this continuous time Markov process, we iterate through these steps
michaelis_menton_reaction_matrix <- rbind(
# E S ES P
c(-1, -1, 1, 0),
c( 1, 1, -1, 0),
c( 1, 0, -1, 1)
)
#' Function to simulate a single Michaelis-Menton transition
#'
#' This function will simulate a single predator-prety transition based on the
#' current `state` of the system and the reaction `rates`
simulate_michaelis_menton_transition <- function(state, rate, initial_time) {
propensity <- c(
rate[1] * state[1] * state[2],
rate[2] * state[3] ,
rate[3] * state[3]
)
time_increment <- rexp(1, rate = sum(propensity))
reaction <- sample(1:3, 1, prob = propensity)
state <- state + michaelis_menton_reaction_matrix[reaction,]
return(
list(
time = initial_time + time_increment,
state = state
))
}
simulate_michaelis_menton_system <- function(initial_state, rate, max_rxns = 1e3) {
# Create storage structures for state and time
n_possible_rxns <- max_rxns + 1
state <- matrix(initial_state, nrow = 4, ncol = n_possible_rxns)
n_rxns <- 1
time <- rep(0, n_possible_rxns)
# Simulate outbreak
while(any(state[2:3,n_rxns] > 0) & n_rxns <= max_rxns) {
tmp <- simulate_michaelis_menton_transition(state[,n_rxns], rate, time[n_rxns])
n_rxns <- n_rxns + 1
time[n_rxns] <- tmp$time
state[,n_rxns] <- tmp$state
}
d <- data.frame(
E = state[1, ],
S = state[2, ],
ES = state[3, ],
P = state[4, ],
time = time
) %>%
pivot_longer(-time, names_to = "state", values_to = "count") %>%
mutate(
state = factor(state, levels = c("E","S","ES","P"))
)
}
This
set.seed(1)
d <- simulate_michaelis_menton_system(c(5, 100, 0, 0), rate = c(1,1,.1), max_rxns = 1e3)
ggplot(d, aes(x = time, y = count, color = state)) +
geom_step()
This model is a continuous time Markov process because
The SIR model has three states:
There are two reactions that can occur \[ \begin{array}{r\quad{}ll} I:&S + I &\stackrel{\lambda}{\to} 2I \\ II:&I &\stackrel{\rho}{\to} R \end{array} \]
The propensity of these reactions are \[ \begin{array}{r\quad{}ll} I:& \lambda \, S\cdot I / N \\ II:& \rho \, I \end{array} \]
The system is updated by the reaction matrix
\[ \begin{array}{r|rrr} & S & I & R \\ \hline I & -1 & 1 & 0 \\ II & 0 & -1 & 1\\ \end{array} \]
Due to the two reactions converting a single individual from one state to another, this model has a constant population size.
sir_reaction_matrix <- rbind(
# S I R
c(-1, 1, 0),
c( 0, -1, 1)
)
#' Function to simulate a single SIR transition
#'
#' This function will simulate a single SIR transition based on the current
#' `state` of the system and the reaction `rates`
simulate_sir_transition <- function(state, rate, initial_time) {
propensity <- c(
rate[1] * state[1] * state[2] / sum(state),
rate[2] * state[2]
)
time_increment <- rexp(1, rate = sum(propensity))
reaction <- sample(1:2, 1, prob = propensity)
# Update state according to stoichiometry
state <- state + sir_reaction_matrix[reaction,]
return(
list(
time = initial_time + time_increment,
state = state
))
}
simulate_outbreak <- function(initial_state, rate, max_rxns = 5000) {
# Create storage structures for state and time
n_possible_rxns <- min(2*initial_state[1] + initial_state[2]+1, max_rxns + 1)
state <- matrix(initial_state, nrow = 3, ncol = n_possible_rxns)
n_rxns <- 1
time <- rep(0, n_possible_rxns)
# Simulate outbreak
while(state[2,n_rxns] > 0 & n_rxns <= max_rxns ) {
tmp <- simulate_sir_transition(state[,n_rxns], rate, time[n_rxns])
n_rxns <- n_rxns + 1
time[n_rxns] <- tmp$time
state[,n_rxns] <- tmp$state
}
# Clean up data frames
n <- which(state[2,] == 0)
d <- data.frame(
S = state[1, 1:n],
I = state[2, 1:n],
R = state[3, 1:n],
time = time[1:n]
) %>%
pivot_longer(-time, names_to = "state", values_to = "count") %>%
mutate(
state = factor(state, levels = c("S","I","R"))
)
}
Sometimes the outbreak will die out right away
set.seed(1)
d <- simulate_outbreak(c(1000, 1, 0), rate = c(2,1))
ggplot(d, aes(x = time, y = count, color = state)) +
geom_step()
Sometimes the outbreak will last a while
set.seed(4)
d <- simulate_outbreak(c(1000, 1, 0), rate = c(2,1))
ggplot(d, aes(x = time, y = count, color = state)) +
geom_step()
One question we might ask is: what is the probability the outbreak will die out quickly (before time 5) for a given initial state and reaction rates.
To answer this question, we can run a Monte Carlo study
X0 <- c(1000, 2, 0)
rate <- c(2,1)
die_out <- replicate(1e3, {
d <- simulate_outbreak(X0, rate)
return(max(d$time) < 5)
})
mean(die_out)
## [1] 0.258
binom.test(sum(die_out), length(die_out))$conf.int
## [1] 0.2311271 0.2863056
## attr(,"conf.level")
## [1] 0.95
Another question could be, given the outbreak lasts longer than 5, what is the expected number of people who will be infected?
X0 <- c(1000, 1, 0)
rate <- c(2,1)
infected <- replicate(1e3, {
d <- simulate_outbreak(X0, rate)
while(max(d$time) < 5)
d <- simulate_outbreak(X0, rate)
return(max(d$count[d$state == "R"]))
})
t.test(infected)$conf.int
## [1] 791.6672 797.3448
## attr(,"conf.level")
## [1] 0.95
A simple predator-prey model (stochastic version of the Lotka-Volterra Model) has two states:
There are two reactions that can occur \[ \begin{array}{r\quad{}ll} I:&R &\stackrel{\lambda}{\to} 2R \\ II:&F+R &\stackrel{\rho}{\to} 2F \\ II:&F &\stackrel{\delta}{\to} \emptyset \\ \end{array} \] The first reaction simply
The propensity of these reactions are \[ \begin{array}{r\quad{}ll} I:& \lambda \, R \\ II:& \rho \, F\cdot R\\ II:& \delta \, F \end{array} \]
The system is updated by the reaction matrix
\[ \begin{array}{r|rrr} & R & F \\ \hline I & 1 & 0 \\ II & -1 & 1 \\ III & 0 & -1 \end{array} \]
This model does not have a constant population size.
predator_prey_reaction_matrix <- rbind(
# R F
c( 1, 0),
c(-1, 1),
c( 0, -1)
)
#' Function to simulate a single predator-prey transition
#'
#' This function will simulate a single predator-prety transition based on the
#' current `state` of the system and the reaction `rates`
simulate_transition <- function(state, rate, initial_time) {
propensity <- c(
rate[1] * state[1],
rate[2] * state[1] * state[2],
rate[3] * state[2]
)
time_increment <- rexp(1, rate = sum(propensity))
reaction <- sample(1:3, 1, prob = propensity)
# Update state according to reaction matrix
state <- state + predator_prey_reaction_matrix[reaction,]
return(
list(
time = initial_time + time_increment,
state = state
))
}
simulate_predator_prey_system <- function(initial_state, rate, max_rxns = 1e4) {
# Create storage structures for state and time
n_possible_rxns <- max_rxns + 1
state <- matrix(initial_state, nrow = 2, ncol = n_possible_rxns)
n_rxns <- 1
time <- rep(0, n_possible_rxns)
# Simulate outbreak
while(state[2,n_rxns] > 0 & n_rxns <= max_rxns ) {
tmp <- simulate_transition(state[,n_rxns], rate, time[n_rxns])
n_rxns <- n_rxns + 1
time[n_rxns] <- tmp$time
state[,n_rxns] <- tmp$state
}
d <- data.frame(
R = state[1, ],
F = state[2, ],
time = time
) %>%
pivot_longer(-time, names_to = "state", values_to = "count") %>%
mutate(
state = factor(state, levels = c("R","F"))
)
}
Sometimes the foxes will eat all the rabbits
set.seed(1)
d <- simulate_predator_prey_system(c(20,20), rate = c(1,.01,1), max_rxns = 1e5)
ggplot(d, aes(x = time, y = count, color = state)) +
geom_step()