For Developers#
Models#
List of Tapqir models.
- class tapqir.models.model.Model(S=1, K=2, Q=None, device='cpu', dtype='double', priors=None)[source]#
Bases:
object
Base class for tapqir models.
Derived models must implement the methods
- Parameters:
S (
int
) – Number of distinct molecular states for the binder molecules.K (
int
) – Maximum number of spots that can be present in a single image.device (
str
) – Computation device (cpu or gpu).dtype (
str
) – Floating point precision.priors (
Optional
[dict
]) – Dictionary of parameters of prior distributions.
- run(num_iter=0, progress_bar=<class 'tqdm.std.tqdm'>)[source]#
Run inference procedure for a specified number of iterations. If num_iter equals zero then run till model converges.
- save_checkpoint(writer=None)[source]#
Save checkpoint.
- Parameters:
writer (
Optional
[SummaryWriter
]) – SummaryWriter object.
cosmos#
- class tapqir.models.cosmos.cosmos(S=1, K=2, Q=None, device='cpu', dtype='double', use_pykeops=True, priors={'background_mean_std': 1000.0, 'background_std_std': 100.0, 'gain_std': 50.0, 'height_std': 10000.0, 'lamda_rate': 1.0, 'proximity_rate': 1.0, 'width_max': 2.25, 'width_min': 0.75})[source]#
Bases:
Model
Multi-Color Time-Independent Colocalization Model
Reference:
Ordabayev YA, Friedman LJ, Gelles J, Theobald DL. Bayesian machine learning analysis of single-molecule fluorescence colocalization images. eLife. 2022 March. doi: 10.7554/eLife.73860.
- Parameters:
- model()[source]#
Generative Model
Model parameters:
Parameter
Shape
Description
gain
- \(g\)(1,)
camera gain
proximity
- \(\sigma^{xy}\)(1,)
proximity
lamda
- \(\lambda\)(1,)
average rate of target-nonspecific binding
pi
- \(\pi\)(1,)
average binding probability of target-specific binding
background
- \(b\)(N, F)
background intensity
z
- \(z\)(N, F)
target-specific spot presence
theta
- \(\theta\)(N, F)
target-specific spot index
m
- \(m\)(K, N, F)
spot presence indicator
height
- \(h\)(K, N, F)
spot intensity
width
- \(w\)(K, N, F)
spot width
x
- \(x\)(K, N, F)
spot position on x-axis
y
- \(y\)(K, N, F)
spot position on y-axis
data
- \(D\)(N, F, P, P)
observed images
Full joint distribution:
\[\begin{split}\begin{aligned} p(D, \phi) =~&p(g) p(\sigma^{xy}) p(\pi) p(\lambda) \prod_{\mathsf{AOI}} \left[ p(\mu^b) p(\sigma^b) \prod_{\mathsf{frame}} \left[ \vphantom{\prod_{F}} p(b | \mu^b, \sigma^b) p(z | \pi) p(\theta | z) \vphantom{\prod_{\substack{\mathsf{pixelX} \\ \mathsf{pixelY}}}} \cdot \right. \right. \\ &\prod_{\mathsf{spot}} \left[ \vphantom{\prod_{F}} p(m | \theta, \lambda) p(h) p(w) p(x | \sigma^{xy}, \theta) p(y | \sigma^{xy}, \theta) \right] \left. \left. \prod_{\substack{\mathsf{pixelX} \\ \mathsf{pixelY}}} \sum_{\delta} p(\delta) p(D | \mu^I, g, \delta) \right] \right] \end{aligned}\end{split}\]\(z\) and \(\theta\) marginalized joint distribution:
\[\begin{split}\begin{aligned} \sum_{z, \theta} p(D, \phi) =~&p(g) p(\sigma^{xy}) p(\pi) p(\lambda) \prod_{\mathsf{AOI}} \left[ p(\mu^b) p(\sigma^b) \prod_{\mathsf{frame}} \left[ \vphantom{\prod_{F}} p(b | \mu^b, \sigma^b) \sum_{z} p(z | \pi) \sum_{\theta} p(\theta | z) \vphantom{\prod_{\substack{\mathsf{pixelX} \\ \mathsf{pixelY}}}} \cdot \right. \right. \\ &\prod_{\mathsf{spot}} \left[ \vphantom{\prod_{F}} p(m | \theta, \lambda) p(h) p(w) p(x | \sigma^{xy}, \theta) p(y | \sigma^{xy}, \theta) \right] \left. \left. \prod_{\substack{\mathsf{pixelX} \\ \mathsf{pixelY}}} \sum_{\delta} p(\delta) p(D | \mu^I, g, \delta) \right] \right] \end{aligned}\end{split}\]
- guide()[source]#
Variational Distribution
\[\begin{split}\begin{aligned} q(\phi \setminus \{z, \theta\}) =~&q(g) q(\sigma^{xy}) q(\pi) q(\lambda) \cdot \\ &\prod_{\mathsf{AOI}} \left[ q(\mu^b) q(\sigma^b) \prod_{\mathsf{frame}} \left[ \vphantom{\prod_{F}} q(b) \prod_{\mathsf{spot}} q(m) q(h | m) q(w | m) q(x | m) q(y | m) \right] \right] \end{aligned}\end{split}\]
- TraceELBO(jit=False)[source]#
A trace implementation of ELBO-based SVI that supports - exhaustive enumeration over discrete sample sites, and - local parallel sampling over any sample site in the guide.
hmm#
- class tapqir.models.hmm.hmm(S=1, K=2, device='cpu', dtype='double', use_pykeops=True, vectorized=True, priors={'background_mean_std': 1000.0, 'background_std_std': 100.0, 'gain_std': 50.0, 'height_std': 10000.0, 'lamda_rate': 1.0, 'proximity_rate': 1.0, 'width_max': 2.25, 'width_min': 0.75})[source]#
Bases:
cosmos
Multi-Color Hidden Markov Colocalization Model
EXPERIMENTAL This model relies on Funsor backend.
- Parameters:
K (
int
) – Maximum number of spots that can be present in a single image.device (
str
) – Computation device (cpu or gpu).dtype (
str
) – Floating point precision.use_pykeops (
bool
) – Use pykeops as backend to marginalize out offset.vectorized (
bool
) – Vectorize time-dimension.priors (
dict
) – Dictionary of parameters of prior distributions.
- TraceELBO(jit=False)[source]#
A trace implementation of ELBO-based SVI that supports - exhaustive enumeration over discrete sample sites, and - local parallel sampling over any sample site in the guide.
- property theta_probs: Tensor#
Posterior target-specific spot probability \(q(\theta = k, z=z_\mathsf{MAP})\).
Distributions#
KSMOGN#
- class tapqir.distributions.ksmogn.KSMOGN(height, width, x, y, target_locs, background, gain, offset_samples, offset_logits, P, m=None, alpha=None, use_pykeops=True, validate_args=None)[source]#
Bases:
TorchDistribution
K-Spots Marginalized Offset Gamma Noise Image Distribution.
\[\mu^S_{\mathsf{pixelX}(i), \mathsf{pixelY}(j)} = \dfrac{m \cdot h}{2 \pi w^2} \exp{\left( -\dfrac{(i-x-x^\mathsf{target})^2 + (j-y-y^\mathsf{target})^2}{2 w^2} \right)}\]\[\mu^I = b + \sum_{\mathsf{spot}} \mu^S\]\[p(D|\mu^I, g) = \sum_\delta p(\delta) p(D|\mu^I, g, \delta) = \sum_\delta \delta_\mathsf{weights} \cdot \mathrm{Gamma}(D - \delta_\mathsf{samples} | \mu^I, g)\]Reference:
Ordabayev YA, Friedman LJ, Gelles J, Theobald DL. Bayesian machine learning analysis of single-molecule fluorescence colocalization images. bioRxiv. 2021 Oct. doi: 10.1101/2021.09.30.462536.
- Parameters:
height (
Tensor
) – Integrated spot intensity. Should be broadcastable tobatch_shape + (K,)
.width (
Tensor
) – Spot width. Should be broadcastable tobatch_shape + (K,)
.x (
Tensor
) – Spot center on x-axis. Should be broadcastable tobatch_shape + (K,)
.y (
Tensor
) – Spot center on y-axis. Should be broadcastable tobatch_shape + (K,)
.target_locs (
Tensor
) – Target location. Should have the rightmost size2
correspondnig to locations on x- and y-axes, and be broadcastable tobatch_shape + (2,)
.background (
Tensor
) – Background intensity. Should be broadcastable tobatch_shape
.gain (
Tensor
) – Camera gain.offset_samples (
Tensor
) – Offset samples from the empirical distribution.offset_logits (
Tensor
) – Offset log weights corresponding to the offset samples.P (int) – Number of pixels along the axis.
m (
Optional
[Tensor
]) – Spot presence indicator. Should be broadcastable tobatch_shape + (K,)
.alpha (
Optional
[Tensor
]) – Signal cross-talk coefficient matrix. Should be broadcastable to(Q, C)
.use_pykeops (bool) – Use pykeops as backend to marginalize out offset.
Distribution utility functions#
- tapqir.distributions.util.gaussian_spots(height, width, x, y, target_locs, P, m=None)[source]#
Calculates ideal shape of the 2D-Gaussian spots given spot parameters and target positions.
\[\mu^S_{\mathsf{pixelX}(i), \mathsf{pixelY}(j)} = \dfrac{m \cdot h}{2 \pi w^2} \exp{\left( -\dfrac{(i-x-x^\mathsf{target})^2 + (j-y-y^\mathsf{target})^2}{2 w^2} \right)}\]- Parameters:
height (
Tensor
) – Integrated spot intensity. Should be broadcastable tobatch_shape
.width (
Tensor
) – Spot width. Should be broadcastable tobatch_shape
.x (
Tensor
) – Spot center on x-axis. Should be broadcastable tobatch_shape
.y (
Tensor
) – Spot center on y-axis. Should be broadcastable tobatch_shape
.target_locs (
Tensor
) – Target location. Should have the rightmost size2
correspondnig to locations on x- and y-axes, and be broadcastable tobatch_shape + (2,)
.P (
int
) – Number of pixels along the axis.m (
Optional
[Tensor
]) – Spot presence indicator. Should be broadcastable tobatch_shape
.
- Return type:
- Returns:
A tensor of a shape
batch_shape + (P, P)
representing 2D-Gaussian spots.
- tapqir.distributions.util.truncated_poisson_probs(lamda, K)[source]#
Probability of the number of non-specific spots.
\[\begin{split}\mathbf{TruncatedPoisson}(\lambda, K) = \begin{cases} 1 - e^{-\lambda} \sum_{i=0}^{K-1} \dfrac{\lambda^i}{i!} & \textrm{if } k = K \\ \dfrac{\lambda^k e^{-\lambda}}{k!} & \mathrm{otherwise} \end{cases}\end{split}\]
- tapqir.distributions.util.probs_m(lamda, K)[source]#
Prior spot presence probability \(p(m | \theta, \lambda)\).
\[\begin{split}p(m_{\mathsf{spot}(k)} | \theta, \lambda) = \begin{cases} \mathbf{Bernoulli}(1) & \text{$\theta = k$} \\ \mathbf{Bernoulli} \left( \sum_{l=1}^K \dfrac{l \cdot \mathbf{TruncPoisson}(l; \lambda, K)}{K} \right) & \text{$\theta = 0$} \rule{0pt}{4ex} \\ \mathbf{Bernoulli} \left( \sum_{l=1}^{K-1} \dfrac{l \cdot \mathbf{TruncPoisson}(l; \lambda, K-1)}{K-1} \right) & \text{otherwise} \rule{0pt}{4ex} \end{cases}\end{split}\]
- tapqir.distributions.util.expand_offtarget(probs)[source]#
Expand state probability
probs
(e.g., \(\pi\) or \(A\)) to off-target AOIs.\[\begin{split}p(\mathsf{state}) = \begin{cases} \mathbf{Categorical}\left( \mathsf{probs} \right) & \textrm{if on-target} \\ \mathbf{Categorical}\left( \left[ 1, 0, \dots, 0 \right] \right) & \textrm{if off-target} \end{cases}\end{split}\]
- tapqir.distributions.util.probs_theta(K, device)[source]#
Prior probability for target-specific spot index \(p(\theta | z)\).
\[\begin{split}p(\theta | z) = \begin{cases} \mathbf{Categorical}\left( \begin{bmatrix} 0 & 1/K & \dots & 1/K \end{bmatrix} \right) & z > 0 \\ \mathbf{Categorical}\left( \begin{bmatrix} 1 & 0 & \dots & 0 \end{bmatrix} \right) & z = 0 \end{cases}\end{split}\]
Utility functions#
- class tapqir.utils.dataset.CosmosDataset(images, xy, is_ontarget, mask=None, labels=None, offset_samples=None, offset_weights=None, device=device(type='cpu'), time1=None, ttb=None, name=None, channels=None)[source]#
Cosmos Dataset
- tapqir.utils.imscroll.count_intervals(labels)[source]#
Count binding interval data.
Co-localization and absent intervals are coded as -3 and -2 respectively when they are the first (or only) interval in a record, 3 and 2 when they are the last interval in a record and 1 and 0 elsewhere.
Reference:
@article{friedman2015multi, title={Multi-wavelength single-molecule fluorescence analysis of transcription mechanisms}, author={Friedman, Larry J and Gelles, Jeff}, journal={Methods}, volume={86}, pages={27--36}, year={2015}, publisher={Elsevier} }
- tapqir.utils.imscroll.time_to_first_binding(labels)[source]#
Measure the time elapsed prior to the first binding.
Time-to-first binding for a binary data:
\(\mathrm{ttfb} = \sum_{f=1}^{F-1} f z_{n,f} \prod_{f^\prime=0}^{f-1} (1 - z_{n,f^\prime}) + F \prod_{f^\prime=0}^{F-1} (1 - z_{n,f^\prime})\)
Expected value of the time-to-first binding:
\(\mathbb{E}[\mathrm{ttfb}] = \sum_{f=1}^{F-1} f q(z_{n,f}=1) \prod_{f^\prime=f-1}^{f-1} q(z_{n,f^\prime}=0) + F \prod_{f^\prime=0}^{F-1} q(z_{n,f^\prime}=0)\)
Reference:
@article{friedman2015multi, title={Multi-wavelength single-molecule fluorescence analysis of transcription mechanisms}, author={Friedman, Larry J and Gelles, Jeff}, journal={Methods}, volume={86}, pages={27--36}, year={2015}, publisher={Elsevier} }
- tapqir.utils.imscroll.association_rate(labels)[source]#
Compute the on-rate from the binary data assuming a two-state HMM model.
- tapqir.utils.imscroll.dissociation_rate(labels)[source]#
Compute the off-rate from the binary data assuming a two-state HMM model.
- tapqir.utils.imscroll.bootstrap(samples, estimator, repetitions=1000, probs=0.68)[source]#
Estimate the confidence interval of an estimator by constructing approximating distributions using the bootstrap method (resampling with replacement).