Part I: Intro (Generic)#

Models#

Tapqir is a modular program that uses a chosen probabilistic model to interpret experimental data. There currently exists only a single model, cosmos, developed for analysis of simple CoSMoS experiments. The cosmos model is for time-independent analysis of single-channel (i.e., one-binder) data sets. Our publication (Ordabayev et al., 2022) contains a comprehensive description of the cosmos model. In the future, we plan to add addional models to Tapqir, for example to integrate hidden-Markov kinetic analysis or to handle global analysis with multiple wavelength channels.

Tapqir uses Bayesian models; this means that each model parameter has an associated probability distribution (uncertainty). For those who are interested, Kinz-Thompson et al., 2021 is a nice read about Bayesian inference in the context of single-molecule data analysis.

As a consequence of Bayesian inference, Tapqir computes for each frame of each AOI the probability \(p(\mathsf{specific})\), that a target-specific spot is present.

cosmos is a physics-informed model, i.e. model parameters have a physical meaning. For N AOIs per frame, F frames, and a maximum of K spots in each AOI in each frame, Tapqir estimates the values of the cosmos model parameters:

Parameter	Shape	Description
`gain` - \(g\)	(1,)	camera gain
`proximity` - \(\sigma^{xy}\)	(1,)	proximity
`lamda` - \(\lambda\)	(1,)	average rate of target-nonspecific binding
`pi` - \(\pi\)	(1,)	average binding probability of target-specific binding
`background` - \(b\)	(N, F)	background intensity
`z` - \(z\)	(N, F)	target-specific spot presence
`theta` - \(\theta\)	(N, F)	target-specific spot index
`m` - \(m\)	(K, N, F)	spot presence indicator
`height` - \(h\)	(K, N, F)	spot intensity
`width` - \(w\)	(K, N, F)	spot width
`x` - \(x\)	(K, N, F)	spot position on x-axis
`y` - \(y\)	(K, N, F)	spot position on y-axis
`data` - \(D\)	(N, F, P, P)	observed images

where “shape” is the dimensionality of the parameters, e.g., (1,) shape means a scalar parameter and (K, N, F) shape means that each spot in each AOI in each frame has a separate value of the parameter. Ordabayev et al., 2022 has a more detailed description of the parameters.

Some basic Linux commands#

For a quick reference, some commonly used Linux commands:

pwd - Print the name of the current working directory.
ls - List files and folders.
cd - Change the working directory (e.g., cd Downloads)
mkdir - Create a folder (e.g., mkdir new_folder). Tip: try to avoid spaces in file & folder names because spaces need a special escape character \.
rm - Delete files. Use rm -r to delete folders. Be careful, files delted with rm command do not go to the recycle bin and are permanently deleted!
cp - Copy files. Usage is cp <from> <to>.
mv - Move or rename files. Usage is mv <from> <to>.
Use double [TAB] for command or filename completion.

Input data#

Tapqir analyzes a small area of interest (AOI) around each target or off-target location. AOIs (usually 14x14 pixels) are extracted from raw input data. Currently Tapqir supports raw input images in Glimpse format and pre-processing information files from the imscroll program:

folder containing image data in glimpse format and header files
driftlist file recording the stage movement that took place during the experiment
aoiinfo file designating target molecule locations in the binder channel
(optional) aoiinfo file designating off-target locations in the binder channel

We plan to extend the support to other data formats. Please start a new issue if you would like to work with us to extend support to file formats used in your processing pipeline.

Workflow#

The following diagram shows the steps in a Tapqir data processing run (using the cosmos model), the Tapqir command used to run each step, and the input files used and output files produced (color highlights) in each step. All the Tapqir commands for a single processing run should be run in the same default working directory (new_folder in the diagram) in order to keep the files associated with the run organized in a single location.