Skip to content

Synthetic data example

Overview

The goal of this example is to retrieve and analyze simulated iterative RNA-FISH experiments generated by the statphysbio lab at ASU using merfish3d-analysis.

Try without installing: Google Colab Notebook

If you would like to try out merfish3d-analysis without installing, there is an example notebook that can be run on Google Colab here.

Preliminaries

You need to make sure you have a working python enviornment with merfish3d-analysis properly installed and the synthetic dataset downloaded. The dataset is ~1 GB and you will need roughly another 1 GB of temporary space to create the qi2labDataStore structure we use to perform tile registration, pixel decoding, filtering, and cell assignment.

Downloading the data

The Zenodo link contains four types of simulations: (1) MERFISH in cells; (2) MERFISH randomly distributed; (3) smFISH in cells; (4) smFISH randomly distributed. For each simulation type, there are three axial spacings: 0.315, 1.0, and 1.5 micrometers. The directory structure is as follows:

/path/to/download/
├── example_16bit_cells/
  ├── 0.315/
    ├── beads
        ├── codebook.csv
        └── experiment_and_GT.h5
    ├── aligned_1.tif
    ├── bit_order.csv
    ├── codebook.csv
    ├── GT_spots.csv
    ├── experiment_and_GT.h5
    ├── norm_offset.json
    └── scan_metadata.csv
  ├── 1.0/
    ├── beads/
        ├── codebook.csv
        └── experiment_and_GT.h5
    ├── aligned_1.tif
    ├── bit_order.csv
    ├── codebook.csv
    ├── GT_spots.csv
    ├── experiment_and_GT.h5
    ├── norm_offset.json
    └── scan_metadata.csv
  ├── 1.5/
    ├── beads/
        ├── codebook.csv
        └── experiment_and_GT.h5
    ├── aligned_1.tif
    ├── bit_order.csv
    ├── codebook.csv
    ├── GT_spots.csv
    ├── experiment_and_GT.h5
    ├── norm_offset.json
    └── scan_metadata.csv
├── example_16bit_uniform/
...

Processing non-qi2lab data

Because this is a simulated experiment, it does not follow the standard metadata or file structure of the microscope file format we internally use in qi2lab. Therefore, we first convert the simulation to the qi2lab experiment format and proceed from there.

Processing steps

We provide a command line interface (CLI) to run the simulation analysis. This consists of a series of commands for each processing step.

  1. Simulation conversion
    conda run -n merfish3d --live-stream bash -lc "sim-convert /path/to/simulation/example_16bit_cells/0.315"
    
  2. Simulation to qi2lab datastore conversion
    conda run -n merfish3d --live-stream bash -lc "sim-datastore /path/to/simulation/example_16bit_cells/0.315/sim_acquisition"
    
  3. Data pre-processing
    conda run -n merfish3d --live-stream bash -lc "sim-preprocess /path/to/simulation/example_16bit_cells/0.315/sim_acquisition"
    
  4. Pixel decoding and RNA calling
    conda run -n merfish3d --live-stream bash -lc "sim-decode /path/to/simulation/example_16bit_cells/0.315/sim_acquisition"
    
  5. Calculate F1-score
    conda run -n merfish3d --live-stream bash -lc "sim-f1score /path/to/simulation/example_16bit_cells/0.315"
    

Ensuring a sucessful run

If the runs are succesful, you should have F1 scores that match the following values:

Simulation Type Axial Spacing (µm) Precision Recall F1 score
Uniform MERFISH 0.315 1.00 1.00 1.00
Uniform MERFISH 1.0 0.93 0.92 0.93
Uniform MERFISH 1.5 0.66 0.65 0.66
Cells MERFISH 0.315 1.00 1.00 1.00
Cells MERFISH 1.0 0.96 1.00 0.98
Cells MERFISH 1.5 0.70 0.62 0.66