StatPhysBio synthetic data example

Overview

The goal of this example is to retrieve a simulated 3D MERFISH experiment generated by the statphysbio lab at ASU and run merfish3d-analysis. This is a synthetic 119-gene MERFISH experiment in a small field of view (FOV).

Preliminaries

You need to make sure you have a working python enviornment with merfish3d-analysis properly installed and the synthetic dataset downloaded. The dataset is ~20 MB and you will need roughly another 20 MB of temporary space to create the qi2labDataStore structure we use to perform tile registration, pixel decoding, filtering, and cell assignment.

Downloading the data

All of the required code to process this data is in the Google Drive download. There should be one top-level directories in the downloaded folder, fixed. The directory structure is as follows:

/path/to/download/
├── fixed/ 
  ├── aligned_1.tif
  ├── bit_order.csv
  ├── codebook.csv
  ├── GT_spots.csv
  └── scan_metadata.csv

Processing non-qi2lab data

Because this is a simulated experiment, it does not follow the standard metadata or file structure of a microscope. Because of the difference, we first simulate an qi2lab experiment using the synthetic data and proceed from there.

Processing steps

For each of the python files in the examples/statphybio_synthetic directory, you will need to scroll to the bottom and replace the path with the correct path. For example, in 01_convert_simulation_to_experiment.py you'll want to change this section:

if __name__ == "__main__":
    root_path = Path(r"/path/to/download/")

    convert_simulation(root_path=root_path)

For all of the files in the example, you'll set the root_path to root_path = Path(r"/path/to/download/raw_data"). The package automatically places the datastore within that directory.

Once that is done, you can run 01_convert_simulation_to_experiment.py, 02_convert_to_datastore.py, 03_register_and_deconvolve.py, and 04_pixeldecode.py without any further changes.

Depending on your computing hardware, you should expect ~1 minute for 01_convert_simulation_to_experiment.py, ~1 minute for 02_convert_to_datastore.py, <20 minutes for 03_register_and_deconolve.py, and <20 minutes for 04_pixeldecode.py, depending on your hard disk and GPU configuration.

Ensuring a sucessful run

We have included the ground truth spots with the simulated synthetic data. You can run 05_calculate_F1.py to calculate the F1 score for the default decoder settings. If you want to explore how the various decoder parameters impact accuracy, you can run 06_sweep_F1.py to loop over many possible parameters and calculate the F1 score for each unique parameter set.

To run these functions, you'll need to make sure that the ground truth spots path is set at the end of the file.

For 05_calculate_F1.py,

if __name__ == "__main__":
    root_path = Path(r"/path/to/sim_acquisition")
    gt_path = Path(r"/path/to/GT_spots.csv")
    results = calculate_F1(root_path=root_path,gt_path=gt_path,search_radius=.75)
    print(results)

and for 06_sweep_f1.py,

if __name__ == "__main__":
    root_path = Path(r"/path/to/sim_acquisition")
    gt_path = Path(r"/path/to/GT_spots.csv")
    sweep_decode_params(root_path=root_path, gt_path=gt_path)