Workflow - Standard processing of 10X single cell ATAC-seq data with SnapATAC2

single-cell-scatac-standard-processing-snapatac2/standard-processing-of-10x-single-cell-atac-seq-data-with-snapatac2

Author(s)
Timon Schlegel
version Version
2
last_modification Last updated
Aug 9, 2024
license License
CC-BY-4.0
galaxy-tags Tags
scATAC-seq
epigenetics

Features
Tutorial
hands_on Single-cell ATAC-seq standard processing with SnapATAC2

Workflow Testing
Tests: ✅
Results: Not yet automated
FAIRness purl PURL
https://gxy.io/GTN:W00274
RO-Crate logo with flask Download Workflow RO-Crate Workflowhub cloud with gears logo View on (Dev) WorkflowHub
Launch in Tutorial Mode question
galaxy-download Download
flowchart TD
  0["ℹ️ Input Dataset\nFragment_file"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nchromosome_sizes.tabular"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\ngene_annotation"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["ℹ️ Input Dataset\nBam-file"];
  style 3 stroke:#2c3143,stroke-width:4px;
  4["ℹ️ Input Parameter\nKeys for annotations of obs/cells or vars/genes"];
  style 4 fill:#ded,stroke:#393,stroke-width:4px;
  5["ℹ️ Input Dataset\nReplace_file"];
  style 5 stroke:#2c3143,stroke-width:4px;
  6["pp.import_data"];
  1 -->|output| 6;
  0 -->|output| 6;
  7["pp.make_fragment_file"];
  3 -->|output| 7;
  8["pl.frag_size_distr"];
  6 -->|anndata_out| 8;
  265d6020-44e2-408c-9eb6-02cbc9ac23a7["Output\nplot frag_size"];
  8 --> 265d6020-44e2-408c-9eb6-02cbc9ac23a7;
  style 265d6020-44e2-408c-9eb6-02cbc9ac23a7 stroke:#2c3143,stroke-width:4px;
  9["pl.frag_size_distr_log"];
  6 -->|anndata_out| 9;
  a37cd3c3-e07e-4bea-899f-ee44cee8370b["Output\nplot log frag_size"];
  9 --> a37cd3c3-e07e-4bea-899f-ee44cee8370b;
  style a37cd3c3-e07e-4bea-899f-ee44cee8370b stroke:#2c3143,stroke-width:4px;
  10["metrics.tsse"];
  6 -->|anndata_out| 10;
  2 -->|output| 10;
  233284e5-c184-4330-a1d3-2d03951c4ce7["Output\nanndata tsse"];
  10 --> 233284e5-c184-4330-a1d3-2d03951c4ce7;
  style 233284e5-c184-4330-a1d3-2d03951c4ce7 stroke:#2c3143,stroke-width:4px;
  11["pp.import_data-sorted_by_barcodes"];
  1 -->|output| 11;
  7 -->|fragments_out| 11;
  50ef9fe7-bb08-4957-bc45-079ed78f9ee8["Output\nanndata"];
  11 --> 50ef9fe7-bb08-4957-bc45-079ed78f9ee8;
  style 50ef9fe7-bb08-4957-bc45-079ed78f9ee8 stroke:#2c3143,stroke-width:4px;
  12["pl.tsse"];
  10 -->|anndata_out| 12;
  9a5c0f9c-0423-4822-871e-ba8317f2fa7b["Output\nplot tsse"];
  12 --> 9a5c0f9c-0423-4822-871e-ba8317f2fa7b;
  style 9a5c0f9c-0423-4822-871e-ba8317f2fa7b stroke:#2c3143,stroke-width:4px;
  13["pp.filter_cells"];
  10 -->|anndata_out| 13;
  b58dc4ad-6bf0-4f31-99f4-4f8deb13427c["Output\nanndata filter cells"];
  13 --> b58dc4ad-6bf0-4f31-99f4-4f8deb13427c;
  style b58dc4ad-6bf0-4f31-99f4-4f8deb13427c stroke:#2c3143,stroke-width:4px;
  14["pp.add_tile_matrix"];
  13 -->|anndata_out| 14;
  d4b64fc4-fd35-4781-b799-0d07fe8fe438["Output\nanndata tile matrix"];
  14 --> d4b64fc4-fd35-4781-b799-0d07fe8fe438;
  style d4b64fc4-fd35-4781-b799-0d07fe8fe438 stroke:#2c3143,stroke-width:4px;
  15["pp.select_features"];
  14 -->|anndata_out| 15;
  f531f6bd-3fc6-48b9-bb33-66de7928e670["Output\nanndata select features"];
  15 --> f531f6bd-3fc6-48b9-bb33-66de7928e670;
  style f531f6bd-3fc6-48b9-bb33-66de7928e670 stroke:#2c3143,stroke-width:4px;
  16["pp.scrublet"];
  15 -->|anndata_out| 16;
  5b17aa3a-3694-440f-9022-cf9761feb760["Output\nanndata scrublet"];
  16 --> 5b17aa3a-3694-440f-9022-cf9761feb760;
  style 5b17aa3a-3694-440f-9022-cf9761feb760 stroke:#2c3143,stroke-width:4px;
  17["pp.filter_doublets"];
  16 -->|anndata_out| 17;
  3838cf9d-a018-4aa7-8852-893b2f60ec2a["Output\nanndata filter doublets"];
  17 --> 3838cf9d-a018-4aa7-8852-893b2f60ec2a;
  style 3838cf9d-a018-4aa7-8852-893b2f60ec2a stroke:#2c3143,stroke-width:4px;
  18["tl.spectral"];
  17 -->|anndata_out| 18;
  77883516-8ac4-40b5-8b75-5750639d9fc4["Output\nanndata spectral"];
  18 --> 77883516-8ac4-40b5-8b75-5750639d9fc4;
  style 77883516-8ac4-40b5-8b75-5750639d9fc4 stroke:#2c3143,stroke-width:4px;
  19["tl.umap"];
  18 -->|anndata_out| 19;
  062a5471-7b7a-4d9b-bf6c-fe9bbbc5d2ad["Output\nanndata umap"];
  19 --> 062a5471-7b7a-4d9b-bf6c-fe9bbbc5d2ad;
  style 062a5471-7b7a-4d9b-bf6c-fe9bbbc5d2ad stroke:#2c3143,stroke-width:4px;
  20["pp.knn"];
  19 -->|anndata_out| 20;
  b7ebe580-9938-48e5-8718-4df197126615["Output\nanndata knn"];
  20 --> b7ebe580-9938-48e5-8718-4df197126615;
  style b7ebe580-9938-48e5-8718-4df197126615 stroke:#2c3143,stroke-width:4px;
  21["tl.leiden"];
  20 -->|anndata_out| 21;
  9970eb96-013b-4d97-bbbf-2d1f4663b73c["Output\nanndata_leiden_clustering"];
  21 --> 9970eb96-013b-4d97-bbbf-2d1f4663b73c;
  style 9970eb96-013b-4d97-bbbf-2d1f4663b73c stroke:#2c3143,stroke-width:4px;
  22["pl.umap"];
  21 -->|anndata_out| 22;
  4fff8ec8-7d6f-4871-9e42-d6c1791d6aa9["Output\numap_leiden-clusters"];
  22 --> 4fff8ec8-7d6f-4871-9e42-d6c1791d6aa9;
  style 4fff8ec8-7d6f-4871-9e42-d6c1791d6aa9 stroke:#2c3143,stroke-width:4px;
  23["make_gene_matrix"];
  21 -->|anndata_out| 23;
  2 -->|output| 23;
  9a5c2e23-5b59-41ae-aceb-0e47ce9e7b1a["Output\nanndata gene matrix"];
  23 --> 9a5c2e23-5b59-41ae-aceb-0e47ce9e7b1a;
  style 9a5c2e23-5b59-41ae-aceb-0e47ce9e7b1a stroke:#2c3143,stroke-width:4px;
  24["scanpy_filter_genes"];
  23 -->|anndata_out| 24;
  7a9b22df-e0e5-49d9-9fa1-2efe316d6d66["Output\nanndata filter genes"];
  24 --> 7a9b22df-e0e5-49d9-9fa1-2efe316d6d66;
  style 7a9b22df-e0e5-49d9-9fa1-2efe316d6d66 stroke:#2c3143,stroke-width:4px;
  25["Normalize"];
  24 -->|anndata_out| 25;
  7dddd4c8-1944-4f50-b331-857d03cc9631["Output\nanndata normalize"];
  25 --> 7dddd4c8-1944-4f50-b331-857d03cc9631;
  style 7dddd4c8-1944-4f50-b331-857d03cc9631 stroke:#2c3143,stroke-width:4px;
  26["pp.log1p"];
  25 -->|anndata_out| 26;
  40ace30a-21e1-4bf2-8e08-ad93b62fb2ee["Output\nanndata log1p"];
  26 --> 40ace30a-21e1-4bf2-8e08-ad93b62fb2ee;
  style 40ace30a-21e1-4bf2-8e08-ad93b62fb2ee stroke:#2c3143,stroke-width:4px;
  27["external.pp.magic"];
  26 -->|anndata_out| 27;
  8c1c7aa9-b9dd-467b-9715-73773aea093f["Output\nanndata_magic"];
  27 --> 8c1c7aa9-b9dd-467b-9715-73773aea093f;
  style 8c1c7aa9-b9dd-467b-9715-73773aea093f stroke:#2c3143,stroke-width:4px;
  28["Copy obsm"];
  21 -->|anndata_out| 28;
  27 -->|anndata_out| 28;
  2c3825d3-7151-4eef-97d0-c8a31ca6fd3e["Output\nanndata_gene-matrix_leiden"];
  28 --> 2c3825d3-7151-4eef-97d0-c8a31ca6fd3e;
  style 2c3825d3-7151-4eef-97d0-c8a31ca6fd3e stroke:#2c3143,stroke-width:4px;
  29["umap_plot_with_scanpy"];
  28 -->|output_h5ad| 29;
  4 -->|output| 29;
  4cfbbb70-156b-48d7-b14b-db21198d9b6d["Output\numap_marker-genes"];
  29 --> 4cfbbb70-156b-48d7-b14b-db21198d9b6d;
  style 4cfbbb70-156b-48d7-b14b-db21198d9b6d stroke:#2c3143,stroke-width:4px;
  30["Inspect observations"];
  28 -->|output_h5ad| 30;
  31["Cut leiden from table"];
  30 -->|obs| 31;
  11c22200-a54e-4585-82c4-8457dbbfb53c["Output\nleiden annotation"];
  31 --> 11c22200-a54e-4585-82c4-8457dbbfb53c;
  style 11c22200-a54e-4585-82c4-8457dbbfb53c stroke:#2c3143,stroke-width:4px;
  32["Replace leiden"];
  31 -->|out_file1| 32;
  5 -->|output| 32;
  467de061-79e6-4148-8a65-c06383a14012["Output\ncell type annotation"];
  32 --> 467de061-79e6-4148-8a65-c06383a14012;
  style 467de061-79e6-4148-8a65-c06383a14012 stroke:#2c3143,stroke-width:4px;
  33["Manipulate AnnData"];
  28 -->|output_h5ad| 33;
  32 -->|outfile_replace| 33;
  cbba6173-f439-4fd5-8423-946aa625112b["Output\nanndata_cell_type"];
  33 --> cbba6173-f439-4fd5-8423-946aa625112b;
  style cbba6173-f439-4fd5-8423-946aa625112b stroke:#2c3143,stroke-width:4px;
  34["Plot cell types"];
  33 -->|anndata| 34;
  c28c53ae-8ee7-47d7-8a00-352908bb7c1a["Output\numap_cell-type"];
  34 --> c28c53ae-8ee7-47d7-8a00-352908bb7c1a;
  style c28c53ae-8ee7-47d7-8a00-352908bb7c1a stroke:#2c3143,stroke-width:4px;
  35["Final Anndata general info"];
  33 -->|anndata| 35;
  379a85d1-a204-4b0b-8b33-86e5582fc51f["Output\ngeneral"];
  35 --> 379a85d1-a204-4b0b-8b33-86e5582fc51f;
  style 379a85d1-a204-4b0b-8b33-86e5582fc51f stroke:#2c3143,stroke-width:4px;

Inputs

Input Label
Input dataset Fragment_file
Input dataset chromosome_sizes.tabular
Input dataset gene_annotation
Input dataset Bam-file
Input parameter Keys for annotations of obs/cells or vars/genes
Input dataset Replace_file

Outputs

From Output Label
Input parameter Keys for annotations of obs/cells or vars/genes
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_plotting/snapatac2_plotting/2.6.4+galaxy1 SnapATAC2 Plotting pl.frag_size_distr
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_plotting/snapatac2_plotting/2.6.4+galaxy1 SnapATAC2 Plotting pl.frag_size_distr_log
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing metrics.tsse
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing pp.import_data-sorted_by_barcodes
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_plotting/snapatac2_plotting/2.6.4+galaxy1 SnapATAC2 Plotting pl.tsse
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing pp.filter_cells
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing pp.add_tile_matrix
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing pp.select_features
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing pp.scrublet
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing pp.filter_doublets
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_clustering/snapatac2_clustering/2.6.4+galaxy1 SnapATAC2 Clustering tl.spectral
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_clustering/snapatac2_clustering/2.6.4+galaxy1 SnapATAC2 Clustering tl.umap
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_clustering/snapatac2_clustering/2.6.4+galaxy1 SnapATAC2 Clustering pp.knn
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_clustering/snapatac2_clustering/2.6.4+galaxy1 SnapATAC2 Clustering tl.leiden
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_plotting/snapatac2_plotting/2.6.4+galaxy1 SnapATAC2 Plotting pl.umap
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 SnapATAC2 Preprocessing make_gene_matrix
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_filter/scanpy_filter/1.9.6+galaxy3 Filter scanpy_filter_genes
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_normalize/scanpy_normalize/1.9.6+galaxy3 Normalize
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_inspect/scanpy_inspect/1.9.6+galaxy3 Inspect and manipulate pp.log1p
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_normalize/scanpy_normalize/1.9.6+galaxy3 Normalize external.pp.magic
toolshed.g2.bx.psu.edu/repos/ebi-gxa/anndata_ops/anndata_ops/1.9.3+galaxy0 AnnData Operations Copy obsm
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_plot/scanpy_plot/1.9.6+galaxy3 Plot umap_plot_with_scanpy
Cut1 Cut Cut leiden from table
toolshed.g2.bx.psu.edu/repos/bgruening/replace_column_by_key_value_file/replace_column_with_key_value_file/0.2 Replace column Replace leiden
toolshed.g2.bx.psu.edu/repos/iuc/anndata_manipulate/anndata_manipulate/0.10.3+galaxy0 Manipulate AnnData
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_plot/scanpy_plot/1.9.6+galaxy3 Plot Plot cell types
toolshed.g2.bx.psu.edu/repos/iuc/anndata_inspect/anndata_inspect/0.10.3+galaxy0 Inspect AnnData Final Anndata general info

Tools

Tool Links
Cut1
toolshed.g2.bx.psu.edu/repos/bgruening/replace_column_by_key_value_file/replace_column_with_key_value_file/0.2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/ebi-gxa/anndata_ops/anndata_ops/1.9.3+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/anndata_inspect/anndata_inspect/0.10.3+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/anndata_manipulate/anndata_manipulate/0.10.3+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_filter/scanpy_filter/1.9.6+galaxy3 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_inspect/scanpy_inspect/1.9.6+galaxy3 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_normalize/scanpy_normalize/1.9.6+galaxy3 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/scanpy_plot/scanpy_plot/1.9.6+galaxy3 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_clustering/snapatac2_clustering/2.6.4+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_plotting/snapatac2_plotting/2.6.4+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.6.4+galaxy1 View in ToolShed

To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.

Importing into Galaxy

Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!
Hands-on: Importing a workflow
  • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
  • Click on galaxy-upload Import at the top-right of the screen
  • Provide your workflow
    • Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
    • Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
  • Click the Import workflow button

Below is a short video demonstrating how to import a workflow from GitHub using this procedure:

Video: Importing a workflow from URL

Version History

Version Commit Time Comments
2 747cf76eb 2024-08-08 16:59:41 updated snapatac2 version to 2.6.4
1 9740cbbad 2024-07-11 13:29:51 Update workflow and add tests

For Admins

Installing the workflow tools

wget https://training.galaxyproject.org/training-material/topics/single-cell/tutorials/scatac-standard-processing-snapatac2/workflows/Standard-processing-of-10X-single-cell-ATAC-seq-data-with-SnapATAC2.ga -O workflow.ga
workflow-to-tools -w workflow.ga -o tools.yaml
shed-tools install -g GALAXY -a API_KEY -t tools.yaml
workflow-install -g GALAXY -a API_KEY -w workflow.ga --publish-workflows