Using dataset collections

Overview

Questions:
  • How to manipulate large numbers of datasets at once?

Objectives:
  • Understand and master dataset collections

Time estimation: 30 minutes
Level: Intermediate Intermediate
Supporting Materials:
Last modification: Aug 8, 2021
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License The GTN Framework is licensed under MIT

Here we will show Galaxy features designed to help with the analysis of large numbers of samples. When you have just a few samples - clicking through them is easy. But once you’ve got hundreds - it becomes very annoying. In Galaxy we have introduced Dataset collections that allow you to combine numerous datasets in a single entity that can be easily manipulated.

Getting data

First, we need to upload datasets. Cut and paste the following URLs to Galaxy upload tool (see a tip Tip on how to do this below).

https://zenodo.org/record/5119008/files/M117-bl_1.fq.gz
https://zenodo.org/record/5119008/files/M117-bl_2.fq.gz
https://zenodo.org/record/5119008/files/M117-ch_1.fq.gz
https://zenodo.org/record/5119008/files/M117-ch_2.fq.gz
https://zenodo.org/record/5119008/files/M117C1-bl_1.fq.gz
https://zenodo.org/record/5119008/files/M117C1-bl_2.fq.gz
https://zenodo.org/record/5119008/files/M117C1-ch_1.fq.gz
https://zenodo.org/record/5119008/files/M117C1-ch_2.fq.gz

details Set format to fastqsanger.gz

The above datasets are in fastqsanger.gz format. It is necessary to explicitly set format in Galaxy. The tip Tip section below explains how to upload these data and set the correct format. There is a variety of fastq format flavors and it is difficult to guess them automatically.

  1. Click on Upload Data on the top of the left panel:

    UploadDataButton

  2. Click on Paste/Fetch:

    PasteFetchButton

  3. Paste URL into text box that would appear:

    PasteFetchModal

  4. Set Type (set all) to fastqsanger or, if your data is compressed as in URLs above (they have .gz extensions), to fastqsanger.gz

    ChangeTypeDropDown:

About these datasets

These datasets represent genomic DNA (enriched for mitochondria via a long range PCR) isolated from blood (bl) and cheek (buccal swab, ch) of mother (M117) and her child (M117C1) that was sequenced on an Illumina miSeq machine as paired-read library (250-bp reads; see our 2014 manuscript for Methods):

  • M117-bl_1 - family 117, mother, forward (F) reads from blood
  • M117-bl_2 - family 117, mother, reverse (R) reads from blood
  • M117-ch_1 - family 117, mother, forward (F) reads from cheek
  • M117-ch_1 - family 117, mother, reverse (R) reads from cheek
  • M117C1-bl_1- family 117, child, forward (F) reads from blood
  • M117C1-bl_2- family 117, child, reverse (R) reads from blood
  • M117C1-ch_1- family 117, child, forward (F) reads from cheek
  • M117C1-ch_2- family 117, child, reverse (R) reads from cheek

Creating a paired dataset collection

You can see that there are eight datasets forming four pairs. Obviously, we can manipulate them one-by-one (e.g., start four mapping jobs, call variants four times and so on), but this will unnecessarily tedious. Moreover, imagine if you have 100s or 1,000s of pairs: it will be impossible to process them individually.

This is exactly why we developed collections. Dataset collections allow combining multiple datasets into a single entity. Thus instead of dealing with four, a hundred, or a thousand of individual datasets you have only one item in Galaxy history to deal with.

Because our data is paired we need to create a hierarchical collection called Paired Dataset Collection or Paired Collection. In such collection there are two layers. The first layer corresponds to individual samples (e.g., M117-bl). The second layer represent forward and reverse reads corresponding to each sample:


paired collection
Figure 1: The logic of Paired Collection. Here N datasets are bundled into a paired collection with two layers. The first layer corresponds to samples and the second to forward and reverse reads within each sample.

To begin creating a collection we need to select datasets we would like to bundle. This is done using checkbox button of Galaxy’s history menu. Fig. 2 below shows this process.


selecting multiple datasets
Figure 2: Selecting multiple datasets and creating a paired collection.

The above process ended with appearance of Galaxy collection wizard. In this case Galaxy automatically assigned pairs using the _1 and _2 endings of dataset names. Let’s however pretend that this did not happen. Click on Unpair all (highlighted in red in the figure above) link and then on Filters link (see animation in Fig. 3). The interface will change into its unpaired state.

Here datasets containing the first (forward) and the second (reverse) read are differentiated by having _1 and _2 in the filename. We can use this feature in dataset collection wizard to pair our datasets. Type _1 in the left Filter text text box and _2 in the right. You will see that the dataset collection wizard will automatically filter lists on each side of the interface. Now you can either click Auto pair if pairs look good to you (proper combinations of datasets are listed in each line) or pair each forward/reverse group individually by pressing Pair these datasets button separating each pair.

Now it is time to name the collection: type M117-collection in Name text box and create the collection by clicking Create collection. A new item will appear in the history.


using collection wizard
Figure 3: Working with collection wizard. Text above this figure explains each step.

Clicking on collection will expand it to show four pairs it contains (panel B). Clicking individual pairs will expand them further to reveal forward and reverse datasets (panel C). Expanding these further will enable one to see individual datasets (panel D).

expanding collection
Figure 4: To look what is inside a collection, just click on it.

Processing data organized as a collection

By now we see that a collection can be used to bundle a large number of items into a single history item. Galaxy tools tools take collection as input. Let’s map reads contained in collection M117-collection against human mitochondrial genome. Before we can do this we need to upload mitochondrial genome using the following URL (see a tip Tip on how to do this below):

https://zenodo.org/record/5119008/files/chrM.fa.gz

details Set format to fasta.gz

The above dataset is in fasta.gz format. The tip Tip section below explains how to upload these data and set the correct format.

  • Copy the link location
  • Open the Galaxy Upload Manager (galaxy-upload on the top-right of the tool panel)

  • Click Reset button at the bottom of the form. If the button is greyed out -> skip to the next step.

  • Select Paste/Fetch Data
  • Paste the link into the text field

    https://zenodo.org/record/5119008/files/chrM.fa.gz

  • Change Type (set all): from “Auto-detect” to fasta.gz

  • Press Start

  • Close the window

  • By default, Galaxy uses the URL as the name, so rename the files with a more useful name.

Mapping reads

BWA-MEM tool is a widely used sequence aligner for short-read sequencing datasets such as those we are analysing in this tutorial. (You can find the tool by typing BWA MEM in the search box at the top left corner of Galaxy interface).

hands_on Hands-on: Map sequencing reads to reference genome

Run BWA-MEM Tool: toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.1 with the following parameters:

  • “Will you select a reference genome from your history or use a built-in index?”: Use a genome from history and build index
    • param-file “Use the following dataset as the reference sequence”: chrM.fa.gz (The mitochondrial genome we just uploaded)
  • “Single or Paired-end reads: Paired Collection
    • param-file “Select a paired collection”: M117-collection (the collection we built at the beginning of this tutorial.)
  • “Set read groups information?”: Do not set
  • “Select analysis mode”: 1.Simple Illumina mode

The interface should look like this:


bwa_mem_interface


  • Click Execute button

You will see jobs being submitted and new datasets appearing in the history. Because our collection contains four paired datasets Galaxy will actually four separate BWA-MEM jobs. In the end this BWA-MEM run will produce a new collection containing four (4) BAM datasets. Let’s look at this collection by clicking on it (panel A in the figure below). You can see that now this collection is no longer paired (compared to the collection we created in the beginning of this tutorial). This is because BWA-MEM takes forward and reverse data as input, but produces only a single BAM dataset as the output. So what we have in the result is a list of four dataset (BAM files; panel B). If you click on any of the datasets you will see that it is indeed a BAM dataset (panel C).

bwa_memCollection_ABC

Calling variants

After we mapped reads against the mitochondrial genome, we can now call variants. In this step a variant calling tool lofreq will take a collection of BAM datasets (the one produced by BWA-MEM), identify differences between reads and the reference, and output these differences as a collection of VCF datasets.

hands_on Hands-on: Call variants

Run Call variants Tool: toolshed.g2.bx.psu.edu/repos/iuc/lofreq_call/lofreq_call/2.1.5+galaxy1 with the following parameters:

  • param-file “Input reads in BAM format”: Map with BWA-MEM... (output of BWA-MEM tool)
  • “Choose the source for the reference genome”: History
    • param-file “Reference”: chrM.fa.gz (as fasta) (Input dataset)
  • “Call variants across”: Whole reference
  • “Types of variants to call”: SNVs and indels

The interface should look like this:


lofreq_interface


  • Click Execute button

Create table of variants using SnpSift Extract Fields

We will now convert VCF datasets into tab delimited format as it will be easier to work with. This will be done with SNPSift: a tool specifically designed for manipulation of tab-delimited data.

hands_on Hands-on: Create table of variants

Run SnpSift Extract Fields Tool: toolshed.g2.bx.psu.edu/repos/iuc/snpsift/snpSift_extractFields/4.3+t.galaxy0 with the following parameters:

  • param-file “Variant input file in VCF format”: Call variants on collection... (output of Call variants with lofreq tool)
  • “Fields to extract”: CHROM POS REF ALT QUAL DP AF SB DP4
  • “One effect per line”: Yes

The interface should look like this:


snpsift_interface


  • Click Execute button

As a result of this operation we now have a collection of four tab delimited files. Yet, ultimately we want to summaize these data as one final table. The next step does just that.

Collapse data into a single dataset

We now extracted meaningful fields from VCF datasets. But they still exist as a collection. To move towards secondary analysis we need to collapse this collection into a single dataset. For more information about collapsing collections see this video:

hands_on Hands-on: Collapse a collection

Run Collapse Collection Tool: toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/4.0 with the following parameters:

  • param-collection “Collection of files to collapse into single dataset”: SnpSift Extract Fields ... (output of SnpSift Extract Fields tool)
  • Keep one header line”: Yes
  • Prepend File name”: Yes
  • Where to add dataset name”: Same line and each line in dataset

The interface should look like this:


snpsift_interface


  • Click Execute button

You can see that this tool takes lines from all collection elements (in our case we have two), add element name as the first column, and pastes everything together. So if we have a collection as an input:

code-in Input: A collection with two items

A collection element named M117-bl.fq

chrM   152 T C  3707.0 1242 0.99 0 2,2,540,697
chrM 16519 T C 35149.0 1033 0.99 0 1,1,611,420

A collection element named M117-ch.fq:

chrM   152 T C  4098.0 1440 0.99 0 0,1,575,863
chrM 16519 T C 36574.0 1039 0.99 2 3,0,713,321

A collection element named M117C1-bl.fq:

chrM   152 T C  4888.0 1235 1.00 0 0,0,548,687
chrM 16519 T C 35220.0 1042 0.99 0 0,0,598,443

A collection element named M117C1-ch.fq:

chrM   152 T C  2757.0 1413 0.99 0 2,2,576,833
chrM 16455 G A    54.0  100 0.04 0 89,7,4,0
chrM 16519 T C 36363.0 1061 0.99 6 3,4,691,362

We will have a single dataset as the output:

code-out Output: A single dataset

then the Collapse Collection tool will produce this:

M117-bl.fq   chrM   152 T C  3707.0 1242 0.99 0 2,2,540,697
M117-bl.fq   chrM 16519 T C 35149.0 1033 0.99 0 1,1,611,420
M117-ch.fq   chrM   152 T C  4098.0 1440 0.99 0 0,1,575,863
M117-ch.fq   chrM 16519 T C 36574.0 1039 0.99 2 3,0,713,321
M117C1-bl.fq chrM   152 T C  4888.0 1235 1.00 0 0,0,548,687
M117C1-bl.fq chrM 16519 T C 35220.0 1042 0.99 0 0,0,598,443
M117C1-ch.fq chrM   152 T C  2757.0 1413 0.99 0 2,2,576,833
M117C1-ch.fq chrM 16455 G A    54.0  100 0.04 0 89,7,4,0
M117C1-ch.fq chrM 16519 T C 36363.0 1061 0.99 6 3,4,691,362

you can see that added a column with dataset ID taken from collection element name.

We did not fake this:

The history described in this page is accessible directly from here:

From there you can import histories to make them your own.

Collection operations

In this brief analysis we took four paired datasets, created a collection, analyzed this collection and finally created a single report. Such “lifecycle” is shown in the figure below. Here we started with eight fastq datasets representing four paired end samples. A paired collection was reduced to a list of BAM datasets by BWA-MEM. Varinat calling by lofreq and field extraction with SnpEff maintained collection structure: these tools processed four individual datasets changing their formats from BAM to VCF, and from VCF to Tab-delimited. Finally, we collapsed collection by merging its content into a single dataset.

Collection lifecycle
Figure 5: Collection lifecycle. Arrows = individual fastq datasets; Four shades of yellow = four samples analyzed in this example.

The last step of our analysis, collapsing a collection, is an example of a collection operation. Galaxy contains an entire section of tools designed for handling of collection data. These can be classified as:

  • Tools that manipulate elements within a collection
  • Tools that change collection structure
  • Tools that combine elements of a collection

Let’s look at these categories in more detail:

Tools that manipulate elements within a collection

Extract dataset

tool Extract dataset extracts datasets from a collection based on either position or identifier.

The tool allow extracting datasets based on position (The first dataset and Select by index options) or name (Select belement identifier option). This tool effectively collapses the inner-most collection into a dataset. For nested collections (e.g a list of lists of lists: outer:middle:inner, extracting the inner dataset element) a new list is created where the selected element takes the position of the inner-most collection (so outer:middle, where middle is not a collection but the inner dataset element).

Filter empty

tool Filter empty removes empty elements from a collection.

This tool takes a dataset collection and filters out (removes) empty datasets. This is useful for continuing a multi-sample analysis when downstream tools require datasets to have content.

.. image:: ${static_path}/images/tools/collection_ops/filter_empty.svg :width: 500 :alt: Filtering empty datasets

Filter failed datasets

tool Filter failed datasets removes datasets in error (red) from a collection.

This tool takes a dataset collection and filters out (removes) datasets in the failed (red) state. This is useful for continuing a multi-sample analysis when one or more of the samples fails at some point.

filter_failed

Build list

tool Build list creates a new list collection from individual datasets or collections.

This tool combines individual datasets or collections into a new collection. The simplest scenario is building a new colection from individual datasets (case A in the image below). You can merge a collection with individual dataset(s). In this case (see B in the image below) the individual dataset(s) will be merged with each element of the input collection to create a nested collection. Finally, two or more collection can be merged together creating a nested collection (case C in the image below).

Build list

Filter collection

tool Filter collection removes elements from a collection using a list supplied in a file.

This tools allow filtering elements from a data collection. It takes an input collection and a text file with names (i.e. identifiers). The tool behaviour is controlled by How should the elements to remove be determined? drop-down. It has the following options:

Remove if identifiers are ABSENT from file

Given a collection:

 Collection: [Dataset A] 
             [Dataset B] 
             [Dataset X]

and a text file:

             A
             B
             Z

the tool will return two collections:


 (filtered):  [Dataset A]
              [Dataset B]

 (discarded): [Dataset X]

Remove if identifiers are PRESENT in file

Given a collection:

 Collection: [Dataset A] 
             [Dataset B] 
             [Dataset X]

and a text file:

             A
             B
             Z

the tool will return two collections:

 (filtered):  [Dataset X]

 (discarded): [Dataset A]
              [Dataset B]

Relabel identifiers

tool Relabel identifiers changes identifiers of datasets within a collection using identifiers from a supplied file.

New identifiers can be supplied as either a simple list or a tab-delimited file mapping old identifier to the new ones. This is controlled using How should the new identifiers be specified? drop-down:

Using lines in a simple text file

Given a collection:

 Collection: [Dataset A] 
             [Dataset B] 
             [Dataset X]

and a simple text file:

             Alpha
             Beta
             Gamma

the tool will return:

 Collection: [Dataset Alpha] 
             [Dataset Beta] 
             [Dataset Gamma]

Map original identifiers to new ones using a two column table

Given a collection:

 Collection: [Dataset A] 
             [Dataset B] 
             [Dataset X]

and a simple text file (you can see that entries do not have to be in order here):

             B Beta
             X Gamma
             A Alpha

the tool will return:

 Collection: [Dataset Alpha] 
             [Dataset Beta] 
             [Dataset Gamma]

Sort collection

tool Sort collection … well .. sorts dataset collection alphabetically, numerically, or using predetermined order from a supplied file.

Numeric sort

The tool sort in ascending order. When numeric sort is chosen, the tool ignores non-numeric characters. For example, if a collection contains the following elements:

 Collection: [Horse123] 
             [Donkey543] 
             [Mule176]

The tool will output:

 Collection: [Horse123]
             [Mule176] 
             [Donkey543] 

Sorting from file

Alternative, one can supply a single column text file containing elements identifiers in the desired sort order. For example, suppose there a collection:

 Collection: [Horse123] 
             [Donkey543] 
             [Mule176]

and a file specifying sort order:

 Donkey543
 Horse123 
 Mule176

the output will predictably look like this:

 Collection: [Donkey543] 
             [Horse123] 
             [Mule176]

Tag collection

tool Tag collection adds tags (including name: and group: tags) to collection elements.

The relationship between element names and tags is specified in a two column tab-delimited file. This file may contain less entries than elements in the collection. In that case only matching list identifiers will be tagged.

To create name: or group: tags prepend them with # (you can also use name:) or group:, respectively.

More about tags

tip Tip: More about tags

Galaxy allows tagging datasets to facilitate analyses. There are several types of tags including simple tags, name tags, and group tags. Simple tags allow you to attach an alternative label to a dataset, which will make it easier to find it later. Name tags allow you to track propagation of a dataset through the analyses: all datasets derived from the initial dataset labeled with a name tag will inherit it. Finally, group tags allow you to label group of datasets. This is useful. for example, for differential expression analysis where you can have two groups of datasets labeled as “treatment” and “control”.

To learn mote about tags go to training site.

Tools that change collection structure

Flatten collection

tool Flatten collection collapses nested collection into a simple list.

This tool takes nested collections such as a list of lists or a list of dataset pairs and produces a flat list from the inputs. It effectively “flattens” the hierarchy. The collection identifiers are merged together (using _ as default) to create new collection identifiers in the flattened result:

Flatten collection

Merge collections

tool Merge collections takes two or more collections and creates a single collection from them.

By default the tool assumes that collections that are being merged have unique dataset names. If it not the case only one (the first) of the datasets with a repeated name will be included in the merged collection. For example, suppose you have two collections. Each has two datasets named “A” and “B”:

 Collection 1: [Dataset A] 
               [Dataset B] 
               [Dataset X]
 Collection 2: [Dataset A] 
               [Dataset B] 
               [Dataset Y]

Merging them will produce a single collection with only two datasets:

 Merged Collection: [Dataset A] 
                    [Dataset B] 
                    [Dataset X] 
                    [Dataset Y]

This behavior can be changed by clicking on “Advanced Options” link. The following options are available:

Keep first instance (Default behavior)

Input:

 Collection 1: [Dataset A] 
               [Dataset B] 
               [Dataset X]
 Collection 2: [Dataset A] 
               [Dataset B] 
               [Dataset Y]

Output:

 Merged Collection: [Dataset A] 
                    [Dataset B] 
                    [Dataset X] 
                    [Dataset Y]

Here if two collection have identical dataset names, a dataset is chosen from the first collection.


Keep first instance

Input:

 Collection 1: [Dataset A] 
               [Dataset B] 
               [Dataset X]
 Collection 2: [Dataset A] 
               [Dataset B] 
               [Dataset Y]

Output:

 Merged Collection: [Dataset A] 
                    [Dataset B] 
                    [Dataset X] 
                    [Dataset Y]

Here if two collection have identical dataset names, a dataset is chosen from the last collection.


Append suffix to conflicted element identifiers

Input:

 Collection 1: [Dataset A] 
               [Dataset B] 
               [Dataset X]
 Collection 2: [Dataset A] 
               [Dataset B] 
               [Dataset Y]

Output:

 Merged Collection: [Dataset A_1] 
                    [Dataset B_1]
                    [Dataset A_2] 
                    [Dataset B_2]  
                    [Dataset X] 
                    [Dataset Y]

Append suffix to conflicted element identifiers after first on encountered

Input:


 Collection 1: [Dataset A] 
               [Dataset B] 
               [Dataset X]
 Collection 2: [Dataset A] 
               [Dataset B] 
               [Dataset Y]

Output:

 Merged Collection: [Dataset A] 
                    [Dataset B]
                    [Dataset A_2] 
                    [Dataset B_2]  
                    [Dataset X] 
                    [Dataset Y]

Append suffix to every element identifier

Input:

 Collection 1: [Dataset A] 
               [Dataset B] 
               [Dataset X]
 Collection 2: [Dataset A] 
               [Dataset B] 
               [Dataset Y]

Output:

 Merged Collection: [Dataset A_1] 
                    [Dataset B_2]
                    [Dataset A_2] 
                    [Dataset B_2]  
                    [Dataset X_1] 
                    [Dataset Y_2]

Fail collection creation

This option will simply trigger an error.

Zip collection

tool Zip collection takes two collections and creates a paired collection from them.

If you have one collection containing only forward reads and one containing only reverse, this tools will “zip” them together into a simple paired collection. For example, given two collections with forward and reverse reads they can be “zipped” into a single paired collection:

Zip collection

Unzip collection

tool Unzip collection takes a paired collection and “unzips” it into two simple dataset collections (lists of datasets).

Given a paired collection of forward and reverse reads this tool will “unzip” it into two collections containing forward and reverse reads, respectively:

Unzip collection

Tools that combine elements of a collection

Column join

tool Column join merges elements of a collection on a given column.

If you have a collection with three elements (image below), merging it on the first column will first produce a union on values found in the first column of each elements and then paste elements having the same value side-by-side:

Column join

Collapse collection

tool Collapse collection merges elements together (head-to-tail) in the order of the collection. Its power comes from the ability to add identifiers when it performs the merge. Identifiers can be added in variety of ways specified by the Prepend File name option as shown in the figure below (we used option A in the last step of this tutorial). A = Same line and each line in dataset; B = Same line and only once per dataset; C = Line above

Collapse collection

Key points

  • Multiple datasets can be combined in a collection.

  • This significantly simplifies the analysis.

  • This tutorial showed how to (1) create collection, (2) run tools on a collection, (3) combine collection elements into a final analysis results.

  • There is a variety of Collection operation tools allowing to perform a variety of transformations.

Frequently Asked Questions

Have questions about this tutorial? Check out the FAQ page for the Using Galaxy and Managing your Data topic to see if your question is listed there. If not, please ask your question on the GTN Gitter Channel or the Galaxy Help Forum

Feedback

Did you use this material as an instructor? Feel free to give us feedback on how it went.

Click here to load Google feedback frame

Citing this Tutorial

  1. Anton Nekrutenko, 2021 Using dataset collections (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/collections/tutorial.html Online; accessed TODAY
  2. Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

details BibTeX

@misc{galaxy-interface-collections,
author = "Anton Nekrutenko",
title = "Using dataset collections (Galaxy Training Materials)",
year = "2021",
month = "08",
day = "08"
url = "\url{https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/collections/tutorial.html}",
note = "[Online; accessed TODAY]"
}
@article{Batut_2018,
    doi = {10.1016/j.cels.2018.05.012},
    url = {https://doi.org/10.1016%2Fj.cels.2018.05.012},
    year = 2018,
    month = {jun},
    publisher = {Elsevier {BV}},
    volume = {6},
    number = {6},
    pages = {752--758.e1},
    author = {B{\'{e}}r{\'{e}}nice Batut and Saskia Hiltemann and Andrea Bagnacani and Dannon Baker and Vivek Bhardwaj and Clemens Blank and Anthony Bretaudeau and Loraine Brillet-Gu{\'{e}}guen and Martin {\v{C}}ech and John Chilton and Dave Clements and Olivia Doppelt-Azeroual and Anika Erxleben and Mallory Ann Freeberg and Simon Gladman and Youri Hoogstrate and Hans-Rudolf Hotz and Torsten Houwaart and Pratik Jagtap and Delphine Larivi{\`{e}}re and Gildas Le Corguill{\'{e}} and Thomas Manke and Fabien Mareuil and Fidel Ram{\'{\i}}rez and Devon Ryan and Florian Christoph Sigloch and Nicola Soranzo and Joachim Wolff and Pavankumar Videm and Markus Wolfien and Aisanjiang Wubuli and Dilmurat Yusuf and James Taylor and Rolf Backofen and Anton Nekrutenko and Björn Grüning},
    title = {Community-Driven Data Analysis Training for Biology},
    journal = {Cell Systems}
}
                

Congratulations on successfully completing this tutorial!