# Metabarcoding/eDNA through Obitools

### Overview

Questions:
• how to analyze DNA metabarcoding / eDNA data produced on Illumina sequencers using the OBITools?

Objectives:
• Deal with paired-end data to create consensus sequences

• Clean, filter and anlayse data to obtain strong results

Requirements:
Time estimation: 1 hour
Supporting Materials:
Last modification: Mar 18, 2022

# Introduction

Based on this OBITools official tutorial, you will learn here how to analyze DNA metabarcoding data produced on Illumina sequencers using:

• the OBITools on Galaxy
• some classical Galaxy tools

The data used in this tutorial correspond to the analysis of four wolf scats, using the protocol published in SHEHZAD et al. 2012 for assessing carnivore diet. After extracting DNA from the faeces, the DNA amplifications were carried out using the primers TTAGATACCCCACTATGC and TAGAACAGGCTCCTCTAG amplifying the 12S-V5 region (Riaz et al. 2011), together with a wolf blocking oligonucleotide.

It is always a good idea to have a look at the intermediate results or to evaluate the best parameter for each step. Some commands are designed for that purpose, for example you can use :

• obicount to count the number of sequence records in a file
• obihead and obitail to view the first or last sequence records of a file
• obistat to get some basic statistics (count, mean, standard deviation) on the attributes (key=value combinations) in the header of each sequence record (see The extended OBITools fasta format in the fasta format description)
• any Galaxy tools corresponding to classical unix command such as less, awk, sort, wc to check your files.

The OBITools programs imitate Unix standard programs because they usually act as filters. The main difference with classical Unix programs is that text files are not analyzed line per line but sequence record per sequence record (see below for a detailed description of a sequence record). Compared to packages for similar purposes like mothur (Schloss et al. 2009) or QIIME (Caporaso et al. 2010), the OBITools mainly rely on filtering and sorting algorithms. This allows users to set up versatile data analysis pipelines

Most of the OBITools commands read sequence records from a file or from the stdin, make some computations on the sequence records and output annotated sequence records. As inputs, the OBITools are able to automatically recognize the most common sequence file formats (i.e. FASTA, FASTQ, EMBL, and GenBank). They are also able to read ecoPCR (Ficetola et al. 2010) result files and ecoPCR/ecoPrimers formatted sequence databases (Riaz et al. 2011) as ordinary sequence files. File format outputs are more limited. By default, sequences without and with quality information are written in FASTA and FASTQ formats, respectively. However, dedicated options allow enforcing the output format, and the OBITools are also able to write sequences in the ecoPCR/ecoPrimers database format, to produce reference databases for these programs. In FASTA or FASTQ format, the attributes are written in the header line just after the id, following a key=value; format.

### Agenda

In this tutorial, we will cover:

# Manage input data

The data needed to run the tutorial are the following:

• FASTQ files resulting from a GA IIx (Illumina) paired-end (2 x 108 bp) sequencing assay of DNA extracted and amplified from four wolf faeces:
• wolf_F.fastq
• wolf_R.fastq
• the file describing the primers and tags used for all samples sequenced:
• wolf_diet_ngsfilter.txt
• the tags correspond to short and specific sequences added on the 5’ end of each primer to distinguish the different samples
• the file containing the reference database in a fasta format: -db_v05_r117.fasta This reference database has been extracted from the release 117 of EMBL using ecoPCR

## Get data

1. Create a new history for this tutorial
2. Import the zip archive containing input files from Zenodo

https://zenodo.org/record/5932108/files/wolf_tutorial.zip

• Open the Galaxy Upload Manager (galaxy-upload on the top-right of the tool panel)

• Select Paste/Fetch Data
• Paste the link into the text field

• Press Start

• Close the window

### Tip: Importing data from a data library

As an alternative to uploading the data from a URL or your computer, the files may also have been made available from a shared data library:

• Go into Shared data (top panel) then Data libraries
• Navigate to the correct folder as indicated by your instructor
• Select the desired files
• Click on the To History button near the top and select as Datasets from the dropdown menu
• In the pop-up window, select the history you want to import the files to (or create a new one)
• Click on Import
3. Rename the dataset, here a zip archive, if needed
4. Check that the datatype is zip

### Tip: Changing the datatype

• Click on the galaxy-pencil pencil icon for the dataset to edit its attributes
• In the central panel, click on the galaxy-chart-select-data Datatypes tab on the top
• Select datatypes
• Click the Save button

### hands_on Hands-on: Unzip the downladed .zip archive and prepare unzipped files to be used by OBITools

1. Unzip Tool: toolshed.g2.bx.psu.edu/repos/imgteam/unzip/unzip/0.2 with the following parameters:
• “Extract single file”: All files

### comment Comment

To work properly, this unzip Galaxy tool is waiting “simple” archive as input, this means without sub directory.

2. Add to each datafile a tag and/or modify names (optional)

• Click on the dataset
• Click on galaxy-tags Edit dataset tags
• Add a tag starting with #

Tags starting with # will be automatically propagated to the outputs of tools using this dataset.

• Check that the tag is appearing below the dataset name
3. Unhide all dataset from the resulting data collection so you can use these files independently.

4. Modify datatype from txt to tabular for the wolf_diet_ngsfilter dataset

### question Questions

1. Why do we need to unhide manually datasets from the data collection?

### solution Solution

1. Data collection is a functionality often used to deal with multiple datasets on the same format who can be analysed in batch mode. Here, the data collection is populated with heterogenous datafiles, coming from an archive. We thus need to treat separately each dataset of the collection, and to do so, we need to unhide corresponding datasets from the history, as datasets inside collections are just like “symbolic link” to “classical” history datasets hidden by default.

# Use OBITools

OBITools (Boyer et al. 2015) is a set of programs specifically designed for analyzing NGS data in a DNA metabarcoding context, taking into account taxonomic information. It is distributed as an open source software available on the following website: http://metabarcoding.org/obitools.

The OBITools commands consider a sequence record as an entity composed of five distinct elements. Two of them are mandatory, the identifier (id) and the DNA or protein sequence itself. The id is a single word composed of characters, digits, and other symbols like dots or underscores excluding spaces. Formally, the ids should be unique within a dataset and should identify each sequence record unambiguously, but only a few OBITools actually rely on this property. The three other elements composing a sequence record are optional. They consist in a sequence definition, a quality vector, and a set of attributes. The last element is a set of attributes qualifying the sequence, each attribute being described by a key=value pair. The set of attributes is the central concept of the OBITools system. When an OBITools command is run on the sequence records included in a dataset, the result of the computation often consist in the addition of new attributes completing the annotation of each sequence record. This strategy of sequence annotation allows the OBITools to return their results as a new sequence record file that can be used as the input of another OBITools program, ultimately creating complex pipelines (source: OBITools Welcome).

## Micro assembly of paired-end sequences with illuminapairedend

When using the result of a paired-end sequencing assay with supposedly overlapping forward and reverse reads, the first step is to recover the assembled sequence.

The forward and reverse reads of the same fragment are at the same line position in the two FASTQ files obtained after sequencing. Based on these two files, the assembly of the forward and reverse reads is done with the illuminapairedend utility that aligns the two reads and returns the reconstructed sequence.

### hands_on Hands-on: Recover consensus sequences from overlapping forward and reverse reads.

1. illuminapairedend Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_illumina_pairend/obi_illumina_pairend/1.2.13 with the following parameters:
• “Read from file”: wolf_F for the 3p file
• “Read from file”: wolf_R for the 5p file
• “minimum score for keeping aligment”: 40.0

### comment Comment

Sequence records corresponding to the same read pair must be in the same order in the two files !

If the alignment score is below the defined score, here 40, the forward and reverse reads are not aligned but concatenated, and the value of the mode attribute in the sequence header is set to joined instead of alignment

## Remove unaligned sequence records with obigrep

In this step we are going to use the value of the mode attribute in the sequence header of the illuminapairedend output file to discard sequences indicated as “joined”, so not assembled (“alignment”) (see explanation about this mode on the previous step)

### hands_on Hands-on: Remove unaligned sequence records

1. obigrep Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_grep/obi_grep/1.2.13 with the following parameters:
• “Input sequences file”: ilumimnapairedend fastq groomer output file
• “Choose the sequence record selection option”: predicat
• “Python boolean expression to be evaluated for each sequence record.”: mode!="joined"

### tip Tip: Verifying FastQ format and converting it

• You can use FastQC Tool: toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.73+galaxy0 to look at the format of the sequencing encoding score then FASTQ GROOMER Tool: toolshed.g2.bx.psu.edu/repos/devteam/fastq_groomer/fastq_groomer/1.1.5 to specify the guessed sequencing encoding score and create a fastqsanger file.

### comment Comment

The obigrep command is in some way analog to the standard Unix grep command. It selects a subset of sequence records from a sequence file.

A sequence record is a complex object composed of an identifier, a set of attributes (key=value), a definition, and the sequence itself.

Instead of working text line by text line as the standard Unix tool, selection is done sequence record by sequence record. A large set of options allows refining selection on any of the sequence record elements.

Moreover obigrep allows specifying simultaneously several conditions (that take the value TRUE or FALSE) and only the sequence records that fulfill all the conditions (all conditions are TRUE) are selected. You can refer to https://pythonhosted.org/OBITools/scripts/obigrep.html for more details

### question Questions

1. How do you verify the operation is successful?
2. How many sequences are kept? Discarded?

### solution Solution

1. you can search in the input file content the presence of mode=joined and same on the output file (just clicking the eye to visualize the content of each file and typing CTRL+C for example to search mode=joined in the file, or using a regex Galaxy tool for example). You can also at least look at the size of the output file, if smaller than input file, this is a first good indication.
2. You can use a Galaxy tool like Line/Word/Character count of a dataset to count the number of lines of each dataset (input and output of obigrep) and divided by 4 (as in a FastQ file, each sequence is represented by a block of 4 lines). 45 276 sequences for input file. 44 717 for output file. Thus 559 sequences discarded.

## Assign each sequence record to the corresponding sample/marker combination with NGSfilter

### comment Comment

Each sequence record is assigned to its corresponding sample and marker using the data provided in a text file (here wolf_diet_ngsfilter.txt). This text file contains one line per sample, with the name of the experiment (several experiments can be included in the same file), the name of the tags (for example: aattaac if the same tag has been used on each extremity of the PCR products, or aattaac:gaagtag if the tags were different), the sequence of the forward primer, the sequence of the reverse primer, the letter T or F for sample identification using the forward primer and tag only or using both primers and both tags, respectively.

1. NGSfilter Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_ngsfilter/obi_ngsfilter/1.2.13 with the following parameters:
• “Parameter file”: wolf_diet_ngsfilter
• “Read from file”: obigrep output
• “Number of errors allowed for matching primers”: 2
• “Output data type”: fastq

### tip Tip: Be sure the text file is in tabular datatype

• If you are sure the format is compatible with a tabular datatype, as this is the case here ;), you can manually change it, clicking on the eye of the “wolf_diet_ngsfilter.txt” dataset, then selecting the “Datatypes” section then affecting manually tabular and saving the operation

### question Questions

1. How many sequences are not assigned?

1. 1391

## Dereplicate reads into uniq sequences with obiuniq

### comment Comment

The same DNA molecule can be sequenced several times. In order to reduce both file size and computations time, and to get easier interpretable results, it is convenient to work with unique sequences instead of reads. To dereplicate such reads into unique sequences, we use the obiuniq command. Definition: Dereplicate reads into unique sequences

• compare all the reads in a data set to each other
• group strictly identical reads together
• output the sequence for each group and its count in the original dataset (in this way, all duplicated reads are removed) Definition adapted from Seguritan and Rohwer (2001)
1. obiuniq Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_uniq/obi_uniq/1.2.13 with the following parameters:
• “Input sequences file”: Trimmed and annotated file by NGSfilter
• “Attribute to merge”: sample
• “Use specific option”: merge

### question Questions

1. How many sequences you had and how many you finally obtain?

### solution Solution

1. From 43 326 to 3 962

## Limit number of informations with obiannotate

### comment Comment

obiannotate is the command that allows adding/modifying/removing annotation attributes attached to sequence records. Once such attributes are added, they can be used by the other OBITools commands for filtering purposes or for statistics computing.

Here, the goal is to keep only count and merged_sample key=value attributes!

1. obiannotate Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_annotate/obi_annotate/1.2.13 with the following parameters:
• “Input sequences file”: obiuniq output file
• In “Keep only attribute with key”:
• “key”: count
• “if you want to specify a second key”: merged_sample

## Computes basic statistics for attribute values with obistat

### comment Comment

stats computes basic statistics for attribute values of sequence records. The sequence records can be categorized or not using one or several -c options. By default, only the number of sequence records and the total count are computed for each category. Additional statistics can be computed for attribute values in each category, like:

• minimum value (-m option)
• maximum value (-M option)
• mean value (-a option)
• variance (-v option)
• standard deviation (-s option)

The result is a contingency table with the different categories in rows, and the computed statistics in columns.

1. obistat Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_stat/obi_stat/1.2.13 with the following parameters:
• “Input sequences file”: obiannotate output file
• In “Category attribute”:
• param-repeat “Insert Category attribute”
• “How would you specify the category attribute key?”: simply by a key of an attribute
• “Attribute used to categorize the sequence records”: count
• “Use a specific option”: no

### question Questions

1. Can you use this result to say how many sequences occuring only once? You would need to use Galaxy tools like Sort data in ascending or descending order and  Select first lines from a dataset to answer the question

### solution Solution

1. 3131 sequences are occuring once.

## Filtering sequances by count and length with obigrep

In this step, we are going to use obigrep in order to keep only the sequences having a count greater or equal to 10 and a length shorter than 80 bp.

### hands_on Hands-on: filter sequences with obigrep

1. obigrep Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_grep/obi_grep/1.2.13 with the following parameters:
• “Input sequences file”: obiannotate output file
• “Choose the sequence record selection option”: predicat
• “Python boolean expression to be evaluated for each sequence record.”: count>=10
2. obigrep Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_grep/obi_grep/1.2.13 with the following parameters:
• “Input sequences file”: obigrep output file
• “Choose the sequence record selection option”: lmin
• “lmin”: 80

### comment Comment

Based on the previous observation, we set the cut-off for keeping sequences for further analysis to a count of 10 Based on previous knowledge we also remove sequences with a length shorter than 80 bp (option -l) as we know that the amplified 12S-V5 barcode for vertebrates must have a length around 100bp

### question Questions

1. How many sequences are kept following the “count” filter?
2. How many sequences are kept following the “length” filter?

1. 178
2. 175

## Clean the sequences for PCR/sequencing errors (sequence variants) with obiclean

### hands_on Hands-on: Clean the sequences for PCR/sequencing errors (sequence variants)

As a final denoising step, using the obiclean program, we keep the head sequences that are sequences with no variants with a count greater than 5% of their own count

1. obiclean Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_clean/obi_clean/1.2.13 with the following parameters:
• “Input sequences file”: obigrep output file
• “attribute containing sample definition”: merged_sample
• “Maximum numbers of differences between two variant sequences (default: 1)”: 1
• “Threshold ratio between counts (rare/abundant counts) of two sequence records so that the less abundant one is a variant of the more abundant (default: 1, i.e. all less abundant sequences are variants)”: 0.05
• “Do you want to select only sequences with the head status in a least one sample?”: Yes

## Taxonomic assignment of sequences with NCBI BLAST+ blastn

### hands_on Hands-on: Search nucleotide database with nucleotide query sequence(s) from OBITools treatments

Once denoising has been done, the next step in diet analysis is to assign the barcodes to the corresponding species in order to get the complete list of species associated to each sample. Taxonomic assignment of sequences requires a reference database compiling all possible species to be identified in the sample. Assignment is then done based on sequence comparison between sample sequences and reference sequences. We here propose to use BLAST+ blastn.

1. NCBI BLAST+ blastn Tool: toolshed.g2.bx.psu.edu/repos/devteam/ncbi_blast_plus/ncbi_blastn_wrapper/2.10.1+galaxy0 with the following parameters:
• “Nucleotide query sequence(s)”: obiclean output file
• “Subject database/sequences”: FASTA file from your history
• “Nucleotide FASTA subject file to use instead of a database”: db_v05_r117
• “Set expectation value cutoff”: 0.0001
• “Output format”: Tabular (extended 25 columns)
• *“Maximum hits to consider/show”: 1

### comment Comment

Here we directly use the db_v05_r117 fasta file proposed on the original obitools tutorial. One can mention you can create such a fasta file using same obitools workflow describe before (using obigrep/obiuniq/obigrep/obiannotate) on downloaded EMBL datrabases and taxonomy treated by obitools ecoPCR tool.

## Filter database and query sequences by ID to re associate informations with Filter sequences by ID

### comment Comment

This tool allows you to re-associate all the reference sequences information, notably the species_name one so you can see which species are potentially seen on the sample. We will also use it to re-associate all the query sequences information, notably the merged_sample and obiclean_count attributes so we can better evaluate quality of the results.

1. Filter sequences by ID Tool: toolshed.g2.bx.psu.edu/repos/peterjc/seq_filter_by_id/seq_filter_by_id/0.2.7 with the following parameters:
• “Sequence file to be filtered”: db_v05_r117
• “Filter using the ID list from”: tabular file
• “Tabular file containing sequence identifiers”: megablast on obiclean output
• “Column(s) containing sequence identifiers”: Column 2
• “Output positive matches, negative matches, or both?”: just positive match
2. Filter sequences by ID Tool: toolshed.g2.bx.psu.edu/repos/peterjc/seq_filter_by_id/seq_filter_by_id/0.2.7 with the following parameters:
• “Sequence file to be filtered”: obiclean output data
• “Filter using the ID list from”: tabular file
• “Tabular file containing sequence identifiers”: megablast on obiclean output
• “Column(s) containing sequence identifiers”: Column 1
• “Output positive matches, negative matches, or both?”: just positive match

## From FASTA to tabular with Obitab

### hands_on Hands-on: Convert fasta filtered files in tabular ones

1. obitab Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_tab/obi_tab/1.2.13 with the following parameters:
• “Input sequences file”: db_v05_r117 with matched ID

### comment Comment

This tool allows you to convert a fasta file into a tabular one so it is easier to read sequences definitions.

2. obitab Tool: toolshed.g2.bx.psu.edu/repos/iuc/obi_tab/obi_tab/1.2.13 with the following parameters:
• “Input sequences file”: obiclean on data 61 with matched ID

## create a final synthesis as a tabular file

### hands_on Hands-on: Join blast and obitab files then cut relevant column and apply filters

1. Join two datasets side by side on a specified field Tool: join1 with the following parameters:
• “Join”: obitab on obiclean output file
• “using column”: Column 1
• “with”: megablast output file
• “using column”: Column 1
• “Fill empty columns”: Yes
• “Fill Columns by”: Single fill value
• “Fill value”: NA
2. Join two datasets side by side on a specified field Tool: join1 with the following parameters:
• “Join”: last Join two Datasets output file
• “using column”: Column 26
• “with”: obitab on db_v05_r117 with matched ID output file
• “using column”: Column 1
• “Fill empty columns”: Yes
• “Fill Columns by”: Single fill value
• “Fill value”: NA

### comment Comment

To have something easier to read and understand, we create a tabular file containing only columns with important informations (c1: query sequences names / c3-7: query counts / c50: reference sequences names / c54: family / c59: genus / c51: reference annotations).

3. Cut columns from a table Tool: Cut1 with the following parameters:
• “Cut columns”: c1,c3,c4,c5,c6,c7,c50,c54,c59,c51
• “From”: last Join two Datasets output file
4. Filter data on any column using simple expressions Tool: Filter1 with the following parameters:
• “Filter”: Cut output file
• “With following condition”: c3>1000 or c4>1000 or c5>1000 or c6>1000
• “Number of header lines to skip”: 1

### comment Comment

To keep only data with significative counts.

### question Questions

1. How many species are identified? You can use Cut columns from a table and unique occurences of each record to isolate the species name column of obitab results.
2. Can you deduce the diet of each sample? You can use tools like obitab and Join two Datasets side by side on a specified field to join megablast results to obigrep one and db_v05_r117 with matched ID

### solution Solution

1. 1O
2. If we remove human sequences ;), some squirrel (sample:26a_F040644), deer (sample:15a_F730814 + sample:29a_F260619), stag (sample:29a_F260619 + sample:13a_F730603), marmot (sample:26a_F040644), doe (sample:29a_F260619), Grimm’s duiker (sample:29a_F260619).

# Conclusion

You just did a ecological analysis, finding diet from wolves faeces ! So now you know how to preprocess metabarcoding data on Galaxy, producing quantitative informations with quality checks and filtering results to interpret it and to have a synthesis table you can share broadly!

### Key points

• From raw reads you can process, clean and filter data to obtain a list of species from environmental DNA (eDNA) samples.

# Useful literature

Further information, including links to documentation and original publications, regarding the tools, analysis techniques and the interpretation of results described in this tutorial can be found here.

# References

1. Schloss, P. D., S. L. Westcott, T. Ryabin, J. R. Hall, M. Hartmann et al., 2009 Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75: 7537–7541.
2. Caporaso, J. G., J. Kuczynski, J. Stombaugh, K. Bittinger, F. D. Bushman et al., 2010 QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7: 335–336.
3. Ficetola, G. F., E. Coissac, S. Zundel, T. Riaz, W. Shehzad et al., 2010 An In silico approach for the evaluation of DNA barcodes. BMC Genomics 39: e145–e145. 10.1186/1471-2164-11-434
4. Riaz, T., W. Shehzad, A. Viari, F. Pompanon, P. Taberlet et al., 2011 ecoPrimers: inference of new DNA barcode markers from whole genome sequence analysis. Nucleic Acids Research 39: e145–e145. 10.1093/nar/gkr732
5. SHEHZAD, W. A. S. I. M., T. I. A. Y. Y. B. A. RIAZ, M. U. H. A. M. M. A. D. A. NAWAZ, C. H. R. I. S. T. I. A. N. MIQUEL, C. A. R. O. L. E. POILLOT et al., 2012 Carnivore diet analysis based on next-generation sequencing: application to the leopard cat (Prionailurus bengalensis) in Pakistan. Molecular Ecology 21: 1951–1965. 10.1111/j.1365-294x.2011.05424.x
6. Boyer, F., C. Mercier, A. Bonin, Y. L. Bras, P. Taberlet et al., 2015 obitools: aunix-inspired software package for DNA metabarcoding. Molecular Ecology Resources 16: 176–182. 10.1111/1755-0998.12428

# Feedback

Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.

# Citing this Tutorial

1. Coline Royaux, Olivier Norvez, Eric Coissac, Frédéric Boyer, Yvan Le Bras, 2022 Metabarcoding/eDNA through Obitools (Galaxy Training Materials). https://training.galaxyproject.org/archive/2022-07-01/topics/ecology/tutorials/Obitools-metabarcoding/tutorial.html Online; accessed TODAY
2. Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

### details BibTeX

@misc{ecology-Obitools-metabarcoding,
author = "Coline Royaux and Olivier Norvez and Eric Coissac and Frédéric Boyer and Yvan Le Bras",
title = "Metabarcoding/eDNA through Obitools (Galaxy Training Materials)",
year = "2022",
month = "03",
day = "18"
url = "\url{https://training.galaxyproject.org/archive/2022-07-01/topics/ecology/tutorials/Obitools-metabarcoding/tutorial.html}",
note = "[Online; accessed TODAY]"
}
@article{Batut_2018,
doi = {10.1016/j.cels.2018.05.012},
url = {https://doi.org/10.1016%2Fj.cels.2018.05.012},
year = 2018,
month = {jun},
publisher = {Elsevier {BV}},
volume = {6},
number = {6},
pages = {752--758.e1},
author = {B{\'{e}}r{\'{e}}nice Batut and Saskia Hiltemann and Andrea Bagnacani and Dannon Baker and Vivek Bhardwaj and Clemens Blank and Anthony Bretaudeau and Loraine Brillet-Gu{\'{e}}guen and Martin {\v{C}}ech and John Chilton and Dave Clements and Olivia Doppelt-Azeroual and Anika Erxleben and Mallory Ann Freeberg and Simon Gladman and Youri Hoogstrate and Hans-Rudolf Hotz and Torsten Houwaart and Pratik Jagtap and Delphine Larivi{\`{e}}re and Gildas Le Corguill{\'{e}} and Thomas Manke and Fabien Mareuil and Fidel Ram{\'{\i}}rez and Devon Ryan and Florian Christoph Sigloch and Nicola Soranzo and Joachim Wolff and Pavankumar Videm and Markus Wolfien and Aisanjiang Wubuli and Dilmurat Yusuf and James Taylor and Rolf Backofen and Anton Nekrutenko and Björn Grüning},
title = {Community-Driven Data Analysis Training for Biology},
journal = {Cell Systems}
}