Gallantries Grant - Intellectual Output 3 - Data stewardship, federation, standardisation, and collaboration

PURL: https://gxy.io/GTN:P00014

Comment: What is a Learning Pathway?

We recommend you follow the tutorials in the order presented on this page. They have been selected to fit together and build up your knowledge step by step. If a lesson has both slides and a tutorial, we recommend you start with the slides, then proceed with the tutorial.

This Learning Pathway collects the results of Intellectual Output 3 in the Gallantries Project

Success Criteria:

SC3.1) Data stewardship. This competency will provide learners with the necessary skills to evaluate data owned by their organisation, identify key metadata and content requirements, and establish controls and assurance metrics to ensure new data produced by their organisation is of sufficient quality.
SC3.2) Data federation. While hosting data internally is good, sharing data with teams across the Union and around the world is even better. Students need to understand how to achieve data federation and interchange with other organisations.
SC3.3) Data standardisation. A key component of stewardship and federation, standardisation of data allows it to be reused internally by common pipelines, but more importantly to be submitted to external databases. This is a key requirement of many data related projects.
SC3.4) Data collaboration. Many projects now scale beyond the ability of an individual to work on alone. Students need to learn how to work together with large datasets.
SC3.5) FAIR Data. For high quality, reproducible science, datasets should be FAIR; findable, accessible, interoperable, and reusable. This competency aids learners in ensuring their data is FAIR.

Year 1: Introduction to genomics and genome annotation

This will give students a good basic knowledge in the application domain of this IO and give them their first taste of data management [SC3.1,SC3.3,SC3.5]

Time estimation:

Learning Objectives

Lesson	Slides	Hands-on	Recordings
Introduction to Genome Annotation plain text Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages Plain text slides	plain text Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages Plain text slides

Year 1: Prokaryotic annotation

This module will cover the background relevant to annotating prokaryotic genomes in Galaxy (one of the two main classes of genomes), and collaborative curation with Apollo, as well as further exploration of annotation from code. [SC1.5, SC3.1-4]

Time estimation: 4 hours

Learning Objectives

Load genome into Galaxy
Annotate genome with Prokka
View annotations in JBrowse
Load a genome into Galaxy
View annotations in JBrowse
Learn how to load JBrowse data into Apollo
Learn how to manually refine genome annotations within Apollo
Export refined genome annotations

Lesson	Slides	Hands-on	Recordings
Genome annotation with Prokka gmod prokaryote microgalaxy jbrowse1 plain text Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages Plain text slides tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages video video Lecture (February 2021) - 3m video Tutorial (February 2021) - 20m video View All	plain text Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages Plain text slides	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	video video Lecture (February 2021) - 3m video Tutorial (February 2021) - 20m video View All
Refining Genome Annotations with Apollo (prokaryotes) gmod prokaryote microgalaxy jbrowse1 biodiversity apollo2 plain text Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages Plain text slides tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages video video Lecture (February 2021) - 5m video Tutorial (February 2021) - 1h video View All	plain text Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages Plain text slides	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	video video Lecture (February 2021) - 5m video Tutorial (February 2021) - 1h video View All

Year 2: FAIR Data

This submodule will focus specifically on how learners can make their data more FAIR (findable, accessible, interoperable, and reusable) [SC3.5]

Time estimation: 3 hours 35 minutes

Learning Objectives

Learn the FAIR principles
Recognise the relationship between FAIR and Open data
Learn about metadata and findability
Learn how to support system and content curation
Learn best practices in data management
Learn how to introduce computational reproducibility in your research
Locate bioimage data repositories
Compare repositories to find which are suitable for your data
Find out what the requirements are for submitting
Construct an RO-Crate by hand using JSON
Describe each part of the Research Object
Learn basic JSON-LD to create FAIR metadata
Connect different parts of the Research Object using identifiers
Understanding, viewing and creating Galaxy Workflow Run Crates
Create a custom, annotated RO-Crate
Use ORCIDs and other linked data to annotate datasets contained within the crate
Generate a workflow test using Planemo
Understand how testing can be automated with GitHub Actions

Lesson	Hands-on	Recordings
FAIR in a nutshell fair open data stewardship tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages
FAIR Galaxy Training Material fair gtn training tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages
FAIR data management solutions fair dmp data stewardship tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages
FAIR Bioimage Metadata Data management Metadata Bioimaging tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages
RO-Crate - Introduction tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages video video Tutorial (April 2023) - 17m video View All	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	video video Tutorial (April 2023) - 17m video View All
Exporting Workflow Run RO-Crates from Galaxy ro-crate workflows tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages
RO-Crate in Python ro-crate jupyter-notebook tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages
Best practices for workflows in GitHub repositories ro-crate jupyter-notebook tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages
Workflow Run RO-Crate Introduction ro-crate tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages

Year 2: Automatic Annotation

Building on the modules developed in the previous years, this will be further automated giving students the tools required to scale genome annotation regardless of the size of their organism. [SC1.1, SC1.6, SC2.1, SC3.1, SC3.3]

Time estimation: 8 hours

Learning Objectives

Load genome into Galaxy
Annotate genome with Funannotate
Perform functional annotation using EggNOG-mapper and InterProScan
Evaluate annotation quality with BUSCO
View annotations in JBrowse

Lesson	Slides	Hands-on	Recordings
Genome annotation with Funannotate gmod eukaryote jbrowse1 biodiversity tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages video video Tutorial (May 2023) - 1h10m video Tutorial (March 2022) - 1h10m video View All		tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	video video Tutorial (May 2023) - 1h10m video Tutorial (March 2022) - 1h10m video View All

Year 3: Eukaryotic annotation

This module will cover the background relevant to annotating eukaryotic genomes in Galaxy (the second of the two main genome classes), and collaborative curation with Apollo. Additionally students will learn about automating this annotation process using Galaxy and code. [SC1.5, SC2.1, SC3.1-4]

Time estimation: 6 hours

Learning Objectives

Use Red and RepeatMasker to soft-mask a newly assembled genome
Load data (genome assembly, annotation and mapped RNASeq) into Galaxy
Perform a transcriptome assembly with StringTie
Annotate lncRNAs with FEELnc
Classify lncRNAs according to their location
Update genome annotation with lncRNAs
Load a genome into Galaxy
View annotations in JBrowse
Learn how to load JBrowse data into Apollo
Learn how to manually refine genome annotations within Apollo
Export refined genome annotations

Lesson	Hands-on	Recordings
Masking repeats with RepeatMasker eukaryote biodiversity tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages video video Tutorial (May 2023) - 18m video Tutorial (March 2022) - 16m video View All	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	video video Tutorial (May 2023) - 18m video Tutorial (March 2022) - 16m video View All
Long non-coding RNAs (lncRNAs) annotation with FEELnc eukaryote tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages video video Tutorial (September 2024) - 11m video View All	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	video video Tutorial (September 2024) - 11m video View All
Refining Genome Annotations with Apollo (eukaryotes) gmod eukaryote cyoa jbrowse1 apollo2 biodiversity tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages	tutorial Toggle Dropdown Automatic translations Deutsch Español 中文 Français 日本語 Português العربية Italiano More Languages

Year 3: Official Gene Set

One of the key tasks in annotation is producing an official gene set (OGS), and ensuring integrity and validation of all of the curated annotations. This will also further familiarise students with public databases and the process for submitting datasets. [SC3.1, SC3.5]

Time estimation: 30 minutes

Learning Objectives