Understanding Galaxy history system
OverviewQuestions:
Objectives:
How do Galaxy histories work?
Gain understanding on navigating and manipulating histories
Time estimation: 30 minutesSupporting Materials:Last modification: Mar 27, 2023License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MITpurlPURL: https://gxy.io/GTN:T00150
Warning: Compatible Versions of GalaxyThis tutorial has been tested to work with 23.0
- Galaxy’s Interface may be different to the Galaxy where you are following this tutorial.
- ✅ All tutorial steps will still be able to be followed (potentially with minor differences for moved buttons or changed icons.)
- ✅ Tools will all still work
When data is uploaded from your computer or analysis is done on existing data using Galaxy, each output from those steps generates a dataset. These datasets (and the output datasets from later analysis on them) are stored by Galaxy in Histories.
The History
All users have one ‘current’ history, which can be thought of as a workspace or a current working directory in bioinformatics terms. Your current history is displayed in the right hand side of the main ‘Analyze Data’ Galaxy page in what is called the history panel.
<figcaption>Figure 1: Galaxy History is simply the right panel of the interface</figcaption>
The history panel displays output datasets in the order in which they were created, with the oldest/first shown at the bottom. As new analyses are done and new output datasets are generated, the newest datasets are added to the top of the the history panel. In this way, the history panel displays the history of your analysis over time.
Users that have registered an account and logged in can have many histories and the history panel allows switching between them and creating new ones. This can be useful to organize different analyses.
Anonymous users (if your Galaxy allows them) are users that have not registered an account. Anonymous users are only allowed one history. On our main, public Galaxy server, users are encouraged to register and log in with the benefit that they can work on many histories and switch between them.
Warning: Anonymous Users: BewareThe histories of anonymous users are only associated through your browser’s session. If you close the browser or clear you sessions - that history will be lost! We can not recover it for you if it is.
History controls
<figcaption>Figure 2: History Controls</figcaption>
Above the current history panel are three buttons: create a new history, history quick switcher, and the history options.
- The new-history ‘create new history’ button will create an empty history.
- The switch-histories ‘history quick switcher’ will open a dialog letting you easily swap to any of your other histories.
- The galaxy-history-options ‘history menu’ (formerly the galaxy-gear “Gear menu”) gives you access to advanced options to work with your history.
History Information
Histories also store information in addition to the datasets they contain. They can be named/re-named, tagged, and annotated.
Renaming a history
All histories begin with the name ‘Unnamed history’. Non-anonymous users can rename the history as they see fit:
- Click the galaxy-pencil pencil icon next to the history’s name
- Enter a new name or edit the existing one.
- Press Enter, or click “Save” to save the new name. The input field will disappear and the new history name will display.
- To cancel renaming, click the galaxy-undo “Cancel” button
Tagging a history
Tags are short pieces of text used to describe the thing they’re attached to and many things in Galaxy can be tagged.
Each item can have many tags and you can add new tags or remove them at any time. Tags can be another useful way to
organize and search your data. For instance, you might tag a history with the type of analysis you did in it: assembly
or variants
. Or you may tag them according to data sources or some other metadata: long-term-care-facility
or
yellowstone park:2014
.
Comment: Best Practices for TaggingIt is strongly recommended to replace spaces in tags with
_
or-
, as spaces will automatically be removed when the tag is saved.
To tag a history:
- Click the tag button at the top of the history panel. An input field showing existing tags (if any) will appear.
- Begin typing your new tag in the field. Any tags that you’ve used previously will show below your partial entry - allowing you to use this ‘autocomplete’ data to re-use your previous tags without typing them in full.
- Press enter or select one of the previous tags with your arrow keys or mouse.
- To remove an existing tag, click the small ‘X’ on the tag or use the backspace key while in the input field.
<figcaption>Figure 3: Tagging a history will help searching for it later on.</figcaption>
Annotating a history
Sometimes tags and names are not enough to describe the work done within a history. Galaxy allows you to create history annotations: longer text entries that allow for more formatting options. The formatting of the text is preserved. Later, if you publish or share the history, the annotation will be displayed automatically - allowing you to share additional notes about the analysis.
To annotate a history:
- Click the annotation button at the top of the history panel. A larger text section will appear displaying any existing annotation (or, if there’s none, italic text saying you can click on the control to create an annotation).
- Click the annotation section. A larger input field will appear.
- Add your annotations. Enter will move the cursor to the next line. (Tabs cannot be entered since the ‘Tab’ button is used to switch between controls on the page - tabs can be pasted in however).
- To save the annotation, click the ‘Done’ button.
History size
As datasets are added to a history, Galaxy will store them on the server. The total size of these files, for all the datasets in a history, is displayed underneath the history name. For example, if a history has 200 megabytes of dataset data on Galaxy’s filesystem, ‘galaxy-chart-select-data 200 MB’ will be displayed underneath the history name.
If your Galaxy server uses quotas, the total combined size of all your histories will be compared to your quota. If you’re using more than the quota allows, Galaxy will prevent you from running any new jobs until you’ve deleted some datasets and brought that total below the quota.
History Panel Datasets
Datasets in the history panel show the state of the job that has generated or will generate the data.
There are several different ‘states’ a dataset can be in:
- When you first upload a file or run a tool, the dataset will be in the queued state. This indicates that the job that will create this dataset has not yet started and is in line to begin.
- When the job starts, the dataset will be in the running state. The job that created these datasets is now running on Galaxy’s cluster.
- When the job has completed successfully, the datasets it generated will be in the ok state.
- If there’s been an error while running the tool, the datasets will be in the error state.
- If a previously running or queued job has been paused by Galaxy, the dataset will be in the paused state. You can re-start/resume paused jobs using the options menu above the history panel and selecting ‘Resume Paused Jobs’.
<figcaption>Figure 5: Dataset states</figcaption>
Datasets in the panel are initially shown in a ‘summary’ view, that only displays:
- A number indicating in what order (or what step) this dataset was created,
- The dataset name.
- galaxy-eye view button: click this to view the dataset contents in raw format in the browser.
- galaxy-pencil edit button: click this to edit dataset properties.
- galaxy-delete delete button: click this to delete the dataset from the history (don’t worry, you can undo this action).
some of the buttons above may be disabled if the dataset is in a state that doesn’t allow the action. For example, the ‘edit’ button is disabled for datasets that are still queued or running
You can click the dataset name and the view will expand to show more details:
- A short description of the data.
- The file format (Bed in this case) and the reference sequence (or database) for the data (
?
here) - (Optionally) some information/output from the job that produced this dataset.
- A row of buttons that allow further actions on the dataset.
- A peek of the data: a couple of rows of data with the column headers (if available).
Many of these details are only displayed if the dataset has finished running, is in the ‘ok’ state, and is not deleted. Otherwise, you may only see a shorter message describing the dataset’s state (e.g. ‘this dataset is waiting to run’)
Managing Datasets Individually
Hiding and unhiding datasets
Some procedures in Galaxy such as workflows will often hide history datasets in order to simplify the history and hide intermediate steps of an automated analysis. These hidden datasets won’t normally appear in the history panel but theyre still mentioned in the history subtitle (the smaller, grey text that appears below the history name). If your history has hidden datasets, the number will appear there (e.g. ‘3 hidden’) as a clickable link. If you click this link, the hidden datasets are shown. Each hidden dataset has a link in the top of the summary view that allows you to unhide it. You can click that link again (which will now be ‘hide hidden’) to make them not shown again.
Deleting and undeleting datasets
You can delete any dataset in your history by clicking the delete button. This does not immediately remove the dataset’s data from Galaxy and it is reversible. When you delete a dataset from the history, it will be removed from the panel but (like hidden datasets) the total number of deleted datasets is shown in the history subtitle as a link. Clicking this link (e.g. ‘3 deleted’) will make the deleted datasets visible and each deleted dataset will have a link for manually undeleting it, above its title. You can click that link again (which will now be ‘hide deleted’) to make them not shown again.
Admins may purge your deleted datasets
Depending on the policy of your Galaxy server, administrators will often run scripts that search for and purge the datasets you’ve marked as deleted. Often, deleted datasets and histories are purged based on the age of the deletion (e.g. datasets that have been marked as deleted for 90 days or more). Check with the administrators of your Galaxy instance to find out the policy used.
Tagging datasets
There are two types of tags that can be used as an additional level of labeling for datasets: standard tags and hashtags. The standard tags work similarly to history tags described above - they add another level of description to datasets making them easier to find. Hashtags (also known as name tags or propagating tags) are much more powerful as they propagate through the analysis:
For more information on name tags, a dedicated nametag tutorial is available.
Managing Multiple Datasets Easily
Multi-selection
You can also hide, delete, and purge multiple datasets at once by multi-selecting datasets:
- galaxy-selector Click the multi-select button containing the checkbox just below the history size.
- Checkboxes will appear inside each dataset in the history.
- Scroll and click the checkboxes next to the datasets you want to manage.
- Click the ‘n of N selected’ to choose the action. The action will be performed on all selected datasets, except for the ones that don’t support the action. That is, if an action doesn’t apply to a selected dataset - like deleting a deleted dataset - nothing will happen to that dataset, while all other selected datasets will be deleted.
- You can click the multi-select button again to hide the checkboxes again.
Basic Searching
You can filter what datasets are shown and search for datasets using the search bar at the top of the panel. Enter any text that a dataset you’d be looking for would contain, including:
- the name or part of the name
- any text (or partial text) from the info field
- the file format or reference database
- any text or partial text from the annotation or tags of a dataset
For example:
- To find all vcf files you might enter:
vcf
alone. - To find all files whose names contain data 1, you can enter:
data 1
- To search for a VCF file named ‘VCF filter on data 1’ and tagged with ‘experiment_1’, you could enter:
vcf filter on data 1 experiment_1
Clearing a Search
You can clear a search and show all visible datasets by clicking the round ‘X’ button in the right of the search bar or - while entering text in the search bar - hitting the escape key (‘Esc’).
Advanced Searching
You can also specify dataset properties that you want to filter on. If you search with multiple properties, these are connected with ANDs, so datasets must match all provided attributes.
Query | Results |
---|---|
name:'FASTQC on' |
Any datasets with “FASTQC on” in the title, but avoids items which have “FASTQC on” in other fields like the description or annotation. |
extension:vcf |
Datasets with a specific format. Some formats are hierarchical, e.g. searching for fastq will find fastq files but also fastqsanger and fastqillumina files. You can see more formats in the upload dialogue. |
tag:experiment1 tag:to_publish |
for searching on (a partial) dataset tag. You can repeat to search for more tags. |
related:10 |
A specific history item ID (based on the ordering in the history) |
state:error |
To show only datasets in a given state. Other options include ok , running , paused , and new . |
If you find normal searching is showing too many datasets, and not what you’re looking for, try the advanced search! Just use the galaxy-advanced-search button next to the search field to show the advanced selector.
Undeleting … deleted histories
If you have not purged a history and have only deleted it, it is possible to ‘undelete’ it and reverse or undo the deletion. Since one of the purposes of deleting histories is to remove them from view, we’ll use the interface to specifically search for deleted histories and then to undelete the one we’re interested in.
There is one way to do this currently: via the saved histories page.
- Go to the “User” menu at the top
- Select “Histories”
- Click “Advanced Search” below the search box.
- Click “Deleted”
- Click on the title of the history you want to un-delete, and un-delete it.