Once you’re comfortable with Circos in Galaxy, you might want to explore some real world use cases with Circos such as making a simple Genome Annotation plot, like one might want to publish alongside their genome annotation publication
Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
Click on galaxy-uploadImport at the top-right of the screen
Paste the following URL into the box labelled “Archived Workflow URL”: https://training.galaxyproject.org/training-material/topics/visualisation/tutorials/circos-microbial/workflows/main_workflow.ga
Click the Import workflow button
Below is a short video demonstrating how to import a workflow from GitHub using this procedure:
Alternatively you can run the pre-processing steps and configure Circos manually as follows:
Manual Configuration
We’ll calculate the GC skew first from the genome sequence:
Hands On: GC Skew
GC Skew ( Galaxy version 0.69.8+galaxy9) with the following parameters:
“Source for reference genome”: Use a genome from history
param-file“Select a reference genome”: genome.fa (Input dataset)
“Window size”: 200
Comment: Window size
The optimal window size is sometimes a process of trial and error to find the right balance between too many datapoints, and the expected smooth curve that should appear indicating forward or reverse strand genes.
With that file available, we’re ready to convert these into a format Circos can understand. Natively we store the files in BigWig because it’s a very space efficient format, however Circos only processes text files, and expects a dataset with the following structure:
Chromosome name
so we’ll use a tool to convert them into the Circos-preferred format.
Hands On: Dataset Pre-processing
Circos: bigWig to Scatter ( Galaxy version 0.69.8+galaxy9) with the following parameters:
param-files“Data file”:
output of GC Skewtool
RNA-Seq coverage 1.bw (Uploaded Dataset)
RNA-Seq coverage 2.bw (Uploaded Dataset)
DNA sequencing coverage.bw (Uploaded Dataset)
Comment: Multi-select to automate processing
Multi-select allows you to easily process several datasets at once in Galaxy
You can use a tool like bamCoverage: generates a coverage bigWig file from a given BAM or CRAM file ( Galaxy version 3.5.4+galaxy0) to create a bigWig file from a BAM or CRAM sequencing dataset.
Preparing Variant Calls
Variant calls in a vcf format can easily be transformed into the same format as we converted the BigWigs to.
Hands On: Dataset Pre-processing
Cut with the following parameters:
“Cut columns”: c1,c2,c2,c6
param-file“From”: variants.vcf (Uploaded dataset)
Why these columns? What do they represent?
Why is c2 selected twice?
c1 is the chromosome name, c2 is the position of the variant, and c6 is the quality column.
c2 is used twice because in Circos there are no ‘point’ values, everything has a start and end. So here we re-use the start position to represent a 1 base long feature.
Gene annotations (gff3, bed, gtf), known as “intervals” in the Circos world, can be converted into a couple different formats, namely text labels and tiles.
Hands On: Prepare gene calls
Circos: Interval to Circos Text Labels ( Galaxy version 0.69.8+galaxy9) with the following parameters:
We’re ready to run Circos! As this is a ‘near-final’ circos plot it’s requires complicated configuration. Normally you would reach configuration like this with a lot of iterations. It took the tutorial author around 20 executions of the Circos tool to produce this plot.
Hands On: Circos
Circos ( Galaxy version 0.69.8+galaxy9) with the following parameters:
In “Karyotype”:
“Reference Genome Source”: ` FASTA File from History (can be slow, generate a length file to improve execution time.)`
param-file“Source FASTA Sequence”: genome.fa (Uploaded dataset)
In “Ideogram”:
“Chromosome units”: Kilobases
“Spacing Between Ideograms (in chromosome units)”: 0.3
“Thickness”: 10.0
In “Labels”:
“Label Font Size”: 48
In “2D Data Tracks”:
In “2D Data Plot”:
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.98
“Inside Radius”: 0.92
“Plot Type”: Histogram
param-file“Histogram Data Source”: output of Circos: bigWig to Scatter on RNA Seq Coverage 2 tool
In “Plot Format Specific Options”:
“Fill Color”: #f08fa4
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.92
“Inside Radius”: 0.86
“Plot Type”: Histogram
param-file“Histogram Data Source”: output of Circos: bigWig to Scatter on RNA Seq Coverage 1 tool
In “Plot Format Specific Options”:
“Fill Color”: #8ff0a4
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.86
“Inside Radius”: 0.8
“Plot Type”: Histogram
param-file“Histogram Data Source”: output of Circos: bigWig to Scatter on DNA sequencing coverage tool
In “Plot Format Specific Options”:
“Fill Color”: #ffbe6f
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.79
“Inside Radius”: 0.6
“Z-index”: 10 (This is used to plot over the genes which are added later.)
“Plot Type”: Scatter
param-file“Scatter Data Source”: output of cut on variants.vcf tool
In “Plot Format Specific Options”:
“Glyph”: Triangle
“Glyph Size”: 6
“Fill Color”: #dc8add
“Stroke Thickness”: 0
In “Axes”:
In “Axis”:
param-repeat“Insert Axis”
“Radial Position”: Absolute position (values match data values)
“Spacing”: 5000.0
“y1”: 40000.0
“Color”: #1a5fb4
“Color Transparency”: 0.4
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.6
“Inside Radius”: 0.55
“Plot Type”: Text Labels
param-file“Text Data Source”: output of Circos: Interval to Text on genes (NCBI).gff tool
In “Plot Format Specific Options”:
“Label Size”: 18
“Show Link”: No
“Snuggle Labels”: Yes
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.7
“Inside Radius”: 0.6
“Plot Type”: Tiles
param-file“Tile Data Source”: output of Circos: Interval to Tiles on genes (NCBI).gff tool
In “Plot Format Specific Options”:
“Fill Color”: #1c71d8
“Overflow Behavior”: Hide: overflow tiles are not drawn
In “Rules”:
In “Rule”:
param-repeat“Insert Rule”
In “Conditions to Apply”:
param-repeat“Insert Conditions to Apply”
“Condition”: Based on qualifier value (when available)
“Qualifier name”: strand
“Condition”: Less than (numeric)
“Qualifier value to compare against”: 0
In “Actions to Apply”:
param-repeat“Insert Actions to Apply”
“Action”: Change Visibility
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.53
“Inside Radius”: 0.45
“Plot Type”: Tiles
param-file“Tile Data Source”: output of Circos: Interval to Tiles on genes (NCBI).gff tool
In “Plot Format Specific Options”:
“Overflow Behavior”: Hide: overflow tiles are not drawn
“Orient Inwards”: Yes
In “Rules”:
In “Rule”:
param-repeat“Insert Rule”
In “Conditions to Apply”:
param-repeat“Insert Conditions to Apply”
“Condition”: Based on qualifier value (when available)
“Qualifier name”: strand
“Condition”: Greater than (numeric)
“Qualifier value to compare against”: 0
In “Actions to Apply”:
param-repeat“Insert Actions to Apply”
“Action”: Change Visibility
param-repeat“Insert Rule”
In “Conditions to Apply”:
param-repeat“Insert Conditions to Apply”
“Condition”: Apply to Every Point
In “Actions to Apply”:
param-repeat“Insert Actions to Apply”
“Action”: Change Fill Color for all points
“Fill Color”: #99c1f1
param-repeat“Insert 2D Data Plot”
“Outside Radius”: 0.45
“Inside Radius”: 0.35
“Plot Type”: Histogram
param-file“Histogram Data Source”: output of Circos: bigWig to Scatter on the GC Skew Plot tool
In “Plot Format Specific Options”:
“Fill Color”: #ff5757
In “Rules”:
In “Rule”:
param-repeat“Insert Rule”
In “Conditions to Apply”:
param-repeat“Insert Conditions to Apply”
“Condition”: Based on value (ONLY for scatter/histogram/heatmap/line)
“Points below this value”: 0.0
In “Actions to Apply”:
param-repeat“Insert Actions to Apply”
“Action”: Change Fill Color for all points
“Fill Color”: #5092f7
In “Ticks”:
“Skip first label”: Yes
In “Tick Group”:
param-repeat“Insert Tick Group”
“Tick Spacing”: 10.0
“Tick Size”: 20.0
“Show Tick Labels”: Yes
param-repeat“Insert Tick Group”
“Tick Size”: 15.0
“Show Tick Labels”: No
param-repeat“Insert Tick Group”
“Tick Spacing”: 0.25
“Color”: #9a9996
“Show Tick Labels”: No
Comment: Circos is complicated
Please check your parameters carefully, and expect that mistakes can be made. Just re-run the tool and modify your parameters!
And while this example is probably very overwhelming, when you create a
Circos plot from scratch, it will be less overwhelming; it’ll be your
data which you know better, and you’ll add one track at a time.
