After running a coverage analysis you can then navigate to the results of any analysis
by going to the "Analysis Results" tab on the "Project Details" screen, and then clicking on the "View Analysis Results" button. The analysis results
screen then provides various ways to visualize and navigate the coverage metrics, as well as providing the ability to compare multiple samples to each
After navigating to the results of a coverage analysis, Curio shows a screen (see example below) with various ways to visualize, navigate, and
download the coverage metrics that were calculated as part of the analysis.
Feature Density: In the upper left hand corner Curio will draw an overview of where in the genome the features
were analyzed along with scaled blue and green bars that represent the number of sequenced bases encountered and average coverage depth calculated
in those regions. Clicking directly on the chromosome number provides a quick way to visualize the coverage metrics of all of the features found
within that chromosome, or hovering over the circle visualization will provide a tooltip of the basic coverage metrics calculated for each area.
Display Options: The upper right hand corner of the screen provides various options to change the way
the results are displayed - either for a single sample or for a collection of samples that are
being compared in different ways. In addition, some filtering capabilities are provided to control the types of features that are included
in the "Feature/Region Details" area based on their calculated coverage metrics.
Coverage Metrics: At the bottom of the screen Curio will provide a series of tabs that organize the coverage
metrics in three groups: a.) key summary metrics for the entire sample, b.) on/off target metrics and per chromosome metrics, and c.) detailed
metrics for each feature (e.g. region of the genome) that was analyzed. When looking at the "Feature/Region Details" you can change the range
of the genome you're looking at by using the text box on the right hand side to jump to a particular gene by name (e.g. "BRCA1"), to a particular
range by entering the genomic position (e.g. "chr7:55000000-56000000"), or visualize the metrics of all the features found within a chromosome by
entering its name (e.g. "chr7"). Metrics shown in the results area are color coded by sample, and hovering over any individual point provides
additional details as well as a way to see the original aligned reads on which the coverage metrics were calculated. You can see the original aligned
reads by clicking the "View Alignments" link.
In the top left hand corner of the "Coverage Summary" tab some key metrics of analysis are presented, which represent a summarized
version of all the features that were analyzed across the entire sample. In the case that you enabled UMI/UMT processing
as part of the read processing options when running
the analysis then the coverage metrics are based on the consensus read families, otherwise they represent the individual aligned reads that
were found in the sample. The following key metrics are included in this summary area:
Represents the total number of features that were analyzed across the entire sample. This number would include both features that had some coverage,
as well as features that did not have any coverage found within them. Note that the term "feature" here is equivalent to the genomic regions that
were specified in the original BED/GTF/GFF file that was
selected when the analysis was performed.
Total Aligned Sequences:
Represents the total number of reads that were aligned to some area of any chromosome in the sample, including both reads that were aligned to
positions that would overlap with one of the features that were analyzed for coverage and reads that aligned outside of the analyzed features. This
number can therefore be used as a denominator when calculating an "on" or "off" target coverage percentage. In addition to the total number of reads
that were aligned, the total number of base pairs within the individual reads that were aligned to any chromosome is also provided.
In the case that read deduplication was enabled these
numbers represent only the reads that remained after the deduplication filter was applied, and therefore you can easily analyze the impact
of read deduplication on your sample by simply running a coverage analysis with and without that setting enabled. Similarly, in the case that
UMI/UMT processing was enabled, these metrics instead represent the total number
of consensus families (i.e. consensus reads) that were aligned to some area of any chromosome.
Genomic Positions Analyzed:
Represents the total number of base positions on the genome that were analyzed, whether or not any coverage was detected at any particular position.
This metric therefore logically represents the sum of the lengths of all the features that were analyzed.
Covered Genomic Positions:
Represents the total number of analyzed positions where at least one read (or family in the case of UMI/UMT processing) was found to cover the
position. Note that in the case of UMI/UMT processing, it's possible that a position will be determined to have no coverage if the only families
found are small families which are then filtered out (based on the
"Minimum Family Size" setting selected when the analysis was started).
Represents the total number of reads that overlapped with any portion of the total set of features that were analyzed. Note that in the case that
UMI/UMT processing was enabled, this metric instead represents the total number of consensus families (i.e. consensus reads) that overlapped with
the analyzed features.
Overlapping Sequenced Bases:
Represents the total number of base pairs within individual reads that overlapped with any portion of the total set of features that were analyzed.
In the case that UMI/UMT processing was enabled, this metric instead represents the total number of bases within consensus families (i.e. consensus
reads) that overlapped with the analyzed features.
Average Coverage Depth:
Represents the average number of reads (or read families if UMI/UMT processing is enabled) that cover every genomic position that was analyzed. Note
that if a genomic position does not have any coverage it is still included when calculating the average. So, in effect, this metric represents the
total number of sequenced bases that were encountered within the target regions divided by the total number of genomic positions that were analyzed.
UMI/UMT Family Size:
This metric is only visible if UMI/UMT processing was enabled when the coverage
analysis was started. Represents the average number of reads that were grouped into each consensus read family based on having a matching UMI/UMT
barcode at the same genomic position. The standard deviation of consensus read family size is also reported. Note that if the "Minimum Family Size"
option was enabled to filter out small families when the coverage analysis was started then those small families are excluded from the average and
standard deviation calculation. E.g. if you select that you want to exclude UMI/UMT families that have less than 5 reads within each family, then
the average family size calculated here will necessarily be 5 or greater.
The top right hand corner of the "Coverage Summary" tab provides a heatmap visualization of the %GC content in relation to the coverage depth, and
uses color intensity to demonstrate the number of windows within the covered area that have a particular GC content percentage at each coverage depth.
Specifically, the three axis of the graph are setup as follows:
X-Axis / %GC (Guanine and Cytosine):
Represents the percentage of the nucleotides encountered within each window that were either a "G" or a "C" compared to the total number of known
nucleotides encountered within the window (thereby excluding any low quality "N" calls within the window.)
Note that Curio uses a sliding window approach when calculating these %GC values, where each window is 100 base pairs long. So, for example, a
feature that is 104 bps wide would have a GC percentage calculated for the five windows of 1-100, 2-101, 3-102, 4-103, and 5-104.
Y-Axis / Average Coverage:
Represents the average read coverage depth (or read family depth if UMI/UMT processing is enabled) across each 100-bp window that was encountered
with the corresponding %GC.
Note that Curio calculates the %GC numbers for every individual coverage level, but when presenting the information in the heatmap it will group
several similar coverage levels together to avoid overwhelming the browser with too much data. So, for example, if you hover over a position in
the heatmap and saw a message like "61,805 windows have an average depth between 82 and 90 reads and contain
48% GC", that means that Curio consolidated all of the coverage data for windows with 48% GC that had an average read depth between 82
and 90 reads into a single data point on the chart.
Color Intensity / # of Windows:
The color that each point in the heatmap is rendered at represents the number of 100-bp windows within the entire area that was analyzed which have
a given coverage depth at a specific %GC. The darker the color the higher the number of 100-bp windows were found.
Note that by default Curio only renders in the heatmap the data for the 100-bp windows that had an average coverage depth within 2 standard deviations
of the average coverage depth of the entire analyzed area. You can adjust the heatmap to show a more focused or less focused set of data by adjusting
the "Limit GC content to positions within X standard deviations from the average" option in the "Display Options" above.
At the bottom left hand corner of the "Coverage Summary" tab a chart is provided that will help you visualize the minimum depth of coverage you
have across all of the positions within the features you
targeted when running the coverage analysis. The y-axis of the chart represents the percentage of covered positions (rounded to the
nearest whole number) and the x-axis of the chart represents the minimum coverage depth of all the positions at that percentage. Hovering over
the chart provides an explanatory message at each point to help explain the visualization, e.g. "12% of covered positions
have an average coverage depth of at least 423 reads".
At the bottom right hand corner of the "Coverage Summary" tab a chart is provided that will help you visualize the minimum average depth of coverage
you have across all the features you targeted when
running the coverage analysis. The y-axis of the chart represents the percentage of covered positions (rounded to the
nearest whole number) and the x-axis of the chart represents the minimum average coverage depth of all the features at that percentage. Hovering over
the chart provides an explanatory message at each point to help explain the visualization, e.g. "5% of features have an
average coverage depth of at least 2,552 reads".
The coverage analysis system analyzes a set of regions (i.e. "features") of the genome to calculate metrics about the reads that either overlap
with those regions (i.e. are "on target") or outside of those regions (i.e. are "off target"). The "Targets & Chromosomes" tab at the bottom
of the screen provides a visualization of the percentage of those "on" and "off" target reads, as well as the reads that would have been considered
on target if the each region were padded by some number of base pairs.
This pie chart visualization therefore provides a convenient way for you to get a sense of how well your target area was captured, along with
the percentage of the data that falls within a close proximity of your target. Note that the padded metric ranges are based on the
padding options you chose when starting the analysis.
The "Targets & Chromosomes" tab also provides a visualization of the key coverage metrics summarized per chromosome. Hovering
over any chromosome will provide a tooltip of the same coverage metrics
that were available for the entire sample, but instead isolated to just the features within the given chromosome.
By default the chart will plot (and highlight in the tooltips) the "average coverage depth" and "percentage of covered feature positions"
of the features within each chromosome. Under the "Display Options" at the top of the screen you can then change the type of metric that
is plotted by adjusting the
"Show on a column series each chromosome's <metric type>" and "Show on a line series each chromosome's <metric type>" options.
The "Features/Region Details" tab provides a visualization of the key coverage metrics for each individual feature (e.g. each region of the
genome) that was investigated as part of the coverage analysis. You can change the range of the genome you're looking at by using the text box on the
right hand side to jump to a particular gene by name (e.g. "BRCA1"), to a particular range by entering the genomic position (e.g.
"chr7:55000000-56000000"), or visualize the metrics of all the features found within a chromosome by entering its name (e.g. "chr7").
Hovering over any feature will provide a tooltip of similar coverage metrics
that were available for the entire sample, but instead isolated to just the single feature. In addition, each tooltip provides a convenient
way to review the original aligned reads that the feature's coverage metrics were calculated from by clicking on the "View Alignments" link.
By default the chart will plot (and highlight in the tooltips) the "average coverage depth" and "percentage of covered feature positions" of each
of the features. Under the "Display Options" at the top of the screen you can then change the type of metric that is plotted by adjusting the
"Show on a column series each feature's <metric type>" and "Show on a line series each feature's <metric type>" options.
Also by default, the features that are included in the chart are restricted to those feature that the coverage analysis determined had at
least1% of their base positions covered. This filter can be changed by adjusting the "Exclude features with <less than/at least>
In addition to calculating the coverage of each feature's exact genomic range, the system
can also calculate the coverage if each feature were padded
by some number of base pairs (+/- 100 bps and +/- 200 bps by default.) So, for example if a feature was designated in the feature file to have a
genomic range of "chr1:1047-1092", Curio would calculate the coverage for the exact "1047-1092" range of "chr1" as well as "947-1192" and "847-1292".
The "Shared Settings" area allows you to control which set of metrics you would like to display in the chart visualizations.
Beyond visualizing the coverage metrics of a single sample, Curio also provides a powerful way to compare the results of multiple samples
to each other directly on the coverage analysis results report. To access the comparison feature, you need to click on the "Compare"
tab near the top of the screen. The original sample that you navigated to will automatically be included in the comparison, and you can then
select other samples within the same project that you would like to use as part of the comparison. Only the coverage analysis samples that
were based on the same assembly (e.g. HG19, HG38, MM10, etc.) will be available to select from for comparison.
After selecting two or more samples for comparison, Curio will then allow you to visualize relationships between the "on/off" target percentages
and per chromosome metrics on the "Targets & Chromosomes" tab across the selected samples.
Under the "More Options" button in the top right hand corner of screen Curio will allow you to the download a detailed export
of the coverage analysis results. The download is provided in a CSV format that can be easily opened up as a spreadsheet in
a program such as Microsoft's Excel. The report includes both the key
coverage metrics for each of the samples currently being compared, as well as the detailed coverage metrics for each
individual feature of the primary coverage analysis file on the screen.
When you create an account within the Curio Genomics Platform,
your new organization is automatically credited with 50 GB of storage and 100 Computational Units to use for data analyses. This is provided
all free of charge with no obligation to continue using Curio and no strings attached - we will not ask for your credit card or any other
method of payment while you are enjoying your free trial. The free trial allows you to use all of the publicly available features of
Curio: you can create as many projects as you would like, collaborate with as many people as you would like, and execute full data analyses.
With the credit allowances provided, most users are able to upload FASTQ file(s) totaling approximately 200M reads. Then they are able to align
& run QC report(s) on that data and even perform variant or other analyses across panels that total approximately 5mbp.
Once your Storage or Computational Units are exhausted, if you would like to continue using the Curio Platform simply
contact us and we will work with you to get you back to discovering at the speed of your curiosity.