Reviewing Expression Metrics and Comparing Samples
After running either an unaligned expression analysis or an
aligned expression analysis you can then navigate to the results of any analysis
by going to the "Analysis Results" tab on the "Project Details" screen, and then clicking on the "View Analysis Results" button. The analysis results
screen then provides various ways to visualize, navigate, and download the expression metrics, as well as providing the ability to conveniently
compare multiple samples to each other.
After navigating to the results of an expresssion analysis, Curio shows a screen (see example below) with various ways to visualize, navigate, and
download the expression metrics that were calculated as part of the analysis.
Feature Density: In the upper left hand corner Curio will draw an overview of where in the genome the expressed
features were found along with scaled blue and green bars that represent the number of expressed reads encountered and number of expressed
features found in each region. Clicking directly on the chromosome number provides a quick way to visualize the expression metrics of all of the
features found within that chromosome, or hovering over the circle visualization will provide a tooltip of the basic expression metrics calculated for
Display Options: The upper right hand corner of the screen provides various options to change the way
the results are displayed - either for a single sample or for a collection of samples that are being compared in different ways. In addition,
some filtering capabilities are provided to control which expressed features will be included in the "Expression Metrics" area based on
their calculated expression metrics.
Expression Metrics: At the bottom of the screen Curio will provide detailed information on the expressed features
found within whichever section of the genome you've narrowed in on. You can change the range of the genome you're looking at by using the text box
on the right hand side to jump to a particular gene by name (e.g. "BRCA1"), to a particular range by entering the genomic position (e.g.
"chr7:55000000-56000000"), or visualize all of the features found within a chromosome by entering its name (e.g. "chr7"). Hovering over any individual
expressed feature provides additional expression metrics on the right hand side, as well as a way to see the original aligned reads (if the
analysis was based on an aligned expression analysis) that the expression metrics
were calculated from by clicking on the "View Alignments" link.
Within the "Display/Filter Options" tab several options are available that allow you to do filtering of the expressed features that
will appear on the bottom of the screen. Note that any filter options set here will also affect the expression download results as well.
So, for example, if you enable the "Exclude features whose expressed read count is less than 5" option, the corresponding features will
both be excluded from the results rendered on the bottom of the screen as well as from the list of expressed features reported via
the "More Options -> Download These Results" option.
The "Search gene and exon positions from <source>" option will control the list of genes (and the transcript positions of each) that
will be available when searching for genes by name in the text box on the right hand side of the results area. Clicking on the blue "from XYZ"
text allows you to change which annotation source (e.g. UCSC, Ensemble, RefSeq) the gene names will come from for the species of the project
you're working on.
By default, Curio will render a column for each expressed feature that represents the feature's transcripts per million (TPM), as well
as a line series for the feature's expressed read count. By clicking on the blue text in the "Show on a column/line series each feature's
<metric type>" option, you can switch which statistical metric you would like to visualize. The available metrics to visualize
Transcripts Per Million: TPM as defined by the equation:
Number of Reads Mapped to Feature * Average Overlap Length of Mapped Read *
Total Number of Expressed Features * Transcript Length of Feature
Reads Per Million Mapped Reads: RPM as defined by the equation:
Number of Reads Mapped to Feature * 1,000,000
Total Number of Filtered Reads in Sample
Reads Per Kilobase Per Million Mapped Reads: RPKM as defined by the equation:
Number of Reads Mapped to Feature * 1,000 * 1,000,000
Total Number of Filtered Reads in Sample * Transcript Length of Feature
Expressed Read Count: Total number of reads that mapped to the feature.
Total Expressed Bases: Total number of bases that overlapped with the feature for all mapped reads.
Average Expressed Bases Per Read: Average number of bases that overlapped with the feature per mapped read.
Feature Transcript Length: The transcript nucleotide sequence length of the feature.
Notes on expression metric calculations:
To determine the length of each feature, Curio calculates the logical length of the transcript nucleotide sequence for the feature. In
the case of an exon or a generic feature as defined by a BED file, this length will simply be the feature's "end position - start position + 1".
In the case of a transcript, the length will be the total lengths of all the exons that make up the transcript. In the case of a gene, Curio
will intelligently merge all of the exons of all transcripts of the gene into a single transcript, and then use the length of that merged transcript
as the gene's feature length.
The "Total Expressed Bases" and "Average Expressed Bases Per Read" are only available if you utilized the
aligned expression analysis job type. Analyses that are based on the
unaligned expression analysis jobs are currently unable to report
the number of base pairs that overlap with the expressed feature.
The "RPKM" and "RPM" metrics are estimated based on the total number of reads available in the input FASTQ file when using the
unaligned expression analysis job type (since that type of analysis
does not require aligning the reads first). With the aligned expression analysis
job types though, the "RPKM" and "RPM" make use of the more expected total number of aligned reads in the sample (after any quality filtering
or UMI processing has been applied).
By default, Curio will include all features that had at least one expressed read. You can use the "Exclude features whose <metric type>
is less than <metric value>" option to filter out features whose expression doesn't meet a required threshold. Clicking on the blue
link text within the filter will allow you to set the desired metric type and threshold value to your liking.
Beyond visualizing the expression metrics of a single sample, Curio also provides a powerful way to compare the results of multiple samples
to each other directly on the expression analysis results report. To access the comparison feature, you need to click on the "Compare"
tab near the top of the screen. The original sample that you navigated to will automatically be included in the comparison, and you can then
select other samples within the same project that you would like to use as part of the comparison. Only the expression analysis samples that
were based on the same assembly (e.g. HG19, HG38, MM10, etc.) will be available to select from for comparison.
After you select one or more files on the "Compare" tab then whichever metrics
you selected will be rendered side-by-side for each of the samples.
Under the "More Options" button in the top right hand corner of the screen Curio will allow you to the download a detailed export
of the expression analysis results. The download is provided in a CSV format that can be easily opened up as a spreadsheet in
a program such as Microsoft's Excel. The report includes both a summary of some key metrics for each of the samples currently being
compared, as well as the detailed expression metrics for each individual expression feature of the primary expression analysis file on the screen.
When you create an account within the Curio Genomics Platform,
your new organization is automatically credited with 50 GB of storage and 100 Computational Units to use for data analyses. This is provided
all free of charge with no obligation to continue using Curio and no strings attached - we will not ask for your credit card or any other
method of payment while you are enjoying your free trial. The free trial allows you to use all of the publicly available features of
Curio: you can create as many projects as you would like, collaborate with as many people as you would like, and execute full data analyses.
With the credit allowances provided, most users are able to upload FASTQ file(s) totaling approximately 200M reads. Then they are able to align
& run QC report(s) on that data and even perform variant or other analyses across panels that total approximately 5mbp.
Once your Storage or Computational Units are exhausted, if you would like to continue using the Curio Platform simply
contact us and we will work with you to get you back to discovering at the speed of your curiosity.