The expression analysis system analyzes a set of features (i.e. "genes", "transcripts", etc.) of the genome to calculate metrics about
the number of reads that could be best mapped to each of the defined features. The "Genome Feature Set" option allows
you to specify which regions of the genome you want to calculate the expression levels for. Curio will
automatically build the nucleotide reference transcript sequences (i.e. the cDNA) based on the features found in the selected file. And then,
depending on the mapping algorithm you choose, will attempt to determine which features each unaligned read in the FASTQ (or read pair) can be best
mapped to as part of calculating the expression metrics for each feature.
The "Feature to Count" setting allows you to specify the type of feature that you would like to count the expression levels for. The expression
analysis system will automatically build the nucleotide reference transcript sequences (i.e. the cDNA) based on the features found in the selected
file, and will count the expression level of the feature type you select here.
E.g. if you're analyzing features defined by a GTF or GFF file that has separate feature records for "genes", "transcripts", and "exons" then Curio
can automatically assemble the cDNA sequence of each transcript from their defined exons. It will then use those sequences directly if you choose to
count transcript expression, and can also conveniently count gene expression using those same transcript sequences (by mapping each transcript to its
corresponding gene). Alternatively, if you select you want to count exon expression (or just the expression of generic genome ranges defined in a
feature file such as a BED file), then Curio will assemble the range (start to end position) of each generic feature as its nucleotide sequence.
Note that only the feature types found in the selected feature set file will be available as options to choose from. If you select a feature file that
only has one feature type (such as a BED file), then Curio simply calculates the expression level of each feature's nucleotide sequence independently.
Curio offers a collection of standard feature sets for common use cases (whole genome analysis, whole exome analysis, etc.) for the standard
assemblies (hg19, hg38, mm10, etc.) Those feature sets are available under the "Community Genome Feature Sets" section which you are welcome to use as
In addition, if you have a custom panel or set of genomic ranges you'd like to analyze, Curio can easily support that too. Simply upload your custom
GTF, GFF, or BED file into the project, assign it to an assembly, and then you'll be able to select it when running an unaligned expression analysis.