Contenu principal

getExons

Return table of exons from GTFAnnotation object

Description

exons = getExons(AnnotObj) returns exons, a table of existing exons in AnnotObj.

[exons,junctions]= getExons(AnnotObj) also returns junctions, a table of spliced junctions for each reference listed in AnnotObj.

[___] = getExons(AnnotObj,"Reference",R) returns the exons that belong to one or more references specified by R.

[___] = getExons(AnnotObj,"Gene",G) returns the exons that belong to one or more genes specified by G.

example

[___] = getExons(AnnotObj,"Transcript",T) returns the exons that belong to one or more transcripts specified by T.

Examples

collapse all

Create a GTFAnnotation object from a GTF-formatted file.

obj = GTFAnnotation('hum37_2_1M.gtf');

Get the list of gene names listed in the object.

gNames = getGeneNames(obj)
gNames = 28×1 cell
    {'uc002qvu.2'}
    {'uc002qvv.2'}
    {'uc002qvw.2'}
    {'uc002qvx.2'}
    {'uc002qvy.2'}
    {'uc002qvz.2'}
    {'uc002qwa.2'}
    {'uc002qwb.2'}
    {'uc002qwc.1'}
    {'uc002qwd.2'}
    {'uc002qwe.3'}
    {'uc002qwf.2'}
    {'uc002qwg.2'}
    {'uc002qwh.2'}
    {'uc002qwi.3'}
    {'uc002qwk.2'}
    {'uc002qwl.2'}
    {'uc002qwm.1'}
    {'uc002qwn.1'}
    {'uc002qwo.1'}
    {'uc002qwp.2'}
    {'uc002qwq.2'}
    {'uc010ewe.2'}
    {'uc010ewf.1'}
    {'uc010ewg.2'}
    {'uc010ewh.1'}
    {'uc010ewi.2'}
    {'uc010yim.1'}

Get a table of exons which belong to the first gene uc002qvu.2.

exons = getExons(obj,'Gene',gNames{1})
exons=8×7 table
      Transcript       GeneName         GeneID        Reference    Start      Stop     Strand
    ______________    __________    ______________    _________    ______    ______    ______

    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       218138    219001      -   
    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       224864    224920      -   
    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       229966    230044      -   
    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       231023    231191      -   
    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       233101    233229      -   
    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       234160    234272      -   
    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       247538    247602      -   
    {'uc002qvu.2'}    {0×0 char}    {'uc002qvu.2'}      chr2       249731    249852      -   

Input Arguments

collapse all

GTF annotation, specified as a GTFAnnotation object.

Names of reference sequences, specified as a character vector, string, string vector, cell array of character vectors, or categorical array.

The names must come from the Reference field of AnnotObj. If a name does not exist, the function provides a warning and ignores it.

Data Types: char | string | cell | categorical

Names of genes, specified as a character vector, string, string vector, cell array of character vectors, or categorical array.

The names must come from the Gene field of AnnotObj. If a name does not exist, the function provides a warning and ignores the name.

Data Types: char | string | cell | categorical

Names of transcripts, specified as a character vector, string, string vector, cell array of character vectors, or categorical array.

The names must come from the Transcript field of AnnotObj. If a name does not exist, the function gives a warning and ignores the name.

Data Types: char | string | cell | categorical

Output Arguments

collapse all

Exons in AnnotObj, returned as a table. The table contains the following variables for each transcript.

Variable NameDescription
TranscriptCell array of character vectors containing transcript IDs, obtained from the Transcript field of AnnotObj.
GeneNameCell array of character vectors containing the names of expressed genes, obtained from the Attributes field of AnnotObj. This cell array can contain empty character vectors if the corresponding gene names are not found in Attributes.
GeneIDCell array of character vectors containing the expressed gene IDs, obtained from the Gene field of AnnotObj.
ReferenceCategorical array representing the names of reference sequences to which the expressed genes belong. The reference names are from the Reference field of AnnotObj.
StartStart location of each exon.
StopStop location of each exon.
StrandCategorical array containing the strand of expressed gene.

Spliced junctions for each reference, returned as a table. The table contains the following variables for each junction.

Variable NameDescription
StartStart location of each junction.
StopStop location of each junction.
ReferenceCategorical array representing the names of reference sequences to which the junctions belong. The reference names are from the Reference field of AnnotObj.

Version History

Introduced in R2014b