Documentation

genelowvalfilter

Remove gene profiles with low absolute values

Syntax

  • Mask = genelowvalfilter(Data) example
  • [Mask,FData] = genelowvalfilter(Data) example
  • [Mask,FData,FNames] = genelowvalfilter(Data,geneNames) example
  • [___] = genelowvalfilter(___,Name,Value) example

Description

example

Mask = genelowvalfilter(Data) returns a logical vector Mask identifying gene expression profiles in Data that have absolute expression levels in the lowest 10% of the data set.

Gene expression profile experiments have data where the absolute values are very low. The quality of this type of data is often bad due to large quantization errors or simply poor spot hybridization. Use this function to filter data.

example

[Mask,FData] = genelowvalfilter(Data) also returns FData, a data matrix containing filtered expression profiles.

example

[Mask,FData,FNames] = genelowvalfilter(Data,geneNames) also returns FNames, a cell array of filtered gene names or IDs. You have to specify geneNames to return FNames unless Data is a DataMatrix object with specified row names.

example

[___] = genelowvalfilter(___,Name,Value) returns any of the previous output arguments using any input arguments from the previous syntaxes and additional options, specified as one or more optional name-value pair arguments.

Examples

expand all

Filter Out Genes with Low Absolute Expression Levels

Load the sample yeast data.

load yeastdata;

Retrieve the genes and corresponding expression data where absolute expression levels exceed the 10th percentile.

[mask,filteredData,filteredGenes] = genelowvalfilter(yeastvalues,genes);

Compare the number of filtered genes (filteredGenes) with the number of genes in the original data set (genes).

size (filteredGenes,1)
ans =

        6394

size (genes,1)

ans =

        6400

Filter Out Genes with Low Absolute Expression Levels Using a Logical Vector

Load the sample yeast data.

load yeastdata;

Mark the genes that have low absolute expression levels below the 10th percentile of the data set.

mask = genelowvalfilter(yeastvalues);

The variable genes contains every gene names in the yeast data set. Use the generated logical vector mask to retrieve the genes where expression levels exceed the 10th percentile.

filteredGenes = genes(mask);

Extract corresponding expression profile data for the selected genes from the variable yeastvalues, which contains expression profiles of every gene in the yeast data set.

filteredData = yeastvalues(mask,:);

Filter Out Genes with Absolute Expression Levels that are Lower Than a User-Defined Threshold

Load the sample yeast data.

load yeastdata;

Retrieve the genes and corresponding expression data where absolute expression levels exceed the 30th percentile of the data set.

[mask,filteredData,filteredGenes] = genelowvalfilter(yeastvalues,genes,'Percentile',30);

Compare the number of filtered genes (filteredGenes) with the number of genes in the original data set (genes).

size (filteredGenes,1)
ans =

        6384

size (genes,1)

ans =

        6400

Input Arguments

expand all

Data — Input dataDataMatrix object | numeric matrix

Input data, specified as a DataMatrix object or numeric matrix. Each row of the matrix corresponds to the experimental results for one gene. Each column represents the results for all genes from one experiment.

geneNames — Gene names or IDscell array of strings

Gene names or IDs, specified as a cell array of strings. The array has the same number of rows as Data. Each row contains the name or ID of the gene in the data set.

    Note:   If Data is a DataMatrix object with specified row names, you do not need to provide the second input geneNames to return the third output FNames.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'AbsValue',10.5 specifies genelowvalfilter to remove expression profiles with absolute values less than 10.5.

'Percentile' — Percentile value10 (default) | scalar value in the range (0,100)

Percentile value, specified as a scalar value in the range (0 to 100). The function genelowvalfilter removes gene expression profiles with absolute values less than the percentile value, which is specified using 'Percentile'.

Example: 'Percentile',50

'AbsValue' — Absolute expression profile valuereal number

Absolute expression profile value, specified as a real number. The function genelowvalfilter removes gene expression profiles with absolute values less than the absolute value, which is specified using 'AbsValue'.

Example: 'AbsValue',10.5

'AnyVal' — Logical indicator to select minimum or maximum absolute valuefalse (default) | true

Logical indicator to select the minimum or maximum absolute value, specified as true or false. Set the value to true to select the minimum absolute value. Set it to false to select the maximum absolute value.

Example: 'AnyVal',true

Output Arguments

expand all

Mask — Logical vectorvector of 0s and 1s

Logical vector, returned as a vector of 0s and 1s for each row in Data. The elements of Mask with value 1 correspond to rows with absolute expression levels exceeding the threshold, and those with value 0 correspond to rows with absolute expression levels less than or equal to the threshold.

FData — Filtered data matrixdata matrix

Filtered data matrix, returned as a data matrix that contains gene expression profiles with absolute expression levels exceeding the threshold value. You can also create FData using FData = Data(Mask,:).

FNames — Array of filtered gene namescell array of strings

Array of filtered gene names, returned as a cell array of strings. It contains gene names or IDs corresponding to each row of Data that contains gene expression profiles with absolute expression levels exceeding the threshold value. You can also create FNames using FNames = geneNames(Mask).

References

[1] Kohane, I.S., Kho, A.T., Butte, A.J. (2003). Microarrays for an Integrative Genomics, First Edition (Cambridge, MA: MIT Press).

Was this topic helpful?