Bowtie2AlignOptions
Options to map reads to reference sequence
Description
A Bowtie2AlignOptions object contains options to run the
                bowtie2 function, which aligns reads to a reference
            sequence.
Creation
Syntax
Description
alignOptions = Bowtie2AlignOptionsBowtie2AlignOptions object with default property
                    values.
Bowtie2AlignOptions requires the Bowtie 2 Support Package for Bioinformatics Toolbox™. If this support package is not installed, then the function provides a download
        link. For details, see Bioinformatics Toolbox Software Support Packages.
alignOptions = Bowtie2AlignOptions(Name,Value)alignOptions =
                        Bowtie2AlignOptions('Trim5',10) specifies to trim 10 residues from
                    the 5' end.
Input Arguments
Alignment parameters, specified as a character vector.
                                S must be in the Bowtie 2 option syntax
                            (prefixed by one or two dashes) [1].
Properties
Since R2023b
Flag to allow unpaired reads to be aligned to the forward (Watson)
                        reference strand, specified as a numeric or logical 1
                            (true) or 0 (false). Set this
                        option to false to prevent bowtie2
                        from aligning reads to the forward reference strand.
Data Types: double | logical
Since R2023b
Flag to allow unpaired reads to be aligned to the reverse (Crick)
                        reference strand, specified as a numeric or logical 1
                            (true) or 0 (false). Set this
                        option to false to prevent bowtie2
                        from aligning reads to the reverse reference strand.
Data Types: double | logical
Since R2023b
Base name of files where aligned paired reads are saved, specified as a
                        character vector or string scalar. Paired reads that align at least one time
                        are saved to the files. bowtie2 creates two files, one
                        for each read pair. The files have the same format as the input data.
The function appends ".1" or ".2" to
                        the base file name to specify each read pair file. If the base file name
                        includes the % symbol, bowtie2
                        inserts 1 or 2 at this
                            % position instead of appending
                            ".1" or ".2". Use
                            ReadSupplementFileCompression to compress these
                        supplement files.
By default, bowtie2 does not create these supplement
                        files.
Data Types: char | string
Since R2023b
Name of a file where aligned unpaired reads are saved, specified as a
                        character vector or string scalar. Unpaired reads that align at least one
                        time are saved to the file. The file has the same format as the input data.
                        Use ReadSupplementFileCompression to compress these
                        supplement files.
By default, bowtie2 does not create the file.
Data Types: char | string
Flag to allow dovetail configurations of input reads, specified as a
                        numeric or logical 1 (true) or 0 (
                            false). This property specifies whether the alignment
                        of one mate can extend past the beginning of the alignment of the other mate
                        and be considered concordant.
This property applies to paired-end reads only.
Data Types: double | logical
Penalty for positions with ambiguous characters on the read sequence, reference sequence, or both, specified as a nonnegative integer.
Data Types: double
Since R2023b
Flag to append FASTQ or FASTA comments to the output SAM file, specified
                        as a numeric or logical 1 (true) or 0
                            (false). A comment is any text after the first space
                        in the read name.
Data Types: double | logical
Since R2023b
Flag to align the paired-end BAM reads, specified as a numeric or logical
                        1 (true) or 0 ( false). This flag is
                        functional only if you also set ReadFormat="BAM".
By default, bowtie2 attempts to align unpaired BAM
                        reads only. Set the value to true to align paired-end
                        reads instead.
Data Types: double | logical
Since R2023b
Flag to preserve tags from the input BAM file by appending them to the SAM
                        output, specified as a numeric or logical 1 (true) or 0 (
                            false). Set the value to true to
                        add the tags to the end of the corresponding SAM output file.
Data Types: double | logical
 Encoding format of the base quality in the input files, specified as one
                        of the following: 'Phred33',
                        'Phred64', or 'Solexa'.
Data Types: char | string
Flag to allow one mate alignment to contain the alignment of the other
                        mate and to be considered concordant, specified as a numeric or logical 1
                            (true) or 0 (false).
This property applies to paired-end reads only.
Data Types: double | logical
Flag to include discordant alignments, specified as a numeric or logical 1
                            (true) or 0 (false). A discordant
                        alignment is an alignment where both mates align uniquely, but not in a way
                        that satisfies the paired-end constraints.
Data Types: double | logical
Flag to exclude mixed alignments, specified as a numeric or logical 1
                            (true) or 0 ( false). A mixed
                        alignment consists of mate reads that are not concordant or discordant, but
                        align individually.
This property applies to paired-end reads only.
Data Types: double | logical
Flag to allow the alignment of one mate to overlap with the alignment of
                        the other mate and to be considered concordant, specified as a numeric or
                        logical 1 (true) or 0 (false). 
Data Types: double | logical
Since R2023b
Flag to exclude SAM headers, specified as a numeric or logical 1
                            (true) or 0 ( false). A SAM header
                        starts with the @ symbol.
Data Types: double | logical
Since R2023b
Flag to exclude SAM reference sequence header lines in the output SAM
                        file, specified as a numeric or logical 1 (true) or 0 (
                            false). A reference sequence header line starts with
                            @SQ.
Data Types: double | logical
Flag to exclude reads that failed to align, specified as a numeric or
                        logical 1 (true) or 0 (false).
Data Types: double | logical
Additional options not included in the object properties, specified as
                                    a character vector. The character vector must be in the Bowtie 2
                                    option syntax (prefixed by one or two dashes). The default value
                                    is an empty character vector ''.
Example: 'ExtraBowtie2Command','--version'
Data Types: char | string
Since R2023b
K-mer length and step size to use when you set
                            ReadFormat="FASTAKMer", specified as a two-element
                        vector of positive integers.
Data Types: double
Since R2023b
Flag to filter reads with nonzero QSEQ filter field, specified as a
                        numeric or logical 1 (true) or 0 (
                            false). This flag is functional only if you also set
                            ReadFormat="QSeq".
Data Types: double | logical
Flag to ignore the actual read position quality when a mismatch occurs,
                        specified as a numeric or logical 1 (true) or 0
                            (false). Setting this property to
                            true allows the quality value at that mismatched
                        position to be the highest possible, regardless of the actual value.
Data Types: double | logical
Since R2023b
Flag to consider soft-clipped bases as unmapped when calculating TLEN in
                        the output SAM file, specified as a numeric or logical 1
                            (true) or 0 ( false). This flag is
                        functional only if you also set Mode="Local". TLEN
                        stands for signed observed template length.
Data Types: double | logical
Since R2023b
Flag to specify quality values in the input reads as space-separated
                        integers rather than ASCII characters, specified as a numeric or logical 1
                            (true) or 0 ( false).
Data Types: double | logical
Reward added to the alignment score when a position in the read matches a position in the reference, specified as a nonnegative integer.
Data Types: double
Since R2023b
Orientation of mate pairs for paired-end alignment, specified as one of the following:
- "ForwardReverse"— Aligned pairs are derived from a forward-oriented mate upstream of a reverse-oriented complement mate.
- "ReverseForward"— Aligned pairs are derived from a reverse-oriented complement mate upstream of a forward-oriented mate.
- "ForwardForward"— Aligned pairs are derived from a forward-oriented mate upstream of a forward-oriented mate.
Data Types: char | string
Function governing the maximum number of ambiguous characters allowed in a read, specified as a character vector or string scalar.
The function has the format 'f,B,A', where
        f is a function type, B is a constant term, and
        A is a coefficient. Available function types are:
- 'C'– Constant
- 'L'– Linear
- 'S'– Square root
- 'G'– Natural log
The resulting function is H(x) = B + A * f(x), where
        x is the read length.
The default function is 'L,0,0.15', that is,
                            H(x) = 0 + 0.15 * x.
Example: 'MaxAmbiguousFunction','L,-0.4,-0.6'
Data Types: char | string
Since R2023b
Maximum fragment length for the paired-end alignment, specified as a positive integer.
The larger the difference between MaxFragmentLength
                        and MinFragmentLength is, the slower
                            bowtie2 runs.
This option does not consider trimming into account. That is, if you
                        specify trimming options, such as Trim3 or
                            Trim5,  MaxFragmentLength is
                        applied to the untrimmed mates.
Data Types: double
Flag to use memory mapping (instead of file I/O) when loading the index,
                        specified as a numeric 1 (true) or 0
                            (false). Memory mapping allows many concurrent
                        processes to share the memory image of the index, resulting in a more
                        efficient parallelization of the task.
Data Types: double | logical
Since R2023b
Name of the metrics file, specified as a character vector or string
                        scalar. This file contains performance metrics for the alignment generated
                        by bowtie2. By default, bowtie2
                        does not generate a metrics file.
Data Types: char | string
Since R2023b
Time interval in seconds for writing to the metrics file, specified as a
                        positive integer. This option is functional only if you also specify
                            MetricsFile. If so, by default,
                            bowtie2 writes a new metrics record every
                        second.
Data Types: double
Since R2023b
Minimum fragment length for the paired-end alignment, specified as a nonnegative integer.
The larger the difference between MaxFragmentLength
                        and MinFragmentLength is, the slower
                            bowtie2 runs.
This option does not consider trimming into account. That is, if you
                        specify trimming options, such as Trim3 or
                            Trim5,  MinFragmentLength is
                        applied to the untrimmed mates.
Data Types: double
Function governing the minimum score threshold of an alignment, specified as a character vector or string scalar.
The function has the format 'f,B,A', where
        f is a function type, B is a constant term, and
        A is a coefficient. Available function types are:
- 'C'– Constant
- 'L'– Linear
- 'S'– Square root
- 'G'– Natural log
The resulting function is H(x) = B + A * f(x), where
        x is the read length.
For the 'EndToEnd' alignment mode, the default function
                        is 'L,-0.6,-0.6'. For the 'Local'
                        mode, the default function is 'G,20,8'.
Example: 'MinScoreFunction','L,-0.4,-0.6'
Data Types: char | string
Maximum and minimum values to compute the mismatch penalty during alignment, specified as a two-element vector. The first element is the maximum value and the second element is the minimum value.
A number less than or equal to the maximum value, and greater than or
                        equal to the minimum value is subtracted from the alignment score for each
                        position where a read character aligns to a reference character, the
                        characters do not match, and neither is an N
                        character.
Example: 'MismatchPenalty',[5 3]
Data Types: double
Alignment mode, specified as 'EndToEnd' or
                            'Local'.
In the 'Local' mode, only part of the read must align
                        to the reference, and some residues can be omitted (soft-clipped) to achieve
                        the best alignment score. In the 'EndToEnd' mode, the
                        entire read must align without any soft-clipping.
Data Types: char | string
Flag to reinitialize the pseudo-random generator for each read using the
                        current time, specified as a numeric or logical 1 (true)
                        or 0 (false). If true, the alignments
                        reported for two identical reads can be different. The default value is
                            false, that is, the pseudo-random generator is
                        reinitialized using a seed derived from read information and the seed
                        number.
Data Types: double | logical
Number of positions at the beginning or end of each read where gaps are not allowed, specified as a nonnegative integer.
Data Types: double
Maximum number of valid alignments to report before terminating the
                        search, specified as a positive integer, 'Best', or
                            'All'. If you specify a positive integer
                            N, the function searches for up to
                            N distinct, valid alignments for each read.
                            'Best' reports the best alignment for each read.
                            'All' reports all the valid alignments for each read
                        sorted by alignment scores.
The alignment score for a paired-end alignment equals the sum of the alignment scores of individual mates.
Data Types: double | char | string
Maximum number of reseeding attempts with repetitive seeds, specified as a nonnegative integer. During reseeding, the function chooses a new set of reads at different offsets to find more alignments.
Data Types: double
Maximum number of consecutive seed extension attempts before getting a new seed, specified as a nonnegative integer. A seed extension fails if it does not yield an alignment with the best (or second-best) score.
Data Types: double
Number of allowed mismatches in  a seed alignment during the multiseed
                        alignment, specified as 0 or 1.
Data Types: double
Number of parallel threads to perform the alignment, specified as a positive integer. Threads run on separate processors or cores. Increasing the number of threads provides a significant increase in speed (close to linear) but also increases the memory footprint.
Data Types: double
Offrate to use when reading the index to reduce the memory footprint, specified as a positive integer. The offrate must be greater than the offrate used to build the index.
Data Types: double
Since R2023b
Flag to omit SEQ and QUAL fields, specified as a numeric or logical 1
                            (true) or 0 (false). When this
                        option is true, bowtie2 prints an asterisk
                            "*" for these fields in the output SAM file.
Data Types: double | logical
Position in the reference sequence where the alignment for each sequence begins, specified as a nonnegative integer.
Data Types: double
Since R2023b
File format for the input reads, specified as one of the following strings.
- ""— Uses the extensions of the input files to determine the file format. All the input files must have the same file extension.
- "FASTQ"— FASTQ file format.
- "FASTA"— FASTA file format.
- "FASTAKMer"— FASTA file format and you aim to align k-mers from the input files. You must also specify- FASTAKMerParametersthat defines the k-mer length and step size.
- "Interleaved"— Interleaved FASTQ files, where the first two records represent a mate pair.
- "BAM"— Sorted and unaligned BAM files.
- "RawSequences"— Input files contain a single sequence per line.
- "QSeq"— QSEQ file format.
- "Tab5"— TAB5 file format, where each read or pair is on a single line. An unpaired read line is- [name]\t[seq]\t[qual]\n. A paired-end read line is- [name]\t[seq1]\t[qual1]\t[seq2]\t[qual2]\n. An input file can contain a mix of unpaired and paired-end reads, and the function can distinguish and handle both read types.
- "Tab6"— TAB6 file format, where an unpaired read line is- [name]\t[seq]\t[qual]\nand a paired read line is- [name1]\t[seq1]\t[qual1]\t[name2]\t[seq2]\t[qual2]\n.
Data Types: char | string
Gap costs for opening and extending a gap on the read, specified as a
                        two-element vector of nonnegative integers. The first element is the cost of
                        opening a gap, and the second element is the cost of extending a gap. Given
                        the cost vector [GO
                            GE], a read gap of length
                            N is assigned a penalty of
                                GO + N *
                                GE.
Example: 'ReadGapCosts',[4 2]
Data Types: double
Read group information to add as a field on the @RG
                        header line in the output SAM report, specified as a character vector or
                        string. This property applies only if you specify
                            'ReadGroupID'.
Data Types: char | string
Read group ID to add on the @RG header line in the
                        output SAM report, specified as a character vector or string. If you specify
                        any read group ID, the function prints the @RG header
                        line with the tag ID: followed by the specified group
                        ID.
Data Types: char | string
Since R2023b
Compression type to use for the supplement files, specified as
                            "None", "gz",
                            "bz2", or "lz4".Use the following
                        options to specify supplement files:
                            AlignedPairedReadSupplementFile,
                            AlignedUnpairedReadSupplementFile,
                            UnalignedPairedReadSupplementFile,
                            UnalignedUnpairedReadSupplementFile.
Data Types: char | string
Gap costs for opening and extending a gap on the reference, specified as a
                        two-element vector of nonnegative integers. The first element is the cost of
                        opening a gap, and the second element is the cost of extending a gap. Given
                        the cost vector [GO
                            GE], a reference gap of length
                            N is assigned a penalty of
                                GO + N *
                                GE.
Example: 'RefGapCosts',[4 2]
Data Types: double
Flag to reorder SAM records to maintain the same order as in the input
                        files, specified as a numeric or logical 1 (true) or 0
                            (false). This property applies only when the number
                        of parallel threads is greater than one. When you use one thread, the order
                        of the records in the output is the same as the order of the input.
Data Types: double | logical
Number to set the seed in the pseudo-random number generator, specified as a nonnegative integer.
Example: 'Seed',3
Data Types: double
Function governing the distance between seed substrings during the multiseed alignment, specified as a character vector or string scalar.
The function has the format 'f,B,A', where
        f is a function type, B is a constant term, and
        A is a coefficient. Available function types are:
- 'C'– Constant
- 'L'– Linear
- 'S'– Square root
- 'G'– Natural log
The resulting function is H(x) = B + A * f(x), where
        x is the read length.
For the 'EndToEnd' alignment mode, the default function
                        is 'S,1,1.15'. For the 'Local' mode,
                        the default function is 'S,1,0.75'.
Example: 'SeedIntervalFunction','S,2,2.15'
Data Types: char | string
Seed substring length to align during the multiseed alignment, specified as a positive integer.
Data Types: double
Number of reads to ignore from the beginning of the input files, specified as a nonnegative integer.
Data Types: double
Number of residues to trim from the 3' end of each read before aligning, specified as a nonnegative integer.
Data Types: double
Number of residues to trim from the 5' end of each read before aligning, specified as a nonnegative integer.
Data Types: double
Since R2023b
Threshold to trim reads exceeding a given number of bases, specified as a nonnegative integer or two-element array. By default, no reads are trimmed.
If the value is a nonnegative integer N, reads that contains more bases than the specified number N are trimmed from the 3' end.
If the value is a two-element array
                                [M,N], the
                        first number M must be either 3 or 5, which indicates
                        either the 3' or 5' end to trim from. The second number specifies the
                        maximum read length and any reads containing more bases than
                            N are trimmed.
Data Types: double
Since R2023b
Flag to truncate read names, specified as a numeric or logical 1
                            (true) or 0 (false). By default,
                            bowtie2 truncates the read name after the first
                        white space.
Data Types: double | logical
Since R2023b
Base name of files where paired reads that are not aligned are saved,
                        specified as a character vector or string scalar.
                            bowtie2 creates two files, one for each read pair.
                        The files have the same format as the input data.
The function appends ".1" or ".2" to
                        the base file name to specify each read pair file. If the base file name
                        includes the % symbol, bowtie2
                        inserts 1 or 2 at this
                            % position instead of appending
                            ".1" or ".2". Use
                            ReadSupplementFileCompression to compress these
                        supplement files.
By default, bowtie2 does not create these supplement
                        files.
Data Types: char | string
Since R2023b
Name of a file where unpaired reads that are not aligned are saved,
                        specified as a character vector or string scalar. The file has the same
                        format as the input data. Use
                            ReadSupplementFileCompression to compress these
                        supplement files.
By default, bowtie2 does not create the file.
Data Types: char | string
Number of reads to consider from the beginning of input files, specified
                        as a positive integer. The default value is Inf, that is,
                        all reads are considered.
Data Types: double
Since R2023b
Flag to indicate the prioritization of 1-mismatch alignments over the
                        multiseed alignment, specified as a numeric or logical 1
                            (true) or 0 (false). By default,
                            bowtie2 attempts to find the exact matches or
                        matches with a single mismatch before trying a multiseed alignment.
Data Types: double | logical
Object Functions
| getBowtie2Command | Translate object properties to Bowtie 2 options | 
| getBowtie2Table | Retrieve table with object properties and equivalent Bowtie 2 options | 
| preset | Set combination of alignment options | 
| run | Map sequence reads to reference sequence using Bowtie 2 | 
Examples
Build a set of index files for the Drosophila genome. An error message appears if you do not have the Bowtie 2 Support Package for Bioinformatics Toolbox installed when you run the function. Click the provided link to download the package from the Add-on menu.
For this example, the reference sequence Dmel_chr4.fa is already
            provided with the toolbox.
status = bowtie2build('Dmel_chr4.fa', 'Dmel_chr4_index');
If the index build is successful, the function returns 0 and
            creates the index files (*.bt2) in the current folder. The files have
            the prefix 'Dmel_chr4_index'.
Sometimes the index files exist, and you want to know the reference sequence used to
            build the index. In this case, use the bowtie2inspect function to get more information about the
            reference.
bowtie2inspect('Dmel_chr4', 'Dmel_chr4_retrieved.fa');
By default, the output file Dmel_chr4_retrieved.fa contains the sequence of the reference. You can also get a summary information about the reference name and lengths instead of the actual sequence. For details on the available options, see Bowtie2InspectOptions.
Once the index is ready, map the read sequences to the reference using the
                bowtie2 function. The paired-end read files
                (SRR6008575_10k_1.fq and SRR6008575_10k_2.fq)
            are already provided with the toolbox.
bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4.sam');
The output is a SAM-formatted file that contains the mapping results.
You can specify different alignment options by passing in a Bowtie 2 syntax string or
            using a Bowtie2AlignOptions object. 
Suppose you want to trim some residues from the 3' end before
            aligning. First, create a Bowtie2AlignOptions object.
alignOpt = Bowtie2AlignOptions;
Trim four residues from the 3' end before aligning.
alignOpt.Trim3 = 4;
Map reads to the reference using the specified alignment option.
flag = bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4_trimmed.sam',alignOpt);
References
[1] Langmead, B., and S. Salzberg. "Fast gapped-read alignment with Bowtie 2." Nature Methods. 9, 2012, 357–359.
Version History
Introduced in R2018a
See Also
bowtie2 | bowtie2inspect | bowtie2build | Bowtie2BuildOptions | Bowtie2InspectOptions
External Websites
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Sélectionner un site web
Choisissez un site web pour accéder au contenu traduit dans votre langue (lorsqu'il est disponible) et voir les événements et les offres locales. D’après votre position, nous vous recommandons de sélectionner la région suivante : .
Vous pouvez également sélectionner un site web dans la liste suivante :
Comment optimiser les performances du site
Pour optimiser les performances du site, sélectionnez la région Chine (en chinois ou en anglais). Les sites de MathWorks pour les autres pays ne sont pas optimisés pour les visites provenant de votre région.
Amériques
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)