setSequence

Update read sequences

Description

example

newObject = setSequence(object,sequenceInfo) returns a new object that is a copy of object with the Sequence property set to sequenceInfo.

example

newObject = setSequence(object,sequenceInfo,subset) returns a new object that is a copy of object with the Sequence property of a subset of elements set to sequenceInfo. A one-to-one relationship must exist between the number and order of elements in sequenceInfo and subset.

Examples

collapse all

Store read data from a SAM-formatted file in a BioRead object. Set 'InMemory' to true to load the object into memory so that you can modify its properties.

br = BioRead('SRR005164_1_50.fastq','InMemory',true)
br = 
  BioRead with properties:

     Quality: {50x1 cell}
    Sequence: {50x1 cell}
      Header: {50x1 cell}
       NSeqs: 50
        Name: ''

Check the read sequences of the first three elements of the object.

br.Sequence(1:3)
ans = 3x1 cell array
    {'TGGCTTTAAAGCAGAACTTGTGAAAGAAGGAAAGCATTATGATTATCTGGCTAAGCTTAGCATTGTTTAGAA'                                                                                                     }
    {'TTACACTATCCTCTGATTACCAAAGACGTTTCTCGGTCATACAGACAGTCCTTGAGCAAGGGAAGAATTTATTTGCAGGCAAAAAAGTGTCCAACCGTATCGTGAGTATCGACCGGCATTACCTT'                                                }
    {'CACGAGCGGTATATTTGCCTTTTTGTGCTGTGATTCGATTCTTTTCTCTCCTCCACCCAAGCGAGCTTGCTCACGAAGTGCGATGAGCTCTTTTACTTTTCAAGCTGGTTACTCATTGTATTTTGATTTGTTGTTAGAAATGAACGGATTAATTATTTGTTGCCCGGCATGCA'}

Generate random sequences for the first three reads. Use the randseq function to generate random sequences with the same length as the original sequences.

sequenceInfo = cell(3,1);
rng('default');
for i = 1:3
    sequenceInfo{i} = randseq(length(br.Quality{i}));
end
sequenceInfo
sequenceInfo = 3x1 cell array
    {'TTATGACGTTATTCTACTTTGATTGTGCGAGACAATGCTACCTTACCGGTCGGAACTCGATCGGTTGAACTC'                                                                                                     }
    {'TATCACGCCTGGTCTTCGAAGTTAGCACATCGAGCGGGCAATATGTACATATTTACCTCTACAATGGATGCGCAAAAACATTCCCTCATCACAATTGAACTAAAGGGCGCGAGACGTATTCCCCG'                                                }
    {'GTTGCTGCTTGGGACCATAAAACCTCATTCACCGCGGAACCCGACTATGCGACTGGACGGCCTATTTACCGAGAGCTGTTCGAAGGCTGGTTGAATACATGGCAGAAGATTGAGGTGTCCTAAACTTACGCGGCCATAACACCTTAGCCGTCTCGGGGGAATAAGTGACCTAT'}

Update the sequences of the first three elements. br2 is a copy of br with updated read sequences. If you need to update the br object itself, set it as the output of the function.

br2 = setSequence(br,sequenceInfo,[1:3]);
br2.Sequence(1:3)
ans = 3x1 cell array
    {'TTATGACGTTATTCTACTTTGATTGTGCGAGACAATGCTACCTTACCGGTCGGAACTCGATCGGTTGAACTC'                                                                                                     }
    {'TATCACGCCTGGTCTTCGAAGTTAGCACATCGAGCGGGCAATATGTACATATTTACCTCTACAATGGATGCGCAAAAACATTCCCTCATCACAATTGAACTAAAGGGCGCGAGACGTATTCCCCG'                                                }
    {'GTTGCTGCTTGGGACCATAAAACCTCATTCACCGCGGAACCCGACTATGCGACTGGACGGCCTATTTACCGAGAGCTGTTCGAAGGCTGGTTGAATACATGGCAGAAGATTGAGGTGTCCTAAACTTACGCGGCCATAACACCTTAGCCGTCTCGGGGGAATAAGTGACCTAT'}

You can also update the sequences of the br object directly using dot notation.

br.Sequence(1:3) = sequenceInfo;
br.Sequence(1:3)
ans = 3x1 cell array
    {'TTATGACGTTATTCTACTTTGATTGTGCGAGACAATGCTACCTTACCGGTCGGAACTCGATCGGTTGAACTC'                                                                                                     }
    {'TATCACGCCTGGTCTTCGAAGTTAGCACATCGAGCGGGCAATATGTACATATTTACCTCTACAATGGATGCGCAAAAACATTCCCTCATCACAATTGAACTAAAGGGCGCGAGACGTATTCCCCG'                                                }
    {'GTTGCTGCTTGGGACCATAAAACCTCATTCACCGCGGAACCCGACTATGCGACTGGACGGCCTATTTACCGAGAGCTGTTCGAAGGCTGGTTGAATACATGGCAGAAGATTGAGGTGTCCTAAACTTACGCGGCCATAACACCTTAGCCGTCTCGGGGGAATAAGTGACCTAT'}

Input Arguments

collapse all

Object containing the read data, specified as a BioRead or BioMap object. If the object is not stored in memory, you cannot modify its properties, except the Name property.

Example: readData

Read sequences, specified as a cell array of character vectors or string vector containing nucleotide sequences.

Example: {'TGGCTTC','AAAGCAGTACG'}

Subset of elements in the object, specified as a vector of positive integers, logical vector, string vector, or cell array of character vectors containing valid sequence headers.

Example: [1 3]

Tip

When you use a sequence header (or a cell array of headers) for subset, a repeated header specifies all elements with that header.

Output Arguments

collapse all

New object with updated properties, returned as a BioRead or BioMap object.

Introduced in R2010a