Find and Replace Overlapping Substrings
3 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hello,
I want to find a set of substrings (between 19 and 24 characters long, 'ACGT' mix = DNA sequences) in a bigger string (template DNA) and replace them with '*' for the length of the substring. I have following code.
%"template" is a 8x1 cell array with original DNA sequence data (araound 1800 chars each). To minimize the example I just go through the first cell.
%"substring" is e.g. a 50x2 cell array, with column 1 = substring and olumn 2 = length of the substring.
%"substituted_seq" is a 8x1 cell array with the replaced sequence (substrings substituted by '*')
%
substituted_seq{1,1} = strrep(template{1,1},substring{1,1},'*');
for j=1:size(substring,1)
substituted_seq{1,1} = strrep(substituted_seq{1,1},substring{j,1},'*');
end
The first problem I have is, that these substrings are overlapping with each other. So when I replace the first substring with '*' and search for the next one (which is overlapping the first) this code will not replace it anymore.
Second: I also couldn't figure out, how to replace a substing of e.g. 'ACGTCG' with the same number of '*' (in this example '******').
I would be very grateful for any help. Thanks!
0 commentaires
Réponse acceptée
Robert Cumming
le 30 Août 2012
I would make a binary flag = to the length of your string. Then run through all your substrings and mark the flag true for the characters to be replaced wiht *. This will eliminate the fact the problem of overlapping.
Once its all done you then replace all the true items recorded by flag in your string.
For your second iss: something like:
flag = 'CC';
key = regexprep ( 'CC', '.', '*' );
regexprep ( 'ABCCCCCDEFG', flag, key )
ans =
AB****CDEFG
0 commentaires
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur Genomics and Next Generation Sequencing dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!