MATLAB Answers

Text file manipulation, find specific lines, find there a string and replace the specific line with new content

1 view (last 30 days)
Martin Landesberger
Martin Landesberger on 5 Dec 2019
Commented: Stephan on 6 Dec 2019
Hello Community,
Is there a simple way in matlab to perform the following task?
I have a text file that contains of many datasets that looks like that:
#sp|A00G945|CRER-FG Beta-morph B2 chain PS=Legro soma RB=10653 GN=CWERG PE=20 SV=1
CDMSXXXTGHTJKDLGLSKDJGHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFF
GHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFFCDMSXXXTGHTJKDLGLSKDJ
DMSXXXTGHTJKDLGLSKDJDMSXXXTGHTJKDLGLSKDJDMERUTUEZZUUFFF
#sp|P011|CRYAB_Ceta Beta-morph C chain PS=Legro alto RB=5456 GN=CWERF PE=60 SV=2
CDMSXXXTGHTJKDLGLSKDJGHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFF
GHLJSDGHFKSJDHFGKSJDFHGKSDJGHKSDJHFGFFFCDMSXXXTGHTJKDLGLSKDJ
DMSXXXTGHTJKDLGLSKDJDMSXXXTGHTJKDLGLSKDJDMERUTUEZZUUFFF
...
The beginning of each dataset starts with "#".
I'd like to replace spaces with undescore,
then extract the string between "PS=" and "RB=" (In case of 1. dataset it would give back "Legro_soma"
and replace the line with "#Legro_soma"
fid =fopen('test.txt');
C=textscan(fid,'%s','delimiter','\n');
fclose(fid);
for k=1:numel(C{1,1})
tmp = regexp(C{1,1}(k),'\s'); % find empty spaces
tmp2= strfind(C{1,1}(k),'>');
C{1,1}{k,1}(tmp{1,1}) = '_'; % substitute empty spaces by '_'
end
%Here comes the attemp to extract the string:
idx = strfind(A{u},'PS=');
idend = strfind(A{u},'RB=');
remain = C{1,1}(idx+3:idend-2);
%Replacing the line can be down how?
Thanks for your ideas and help!

  0 Comments

Sign in to comment.

Answers (2)

Stephan
Stephan on 5 Dec 2019
Edited: Stephan on 5 Dec 2019

Sign in to answer this question.