How to read text files form sub-sub folders
1 vue (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Mekala balaji
le 4 Oct 2017
Commenté : Mekala balaji
le 17 Oct 2017
Hi,
I want to read text files from sub-sub folders:
Architecture:
Mainfolder
Tool1
sub-subFolder1
sub-subFolder2
.....
.....
Tool2
sub-subFolder1
sub-subFolder2
.....
.....
......
1. Read text files by each sub-folder(i.e, Tool1, Tool2, etc)
2. Output
Tool1.xlsx, Tool2.xlsx
I use the following code, but I can access sub-sub folders.
% - Define output header.
header = {'RainFallID', 'IINT', 'Rain Result', 'Start Time', 'Param1.pipe', ...
'10 Un Para2.pipe', 'Verti 2 mixing.dis', 'Rate.alarm times'} ;
Mainfolder='Mainfolder';
outLocatorFolder='OutputFolder';
nHeaderCols = numel( header ) ;
% - Build listing sub-folders of main folder.
% D_main = dir( 'D:\Mekala_Backupdata\Matlab2010\Mainfolder' ) ;
D_main = dir(Mainfolder ) ;
D_main = D_main(3:end) ; % Eliminate "." and ".."
% - Iterate through sub-folders and process.
for dId = 1 : numel( D_main )
% - Build listing files of sub-folder.
D_sub = dir( fullfile(Mainfolder, D_main(dId).name, '*.txt' )) ;
nFiles = numel( D_sub ) ;
keyboard
% - Prealloc output cell array.
data = cell( nFiles, nHeaderCols ) ;
% - Iterate through files and process.
for fId = 1 : nFiles
% - Read input text file.
inLocator = fullfile(Mainfolder, D_main(dId).name, D_sub(fId).name ) ;
content = fileread( inLocator ) ;
% - Extract relevant data.
rainfallId = str2double( regexp( content, '(?<=RainFallID\s+:\s*)\d+', 'match', 'once' )) ;
iint = regexp( content, '(?<=IINT\s+:\s*)\S+', 'match', 'once' ) ;
rainResult = regexp( content, '(?<=Rain Result\s+:\s*)\S+', 'match', 'once' ) ;
startTime = strtrim( regexp( content, '(?<=Start Time\s+:\s*).*?(?= -)', 'match', 'once' )) ;
param1Pipe = str2double( regexp( content, '(?<=Param1.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
tenUn = str2double( regexp( content, '(?<=10 Un Para2.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
verti2 = regexp( content, '(?<=Verti 2 mixing.dis\s+\S+\s%\s+)\S+', 'match', 'once' ) ;
rateAlarm = strtrim( regexp( content, '(?<=Rate.alarm times\s+\S+\s+)[^\r\n]+', 'match', 'once' )) ;
% - Populate data cell array.
data(fId,:) = {rainfallId, iint, rainResult, startTime, ...
param1Pipe, tenUn, verti2, rateAlarm} ;
end
% - Output to XLSX.
% outLocator = fullfile( 'D:\Mekala_Backupdata\Matlab2010\OutputFolder', sprintf( '%s.xlsx', D_main(dId).name )) ;
outLocator = fullfile(outLocatorFolder, sprintf( '%s.xlsx', D_main(dId).name )) ;
fprintf( 'Output XLSX: %s ..\n', outLocator ) ;
xlswrite( outLocator, [header; data] ) ;
end
many thanks in advance,
0 commentaires
Réponse acceptée
Plus de réponses (1)
Cedric
le 4 Oct 2017
Modifié(e) : Cedric
le 4 Oct 2017
Look at the EDIT 4:09pm block in the thread:
update the pseudo-code
Iterate through sub folders of 'Mainfolder'
Iterate through files of sub folder
Extract data from file and store in data array
Export data array to relevant Excel file
specifically for your new problem, and it should show you how to restructure and update the former code. At first remove all the code that is not necessary to crawling through the folders and files, and run it to check that it is crawling as desired.
Big hint: you should be able to add a level of FOR loop. Define D_sub at a strategic place:
for dmId = 1 : numel( D_main )
D_sub = dir( fullfile( Mainfolder, D_main(dmId).name )) ;
D_sub = D_sub(3:end) ; % Eliminate "." and ".."
iterate through its elements (sub-sub-folders):
for dsId = 1 : numel( D_sub )
D_subsub = dir( fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, '*.txt' )) ;
nFiles = numel( D_subsub ) ;
and finally iterate through D_subsub elements (the text files):
for fId = 1 : nFiles
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
content = fileread( inLocator ) ;
Note that if you have a recent version of MATLAB, you can replace most calls to FULLFILE by the value of the folder field of the relevant output of a former DIR, e.g.:
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
could be replaced by:
inLocator = fullfile( D_subsub(fId).folder, D_subsub(fId).name ) ;
Finally, note that if you have a lot of different situations with varying depths of nested folders, a better approach would be to build a recursive crawler, but this is a bit more complex.
4 commentaires
Cedric
le 4 Oct 2017
Modifié(e) : Cedric
le 4 Oct 2017
You should index D_main with dmId when you generate the output locator. When I wrote the hints above with an additional level of loop, I changed the name of the loop index variables to make them more consistent: dmId for "dir main ID" and dsId for "dir sub ID".
Voir également
Catégories
En savoir plus sur File Operations dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!