Getting a List of Files.... Should Be Easy But

I have a GUI app 99% complete, but have spent several days trying to resolve what, to me, should be an extremely simple task.
Namely,
I need to get a list of all files and sub files from a given foldername down (incl all subfolders).
The list should exclude;
  • Directories.
  • System Files
  • Hidden Files
  • Files which start with a full stop (ie Mac hidden files)
  • Files which have ".Spotlight-V100" as part of the folder path.
  • Files which have ".Trashes" as part of the folder path.
Questions.
  1. Is there a MatLab command to do this ….. ????
  2. Or is there an elegant routine to do this.
  3. Or is there a Matlab plug-in which will do this.
I have created a matlab script to do this on Windows, but it is approx 90 lines of code. I hate to think what I might have to do to get this working on a Mac as well.
I will be using this regularly, so wish to make this as elegant/efficient as possible.

4 commentaires

@dpb: Shouldn't it be something like this?:
rootpath='YourRootPath';
[~,d]=system(sprintf('dir /A:-H-D-S /S /B "%s"',rootpath));
Walter Roberson
Walter Roberson le 30 Août 2022
The list should exclude;
Directories.
System Files
Hidden Files
Files which start with a full stop (ie Mac hidden files)
Files which have ".Spotlight-V100" as part of the folder path.
Files which have ".Trashes" as part of the folder path.
None of that is difficult except for the "System Files" part. MacOS does not have any file attribute for system files, and determining whether something in MacOS depends on a file could be a lot of work.
Matt O'Brien
Matt O'Brien le 30 Août 2022
[~,d]=system(sprintf('dir /A:-H-D-S /S /B "%s"',rootpath));
This is real close .... but delivers result as a long string of chars. ( 1x13424 char )
Is there any way to get the result as an array or structure of some kind.
dpb
dpb le 30 Août 2022
@Voss -- good catch -- I threw that in at the end...will fix original.

Connectez-vous pour commenter.

 Réponse acceptée

Matt O'Brien
Matt O'Brien le 14 Sep 2022
Modifié(e) : Matt O'Brien le 14 Sep 2022
I think this might be the final version..... I may find some other odd directories which may need to be excluded, but this code provides a good basis for coding for such scenarios.
dinfo = dir('F:\**\*');
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
%filtering
[~,stats] = cellfun(@fileattrib, full_filenames);
is_unwanted = [stats.hidden]==1 | [stats.system]==1;
dinfo(is_unwanted) = [];
full_filenames(is_unwanted) = [];
dinfo([dinfo.isdir]) = []; %exclude directories
dinfo( startsWith({dinfo.name}, '.') ) = []; %exclude hidden files, which is same as . files on MacOS
dinfo( contains({dinfo.folder}, {'.Spotlight-V100', '.Trashes','System Volume Information'}) ) = []; %exclude those particular directories
My thanks to Walter Roberson in particular, and all who have contributed to this discussion / solution.
Ps. I am happy working with the the struct dinfo as the final output of this snippet.

4 commentaires

dpb
dpb le 14 Sep 2022
Modifié(e) : dpb le 14 Sep 2022
Consider streamlining a little and putting the exclusion strings list into variable so can edit/update without modifying functional code; only data...
The fields of interest in the attributes stucture are logical; explictily testing against the value is superfluous; use the logical as is; by "==1" you're just computing a new logical in place of the one already have.
CULL_STRS={'.Spotlight-V100', '.Trashes','System Volume Information'};
dinfo = dir('F:\**\*');
%filtering
full_filenames = fullfile({dinfo.folder}, {dinfo.name}).'';
[~,stats] = cellfun(@fileattrib, full_filenames);
dinfo=dinfo(~([stats.hidden]|[stats.system]|[stats.directory]));
dinfo=dinfo(~contains({dinfo.folder},CULL_STRS));
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
Matt O'Brien
Matt O'Brien le 14 Sep 2022
Modifié(e) : Matt O'Brien le 14 Sep 2022
I like your thinking a lot but I get this error when I test it.
Error using |
Too many input arguments.
Error in BestCodeToGetListOfFilesInAllSubDirs (line 9)
dinfo=dinfo(~(stats.hidden|stats.system|stats.directory));
dpb
dpb le 14 Sep 2022
Modifié(e) : dpb le 14 Sep 2022
Oh,yeah...forgot about stats being an array when putting together...just put the [] you had back that I left out to assimilate back into vectors...
...
full_filenames = fullfile({dinfo.folder}, {dinfo.name}).'; % convert to column for pretty
[~,stats] = cellfun(@fileattrib, full_filenames);
dinfo=dinfo(~([stats.hidden]|[stats.system]|[stats.directory]));
...
Yes... just confirming .... the following snippet works with my test data. Thanks for the prompt response.
CULL_STRS={'.Spotlight-V100', '.Trashes','System Volume Information'};
dinfo = dir('F:\**\*');
%filtering
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
[~,stats] = cellfun(@fileattrib, full_filenames);
dinfo=dinfo(~([stats.hidden]|[stats.system]|[stats.directory]));
dinfo=dinfo(~contains({dinfo.folder},CULL_STRS));
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
Bringing the ad-hoc folders into a variable and using a single filter statement on the attributes is very elegant coding.
I will be using it in a variety of scenarios and it will probably get exposed to a substantial use in due course. I will post back here if I find any unusual gottchas. BTW. I have only tested on Windows, will test on Mac within a few months.

Connectez-vous pour commenter.

Plus de réponses (9)

dpb
dpb le 30 Août 2022
Déplacé(e) : dpb le 30 Août 2022
  1. Directly as one command, no.
  2. Probably not extant, no.
  3. Certainly not specifically, no.
I "know nuthink!" of Mac OS ls equivalent, but doesn't really seem as though it should be particularly difficult -- certainly can't see why it would take some 90 lines of code.
rootpath='YourRootPath';
[~,d]=system(['dir /A:-H-D-S /S /B ' rootpath] );
d=string(split(d,newline));
d=d(strlength(d)>0);
will give you a list of all files that are not hidden/directories/system files for the rootpath folder and all subfolders in a list on Windows using the default CMD command shell. Mac surely has something equivalent.
A few well-chosen filters against the not-wanted list of this list should be pretty slimple to code; a regexp guru there might be of some help; that wouldn't be me, however... :)
I've always wondered why/wished for that TMW would have just supported the basic OS command line switches for the native OS in its incarnation of dir -- having it neutered as it is to "plain vanilla" is a real pain.
ADDENDUM:
Thinking about the exclude list, led me to thinking it's not that hard, either...with the caveat you have had the discipline to not name a file with the excluded path name in a directory not in the excluded list.
excludeList=[".Spotlight-V100"; ".Trashes"]; % filename content to exclude
d=d(~contains(d,excludeList)); % get rid of 'em...
I guess even that part above could be handled if used
d=d(~contains(fileparts(d),excludeList)); % exclude unwanted folders only
I dunno how you would handle @Walter Roberson's comment re: Mac and OS files -- although I'd hope you aren't putting your data where the OS stores its files so it wouldn't be an issue, anyway.

2 commentaires

ls('-ld', tempdir)
drwxrwxrwt 8 root root 200 Aug 30 22:45 /tmp/
You can see that on MacOs and Linux, basic command line switches for MATLAB ls() are supported.
dpb
dpb le 30 Août 2022
Yeah, but not for Winwoes -- nor does dir for either which was my specific complaint.

Connectez-vous pour commenter.

Walter Roberson
Walter Roberson le 30 Août 2022
query_folder = tempdir; %set as appropriate, tempdir is just used for example purposes
dinfo = dir( fullfile(query_folder, '**', '*') );
dinfo([dinfo.isdir]) = []; %exclude directories
dinfo( startsWith({dinfo.name}, '.') ) = []; %exclude hidden files, which is same as . files on MacOS
dinfo( contains({dinfo.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
Unless, that is, when you refer to "hidden files", you refer to things such as ~/Library . If so then you would need to use ls -@ to query looking for the extended attribute com.apple.FinderInfo 32 or system() out to xattr looking for com.apple.FinderInfo
Well, except for the fact that if you add a color tag to a file then the com.apple.FinderInfo attribute with value 32 gets added to the file, and com.apple.metadata:_kMDItemUserTags gets added as well. If you then remove the color from the file, then com.apple.FinderInfo gets removed but a com.apple.metadata:_kMDItemUserTags attribute gets left behind. So to determine whether a file is hidden you need to look for com.apple.FinderInfo is present with value 32 but com.apple.metadata:_kMDItemUserTags is not present...

2 commentaires

Matt O'Brien
Matt O'Brien le 31 Août 2022
My app should not be going near system directories. It is mainly to transfer and process images from SD cards to a hard drive. In some cases the SD cards are backed up to an ssd drive in the field. It is while trying to process from the Sdd drives that I am encountering large volumes of hidden, system, spotlight and trash scenarios.
My app needs to cater for all such scenarios, but will not be looking in traditional system folders.
Matt O'Brien
Matt O'Brien le 31 Août 2022
Modifié(e) : Matt O'Brien le 31 Août 2022
Sony marks certain folders or files as hidden on SD cards used within Sony cameras. E.G. There is a database stored on the card, used by the camera for various camera functions. It is understandable that Sony has marked such files as hidden. I need to make sure these files are not included in the list I wish to transfer and process.

Connectez-vous pour commenter.

Matt O'Brien
Matt O'Brien le 30 Août 2022
I will explore the various suggestions..... Thanks... all good.
Some of the hidden files can be recognised with a name starting with a full stop. Some are dependent on the system or hidden attribute. Very messy.
Getting back to the system command.
The suggested syntax gives me a long stream of chars, with no delimiter between each file name/path.
The following is a quick and fairly crude work around.
Here I split using the Drive Letter and will then need to add it back in at some stage. I can do that with code in my app.
rootpath='W:\MoB_AllData\MoB_TestCopyOfSD_Card Valid\';
[~,d]=system(sprintf('dir /A:-H-D-S /S /B "%s"',rootpath));
DirSplit = split(d,'W:\')
This gives me an array of filenames (ie file urls)... which I can work with.
I will need to reconcile the results.... (of the various suggestions).... Will work on this tomorrow (midnight here in Dublin, Ireland)... and post back some comments when I check the details.

11 commentaires

Walter Roberson
Walter Roberson le 30 Août 2022
fileattrib can tell you whether a MS Windows file has Hidden or System set.
Matt O'Brien
Matt O'Brien le 30 Août 2022
Modifié(e) : Matt O'Brien le 30 Août 2022
I cannot get the split of the datastream from the system(dir) method to work. I have tried lots of different control chars in the split function, hoping one of them will work. No go.
I will start again in the morning.... using Walter Roberson suggestion.
I got funny results before when I used something like the following ...
dinfo( contains({dinfo.folder}, {'.Spotlight-V100', '.Trashes'}) ) = [];
So... I created a seperate command for each folder type.... It may have something to do with the dataset.
I will have to add some extra logic to filter on the system and hidden attributes...
Signing off for now, will pick up with this in the morning.
Thanks for all the really good responses...
I will have to add some extra logic to filter on the system and hidden attributes...
dpb
dpb le 30 Août 2022
Modifié(e) : dpb le 30 Août 2022
"I cannot get the split of the datastream from the system(dir) method to work."
Which OS are we speaking of here, Windows I presume since dir isn't a UNIX-like syntax?
I've never seen a system that doesn't return the list as the character string that looks just like the command window output -- which is a newline after each. Attach a .mat file of your result along with the command line that created it.
Of course, under Windows, dir won't return hidden or system files anyways, so just dir with the magic incantation of the extra **\* to traverse subdirectories returns the same list, just in the struct array instead of a list of names.
Matt O'Brien
Matt O'Brien le 30 Août 2022
Windows. Will post tomorrow. Using my ipad to respond.
No problem; just surprised you're having an issue with this; I've done this kind of exact thing for 30+ years and never had a failure to parse...can't imagine what could be that would change that behavior.
I presume if you just do
!dir
at the command window prompt you see the normal OS directory listing as normal???
Matt O'Brien
Matt O'Brien le 31 Août 2022
Sometime in the not too distant past, the Dos prompt has been replaced by the Windows PowerShell environment. I worked on the original dos, wrote apps in machine code and assembler, but never had the need to get familiar with the Windows PowerShell. So, there may be a syntax adj needed for PowerShell rather than Dos.
dpb
dpb le 31 Août 2022
Oh, bleech! I can't even figure out how to get a useful directory listing at the command prompt with that abomination. I suppose maybe you're also on Win11, too...
Matt O'Brien
Matt O'Brien le 31 Août 2022
No.... still win 10. But I did not request / configure Powershell replacement of Dos prompt. A gift from Microsoft !
Matt O'Brien
Matt O'Brien le 31 Août 2022
Powershell uses Get-ChildItem cmdlet in Recursive mode. That is for another day or project. I will post better screen grabs / info, mat file later tonight of the output from the system(dir) code.... will be v busy for the rest of the day.
dpb
dpb le 31 Août 2022
I've not delved into how to control it (although I may have done and just forgotten it) but when "bang" to OS under MATLAB here on Win10, it still uses CMD.EXE. system is builtin so can't see what it actually does, I presume, however, it uses a start command to spawn a new CMD.EXE process, passing it the rest of the command as parameters.
I deduce that because personally on Windows I use the JPSoftware replacement command processor instead of the MS-supplied CMD and even if Windows is configured to use it as default, MATLAB still use CMD.EXE, not the system default. I've also thought that very rude of MATHWORKS to have done and not use the system default so the user could have their toolsets at hand if wish. With the TakeCommand processor from JPSoft, one could add in the various exclusions into its enhanced DIR and do virtually all the culling before returning the list. That, however, doesn't help for Mac not those who don't use it, of course.
But, looks as though you've basically got the problem solved with Walter's esteemed help so I'll retire from the field here unless you have something else specific along this line you care to pursue.
Good luck!!!
Matt O'Brien
Matt O'Brien le 31 Août 2022
Walter's elegant package of code is so sweet. I am seriously impressed. I just tested it and found a glitch .... hoping Walter will recognise the issue.... posted a comment relative to his post.
I will explore the PowerShell / System(dir) combo. It looks very powerful... but I need to put time and energy into grasping this tool kit... in due course.
Given the elegance of Walter's package of code, I intend to create a general purpose function I can use to generate the list of files required and provide options to include /exclude system files, hidden files, etc. I can then use this for both Win and Mac versions of my app. Also impressed with your use of the dir command. It has been so long since I was near a dos prompt. My thanks. This has been a most useful discussion.

Connectez-vous pour commenter.

Matt O'Brien
Matt O'Brien le 30 Août 2022
Modifié(e) : Matt O'Brien le 30 Août 2022
Just some feedback ...
rootdir = 'W:\MoB_AllData\MoB_TestCopyOfSD_Card Valid\**\*.*'
dinfo = dir( rootdir );
dinfo([dinfo.isdir]) = []; %exclude directories
dinfo( startsWith({dinfo.name}, '.') ) = []; %exclude hidden files, which is same as . files on MacOS
dinfo( contains({dinfo.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
This works beautifully .. but the result includes a small number of hidden files ....
I will add filters tomorrow to filter dinfo() based on the system and hidden attributes as well as the above filters.
Matt O'Brien
Matt O'Brien le 31 Août 2022
Here is my current state of play.....
This elegant snippet (from Walter Roberson ...much appreciated) does most of the heavy lifting.
rootdir = strcat(MyDrive,'**\*.*'); %MyDrive stores full path to folder of interest
MyFileList=dir(rootdir); %get info of files/folders in current directory
%dinfo = dir( rootdir );
MyFileList([MyFileList.isdir]) = []; %exclude directories
MyFileList( startsWith({MyFileList.name}, '.') ) = []; %exclude hidden files, which is same as . files on MacOS
MyFileList( contains({MyFileList.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
But it does not catch Windows files with the attribute of Hidden.
This following snippet catches the hidden Windows files (using my test data).
% remove all folders ( already done... but initialises isBadFile)
isBadFile = cat(1,MyFileList.isdir); %# all directories are bad
% loop to identify hidden files
for iFile = find(~isBadFile)' %'# loop only non-dirs
%# on OSX, hidden files start with a dot
%isBadFile(iFile) = strcmp(MyFileList(iFile).name(1),'.'); % already removed.. in code above
if ~isBadFile(iFile) && ispc
%# check for hidden Windows files - only works on Windows
tmpName = MyFileList(iFile).name;
tmpFullName = strcat(MyFileList(iFile).folder,'\',tmpName)
[~,stats] = fileattrib(tmpFullName);
if stats.hidden
isBadFile(iFile) = true;
end
if stats.system
isBadFile(iFile) = true;
end
end
end
%# remove bad files
MyFileList(isBadFile) = [];
I found the above snippet which catered for Hidden Windows files and also added a filter to catch Windows files with a System Attribute.
I am focused at the moment to get my app working for Windows. Will use it for a few months before I consider adjusting to make it work for Windows and Mac.
In due course I will explore the use of PowerShell to generate the equivalent of a Dir list. This involves the use of Get-ChildItem cmdlet in Recursive mode. That is for another day or project.
However, I have already created a simple Matlab GUI to allow the selection of a folder and generate the file list, outputting it to an Excel file, spitting up into component fields/columns such as Drive,Folder,FileName,Extension, Bytes, Date. This miniApp uses the code included here. This will work for me to generate lists until I get a better handle on Windows PowerShell and has the advantage in that it should work for Mac (with a few tweaks) and Windows.
As a Matlab beginner, my main GUI app might have been a bit too complex, but it delivers functionality which will save me a lot of time ingesting images, movies, sound and other digital assets, in a structured manner from SD cards to a digital repositary. I can start to use it now for my real world needs.
I am truely impressed and grateful for the promptness and quality of the responses here.

2 commentaires

We already know the names cannot be folders, so there is no point testing that.
The below code will work on MacOS and Linux as well -- those will return NaN for the hidden and system attributes, but but specifically testing == 1 then both NaN and 0 are treated as false, so NaN does not need to be special cased.
No loop is needed.
tmpFullNames = fullfile( {MyFileList(iFile).folder}, {MyFileList(iFile).name});
[~,stats] = fileattrib(tmpFullNames);
isBadFile(iFile) = [stats.hidden]==1 | [stats.system]==1;
myFileList(isBadFile) = [];
Matt O'Brien
Matt O'Brien le 31 Août 2022
Brilliant.... will test with my data in the next few days.

Connectez-vous pour commenter.

Matt O'Brien
Matt O'Brien le 31 Août 2022
Modifié(e) : Matt O'Brien le 5 Sep 2022
To Walter.
I ran into a problem with the following.
"No loop is needed."
iFile in the following line is not defined. This is the loop variable in the code I was using to cater for system and hidden files, so I am not sure how iFile should be defined or initialised. I get the following error.
"Unrecognized function or variable 'iFile'.
[ Final working code can be found at this link https://uk.mathworks.com/matlabcentral/answers/1791125-getting-a-list-of-files-should-be-easy-but?s_tid=mlc_ans_email_view#comment_2341705 ] Thanks to Walter for his valuable snippet.]
rootdir = strcat(MyDrive,'**\*.*'); %MyDrive stores full path to folder of interest
MyFileList=dir(rootdir); %get info of files/folders in current directory
%dinfo = dir( rootdir );
MyFileList([MyFileList.isdir]) = []; %exclude directories
MyFileList( startsWith({MyFileList.name}, '.') ) = []; %exclude hidden files, which is same as . files on MacOS
MyFileList( contains({MyFileList.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
tmpFullNames = fullfile( {MyFileList(iFile).folder}, {MyFileList(iFile).name});
[~,stats] = fileattrib(tmpFullNames);
isBadFile(iFile) = [stats.hidden]==1 | [stats.system]==1;
myFileList(isBadFile) = [];
Maybe I am using your suggested code in the wrong place. I can see the overall package of code is a 'thing of beauty' and an inspiration to me to achieve this standard of coding. Apologies if I am missing something obvious.
Regards.

11 commentaires

Walter Roberson
Walter Roberson le 31 Août 2022
Modifié(e) : Walter Roberson le 31 Août 2022
tmpFullNames = fullfile( {MyFileList.folder}, {MyFileList.name});
isBadFile = [stats.hidden]==1 | [stats.system]==1;
Matt O'Brien
Matt O'Brien le 31 Août 2022
Thanks....still a glitch.... do not have the skills to debug this myself ...
Error message and screen grab of section of tmpFullNames enclosed.
Error using fileattrib Argument must be a text scalar.
Error in MoB_DirTable (line 28)
[~,stats] = fileattrib(tmpFullNames);
[~,stats] = cellfun(@fileattrib, tmpFullNames);
Matt O'Brien
Matt O'Brien le 31 Août 2022
Still an Error ....
myFileList(isBadFile) = [];
Deletion requires an existing variable.
Just fyi. isBadFile is a 1x71 logical array (ie horizontal)....
I do not want to be a pest with this..... I can see what the snippet is trying to do. I can also see how I can get it to work with a loop. I will use the loop method now for the system and hidden attributes and circle back to this later, when I can try to figure out how to do this without a loop. I have taken up enough of your time.
Well, then myFileList better be an array of 71 elements as well.
I've not tried to patch together all the myrial partial code postings to see, maybe you have a change in spelling? What does
whos myFileList
return at that point?
Do you know how to set breakpoint and use the debugger to step through code an find/fix logic errors? You can stop at the last working point and poke around with the debugger to figure out syntax are uncertain of...
Matt O'Brien
Matt O'Brien le 31 Août 2022
Yes... very familiar with Ide's, debuggers, etc.... will post info requested in a min or two.
Matt O'Brien
Matt O'Brien le 31 Août 2022
Resolved....
'myFileList' should read 'MyFileList' (....doh!).
The rest of the routine completes successfully and my GUI fills with the expected results.
I should have spotted this, not sure when the lower case 'm' arrived, was distracted by status of isBadFile for some reason.
Thanks for sticking with this. I admire your ability to condense your code down to single statements. Impressive.
I intend to use this discusion to build a more general purpose function to generate this list, going forward, with options to be selective re system or hidden files, etc.. ensuring it works on Mac and Windows. It is such a neat powerful snippet and a learning experience for me.
My thanks.
Matt O'Brien
Matt O'Brien le 31 Août 2022
Modifié(e) : Matt O'Brien le 31 Août 2022
For the benefit of others, here is the current version of the snippet....
rootdir = strcat(MyDrive,'**\*.*'); %MyDrive stores full path to folder of interest
MyFileList=dir(rootdir); %get info of files/folders in current directory
MyFileList([MyFileList.isdir]) = []; %exclude directories
MyFileList( startsWith({MyFileList.name}, '.') ) = []; %exclude hidden files, which is same as . files on MacOS
MyFileList( contains({MyFileList.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
tmpFullNames = fullfile( {MyFileList.folder}, {MyFileList.name});
[~,stats] = cellfun(@fileattrib, tmpFullNames);
isBadFile = [stats.hidden]==1 | [stats.system]==1;
MyFileList(isBadFile) = [];
I need to do more complete testing ... As it is, I am using a comprehensive test pack... which I have reconciled with the final outputs and this snippet delivers the expected results... . I have a further series of test scripts to progress. I will post a comment here if I discover any issues. I am not expecting any surprises.
dpb
dpb le 1 Sep 2022
I thought the MATLAB dir() function did not return hidden/system files, but I see that it does...more reason for TMW to have included switches to support the OS. I suppose I never got "bit" because just "never" have either in working data directories and haven't ever tried to process a device output files such as you describe that does such things to its file structure to have been bit.
I virtually always can manage to write a search explicitly-enough with wildcards to manage to retrieve only those files of interest; usualy that also includes an extension of the data type which will remove all the unwanted directory entries, etc., automagically as well.
Anyways, glad Walter got you sorted...and hope I'm long since departed from the scene before ever having to deal with powershell -- it seems grossly overly verbose and complex -- TakeCommand has all it has and more in much simpler form while retaining compatibiity with CMD.
Matt O'Brien
Matt O'Brien le 1 Sep 2022
Modifié(e) : Matt O'Brien le 1 Sep 2022
Some background. Ignore the following if not interested in the background.
Dealing with Digital Assets from Movie/Still cameras.
In most typical cases, people using Matlab dir feature may be looking for a specific range of files such as *.txt or *.jpg, or be exploring directories which do not contain unusual, hidden or system files or directories.
In my case I am dealing with digital assets from cameras (ie SD cards or similar) in a card reader, or a direct copy of an SD card, for example a card backed up to disk.
For occasional shooters they may never notice the potential complexity of the folder structure on an SD card, but volume shooters (ie sports/wildlife, etc) will generate 1000's of images in a very short time frame, which then may be stored in a series of Dcim folders. Each camera maker needs to conform to a Dcim standard, but also has freedom in certain aspects how the images are stored within a Dcim folder substructure. The gottcha for many people is that a Dcim folder will normally only hold 9999 images. If more than 9999 images are shot, then they go into a different Dcim subfolder. If the images are copied directly from card to disk, then there is a risk of either duplicate file names, or horror of horrors, images from one Dcim subfolder overwriting images from another dcim folder when copied to a target destination folder (ie the file names are the same but the image contents are different). The situation for video, sound and other digital assets can be even more complex.
The camera (stills or video) may also use the card as the storage to allow the photographer manage the images captures, rename folders, filenames, rate images, delete images, etc..., directly on there camera, in the field. Therefore the maker may have hidden databases and status files to manage this realtime camera functionality.
As photographers, videographers, editors may use a large range of applications to manage their images, lots of other rubbish gets added by these apps, such as 'Spotlight-V100', '.Trashes', etc..
So, my simple Sd cards look very simple, but may contain a nightmare of hidden delights. It is super critical to filter out all such files or folders.
As I test out my app, using different cameras, I expect to find more surprises.
My app caters for managing the transfer of image, sound, video and related files from card (or backup of card) to disk, providing the option to filter files based on format (eg raw, jpg, wav, mov,xmp,etc) and date. Eg. Jpgs from a specific date may be directed to Project A, and video files with a different date may be directed to project B. The app guarantees that the original camera number, unique image number and other details are retained, as well as keeping an Excel based audit log of all files copied.
I had the main GUI built and the major application logic coded (and tested).... but kept getting stuck with unwanted system or hidden files.
Thanks to your help here, I have made a significant step forward.
dpb
dpb le 1 Sep 2022
Interesting. I've never been "bit" by the camera bug -- I've bought several with the intent over the years, but they all end up just sitting on the shelf gathering dust, so I've never poked around with any.
I do think this thread is another "shot across the bow" that TMW should strengthen the builtin functionality of dir() to support the underlying OS switches.
I've not tried the dir route via expressly "banging" to CMD.EXE with the command string to avoid powershell -- going that route has the benefit of return the FQNs as a list without having to construct them from the struct returned by dir() as well as the various attribute screening done first instead of later.
Anyways, looks as though you've basically got it sorted -- the other vendor idiosyncracies are likely just going to be additional specific strings to add to the exclusion list. Unfortunately, one can imagine that may continue to grow as new models/features are introduced...

Connectez-vous pour commenter.

Matt O'Brien
Matt O'Brien le 1 Sep 2022

0 votes

I am a serious amateur photographer. A civil engineer by qualification but got into the Information Technology world ex college. Lived in the world of large / global scale enterprise computing at an application and infrastructure layer. I have deep sympathy for the end user and will always champion ease of use above technical challenges. Given my background, I cannot live with inefficient workflows. So the modern world of photography is a nightmare both for the beginner and the seasoned professional (ie flow from image capture to a published or printed product).
I have dipped my toe into this space because I see photogrphers (especially high volume shooters), struggle with the needless complexity of getting images from their cameras to a back end computer and arrive there in a structured and audited manner.
I left the world of writing code in the early 80's, but have been responsible for large development and implemention teams in a large variety of industries in manufacturing, distribution and banking. I am familiar with quite a number of IDE's, but I am impressed with the small foot print of Matlab. However, while I know how to structure and design complex apps or systems, I seriously struggle with Matlab syntax and the ever present challenge of converting simple ideas to simple lines of code.
I like the fact that migrating from Windows to Mac is at least feasible, but probably another mountain to be climbed in due course.
I cannot believe that the dir function has not got simple switches to make it easier to filter system, hidden or sub directories and there are other anomolies working with drives and/or folders. Yes, the attrib variable is there, but should be simplier to use.
The important thing is to make progress... even if it is only tiny steps.
Resolving the dir issue for me might be small in the scheme of things but was vip in terms of creating a working solution. Photographers who might get to use this will never know what happens under the hood, and maybe there is a little satisfaction in that.
My best regards and deep felt appreciation to the members here..
Matt O'Brien
Matt O'Brien le 12 Sep 2022

0 votes

I discovered another condition which needs to be catered for, to be able to get a list of all files and subfiles from a selected drive or folder.
Dir 'folder' entries with a value of 'System Volume Information' should be excluded from the final dir list generated.

11 commentaires

dpb
dpb le 12 Sep 2022
@Matt O'Brien -- Given the multiple places in this thread what appears final code ended up in and that the above is more apropos as a comment than an Answer, I'd suggest taking your final function and post it in its entirety as a new Answer; then put your "Accept" stamp on it...then this comment/Answer and several others could probably be dispensed with as not being of much benefit to save...
$0.02, imo, ymmv, etc., etc., ...
Matt O'Brien
Matt O'Brien le 12 Sep 2022
Modifié(e) : Matt O'Brien le 12 Sep 2022
I discovered the 'System Volume Information' scenario while working on a different app. I will revert in due course and will add the code for the extra condition and post this as the final answer. However, it will be a few days time before I can do this.
You would add that to the list in the command
MyFileList( contains({MyFileList.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
Matt O'Brien
Matt O'Brien le 12 Sep 2022
Modifié(e) : Matt O'Brien le 12 Sep 2022
Agreed and will do so after I test the extra code added (one line hopefully). I need to check if the reference to 'System Volume Information' appears in the file or folder cells. I may also need to set up the test data script scenarios to make sure I trigger this case, as I am surprised I did not discover this situation previously.
I would have thought that 'System Volume Information' would be marked as System or Hidden ?
Matt O'Brien
Matt O'Brien le 12 Sep 2022
Maybe. I am dealing with 2 diff apps and 2 diff datasets. I need to reconcile both. I think the folder was caught but not the two files which were in the folder. It is too late tonight. Will pick up in the morning.
Perhaps you want to remove any file that has a directory element that is System or Hidden?
And on the Mac / Linux side, perhaps you want to remove any file that has a directory element that starts with period ?
For example, if there was a /Users/obrien/.matlab/license/logo.png then does the .matlab disqualify the file from consideration?
Perhaps for Mac and Linux, an option with several states:
  • skip all dot directories
  • skip .Trashes and .Spotlight-V100 directories
  • examine all directories
Bottom line... testing for System or Hidden will trap the System Volume Information folder, but not the two files contained therein.
See attached screen grab.
This is just to show you the underlying data.
To assist my testing, I created a utility which stores the status of all the key fields per entry from the Dir command in an Excel spreadsheet. In this case, it is a listing for an SD card (F:\). I populated the various fields using a loop, as follows. This logic and code here is not related to the code above, but is in a separate utility app I built to troubleshoot these various scenarios.
Name,Folder,Date,Bytes,IsDir,DateNum are directly from the Dir command. FullPath is assembled to allow me extract the fileparts and attributes.
FileName and Ext are from fileparts.
Hidden and System are from the attributes.
. and .. entries are removed from the Dir list.
Spotlight, Trashes,FirstDot,SysVol are populated via following code. I have tested all of these filters individully and they correctly identify the various scenarios.
if contains(d_folder,'.Spotlight-V100')
MySpotLight(i,1) = 1;
end
if contains(d_name,'.Spotlight-V100')
MySpotLight(i,1) = 1;
end
if contains(d_folder,'.Trashes')
MyTrash(i,1) = 1;
end
if contains(d_folder,'System Volume Information')
MySysVol(i,1) = 1;
end
if startsWith(d_name,'.')
MyFirstDot(i,1) = 1;
It is clear from the screen grab (attached) that the 'System Volume Information' folder is both a System and Hidden folder, but IndexerVolumeGuid & WPSettings.dat files are not flagged as either System or Hidden.
My assumption is that Windows Explorer will hide the files in a hidden folder but the MatLab Dir does not.
I continue to be gobsmacked with the level of effort needed to get a simple dir listing of all the files and folders on a drive or folder (incl subfolders). I think there is a strong case to be made for MatDir to create a new Dir listing function which reveals all the attributes possible for Windows, Mac and Linux as a single line command. This will reduce errors and assist with making Matlab easier to port between Mac,Windows and Linux.
My long term solution is to build my own dir function.
I am posting this info in case anyone else runs into these conditions. In the next week or so, I will revist my original app and add the code for the System Volume Folder and its contents (ie after I have tested that it works) and mark it as the final solution.
Dir will probably work well if one is looking for a specific type of file eg *.jpg. That is not an option for me as SD cards can contain Raw files from a large number of makers, each with their own extension. New raw file extensions can appear at any time. The situation for stills images is relatively simple as they are stored within a DCIM folder structure, but dealing with video files is exponentially more complicated.
I am going to this level of effort, because I need to be 100% certain I catch all stills images and video clips on any SD card presented to my app. So far, I am only working on stills. The good news is this has been an excellent app for me to develop my MatLab skills and I have the confidence to deal with Phase 2 (ie Video) in due course.
I continue to be gobsmacked with the level of effort needed to get a simple dir
listing of all the files and folders on a drive or folder (incl subfolders).
dinfo = dir('C:\**\*');
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
That's it. full_filenames is now a list of all the files and folders on the drive.
You can then filter
[~,stats] = fileattrib(full_filenames);
is_unwanted(iFile) = [stats.hidden]==1 | [stats.system]==1;
dinfo(is_unwanted) = [];
full_filenames(is_unwanted) = [];
Matt O'Brien
Matt O'Brien le 13 Sep 2022
Modifié(e) : Matt O'Brien le 13 Sep 2022
This generates an error ....
Error using fileattrib
Argument must be a text scalar.
Error in Test (line 6)
[~,stats] = fileattrib(full_filenames);
I cannot debug now, will revisit this evening. Have to travel for an appointment.
I suspect fileattrib needs an actual file name (char or string) but may not work wioth an array of filenames. If so, will need to loop.
dinfo = dir('C:\**\*');
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
%filtering
[~,stats] = cellfun(@fileattrib, full_filenames);
is_unwanted = [stats.hidden]==1 | [stats.system]==1;
dinfo(is_unwanted) = [];
full_filenames(is_unwanted) = [];
(On Mac, fileattrib is happy to work on the cell array when I test)

Connectez-vous pour commenter.

Matt O'Brien
Matt O'Brien le 13 Sep 2022

0 votes

Thanks. Just home, so will pick up in the morning. Documentation on fileattrib seems to indicate char or string but not array. I can explore the use of @ feature.

Catégories

En savoir plus sur File Operations dans Centre d'aide et File Exchange

Produits

Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by