Getting a List of Files.... Should Be Easy But
Afficher commentaires plus anciens
I have a GUI app 99% complete, but have spent several days trying to resolve what, to me, should be an extremely simple task.
Namely,
I need to get a list of all files and sub files from a given foldername down (incl all subfolders).
The list should exclude;
- Directories.
- System Files
- Hidden Files
- Files which start with a full stop (ie Mac hidden files)
- Files which have ".Spotlight-V100" as part of the folder path.
- Files which have ".Trashes" as part of the folder path.
Questions.
- Is there a MatLab command to do this ….. ????
- Or is there an elegant routine to do this.
- Or is there a Matlab plug-in which will do this.
I have created a matlab script to do this on Windows, but it is approx 90 lines of code. I hate to think what I might have to do to get this working on a Mac as well.
I will be using this regularly, so wish to make this as elegant/efficient as possible.
4 commentaires
Walter Roberson
le 30 Août 2022
The list should exclude;
Directories.
System Files
Hidden Files
Files which start with a full stop (ie Mac hidden files)
Files which have ".Spotlight-V100" as part of the folder path.
Files which have ".Trashes" as part of the folder path.
None of that is difficult except for the "System Files" part. MacOS does not have any file attribute for system files, and determining whether something in MacOS depends on a file could be a lot of work.
Matt O'Brien
le 30 Août 2022
Réponse acceptée
Plus de réponses (9)
- Directly as one command, no.
- Probably not extant, no.
- Certainly not specifically, no.
I "know nuthink!" of Mac OS ls equivalent, but doesn't really seem as though it should be particularly difficult -- certainly can't see why it would take some 90 lines of code.
rootpath='YourRootPath';
[~,d]=system(['dir /A:-H-D-S /S /B ' rootpath] );
d=string(split(d,newline));
d=d(strlength(d)>0);
will give you a list of all files that are not hidden/directories/system files for the rootpath folder and all subfolders in a list on Windows using the default CMD command shell. Mac surely has something equivalent.
A few well-chosen filters against the not-wanted list of this list should be pretty slimple to code; a regexp guru there might be of some help; that wouldn't be me, however... :)
I've always wondered why/wished for that TMW would have just supported the basic OS command line switches for the native OS in its incarnation of dir -- having it neutered as it is to "plain vanilla" is a real pain.
ADDENDUM:
Thinking about the exclude list, led me to thinking it's not that hard, either...with the caveat you have had the discipline to not name a file with the excluded path name in a directory not in the excluded list.
excludeList=[".Spotlight-V100"; ".Trashes"]; % filename content to exclude
d=d(~contains(d,excludeList)); % get rid of 'em...
I guess even that part above could be handled if used
d=d(~contains(fileparts(d),excludeList)); % exclude unwanted folders only
I dunno how you would handle @Walter Roberson's comment re: Mac and OS files -- although I'd hope you aren't putting your data where the OS stores its files so it wouldn't be an issue, anyway.
2 commentaires
ls('-ld', tempdir)
You can see that on MacOs and Linux, basic command line switches for MATLAB ls() are supported.
dpb
le 30 Août 2022
Yeah, but not for Winwoes -- nor does dir for either which was my specific complaint.
Walter Roberson
le 30 Août 2022
query_folder = tempdir; %set as appropriate, tempdir is just used for example purposes
dinfo = dir( fullfile(query_folder, '**', '*') );
dinfo([dinfo.isdir]) = []; %exclude directories
dinfo( startsWith({dinfo.name}, '.') ) = []; %exclude hidden files, which is same as . files on MacOS
dinfo( contains({dinfo.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
Unless, that is, when you refer to "hidden files", you refer to things such as ~/Library . If so then you would need to use ls -@ to query looking for the extended attribute com.apple.FinderInfo 32 or system() out to xattr looking for com.apple.FinderInfo
Well, except for the fact that if you add a color tag to a file then the com.apple.FinderInfo attribute with value 32 gets added to the file, and com.apple.metadata:_kMDItemUserTags gets added as well. If you then remove the color from the file, then com.apple.FinderInfo gets removed but a com.apple.metadata:_kMDItemUserTags attribute gets left behind. So to determine whether a file is hidden you need to look for com.apple.FinderInfo is present with value 32 but com.apple.metadata:_kMDItemUserTags is not present...
2 commentaires
Matt O'Brien
le 31 Août 2022
Matt O'Brien
le 31 Août 2022
Modifié(e) : Matt O'Brien
le 31 Août 2022
Matt O'Brien
le 30 Août 2022
11 commentaires
Walter Roberson
le 30 Août 2022
fileattrib can tell you whether a MS Windows file has Hidden or System set.
Matt O'Brien
le 30 Août 2022
Modifié(e) : Matt O'Brien
le 30 Août 2022
"I cannot get the split of the datastream from the system(dir) method to work."
Which OS are we speaking of here, Windows I presume since dir isn't a UNIX-like syntax?
I've never seen a system that doesn't return the list as the character string that looks just like the command window output -- which is a newline after each. Attach a .mat file of your result along with the command line that created it.
Of course, under Windows, dir won't return hidden or system files anyways, so just dir with the magic incantation of the extra **\* to traverse subdirectories returns the same list, just in the struct array instead of a list of names.
Matt O'Brien
le 30 Août 2022
dpb
le 31 Août 2022
No problem; just surprised you're having an issue with this; I've done this kind of exact thing for 30+ years and never had a failure to parse...can't imagine what could be that would change that behavior.
I presume if you just do
!dir
at the command window prompt you see the normal OS directory listing as normal???
Matt O'Brien
le 31 Août 2022
dpb
le 31 Août 2022
Oh, bleech! I can't even figure out how to get a useful directory listing at the command prompt with that abomination. I suppose maybe you're also on Win11, too...
Matt O'Brien
le 31 Août 2022
Matt O'Brien
le 31 Août 2022
dpb
le 31 Août 2022
I've not delved into how to control it (although I may have done and just forgotten it) but when "bang" to OS under MATLAB here on Win10, it still uses CMD.EXE. system is builtin so can't see what it actually does, I presume, however, it uses a start command to spawn a new CMD.EXE process, passing it the rest of the command as parameters.
I deduce that because personally on Windows I use the JPSoftware replacement command processor instead of the MS-supplied CMD and even if Windows is configured to use it as default, MATLAB still use CMD.EXE, not the system default. I've also thought that very rude of MATHWORKS to have done and not use the system default so the user could have their toolsets at hand if wish. With the TakeCommand processor from JPSoft, one could add in the various exclusions into its enhanced DIR and do virtually all the culling before returning the list. That, however, doesn't help for Mac not those who don't use it, of course.
But, looks as though you've basically got the problem solved with Walter's esteemed help so I'll retire from the field here unless you have something else specific along this line you care to pursue.
Good luck!!!
Matt O'Brien
le 31 Août 2022
Matt O'Brien
le 30 Août 2022
Modifié(e) : Matt O'Brien
le 30 Août 2022
Matt O'Brien
le 31 Août 2022
2 commentaires
Walter Roberson
le 31 Août 2022
We already know the names cannot be folders, so there is no point testing that.
The below code will work on MacOS and Linux as well -- those will return NaN for the hidden and system attributes, but but specifically testing == 1 then both NaN and 0 are treated as false, so NaN does not need to be special cased.
No loop is needed.
tmpFullNames = fullfile( {MyFileList(iFile).folder}, {MyFileList(iFile).name});
[~,stats] = fileattrib(tmpFullNames);
isBadFile(iFile) = [stats.hidden]==1 | [stats.system]==1;
myFileList(isBadFile) = [];
Matt O'Brien
le 31 Août 2022
Matt O'Brien
le 31 Août 2022
Modifié(e) : Matt O'Brien
le 5 Sep 2022
11 commentaires
Walter Roberson
le 31 Août 2022
Modifié(e) : Walter Roberson
le 31 Août 2022
tmpFullNames = fullfile( {MyFileList.folder}, {MyFileList.name});
isBadFile = [stats.hidden]==1 | [stats.system]==1;
Matt O'Brien
le 31 Août 2022
Walter Roberson
le 31 Août 2022
[~,stats] = cellfun(@fileattrib, tmpFullNames);
Matt O'Brien
le 31 Août 2022
dpb
le 31 Août 2022
Well, then myFileList better be an array of 71 elements as well.
I've not tried to patch together all the myrial partial code postings to see, maybe you have a change in spelling? What does
whos myFileList
return at that point?
Do you know how to set breakpoint and use the debugger to step through code an find/fix logic errors? You can stop at the last working point and poke around with the debugger to figure out syntax are uncertain of...
Matt O'Brien
le 31 Août 2022
Matt O'Brien
le 31 Août 2022
Matt O'Brien
le 31 Août 2022
Modifié(e) : Matt O'Brien
le 31 Août 2022
dpb
le 1 Sep 2022
I thought the MATLAB dir() function did not return hidden/system files, but I see that it does...more reason for TMW to have included switches to support the OS. I suppose I never got "bit" because just "never" have either in working data directories and haven't ever tried to process a device output files such as you describe that does such things to its file structure to have been bit.
I virtually always can manage to write a search explicitly-enough with wildcards to manage to retrieve only those files of interest; usualy that also includes an extension of the data type which will remove all the unwanted directory entries, etc., automagically as well.
Anyways, glad Walter got you sorted...and hope I'm long since departed from the scene before ever having to deal with powershell -- it seems grossly overly verbose and complex -- TakeCommand has all it has and more in much simpler form while retaining compatibiity with CMD.
Matt O'Brien
le 1 Sep 2022
Modifié(e) : Matt O'Brien
le 1 Sep 2022
dpb
le 1 Sep 2022
Interesting. I've never been "bit" by the camera bug -- I've bought several with the intent over the years, but they all end up just sitting on the shelf gathering dust, so I've never poked around with any.
I do think this thread is another "shot across the bow" that TMW should strengthen the builtin functionality of dir() to support the underlying OS switches.
I've not tried the dir route via expressly "banging" to CMD.EXE with the command string to avoid powershell -- going that route has the benefit of return the FQNs as a list without having to construct them from the struct returned by dir() as well as the various attribute screening done first instead of later.
Anyways, looks as though you've basically got it sorted -- the other vendor idiosyncracies are likely just going to be additional specific strings to add to the exclusion list. Unfortunately, one can imagine that may continue to grow as new models/features are introduced...
Matt O'Brien
le 1 Sep 2022
0 votes
Matt O'Brien
le 12 Sep 2022
0 votes
11 commentaires
dpb
le 12 Sep 2022
@Matt O'Brien -- Given the multiple places in this thread what appears final code ended up in and that the above is more apropos as a comment than an Answer, I'd suggest taking your final function and post it in its entirety as a new Answer; then put your "Accept" stamp on it...then this comment/Answer and several others could probably be dispensed with as not being of much benefit to save...
$0.02, imo, ymmv, etc., etc., ...
Matt O'Brien
le 12 Sep 2022
Modifié(e) : Matt O'Brien
le 12 Sep 2022
Walter Roberson
le 12 Sep 2022
You would add that to the list in the command
MyFileList( contains({MyFileList.folder}, {'.Spotlight-V100', '.Trashes'}) ) = []; %exclude those particular directories
Matt O'Brien
le 12 Sep 2022
Modifié(e) : Matt O'Brien
le 12 Sep 2022
Walter Roberson
le 12 Sep 2022
I would have thought that 'System Volume Information' would be marked as System or Hidden ?
Matt O'Brien
le 12 Sep 2022
Walter Roberson
le 13 Sep 2022
Perhaps you want to remove any file that has a directory element that is System or Hidden?
And on the Mac / Linux side, perhaps you want to remove any file that has a directory element that starts with period ?
For example, if there was a /Users/obrien/.matlab/license/logo.png then does the .matlab disqualify the file from consideration?
Perhaps for Mac and Linux, an option with several states:
- skip all dot directories
- skip .Trashes and .Spotlight-V100 directories
- examine all directories
Matt O'Brien
le 13 Sep 2022
Walter Roberson
le 13 Sep 2022
I continue to be gobsmacked with the level of effort needed to get a simple dir
listing of all the files and folders on a drive or folder (incl subfolders).
dinfo = dir('C:\**\*');
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
That's it. full_filenames is now a list of all the files and folders on the drive.
You can then filter
[~,stats] = fileattrib(full_filenames);
is_unwanted(iFile) = [stats.hidden]==1 | [stats.system]==1;
dinfo(is_unwanted) = [];
full_filenames(is_unwanted) = [];
Matt O'Brien
le 13 Sep 2022
Modifié(e) : Matt O'Brien
le 13 Sep 2022
Walter Roberson
le 13 Sep 2022
dinfo = dir('C:\**\*');
full_filenames = fullfile({dinfo.folder}, {dinfo.name});
%filtering
[~,stats] = cellfun(@fileattrib, full_filenames);
is_unwanted = [stats.hidden]==1 | [stats.system]==1;
dinfo(is_unwanted) = [];
full_filenames(is_unwanted) = [];
(On Mac, fileattrib is happy to work on the cell array when I test)
Matt O'Brien
le 13 Sep 2022
0 votes
Catégories
En savoir plus sur File Operations dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
