Shadowing built-in functions
Afficher commentaires plus anciens
It is a well known problem from the forums, that the name of a built-in function is used for a variable and later, perhaps in another script, the behavior becomes unexpected:
sum = 1:10;
...
sum(1:5) % replies [1,2,3,4,5] instead of 15
If a function was used before, MLint shows a warning (at least in my Matlab version):
clear('sum');
a = sum(1:5);
sum = 1:10;
This happens for common names like: max, min, sum, i, j, line, text, input, ...
It seems obvious to prevent these problems by applying the general rule:
- Never shadow built-in functions by variables!*
But nobody can remember the names of all built-in functions, most of all if they belong to toolboxes not installed on the users computer. And is the name "xbuffer" really smarter or safer than "buffer"?!
We could create a tool, which compares all names of (not dynamically created) variables with all documented toolbox functions and show a warning for all collisions. But I'm not convinced that the time required to check the heap of warning messages is worth to avoid the rare (but ugly) real collisions.
Which strategy do you use or suggest to reduce naming collisions?
8 commentaires
Jan
le 15 Oct 2013
@Jan - As I said in the other thread, I know the built-in buffer and on my side I decided that I was fine with shadowing it.. essentially because I find that naming a built-in with a so common name was not appropriate (and why not naming a built-in x if we go this way? [see note 1]). It seems also that it is the toolbox which gets a built-in name first which has it, even if the name could be legitimately used by other toolboxes ( buffer would have been a good name for a tool from the Mapping toolbox for example). Yet, I think that you were right in suggesting that I shouldn't use buffer (in the other thread) on the forum in my answers and try to stick to best practice.
But more generally, I think that toolboxes should come as e.g. objects or packages, which would make all naming more consistent. As mentioned below, I'd love to be able to "mount" libs as packages instead of by adding them to the path. If you downloaded a FEX submission, e.g. CedricJunkLib, which includes 200 M-files with a lot of quite common function names, wouldn't that be great to be able to do
>> sandbox = addpathpack( '/home/jan/matlab/libs/CedricJunkLib' ) ;
>> r = sandbox.regress( ... ) ;
instead of adding the whole thing to the path with ADDPATH?
Note 1: I propose the following definition for built-in x:
x = @(x) fprintf('%s\n', cell2mat(regexp(urlread('http://fr.wikipedia.org/wiki/Homard'), 'Le homard se distingue.*?euse.', 'match'))) ;
This way, we can easily get a sentence in French about lobsters, by evaluating e.g.
>> x(1)
Le homard se distingue facilement de la langouste par la présence de pinces imposantes et par une carapace moins épineuse.
dpb
le 16 Oct 2013
oooh! That's ugly! :( I had not discovered the Toolbox buffer yet having not done any serious signal processing with the newer release. That is indeed, just evil of TMW to have done that. :(
Stephen Becker
le 9 Jan 2014
This isn't a proposed solution, but I wanted to mention that there are very non-obvious functions like "sigma" (from sigma.m in the control toolbox) and "beta" (from beta.m in special functions) that create the problems. These are natural variable names but if you use them in a script and don't initialize them, chaos results!
Some other really bad names: alpha (from 3D graphs), mu (from robust toolbox). Moral of the story: unfortunately, it's not completely afe to use any Greek letters as variables.
that the name of a built-in function is used for a variable and later, perhaps in another script, the behavior becomes unexpected:
@Jan,
I assume you really do mean scripts and not mfile functions. I've never known it to occur that a function name was shadowed by a variable name (inside an mfunction workspace) once the variable went out of scope.
Under this assumption, I don't really consider shadowing function names with variable names to be a big problem. I think it's not unreasonable to ask users to keep track of the contents of their workspace and the meanings they assign to names there. If someone uses a variable called "max", it's fine as long as they are committed to that usage for the duration of the variable's lifetime. If you can't keep track, you're writing functions that are too long. In any case, there is MLint to catch you.
To me, the bigger danger is in shadowing builtin MATLAB functions with other user-written functions. In that case, you change the behavior of all builtin mfunctions trying to call the native versions. There are now warnings to alert you to that too, but I remember a time when there wasn't.
Jan
le 22 Fév 2014
Réponse acceptée
Plus de réponses (7)
Jan
le 15 Oct 2013
John Barber
le 16 Oct 2013
3 votes
I have recently started to use packages, although their utility is significantly reduced by the fact that packages do not import their own namespace. If they did, you could trivially move a toolbox or library into a package for namespace management purposes. As is, converting a directory to a package requires editing all of the m-files to add the package name to internal function calls and class references.
I think that the following features could be useful if added to MATLAB:
- An optional MLint warning when a variable name shadows a function or class name
- An optional environment setting to throw a warning when a variable is created that shadows a function or class name
Obviously, these warnings should be optional and disabled by default because they are a departure from long-established MATLAB behavior. Also, I'm not sure if (2) would affect performance by slowing down variable creation.
As a sidebar, the variable/function shadowing issue seems to have influenced some of TMW's internal design decisions: http://www.mathworks.com/matlabcentral/newsreader/view_thread/171344#439111
1 commentaire
Matt Cooper
le 7 Nov 2022
This doesn't resolve all of the issues raised, but for readers who stumble upon this, as of R2020b (and I think r2019b), you can at least add an import wildcard statement at the beginning of each function. This means you can avoid adding the package prefix to every function call within each function. It is pretty trivial to do this using fscanf, strrep, and fprintf, if you want to convert an existing project to a package. It is also trivial to take it one step further and use dir to read all the package function names, and append the package prefix to each function call using the same fcanf, strrep, fprintf workflow. Unfortunately, there remain many limitations to package namespaces (such as lack of tab complete to find package functions, even after a wildcard import, although function hints do work).
Sean de Wolski
le 15 Oct 2013
which -all xbuffer
'xbuffer' not found.
We're in the clear! I try to use this all of the time.
6 commentaires
'which', of course, works as long as no additional toolboxen are loaded. :)
There's no guaranteed solution for anything other than the case of the local machine at the time of the check--including updates can introduce collisions into the previously clean system.
Jan
le 15 Oct 2013
Sean de Wolski
le 15 Oct 2013
@dpb: Full install.
@Jan: Of course! Though you would detect it when trying this anyway unless someone has maliciously shadowed which but made it the output look standard.
A new clean installation doesn't solve the problem of user code name collisions any more than does adding the toolbox with the name or the upgrade of base product.
It's simply unavoidable in general given the nature of the global namespace in Matlab and there being no separation in names by context (as in, say, Fortran where a function and a variable can have same name as they are distinguished owing to the syntax rules making such determinable).
ADDENDUM: Of course, the type of problem if any depends on what the collision is -- if one uses a variable name that is introduced as a new function in existing code it simply prevents that new function from being used. If otoh, there's a reference to a new function of the same name that's higher precedence than the existing one in the user code, then unless they just happen to have identical functionality the update breaks the existing code.
ADDENDUM 2:
And, of course, that's not to say it can't happen in Fortran or other languages; only that it's much less problematical. Fortran has added new intrinsics that can conflict w/ existing user-supplied ones as just one example.
dpb
le 16 Oct 2013
Well, given the design of Matlab I'd say all of what has been talked about in the thread is intended behavior. :)
It's just an unfortunate result of initial decisions and that Matlab has turned into something far beyond its original visions wherein it now is a lot more problematical than it was way back when.
dpb
le 15 Oct 2013
2 votes
I use the strategy of trying to not then fixing it if do... :)
Philip Borghesani
le 16 Oct 2013
Modifié(e) : Philip Borghesani
le 16 Oct 2013
2 votes
As others have pointed out here avoiding shadowing is impossible particularly because I have not yet met a programmer who is an oracle.
If a few simple rules are followed then shadowing should not be a problem.
- Never write scripts. I only find scripts useful for a few minutes of prototyping before turning the script into a function
- Keep functions short. Clean C++ programming presentations I have seen suggest 4 lines or less. I think this is a bit excessive but it is nearly impossible to shadow a function and try to use it in a short function. It would also be quite obvious.
- Did I say never write scripts?
It seems to me that most cases of shadowing causing confusion are with the function "sum" and novice users working from the command line. Keeping a tidy base workspace and never writing scripts is the best way around this issue.
Function naming and avoiding collisions is a bigger problem. One technique I have started employing is to use static methods on a class to group a set of related functions in a single file. This ends up looking like a package to the user but simplifies a bit on calling related functions and sharing with others. Local helper functions can be added after the classdef block for shared implementation. This sort of thing keeps the namespace much cleaner.
W. Owen Brimijoin
le 10 Jan 2014
I've always taken a stupidly simple approach to this problem: I just type what I am planning to name the variable into the command window and hit enter. If Matlab reports something like
Error using sum
Not enough input arguments.
then I don't use 'sum' as a variable name. Too quick and dirty?
4 commentaires
Bjorn Gustavsson
le 10 Jan 2014
Works - but this approach doesn't (nothing ever will) solve the "function appearing in the future" problem either. Once I had a function saveas.m, that broke when the next release introduced a mathworks function with that name - that was the moment I switched from having my personal directories at the beginning of the path to the end.
Sean de Wolski
le 10 Jan 2014
Interesting idea Bjorn for adding your paths to the end!
This still doesn't take care of the current directory being always at the top of the path.
Bjorn Gustavsson
le 10 Jan 2014
When I had my saveas.m before the suddendly apperaring matlab saveas.m I could not save my work to ordinary .mat-files, that Mathworks "shiped a new version with such a horrendous bug" confused me for quite a while. When I understood what was going on I realized that one of me and Mathworks had to modify our code - and that I was likely the one that had to do it. After some further pondering I thought it was better to have matlab-functions screw over my functions than the other way around.
Sean de Wolski
le 10 Jan 2014
The best one I saw was a shadowed assert(). The effect was similar, GUIDE would not let the files be saved.
Jos (10584)
le 15 Oct 2013
Modifié(e) : Jos (10584)
le 15 Oct 2013
0 votes
From the release notes of ML2015a:
- New built-in function xbuffer. So check all your miles and change the variable xbuffer into something else.
;-)
1 commentaire
Jan
le 15 Oct 2013
Catégories
En savoir plus sur Data Import from MATLAB dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!