Passing structs/objects to functions

84 vues (au cours des 30 derniers jours)
broken_arrow
broken_arrow le 16 Avr 2021
When applying functions to structs, I often pass the entire struct as an input argument even if not all the variables within it are needed by the function in order to keep the input argument list short. AFAIK functions are JIT compiled on their first execution. That makes me wonder: Does the compiler realize which parts of the struct are actually needed and pass only these to the function? Or is the entire struct loaded into the function even if just one element is modified (which could potentially cause enormous overhead)? I would guess it's the former since with objects, the entire object is usually passed to each method:
function obj = whatever(obj,otherArgument1,etc)
% function code
end
Is that correct?

Réponse acceptée

Bruno Luong
Bruno Luong le 16 Avr 2021
Modifié(e) : Bruno Luong le 16 Avr 2021
"Does the compiler realize which parts of the struct are actually needed and pass only these to the function? "
MATLAB don't pass input values of function like C and C++, it passe input (structure) "address" (mxArray pointer). So your question is not applicable thus irrelevant.
  5 commentaires
Bruno Luong
Bruno Luong le 16 Avr 2021
Modifié(e) : Bruno Luong le 16 Avr 2021
"Or does it work like a path"
I don't know what you mean be "work like a path". Are you taking about nesetd structures?
"only the top level adress"
All MATLAB objects (all classes) are encapsulated in a data structure called "mxArray", only theirs addresses are passed into the functtion.
broken_arrow
broken_arrow le 16 Avr 2021
Yes, I'd imagine that the adress of the first level of a struct would contain the adresses of the second level and so on, so one can either pass the adress of level 1 and let Matlab do the work or pass the sublevel adresses directly. Anyways, I take it that it's fine to just pass the whole thing as input.

Connectez-vous pour commenter.

Plus de réponses (2)

Steven Lord
Steven Lord le 16 Avr 2021
Just because MATLAB passes a large array into a function doesn't mean it needs to make a copy of that array if you don't modify the input in the function
tic
S = repmat(dir, 1000, 1000);
toc
Elapsed time is 0.590862 seconds.
whos S
Name Size Bytes Class Attributes S 2000x1000 1464000384 struct
tic
y = myfun(S);
toc
Elapsed time is 0.008938 seconds.
It took much less time to perform the operation in myfun than it did to create S, and if myfun were copying S you'd expect that time to be closer to the creation time. As a different example using a numeric array rather than a struct:
A = rand(4000, 4000);
tic
y = myfun2(A); % Does not modify A
toc
Elapsed time is 0.009651 seconds.
tic
y = myfun3(A); % Modifies A
toc
Elapsed time is 0.107321 seconds.
There's a limit to how big I can make A in a MATLAB Answers post, but try the code yourself in a desktop installation of MATLAB with a larger A if you want to see a larger difference in the times.
function y = myfun(S)
y = fieldnames(S); % S is not modified and so not copied
end
function y = myfun2(A)
y = A(42, 999);
end
function y = myfun3(A)
A = A + 1;
y = A(42, 999);
end

James Tursa
James Tursa le 19 Avr 2021
Modifié(e) : James Tursa le 19 Avr 2021
MATLAB typically passes shared data copies of input arguments to functions. That means creating a separate mxArray header for the variable and then sharing the data pointers. For cell arrays and structs, that means that only one top level data pointer is copied. The individual addresses of all of the cell and field variables (perhaps hundreds or thousands of these) are not copied ... they are part of the "shared data" that is pointed to by the one top level data pointer. Passing cell or struct variables to functions takes about the same amount of overhead as passing a regular numeric variable. I.e., it takes about the same amount of effort to pass a struct with thousands of field elements as it does to pass a small 2D numeric matrix.
If you subsequently modify one of those input arguments inside the function, then MATLAB needs to make a deep copy of the variable (or deep copy of the cell or field you are modifying) first, which will take additional time and memory.
  3 commentaires
Bruno Luong
Bruno Luong le 7 Mar 2022
Modifié(e) : Bruno Luong le 7 Mar 2022
Trym Gabrielsen interesting question, I don't know the answer and I myself asking.
Short answer, I use a lot of structure that captures my program "state" and passing it as input and output of my functions. The function either change or add a new fields to a structure. Performance wise I did not notice any significant speed penalty.
However for speed intensive part I tries to use simpler variable type extracted from the structure before the loop, then assign the result after the loop terminates. I avoid to assign the fields multiple time when it not needed. This also makes my code readable and easier to maintain. So I don't see why to do differently.
Long answer, the internal organization and data sharing is not documentend and TMW makes it more and more difficult to do reverse engineering. I must admit I don't understand how it works since last few release now. I juts want to show how different data pointers change internally in this short example, run on R2021b. Up to you to lake a conclusion, because as I said I don't completely understandd how it works and why it works like that.
The code:
clear all
clc
foo
function foo
format debug;
s = struct('a', 'a', 'b', 'b');
fprintf('\nbefore calling function\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); s.a, PrintPtr(s.a); fprintf('s.b\n'); s.b, PrintPtr(s.b);
s = modifyb(s);
fprintf('\nafter calling function\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
bar
end
%%
function s = modifyb(s)
fprintf('\ninside function before modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
s.b = 'bb';
fprintf('\ninside function after modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
end
%%
function bar()
s = struct('a', 'a', 'b', 'b');
fprintf('\ninside function bar before modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
s.b = 'bb';
fprintf('\ninside function bar after modification\n'); fprintf('s\n'); PrintPtr(s); fprintf('s.a\n'); PrintPtr(s.a); fprintf('s.b\n'); PrintPtr(s.b);
end
Mex file PrintPtr.c
// Save PrintPtr.c to Compile this by using "mex PrintPtr.c"
#include "mex.h"
// PrintPtr.c
// Compile it in Matlab using
// > mex -R2018a PrintPtr.c
// Gateway routine, simple function to print
// data pointer of a MATLAB object
// Bruno Luong: 06 Dec 2008
void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[])
{
mxArray *Var;
char Str[256];
// Check for proper inputs and output
if( nrhs != 1 )
mexErrMsgTxt("PrintPtr needs one input");
if( nlhs > 0 )
mexErrMsgTxt("PrintPtr does not return value");
// Set temporary operand pointers to the inputs.
Var = prhs[0];
sprintf(Str,"mxArray = %px\n", Var);
mexPrintf(Str);
sprintf(Str,"Data ptr = %px\n", mxGetData(Var));
mexPrintf(Str);
return;
}
Result
before calling function
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
ans =
'a'
mxArray = 0000022F7C2C71C0x
Data ptr = 0000022F16401680x
s.b
ans =
'b'
mxArray = 0000022F7C4911E0x
Data ptr = 0000022F010D99C0x
inside function before modification
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
mxArray = 0000022F7C2C7160x
Data ptr = 0000022F16401360x
s.b
mxArray = 0000022F7C4910C0x
Data ptr = 0000022F010DAC80x
inside function after modification
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
mxArray = 0000022F7C2C6DA0x
Data ptr = 0000022F163FBC80x
s.b
mxArray = 0000022F7C4BB760x
Data ptr = 0000022F0DC140C0x
after calling function
s
mxArray = 0000022F7C4B8640x
Data ptr = 0000022F14204360x
s.a
mxArray = 0000022F7C490FA0x
Data ptr = 0000022F010DAA40x
s.b
mxArray = 0000022F7C4BB760x
Data ptr = 0000022F0DC140C0x
inside function bar before modification
s
mxArray = 0000022F7C4B99C0x
Data ptr = 0000022F142041C0x
s.a
mxArray = 0000022F7C2C7760x
Data ptr = 0000022F16401500x
s.b
mxArray = 0000022F7C490C40x
Data ptr = 0000022F010DB4A0x
inside function bar after modification
s
mxArray = 0000022F7C4B99C0x
Data ptr = 0000022F142041C0x
s.a
mxArray = 0000022F7C2C76A0x
Data ptr = 0000022F16400BA0x
s.b
mxArray = 0000022F7C4B9120x
Data ptr = 0000022F162CB400x
James Tursa
James Tursa le 8 Mar 2022
Modifié(e) : James Tursa le 8 Mar 2022
@Trym Gabrielsen "I am wondering if the enite struct B is copied, when you modify only one field, or is only that one field copied?"
Only that one field being modified is deep copied. All of the other fields remain as reference copies, i.e. only the top level mxArray variable address is copied. None of the data within these other fields is deep copied.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Function Creation dans Help Center et File Exchange

Produits


Version

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by