Automatically create a several layer structure from a cell array

Hello,
I have a signal that vary along three parameters a,b and c wich can take several values.
a can take n different values a1,a2...an
b can take m different values b1,b2,...,bm
c can take p different values c1,c2...cp.
So far I have all the variations stocked in a cell array C of size n*m*p such that C(x,y,z) correspond to my signal with parameter ax, by and cz.
This works but is not very easy to use. Let's say I want to get the signal corresponding to certain parameter I first need to go find the indexes of the wanted parameters (because I know the value of a that I want but not the index to wich it correspond).
My idea was to create a structure S such that
S.ax.by.cz=C(x,y,z)
that way I wont need a correpondance table
So far the only way to do it that I found is
for i=1:n
for j=1:m
for k=1:p
S.(a_value(i)).(b_value(j)).(c_value(j))=C(i,j,k)
end
end
end
I was wondering if a nicer/more efficient solution exist?
PS: I can't put the value directly in the structure when I generate it because the generation is inside a parfor loop. If you have a solution to that issue that would also solve my problem
thanks in advance

7 commentaires

" Let's say I want to get the signal corresponding to certain parameter I first need to go find the indexes of the wanted parameters..."
Can you give an example? Say the data is
C{1}=[61 43 50 53 95]; %a
C{2}=[63 96 51 45 57 27]; %b
C{3}=[8 67 76 76 83 50 7]; %c
Now what do you want to access?
Also, it's not clear as to how you are storing the data as C(x,y,z)?
"I was wondering if a nicer/more efficient solution exist?"
ND arrays perhaps.
You indicate that the "signal" varies with some parameters which correspond very neatly to indices 1:m, 1:n, and 1:p. It is unclear why you think the replacing this simple index-correspondence with fieldnames would be an improvement (which most likely would be slower and more complex to access).
Let me give a concrete exemple
Let's say I have a salary depending on three parameters age, job and country
What I have is
salary=arrayfun(@(x) x,randi(100000,[2,2,2]),'UniformOutput',false);
job_array=["doctor","waiter"];
country_array=["USA","France"];
age_array=["less_than_40","more_than_40"];
%How it is now (salary is a 2*2*2 cell array)
salary(1,1,1); %the salary of a doctor in USA under 40 years old
salary(1,1,2) %the salary of a doctor in USA over 40 years old
ans = 1×1 cell array
{[65633]}
% Now let's say I want the salary of a french waiter less than 40 this is
% what I have to do
idx_waiter=find(strcmp(job_array,"waiter"));
idx_country=find(strcmp(country_array,"France"));
idx_age=find(strcmp(age_array,"less_than_40"));
salary_waiter_french_less_40=salary(idx_waiter,idx_country,idx_age)
salary_waiter_french_less_40 = 1×1 cell array
{[96705]}
What I want is to create a structure so that
%could be created like this but is very slow
for i=1:length(job_array)
for j=1:length(country_array)
for k=1:length(age_array)
salary_struct.(job_array(i)).(country_array(j)).(age_array(k))=salary(i,j,k) ;
end
end
end
salary_struct.doctor.USA.more_than_40 %the salary of a doctor in USA over 40 years old
ans = 1×1 cell array
{[65633]}
salary_struct.waiter.France.less_than_40 %salary of a french waiter less than 40
ans = 1×1 cell array
{[96705]}
What I have is technicaly working but the code would not be lisible.
I'm also supposed to share this data with other people so having a clear code is really usefull.
You can keep the indexing approach, but make a function that does the necessary indexing so as to keep the code more legible. If you make it a nested function then you don't have to pass job_array, country_array, etc., to it.
Example:
main()
salary_doctor_USA_less_40 = 1×1 cell array
{[90751]}
salary_doctor_FRE_less_40 = 1×1 cell array
{[40773]}
salary_waiter_USA_less_40 = 1×1 cell array
{[96740]}
salary_waiter_FRE_less_40 = 1×1 cell array
{[1566]}
salary_doctor_USA_more_40 = 1×1 cell array
{[67015]}
salary_doctor_FRE_more_40 = 1×1 cell array
{[93363]}
salary_waiter_USA_more_40 = 1×1 cell array
{[22547]}
salary_waiter_FRE_more_40 = 1×1 cell array
{[57732]}
function main()
salary=arrayfun(@(x) x,randi(100000,[2,2,2]),'UniformOutput',false);
job_array=["doctor","waiter"];
country_array=["USA","France"];
age_array=["less_than_40","more_than_40"];
salary_doctor_USA_less_40 = get_salary("doctor","USA","less_than_40")
salary_doctor_FRE_less_40 = get_salary("doctor","France","less_than_40")
salary_waiter_USA_less_40 = get_salary("waiter","USA","less_than_40")
salary_waiter_FRE_less_40 = get_salary("waiter","France","less_than_40")
salary_doctor_USA_more_40 = get_salary("doctor","USA","more_than_40")
salary_doctor_FRE_more_40 = get_salary("doctor","France","more_than_40")
salary_waiter_USA_more_40 = get_salary("waiter","USA","more_than_40")
salary_waiter_FRE_more_40 = get_salary("waiter","France","more_than_40")
function s = get_salary(job,country,age_group)
idx_job = find(strcmp(job_array,job));
idx_country = find(strcmp(country_array,country));
idx_age = find(strcmp(age_array,age_group));
s = salary(idx_job,idx_country,idx_age);
end
end
Yeah that would work
I ended up biting the bullet and just do this
for i=1:length(job_array)
for j=1:length(country_array)
for k=1:length(age_array)
salary_struct.(job_array(i)).(country_array(j)).(age_array(k))=salary(i,j,k) ;
end
end
end
It's not taking that much time (compare to the rest of the script).
Aaaah, so you actually have text data.... in which case, a few dynamic fieldnames as you show is probably reasonably efficient. Dynamic fieldnames in some loops is likely the most efficient approach (you could use SETFIELD, but it is slower).
The approach you show is not easily generalizable nor easily expandable.
"forcing meta-data into variable names should be avoided"
Yes, you are right. I was following @Matteo Bonhomme's naming pattern, so that he could easily see how the nested function would work.
"The approach you show is not easily generalizable nor easily expandable."
I disagree.

Connectez-vous pour commenter.

Réponses (1)

Another approach is to string this out into a table, converting what were labels along each dimension into grouping variables. Starting with the same salary data example, the trick is to use ndgrid.
salary=arrayfun(@(x) x,randi(100000,[2,2,2]),'UniformOutput',false);
job_array=["doctor","waiter"];
country_array=["USA","France"];
age_array=["less_than_40","more_than_40"];
salary = [salary{:}]'
salary = 8×1
13085 40527 9337 23364 27223 34191 94748 16327
[Job, Country, Age] = ndgrid(job_array,country_array,age_array)
Job = 2×2×2 string array
Job(:,:,1) = "doctor" "doctor" "waiter" "waiter" Job(:,:,2) = "doctor" "doctor" "waiter" "waiter"
Country = 2×2×2 string array
Country(:,:,1) = "USA" "France" "USA" "France" Country(:,:,2) = "USA" "France" "USA" "France"
Age = 2×2×2 string array
Age(:,:,1) = "less_than_40" "less_than_40" "less_than_40" "less_than_40" Age(:,:,2) = "more_than_40" "more_than_40" "more_than_40" "more_than_40"
t = table(Job(:), Country(:), Age(:), salary, VariableNames=["Job", "Country","Age","Salary"])
t = 8×4 table
Job Country Age Salary ________ ________ ______________ ______ "doctor" "USA" "less_than_40" 13085 "waiter" "USA" "less_than_40" 40527 "doctor" "France" "less_than_40" 9337 "waiter" "France" "less_than_40" 23364 "doctor" "USA" "more_than_40" 27223 "waiter" "USA" "more_than_40" 34191 "doctor" "France" "more_than_40" 94748 "waiter" "France" "more_than_40" 16327
% Now you can either use logical indexing to select data or grouping to do
% calculations that would be equivalent to slicing in your original array:
t.Salary(t.Job=="waiter" & t.Country == "France" & t.Age == "less_than_40")
ans = 23364
groupsummary(t,"Country","mean","Salary")
ans = 2×3 table
Country GroupCount mean_Salary ________ __________ ___________ "France" 4 35944 "USA" 4 28756

Catégories

Produits

Version

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by