Looking for a way to "vectorize" my code (or some other efficiency help)

Without going into too much detail, my code uses multiple matricies (matrix1 matrix2, etc..) that are six columns wide and of varying length from empty to thousands of entries. Each of these matricies is a list of xyz coordinates and 3 numbers of metadata, and each matrix represents a different set of coordinates that needs to be separate from the other lists. For simplicity, you can think of it as a sort of electric field where I'm listing locations and a vector at that point. Each of these points needs to be categorized, thus the need for multiple matricies. In my code, I need to scale these lists up which I do using scaledmatrix1 = repmat(matrix1, frequency1); scaledmatrix2 = repmat(matrix2, frequency2); etc... Each of these scaled lists is then mashed into a big matrix manually by doing BigScaledList = [Scaledmatrix1; Scaledmatrix2; ...] and then one is "picked." This is the way we're simulating scaled probability. Later in the code I need certain coordinates to go from one matrix to go to another, some metadata to change, and some coordinates to be deleted. Initially I only had 2 lists, now I have 6, and soon I need to scale up to 20 or 30 lists while also keeping this process intact. It is also important that each of these matricies remain "separate" entries from one another. It's getting very unwieldy to have each matrix being called, by name, in multiple places and having conditionals for each matrix. Now that I have to scale up, my code is going to become monstrous and potentially very buggy.
Here's my question: How can I keep my code's workflow the same (as in having the scaling/merging step still intact and keeping each matrix list separate) while making the scaling/merging and calling of matricies much more efficient?
Also, if you see something dumb in the way I'm doing things, please let me know. I would really prefer to not have them all as one object if possible

3 commentaires

Stephen23
Stephen23 le 11 Fév 2021
Modifié(e) : Stephen23 le 11 Fév 2021
"Initially I only had 2 lists, now I have 6, and soon I need to scale up to 20 or 30 lists while also keeping this process intact."
That is exactly what container arrays (e.g. cell arrays, structure arrays, etc) are used for. Use a few container arrays and you can forget all about how many "lists" they contain, you code simply adjusts automatically. So easy.
"It is also important that each of these matricies remain "separate" entries from one another."
That is exactly what container arrays do: they contain other separate arrays.
"It's getting very unwieldy to have each matrix being called, by name, in multiple places..."
Yes, your current approach is not efficiently expandable. Numbering variables like that is a sign that your data structure is fundamentally flawed and will make expanding/generalizing your processing difficult and/or inefficient.
"How can I keep my code's workflow the same ...while making the scaling/merging and calling of matricies much more efficient?"
The very simple and very efficient approach is to use container arrays and indexing. That is what all experienced MATLAB users would do. That is what the MATLAB parser is optimised for. That is what you should do too.
" if you see something dumb in the way I'm doing things, please let me know."
Numbering variable names ... that is one way that beginners force themselves into writing slow, complex, obfuscated, buggy code which is hard to debug.
What do you want to spend your time on: actually processing your data, or trying to figure out how to access your data?
In addition to Stephen's advice (out of the options I'd lean towards using cell-arrays from your information), I'd like to suggest that after the "first day of coding this up" you take some time sketching up how you want to organize your programme and access of the data - do it on paper. This might help you get a thought-out structure that is cleaner and easier to handle.
Thank you very much for the responses. I am going to make the switch to cell arrays.

Connectez-vous pour commenter.

Réponses (1)

It is also important that each of these matricies remain "separate" entries from one another. It's getting very unwieldy to have each matrix being called, by name, in multiple places and having conditionals for each matrix. Now that I have to scale up, my code is going to become monstrous and potentially very buggy.
Yes, I can see that.
I know you said they need to remain separate, but it could be quite a bit easier to manage your data if you combined them into a larger array with some ID information stating to which of the original data sets each row belongs. One way to do this would be using a table array. Suppose for example that you had two data sets about patients, one for male patients in a variable named male and one for female patients in a variable named female. Let's also make a combined table T.
Since in this data set Gender can only take the values 'Male' and 'Female' it'll be easier to work with if we turn them into a categorical array before we start creating tables using them.
load patients
Gender = categorical(Gender);
VN = ["Last name", "Gender", "Height", "Weight"];
M = Gender == 'Male';
male = table(LastName(M), Gender(M), Height(M), Weight(M), 'VariableNames', VN);
head(male)
ans = 8x4 table
Last name Gender Height Weight ____________ ______ ______ ______ {'Smith' } Male 71 176 {'Johnson' } Male 69 163 {'Wilson' } Male 68 180 {'Moore' } Male 68 183 {'Jackson' } Male 71 174 {'White' } Male 72 202 {'Martin' } Male 71 181 {'Thompson'} Male 69 191
female = table(LastName(~M), Gender(~M), Height(~M), Weight(~M), 'VariableNames', VN);
head(female)
ans = 8x4 table
Last name Gender Height Weight ____________ ______ ______ ______ {'Williams'} Female 64 131 {'Jones' } Female 67 133 {'Brown' } Female 64 119 {'Davis' } Female 68 142 {'Miller' } Female 64 142 {'Taylor' } Female 66 132 {'Anderson'} Female 68 128 {'Thomas' } Female 66 137
T = table(LastName, Gender, Height, Weight, 'VariableNames', VN);
head(T)
ans = 8x4 table
Last name Gender Height Weight ____________ ______ ______ ______ {'Smith' } Male 71 176 {'Johnson' } Male 69 163 {'Williams'} Female 64 131 {'Jones' } Female 67 133 {'Brown' } Female 64 119 {'Davis' } Female 68 142 {'Miller' } Female 64 142 {'Wilson' } Male 68 180
Now if I want to extract data only for Males I can do that.
maleData = T(T.Gender == 'Male', :);
head(maleData)
ans = 8x4 table
Last name Gender Height Weight ____________ ______ ______ ______ {'Smith' } Male 71 176 {'Johnson' } Male 69 163 {'Wilson' } Male 68 180 {'Moore' } Male 68 183 {'Jackson' } Male 71 174 {'White' } Male 72 202 {'Martin' } Male 71 181 {'Thompson'} Male 69 191
I don't have to do something like the following, which doesn't look that bad for two possible ID values to process but would be much more verbose for 20 or 30.
%{
switch genderToProcess
case 'Male'
% Work with a variable named male
otherwise
assert(isequal(genderToProcess , 'Female'))
% Work with a variable named female
end
%}
Oh, as for picking someone at random? With or without replacement? It's easy with the combined table.
numEntries = height(T);
randomEntriesWithout = randperm(numEntries, 5)
randomEntriesWithout = 1×5
42 17 38 5 58
TWithout = T(randomEntriesWithout, :)
TWithout = 5x4 table
Last name Gender Height Weight ____________ ______ ______ ______ {'Perez' } Male 69 183 {'Thompson'} Male 69 191 {'Gonzalez'} Female 66 118 {'Brown' } Female 64 119 {'Bell' } Male 70 170
randomEntriesWith = randi(numEntries, 1, 5)
randomEntriesWith = 1×5
32 82 90 75 74
TWith = T(randomEntriesWith, :)
TWith = 5x4 table
Last name Gender Height Weight ______________ ______ ______ ______ {'Lopez' } Female 66 137 {'Coleman' } Male 69 188 {'Washington'} Female 65 129 {'Sanders' } Female 67 115 {'Kelly' } Female 65 127

Produits

Version

R2018b

Tags

Commenté :

le 18 Fév 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by