How do I group data into different groups based on values of 2 columns for analysis?

13 vues (au cours des 30 derniers jours)
I am biology student and do not have much experience with matlab programming other than simple codes for plotting graphs. I have tracked some living cells using a tracking software and the results are in csv format with location of a living cells (x and y positions) and their speed (attached is the sample dataset). I want to group all those cells within the same neighbourhood (for example within 3 µm difference in x and y position) and average their speed. How do I write the code to allocate them to different bins based on their min and max position and get their average speed? As long as x and y fall within a particular range , I want allocate them to a grid or bin to get their speed. I am not sure of the exact terminolgy here but hope I am able to convey my question clearly. I am new to this platform so appreciate any help. Thanks.

Réponse acceptée

dpb
dpb le 19 Avr 2023
Modifié(e) : dpb le 19 Avr 2023
fn='https://www.mathworks.com/matlabcentral/answers/uploaded_files/1361218/position_velocity%20data.xlsx';
tLC=readtable(fn);
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
height(tLC)
ans = 88
head(tLC)
XPosition_mu_m_ YPosition_mu_m_ Velocity_mu_m_s_ _______________ _______________ ________________ 0 0 -1 22.988 -90.419 3665.2 9.0209 -86.485 570.05 12 -83 180.11 24.993 -77.041 561.57 30.986 -74.524 255.36 16.527 -74.464 568.06 43 -66 1091.9
tLC.Properties.VariableNames={'X','Y','V'}; % let's set easier names to use...
histogram2(tLC.X,tLC.Y) % then just look at the dispersion
xlabel('X'),ylabel('Y')
That doesn't look too bad; you may want to try different numbers of bins in each direction just to see, probably would make sense to use equal distances in each direction, however, but we'll not get that sophisticated here... :) To count and get bin numbers, etc., isn't too tough, either...
[N,~,~,bX,bY]=histcounts2(tLC.X,tLC.Y); % count, keep bin IDs
tLC=addvars(tLC,bX,bY,'NewVariableNames',{'binX','binY'},'After','V'); % add the grouping variables
groupsummary(tLC,{'binX','binY'},{'mean'},{'X','Y','V'}) % find mean of position, speed
ans = 15×6 table
binX binY GroupCount mean_X mean_Y mean_V ____ ____ __________ ___________ _______ ______ 1 3 2 -1.4926 39.508 145.59 1 4 1 -1.7764e-15 64.133 184.66 2 1 3 7.8561 -73.471 546.64 2 3 7 6.557 32.171 168.77 2 4 1 3.254 60.741 429.65 3 1 5 15.101 -65.403 324.78 3 2 18 15.682 -15.592 112.62 3 3 21 14.598 17.921 128.27 3 4 3 17.28 53.133 259.75 4 1 8 23.923 -67.318 687.94 4 2 12 24.243 -34.32 201.63 4 3 1 25.494 42.803 380.46 5 1 4 33.124 -66.386 240.03 5 2 1 38 -50 541.63 6 1 1 43 -66 1091.9
We didn't look at the counts before; this illustrates issues of not having scads of data when start splitting it up by multiple variables; there are quite a number of cells with only one or two counts; you may need to do some variably-spaced intervals to balance out the counts some; it's hard to put much of a bound on estimates with that few members in those bins.
I would guess one thing missing here, maybe. is the starting location of each so isn't just where they ended up but how far and in what net direction did each end up? In that case, you'd be grouping by individual if you had such information available.

Plus de réponses (0)

Catégories

En savoir plus sur Descriptive Statistics dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by