operation between timetables with different index

2 vues (au cours des 30 derniers jours)
Attilio Pittelli
Attilio Pittelli le 6 Juil 2021
Commenté : dpb le 28 Juil 2021
What i want to do is to subtract the first five elements of a timetable with just one another element of a different timetable with different indexes. The code below is kind of expensive as i want to apply this subtraction to a large dataset. Do you have any suggestion? thank you
T.Vull(1:5)-AvWe.Valp(1);
  18 commentaires
Attilio Pittelli
Attilio Pittelli le 7 Juil 2021
i need to subtract the elements of the timetable to the corrisponding weekly mean (Av.We)
example:
T.('first week',Volume) - AvWe('first week') %this needs to be done for all the values and i want to store the results in an array or timetable
dpb
dpb le 7 Juil 2021
Modifié(e) : dpb le 8 Juil 2021
Well, maybe that's what the early request meant, but it wasn't easy to tell that... :) Language is undoubtedly part of the problem...but, that's a trivial operation although you need a grouping variable over week and year to do it. And, retime on its own isn't the tool; this is a job for rowfun -- but, to do it as neatly as would like, also takes an m-file function; not possible to rearrange the output as would like in an anonymous function.
It's still not too bad and is quite concise, but does, I'll grant, take some expertise in using MATLAB more than pure novice level... :)

Connectez-vous pour commenter.

Réponse acceptée

dpb
dpb le 8 Juil 2021
tTicker=readtable('SPCE.csv'); % read the data file
tTicker.Yr=year(tTicker.Date;tTicker.Wk=week(tTicker.Date); % create grouping variables
So far, leaves us with
> [head(tTicker);tail(tTicker)]
ans =
16×9 table
Date Open High Low Close AdjClose Volume Yr Wk
__________ ______ ______ ______ _____ ________ __________ ____ __
2020-07-06 16.58 17.09 16.05 16.95 16.95 1.3902e+07 2020 28
2020-07-07 16.63 16.84 16.151 16.55 16.55 8.204e+06 2020 28
2020-07-08 16.61 16.82 16.2 16.79 16.79 8.7226e+06 2020 28
2020-07-09 17.1 19.49 17.08 19.46 19.46 5.1498e+07 2020 28
2020-07-10 19.32 19.98 18.27 19.25 19.25 2.7769e+07 2020 28
2020-07-13 19.55 20.99 17.712 18.1 18.1 3.6018e+07 2020 29
2020-07-14 18.111 18.61 17.05 18.43 18.43 1.5803e+07 2020 29
2020-07-15 18.97 19.14 18.06 18.55 18.55 1.165e+07 2020 29
2021-06-23 38.96 41.65 38.751 40.82 40.82 2.0806e+07 2021 26
2021-06-24 41.1 41.58 39.26 40.26 40.26 1.3388e+07 2021 26
2021-06-25 47.81 57.29 45.87 55.91 55.91 2.6135e+08 2021 26
2021-06-28 57.15 57.51 52.48 54.84 54.84 1.0957e+08 2021 27
2021-06-29 52.3 53.6 45.97 47.02 47.02 7.4309e+07 2021 27
2021-06-30 44.42 47.981 43.081 46 46 6.4212e+07 2021 27
2021-07-01 46 46.43 43.19 43.19 43.19 2.7077e+07 2021 27
2021-07-02 53.23 54.4 44.04 44.94 44.94 1.3545e+08 2021 27
>>
Now write function to return a variable array minus mean of the array by column that will return a variable for each input column...
function varargout=fnAP(x)
% computes x-mean(x) for input array x
% returns size(x,2) outputs each as a column vector
varargout=mat2cell(x-mean(x),size(x,1),ones(1,size(x,2)));
end
Now for the magic... :)
tVal=rowfun(@fnAP,tTicker,'InputVariables',{'Open','High'}, ...
'GroupingVariables',{'Yr','Wk'}, ...
'SeparateInputs',0,'NumOutputs',2, ...
'OutputVariableNames',{'Open-Mean','High-Mean'});
Let's look at what this bought us---
>> [head(tVal);tail(tVal)]
ans =
16×5 table
Yr Wk GroupCount Open-Mean High-Mean
____ __ __________ _________ _________
2020 28 5 -0.668 -0.954
2020 28 5 -0.618 -1.204
2020 28 5 -0.638 -1.224
2020 28 5 -0.148 1.446
2020 28 5 2.072 1.936
2020 29 5 -0.2262 -0.1536
2020 29 5 -1.6652 -2.5336
2020 29 5 -0.8062 -2.0036
2021 26 5 -1.032 -1.754
2021 26 5 1.108 -1.824
2021 26 5 7.818 13.886
2021 27 5 6.53 5.5258
2021 27 5 1.68 1.6158
2021 27 5 -6.2 -4.0032
2021 27 5 -4.62 -5.5542
2021 27 5 2.61 2.4158
>>
I just did two variables to demonstrate; you can use whichever variables you want as the 'InputVariables' setting the 'NumOutputs' to match as well as the 'OutputVariableNames'
These can all be variables as well and do something like use
>> tTicker.Properties.VariableNames
ans =
1×9 cell array
{'Date'} {'Open'} {'High'} {'Low'} {'Close'} {'AdjClose'} {'Volume'} {'Yr'} {'Wk'}
>>
from which could say
inVars=tTicker.Properties.VariableNames(2:7);
outVars=outVars=strcat(inVars,'-Mean');
numOut=numel(outVars);
tVal=rowfun(@fnAP,tTicker,'InputVariables',inVars, ...
'GroupingVariables',{'Yr','Wk'}, ...
'SeparateInputs',0, ...
'NumOutputs',numOut, ...
'OutputVariableNames',outVars);
Well, I was so close already,
>> [head(tVal);tail(tVal)]
ans =
16×9 table
Yr Wk GroupCount Open-Mean High-Mean Low-Mean Close-Mean AdjClose-Mean Volume-Mean
____ __ __________ _________ _________ ________ __________ _____________ ___________
2020 28 5 -0.668 -0.954 -0.7002 -0.85 -0.85 -8.1175e+06
2020 28 5 -0.618 -1.204 -0.5992 -1.25 -1.25 -1.3815e+07
2020 28 5 -0.638 -1.224 -0.5502 -1.01 -1.01 -1.3296e+07
2020 28 5 -0.148 1.446 0.3298 1.66 1.66 2.9479e+07
2020 28 5 2.072 1.936 1.5198 1.45 1.45 5.75e+06
2020 29 5 -0.2262 -0.1536 -1.1184 -1.94 -1.94 1.8498e+06
2020 29 5 -1.6652 -2.5336 -1.7804 -1.61 -1.61 -1.8365e+07
2020 29 5 -0.8062 -2.0036 -0.7704 -1.49 -1.49 -2.2518e+07
2021 26 5 -1.032 -1.754 -0.1152 -1.666 -1.666 -4.6083e+07
2021 26 5 1.108 -1.824 0.3938 -2.226 -2.226 -5.35e+07
2021 26 5 7.818 13.886 7.0038 13.424 13.424 1.9446e+08
2021 27 5 6.53 5.5258 6.7278 7.642 7.642 2.745e+07
2021 27 5 1.68 1.6158 0.2178 -0.178 -0.178 -7.816e+06
2021 27 5 -6.2 -4.0032 -2.6712 -1.198 -1.198 -1.7913e+07
2021 27 5 -4.62 -5.5542 -2.5622 -4.008 -4.008 -5.5048e+07
2021 27 5 2.61 2.4158 -1.7122 -2.258 -2.258 5.3327e+07
>>
You can then keep the new table or pick the variables from it you want/need and append them to the tTicker table or whatever at will.
NB:
>> height(tVal)
ans =
252
>> height(tTicker)
ans =
252
>>
so the two match up, row for row.
So, as noted, going back to the original posting, it's not at all an "expensive" calculation and rowfun is magical in what you can do...you just have to be somewhat creative in thinking about what the function definition needs to be.
MOST of the time, one can write what one wants in a simple anonymous function; your needs were a little more exotic. If one were satisfied to just get an array back, then you can, in fact, do this in a one-liner using an anonymous function --
>> tValAnon=rowfun(@(v)(v-mean(v)),tTicker,'InputVariables',{'Open','High'},'GroupingVariables',{'Yr','Wk'},'SeparateInputs',0,'OutputVariableNames','Variables-Week Mean');
>> head(tValAnon)
ans =
8×4 table
Yr Wk GroupCount Variables-Week Mean
____ __ __________ ___________________
2020 28 5 -0.668 -0.954
2020 28 5 -0.618 -1.204
2020 28 5 -0.638 -1.224
2020 28 5 -0.148 1.446
2020 28 5 2.072 1.936
2020 29 5 -0.2262 -0.1536
2020 29 5 -1.6652 -2.5336
2020 29 5 -0.8062 -2.0036
>>
This, of course, works, and you can produce the same output from it by extracting the columns from the one composite 2D array; doing it that way means have to do that every time run it; as opposed to writing the explicit function once and being done with it.
  6 commentaires
dpb
dpb le 28 Juil 2021
chuckles...
dpb
dpb le 28 Juil 2021
ADDENDUM:
Indeed, as noted above and that I have commented on numerous times before, rowfun is truly almost magical if one "thinks out of the box" on what the function definition can be besides just the normal run-of-the-mill @mean, etc., etc., ...
This one isn't so bad excepting @Attilio Pittelli got off on the wrong foot by computing his means first independently and then trying to match those up with the proper elements does get more complicated.
It's entertainment for the old fogey retired consulting guy to keep a hand in... :)

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Descriptive Statistics dans Help Center et File Exchange

Produits


Version

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by