How can I extract time series data from cell matrix and put all the data together in one matrix with corresponding date and fund ID? See Example here
6 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I am a new user of Matlab so not aware of the terminology used here. I am working on Mutual Funds analysis.
I have data in a cell named ndata with each cell having 3 columns: Date, Fund-ID, and Returns. Total Unique dates are 465, and total unique Funds are 305. The problem is that most funds start and end on different dates.
The dimensions of ndata are {1,305} with each cell having data of a different Fund. Each cell has dimensions (n,3) where n varies. First column of every cell i.e. ndata{1,i}(:,1) contains Dates. Now I would like to merge all the cells in one matrix, with Dates in first Column, and the Returns in subsequent columns, with every column representing a different Fund, and every row a different date. The Return should correspond to the unique date and unique Fund.
This Example might help explain my problem better:
ndata=cell(1,3);
ndata{1,1} = [1, 101, 4; 2, 101, 4.5; 3, 101, 3];
ndata{1,2} = [2, 202, 1; 3, 202, 1.1; 4, 202, 1.2; 5, 202, 3.2;];
ndata{1,3} = [3, 303, 4.1; 4 303, 4.9];
Funds = [101 202 303];
Dates = [1;2;3;4;5];
I would like to see something like: Returns:
1st Row is: Col1=Date Col2= 101 Col3= 202 Col4= 303
and then
[1, 4, NaN, NaN; 2, 4.5, 1, NaN; 3, 3, 1.1, 4.1; 4, NaN, 1.2, 4.9; 5, NaN, 3.2, NaN]
The code I tried for this kind of example is:
Returns = NaN * ones(5+1, 3+1);
Returns(2:end,1) =Dates; %from the date vector which is same as 1st column of answer I'm looking for i.e. Return Matrix above
Returns(1,2:end) =Funds; %from the FundID vector
for i = 1:3
for j = 2:6
for k=1:size(ndata{1,i},1);
if any(Returns(j,1)==(ndata{1,i}(:,1)));
Returns(j:size(ndata{1,i},1)+j-1,i+1) = (ndata{1,i}(:,3));
end
end
end
end
Can anyone help. I'd really appreciate.
2 commentaires
Jan
le 3 Juil 2013
@Sunny: This is your first question in this forum and you have formatted the text and the code properly, added a useful piece of code and all required information. This is a nice example for an efficient posting and therefore I've voted for it: This is my question of the week! Thanks.
Plus de réponses (3)
Guru
le 3 Juil 2013
Well, using the knowledge that the number of funds is the size of ndata and the rows for each fund that has data is based on the Dates within each cell, we can actually preallocate a NaN array and simplify this process a bit.
% Preallocate a double array Returns of all NaNs and then fill in Dates and Funds
Returns = NaN(length(Dates)+1, length(ndata)+1);
Returns(2:end,1) = Dates;
Returns(1,2:end) = Funds
% Use the Dates we have in each cell of ndata to assign the Fund data
for n = 2:length(ndata)+1
% Extract the dates that we have for that Fund
D = ndata{n-1}(:,1);
Returns(D,n) = ndata{n-1}(:,3);
end
HTH!
1 commentaire
dpb
le 4 Juil 2013
Modifié(e) : dpb
le 8 Juil 2013
CORRECTION -- THE ISMEMBER search wasn't right in initial, sorry. The following corrected version produces the desired results...
A "deadahead" solution...can undoubtedly be more fully vectorized w/ cellfun and friends of similar thought but didn't have time to think it out...
MATL
r=sortrows(vertcat(d{:}),[1 2]); % make single array by date, stock
s=splitvec(r(:,2),'value','bracket'); % Bruno's function @ FEX
u1=unique(r(:,1); u2=unique(r(:,2); % find how many each
res=nan(length(u1))+1,length(u2))+1); % preallocate
res(1,2:end)=u2; % load the IDs on first row
res(2:end,1)=u1; % and the times in first column
for i=1:length(s)
[i1,i2]=dealvec(s(i,:));
[~,ib]=ismember(r(i1:i2,2),u2);
res(i+1,ib+1)=r(i1:i2,3)';
end
res =
NaN 101 202 303
1.0000 4.0000 NaN NaN
2.0000 4.5000 1.0000 NaN
3.0000 3.0000 1.1000 4.1000
4.0000 NaN 1.2000 4.9000
5.0000 NaN 3.2000 NaN
>>
...
NB: I renamed Bruno's function here to splitvec() instead of keeping CamelCase as Matlab doesn't use it and I have too short a memory retention span to recall what is/isn't if were to try to...suit yourself on that--just be aware that above won't work until you do one or t'other--i.e., either change the call to match Bruno's capitalization or rename to all lowercase to match my spelling.
Basically just pulls a subset vector from the list of dates separated by those dates and then finds their location in the columns that are preloaded.
I also used a little utility routine of mine called splitvec that is convenient when a short variable name is less typing than a repeated indexing expression.
MATL
function varargout = dealvec(x)
varargout = num2cell(x);
If you don't want to use it, it's equivalent is s1=s(i,1); s2=s(i,2);
Sorry didn't catch the indexing snafu initially; I just assumed I was good... :)
0 commentaires
dpb
le 8 Juil 2013
OK, I recast as a function and tested so for a single shorter reporting I'll post it as a new answer even though is same algorithm...
MATL
>> % raw data...
>> ndata{:}
ans =
1.0000 101.0000 4.0000
2.0000 101.0000 4.5000
3.0000 101.0000 3.0000
ans =
2.0000 202.0000 1.0000
3.0000 202.0000 1.1000
4.0000 202.0000 1.2000
5.0000 202.0000 3.2000
ans =
3.0000 303.0000 4.1000
4.0000 303.0000 4.9000
>> % function results...
>> collectdata(ndata)
ans =
NaN 101.0000 202.0000 303.0000
1.0000 4.0000 NaN NaN
2.0000 4.5000 1.0000 NaN
3.0000 3.0000 1.1000 4.1000
4.0000 NaN 1.2000 4.9000
5.0000 NaN 3.2000 NaN
>>
>> % function code
function res=collectdata(d)
% Given cell array of nx3 per cell of date, stockID, return as
% an array collecting all returns by date and stockID with missing
% values as NaN
% splitvec is Bruno Luong's FEX submission splitVec
% dealvec is dpbozarth's utility routine
% function varargout = dealvec(x)
% varargout = num2cell(x);
r=sortrows(vertcat(d{:}),[1 2]); % make single array by date, stock
s=splitvec(r(:,1),'value','bracket'); % Bruno's function @ FEX
u1=unique(r(:,1)); u2=unique(r(:,2)); % find how many each
res=nan(length(u1)+1,length(u2)+1); % preallocate
res(1,2:end)=u2; % load the IDs on first row
res(2:end,1)=u1; % and the times in first column
for i=1:length(s)
[i1,i2]=dealvec(s(i,:));
[~,ib]=ismember(r(i1:i2,2),res(1,:));
res(i+1,ib)=r(i1:i2,3)';
end
end
Voir également
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!