how to select particular rows from a matrix

2 vues (au cours des 30 derniers jours)
ektor
ektor le 6 Mar 2017
Commenté : ektor le 7 Mar 2017
Dear all,
I have a panel data set(countries, years) of dimension 20000 by 200. My goal is
1) to choose those countries that correspond to a specific period (1970-2013) AND
2) for which period there are no NaN values Is there a way to do that? Here is a sample matrix Many thanks in advance.
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};

Réponse acceptée

Image Analyst
Image Analyst le 6 Mar 2017
Try converting it to a table to make it easier:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
Echoed to the command window, you can see the steps:
t =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ ___ ___ ___ ___
'Albania' 1919 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1920 1 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1921 0 0 0 0 0 0 1 0 NaN 1 0 0
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Albania' 1923 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1924 1 0 1 0 0 0 2 1 NaN 1 2 0
'Albania' 1933 0 0 0 0 0 0 0 0 NaN 1 0 0
yearsInRange =
7×1 logical array
0
0
1
1
1
0
0
nonNanRows =
7×1 logical array
0
0
0
1
0
0
0
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
>>
  3 commentaires
Image Analyst
Image Analyst le 7 Mar 2017
Sure it will. Here is proof:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Canada' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 11] [ 1] [ 2] [ 0]
'USA' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 12] [ 1] [ 0] [ 0]
'Germany' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 13] [ 1] [ 2] [ 0]
'Belgium' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 14] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
And look at all the countries it found:
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Canada' 1922 1 0 1 0 0 0 2 1 11 1 2 0
'USA' 1923 0 0 0 0 0 0 0 0 12 1 0 0
'Germany' 1922 1 0 1 0 0 0 2 1 13 1 2 0
'Belgium' 1923 0 0 0 0 0 0 0 0 14 1 0 0
so explain why you said that - why you think it's getting countries outside the range or not including countries inside the year range. For example you can give me a cell array that proves what you said.
ektor
ektor le 7 Mar 2017
So I was confused! Yes, it is fine. Many thanks!

Connectez-vous pour commenter.

Plus de réponses (1)

Rik
Rik le 6 Mar 2017
You can use ismember to select the country, then you can convert those rows (excluding the first column) to a matrix with cell2mat. With logical indexing you can select the correct range of years, and with isnan you can find out the rows with NaN values.
I think you should be able to figure out what to do with this pointer. If you have trouble writing the code, don't hesitate to reply.
  2 commentaires
ektor
ektor le 6 Mar 2017
Dear Rik,thank you for this. I am not sure how to choose the years. For example, I am interested in the 1945-2015 time period. However, if I set, for example
x>=1945&x<=2015
I might get counties with time period 1989-2009. But I want countries over 1945-2015. My logical indexing is wrong. Any suggestion on that?
Rik
Rik le 7 Mar 2017
I would say that your data is not complete then, because there is nothing wrong with your logical indexing.
I also like to add that the solution from Image Analyst is more elegant and will be easier to understand if you come back to your code after a year. You do need to make sure that the all machines you run your code on have 2013b or later, because older versions don't have the table data structure.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Numeric Types dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by