MATLAB Answers

how to select particular rows from a matrix

31 views (last 30 days)
ektor
ektor on 6 Mar 2017
Commented: ektor on 7 Mar 2017
Dear all,
I have a panel data set(countries, years) of dimension 20000 by 200. My goal is
1) to choose those countries that correspond to a specific period (1970-2013) AND
2) for which period there are no NaN values Is there a way to do that? Here is a sample matrix Many thanks in advance.
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};

  0 Comments

Sign in to comment.

Accepted Answer

Image Analyst
Image Analyst on 6 Mar 2017
Try converting it to a table to make it easier:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
Echoed to the command window, you can see the steps:
t =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ ___ ___ ___ ___
'Albania' 1919 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1920 1 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1921 0 0 0 0 0 0 1 0 NaN 1 0 0
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Albania' 1923 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1924 1 0 1 0 0 0 2 1 NaN 1 2 0
'Albania' 1933 0 0 0 0 0 0 0 0 NaN 1 0 0
yearsInRange =
7×1 logical array
0
0
1
1
1
0
0
nonNanRows =
7×1 logical array
0
0
0
1
0
0
0
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
>>

  3 Comments

ektor
ektor on 6 Mar 2017
Hi, what worries me is that the command yearsInRange will not give me the interval [1921-1923] FOR EACH COUNTRY. For some countries this interval does not exist. So I want to exclude those countries and keep only those countries for which the interval [1921-1923] exists. I want to have a balanced panel data. Is there a solution to that? Many thanks
Image Analyst
Image Analyst on 7 Mar 2017
Sure it will. Here is proof:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Canada' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 11] [ 1] [ 2] [ 0]
'USA' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 12] [ 1] [ 0] [ 0]
'Germany' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 13] [ 1] [ 2] [ 0]
'Belgium' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 14] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
And look at all the countries it found:
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Canada' 1922 1 0 1 0 0 0 2 1 11 1 2 0
'USA' 1923 0 0 0 0 0 0 0 0 12 1 0 0
'Germany' 1922 1 0 1 0 0 0 2 1 13 1 2 0
'Belgium' 1923 0 0 0 0 0 0 0 0 14 1 0 0
so explain why you said that - why you think it's getting countries outside the range or not including countries inside the year range. For example you can give me a cell array that proves what you said.
ektor
ektor on 7 Mar 2017
So I was confused! Yes, it is fine. Many thanks!

Sign in to comment.

More Answers (1)

Rik
Rik on 6 Mar 2017
You can use ismember to select the country, then you can convert those rows (excluding the first column) to a matrix with cell2mat. With logical indexing you can select the correct range of years, and with isnan you can find out the rows with NaN values.
I think you should be able to figure out what to do with this pointer. If you have trouble writing the code, don't hesitate to reply.

  2 Comments

ektor
ektor on 6 Mar 2017
Dear Rik,thank you for this. I am not sure how to choose the years. For example, I am interested in the 1945-2015 time period. However, if I set, for example
x>=1945&x<=2015
I might get counties with time period 1989-2009. But I want countries over 1945-2015. My logical indexing is wrong. Any suggestion on that?
Rik
Rik on 7 Mar 2017
I would say that your data is not complete then, because there is nothing wrong with your logical indexing.
I also like to add that the solution from Image Analyst is more elegant and will be easier to understand if you come back to your code after a year. You do need to make sure that the all machines you run your code on have 2013b or later, because older versions don't have the table data structure.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by