How to extract rows of data according to text containing specific words in cells in Matlab

72 vues (au cours des 30 derniers jours)
Hi all,
I am quite stuck with a problem, I am trying to extract certain variables from large Excel files that classify organisms from multiple years in Excel so I can process it in MATLAB. I want to extract all columns from A to L and the row-number of the data I need starts from 657828: 1048576. I have tried the filter function in excel but it doesn't work so I am doing it in MATLAB. How I want to filter it includes the column j called object_annotation_hierachy and the precise species I am trying to filter out are the following:
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Calanidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Metridinidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Candaciidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Heterorhabdidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Euchaetidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Metridinidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Cyclopoida_Oithonidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Acartiidae
Arthropoda_Crustacea_Maxillopoda_Copepoda_Calanoida_Temoridae
All other species are a variations of this but I am trying to include all data with 'Copepoda' in the title.
Further I want to extract by year which is in the first column and called object_id with the name ['cruise2012'] up to 2016.
The code so far looks like this, however it does not work:
C=csvread('cruise_2004_2016_ZooScan_dataset.csv');
%R657828, C1048576 (just used in first line of code to show location)
copepods= contains(C.object_id=="cruise2012")&(C.object_annotation_hierachy,"Copepoda");
C1=C(copepods,:);
Any help would be much appreciated!
  5 commentaires
Matt J
Matt J le 28 Mai 2023
Modifié(e) : Matt J le 28 Mai 2023
Hi i cant i'm afraid it is confidential
That shouldn't matter. Replace the real data with random synthetic data of the same general structure. We need a representative example that we can all look at and work with.
Sophia
Sophia le 30 Mai 2023
Okay i have uploaded it and replaced it will dummy text, it is usually a lot longer. I have also split and numerized the taxonomy and code is in the sheet in case there is an easier way to do it. Because i have numerized the 1st and 10th row i used this code but it still doesnt seem to be working.
opts=detectImportOptions("Zoocam.xlsx");
opts.VariableTypes(2)={'double'};
opts.VariableTypes(19)={'double'};
opts.VariableTypes(20)={'double'};
C=readtable('Zoocam.xlsx', opts);
C.index=(C.object==2017 & C.object_annotation_hierarchy<18);
C_new=C(C.index==1,{'object','object_lat', 'object_lon','object_annotation_hierarchy', 'object_area'});
writetable(C_new,'2017_data.csv');

Connectez-vous pour commenter.

Réponse acceptée

Matt J
Matt J le 27 Mai 2023
Modifié(e) : Matt J le 30 Mai 2023
copepods= contains(C.object_id,"cruise2012") & ...
contains(C.object_annotation_hierarchy,"Copepoda");
  14 commentaires
Matt J
Matt J le 31 Mai 2023
Modifié(e) : Matt J le 31 Mai 2023
Thanks a lot Matt, it ended up working when i left it overnight!
I'm glad. Please Accept-click the answer to indicate that the question was resolved.
Error in test5 (line 12)
Should be,
mask1 = find(contains(C.object_id,"cruise2012"));

Connectez-vous pour commenter.

Plus de réponses (0)

Produits


Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by