Reading large .csv data with 12 million rows

Hi all, I have .CSV file with 14 columns and around 12 million rows. I would like to read the data and use it for further analysis. When I use "readtable" command it reads only first two columns with all the information but fails to read other column data. Is there anyway I can read the entire dataset once?
Thanks in advance.

3 commentaires

Mathieu NOE
Mathieu NOE le 10 Nov 2021
hello
can you share an extract of your file (14 columns x 1000 rows ) ?
Ganesh Naik
Ganesh Naik le 16 Déc 2021
Hi Mathieu thanks for your message. I have solved the problem using the following method:
1) I have used "Split CSV" program to split the original CSV files into 12 sub-CSV files (each one million rows)
2) Read each CSV data as:
data1 = readtable('Data-1.csv');
data2 = readtable('Data-2.csv');
.
.
.
data12 = readtable('Data-12.csv');
3) Combine the final file as:
Final_data = [data1; data2;data3; data4;data5; data6;data7; data8;data9; data10;data11; data12];
This created a large table for me (on the fly) at workspace and each time loading the above files dont take much memory. It worked for me, but I believe there maybe better alternative methods.
Mathieu NOE
Mathieu NOE le 16 Déc 2021
ok glad you havefound a workaround ! :)

Connectez-vous pour commenter.

Réponses (1)

Sivani Pentapati
Sivani Pentapati le 1 Déc 2021

0 votes

Hi Ganesh,
You can try using readmatrix in place of readtable. Other workaround is to convert the 'csv' files into 'mat' files and save them with '-v7.3' option. Please refer to this answer for more information.

1 commentaire

Ganesh Naik
Ganesh Naik le 16 Déc 2021
Dear Sivani, I have solved the problem using SplitCSV method, reading each file and combining the data (I have explained my method above). It worked for me.

Connectez-vous pour commenter.

Catégories

Produits

Commenté :

le 16 Déc 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by