Read Spreadsheet Data into Table
The best way to represent spreadsheet data in MATLAB® is in a table, which can store a mix of numeric and text data, as well as variable and row names. You can read data into tables interactively or programmatically. To interactively select data, click Import Data on the Home tab, in the Variable section. To programmatically import data, use one of these functions:
readtable
— Read a single worksheet.spreadsheetDatastore
— Read multiple worksheets or files.
This example shows how to import spreadsheet data programmatically using both functions. The sample data, airlinesmall_subset.xlsx
, contains one sheet for each year between 1996 and 2008. The sheet names correspond to the year, such as 2003
.
Read All Data from Worksheet
Call readtable
to read all the data in the worksheet called 2008, and then display only the first 10 rows and columns. Specify the worksheet name using the Sheet
name-value pair argument. If your data is on the first worksheet in the file, you do not need to specify Sheet
.
T = readtable('airlinesmall_subset.xlsx','Sheet','2008'); T(1:10,1:10)
ans=10×10 table
Year Month DayofMonth DayOfWeek DepTime CRSDepTime ArrTime CRSArrTime UniqueCarrier FlightNum
____ _____ __________ _________ _______ __________ _______ __________ _____________ _________
2008 1 3 4 1012 1010 1136 1135 {'WN'} 752
2008 1 4 5 1303 1300 1411 1415 {'WN'} 1161
2008 1 6 7 2134 2115 2242 2220 {'WN'} 1830
2008 1 7 1 1734 1655 54 30 {'WN'} 302
2008 1 8 2 1750 1755 2018 2035 {'WN'} 1305
2008 1 9 3 640 645 855 905 {'WN'} 896
2008 1 10 4 1943 1945 2039 2040 {'WN'} 120
2008 1 11 5 1303 1305 1401 1400 {'WN'} 1685
2008 1 13 7 1226 1230 1415 1400 {'WN'} 1118
2008 1 14 1 1337 1340 1623 1630 {'WN'} 730
Read Selected Range from Specific Worksheet
From the worksheet named 1996
, read only 10 rows of data from the first 5 columns by specifying a range, 'A1:E11'
. The readtable
function returns a 10-by-5 table.
T_selected = readtable('airlinesmall_subset.xlsx','Sheet','1996','Range','A1:E11')
T_selected=10×5 table
Year Month DayofMonth DayOfWeek DepTime
____ _____ __________ _________ _______
1996 1 18 4 2117
1996 1 12 5 1252
1996 1 16 2 1441
1996 1 1 1 2258
1996 1 4 4 1814
1996 1 31 3 1822
1996 1 18 4 729
1996 1 26 5 1704
1996 1 11 4 1858
1996 1 7 7 2100
Convert Variables to Datetimes, Durations, or Categoricals
During the import process, readtable
automatically detects the data types of the variables. However, if your data contains nonstandard dates, durations, or repeated labels, then you can convert those variables to their correct data types. Converting variables to their correct data types lets you perform efficient computations and comparisons and improves memory usage. For instance, represent the variables Year
, Month
, and DayofMonth
as one datetime
variable, the UniqueCarrier
as categorical
, and ArrDelay
as duration
in minutes.
data = T(:,{'Year','Month','DayofMonth','UniqueCarrier','ArrDelay'}); data.Date = datetime(data.Year,data.Month,data.DayofMonth); data.UniqueCarrier = categorical(data.UniqueCarrier); data.ArrDelay = minutes(data.ArrDelay);
Find the day of the year with the longest delay, and then display the date.
ind = find(data.ArrDelay == max(data.ArrDelay)); data.Date(ind)
ans = datetime
07-Apr-2008
Read All Worksheets from Spreadsheet File
A datastore is useful for processing arbitrarily large amounts of data that are spread across multiple worksheets or multiple spreadsheet files. You can perform data import and data processing through the datastore.
Create a datastore from the collection of worksheets in airlinesmall_subset.xlsx
, select the variables to import, and then preview the data.
ds = spreadsheetDatastore('airlinesmall_subset.xlsx'); ds.SelectedVariableNames = {'Year','Month','DayofMonth','UniqueCarrier','ArrDelay'}; preview(ds)
ans=8×5 table
Year Month DayofMonth UniqueCarrier ArrDelay
____ _____ __________ _____________ ________
1996 1 18 {'HP'} 6
1996 1 12 {'HP'} 11
1996 1 16 {'HP'} -13
1996 1 1 {'HP'} 1
1996 1 4 {'US'} -9
1996 1 31 {'US'} 9
1996 1 18 {'US'} -2
1996 1 26 {'NW'} -10
Before importing data, you can specify what data types to use. For this example, import UniqueCarrier
as a categorical variable.
ds.SelectedVariableTypes(4) = {'categorical'};
Import data using the readall
or read
functions. The readall
function requires that all the data fit into memory, which is true for the sample data. After the import, compute the maximum arrival delay for this dataset.
alldata = readall(ds); max(alldata.ArrDelay)/60
ans = 15.2333
For large data sets, import portions of the file using the read
function. For more information, see Read Collection or Sequence of Spreadsheet Files.
See Also
readtable
| spreadsheetDatastore