How to read all rows of an Excel with missing rows included?
Afficher commentaires plus anciens
My goal is to read all contents of an Excel file as cells using the function "readcell". However, every time when I used this function, it automatically eliminates any rows with missing contents above. How do I force this function to read the entire content of an Excel file with the missing rows above included?
4 commentaires
Leon
le 9 Mar 2025
Image Analyst
le 9 Mar 2025
Modifié(e) : Image Analyst
le 9 Mar 2025
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
Leon
le 9 Mar 2025
Réponse acceptée
Plus de réponses (3)
You can see if it has been fixed (I submitted bug/enhancement report quite some time ago), but I'm still at R2021b, but
readcell('tst.xlsx','range','A1')
should work, but it didn't as of R2021b
The simplest workaround uses an import data options object--
opt=detetctimportoptions('test.xlsx'); % create the base default import struct
opt.DataRange='A1'; % tell it to read data from the start
readcell('test.xlsx',opt) % read using the import options
Locally, R2021b
>> readcell('test.xlsx','range','A1')
ans =
3×3 cell array
{[ 1.00]} {[ 2.00]} {[ 3.00]}
{1×1 missing} {1×1 missing} {1×1 missing}
{[ 4.00]} {[ 5.00]} {[ 6.00]}
>> opt=detectImportOptions('test.xlsx')
opt =
SpreadsheetImportOptions with properties:
Sheet Properties:
Sheet: ''
Replacement Properties:
MissingRule: 'fill'
ImportErrorRule: 'fill'
Variable Import Properties: Set types by name using setvartype
VariableNames: {'Var1', 'Var2', 'Var3'}
VariableTypes: {'double', 'double', 'double'}
SelectedVariableNames: {'Var1', 'Var2', 'Var3'}
VariableOptions: Show all 3 VariableOptions
Access VariableOptions sub-properties using setvaropts/getvaropts
VariableNamingRule: 'modify'
Range Properties:
DataRange: 'A2' (Start Cell)
VariableNamesRange: ''
RowNamesRange: ''
VariableUnitsRange: ''
VariableDescriptionsRange: ''
To display a preview of the table, use preview
>> opt.DataRange='A1';
>> readcell('test.xlsx',opt)
ans =
4×3 cell array
{1×1 missing} {1×1 missing} {1×1 missing}
{[ 1.00]} {[ 2.00]} {[ 3.00]}
{1×1 missing} {1×1 missing} {1×1 missing}
{[ 4.00]} {[ 5.00]} {[ 6.00]}
>>
1 commentaire
readcell('test2.xlsx') % default fails to read blank line
readcell('test2.xlsx','range','A1') % has starting range been fixed yet???
No, it hasn't.
readcell('test2.xlsx','numheaderlines',0)
opt=detectImportOptions('test2.xlsx')
opt.DataRange='A1';
readcell('test2.xlsx',opt)
readcell('test2.xlsx','numheaderlines',0,'range','A1')
You can consider specifying which sheet and which range of cells to be read/imported using readcell() or readtable():
% READCELL()
D1 = readcell("tst.xlsx", Sheet=1, Range = 'A1:C5')
% READTABLE()
D2 = readtable("tst.xlsx", Sheet=1, Range = 'A1:C5')
4 commentaires
dpb
le 9 Mar 2025
Works, but then you need to know the actual size of the file a priori which is inconvenient at best...
Sulaymon Eshkabilov
le 9 Mar 2025
Déplacé(e) : Matt J
le 9 Mar 2025
Here is the code with readcell() and readtable() with default options and adjusted options to read/import data from Cell A1.
Note that readcell() does the job with its default settings as well.
% READCELL()
D1 = readcell("test2.xlsx", Sheet=1) % Default Settings with the start
% READCELL()
D1_ALT = readcell("test2.xlsx", Sheet=1, Range = 'A1') % Start reading/importing from Cell A1 in Test2.xlsx
% READTABLE()
D2 = readtable("test2.xlsx", Sheet=1)
% READTABLE()
D2ALT = readtable("test2.xlsx", Sheet=1, Range = 'A1') % Start reading/importing from Cell A1
Sulaymon Eshkabilov
le 9 Mar 2025
Déplacé(e) : Matt J
le 9 Mar 2025
Note that I am using this online MATLAB.
dpb
le 9 Mar 2025
No, it doesn't read the initial opening empty lines per OP's request in any of the above, the metainfo data is actually the fourth line in the Excel file, the three blank lines preceding it have not been returned.
See the <examples I added as a comment to my earlier Answer> to see the actual content of the beginning of the file.
ActiveX can be used to determine the last row/column of the worksheet: Then, you can force readcell to read in a larger range with the Range option:
[lastRow,lastCol] = getLastCell('TST.xlsx')
lastRow =
3
lastCol =
4
readcell('TST.xlsx',Range=[1,1,lastRow,lastCol])
ans =
3×4 cell array
{[<missing>]} {[<missing>]} {[<missing>]} {[<missing>]}
{[ 1]} {[ 2]} {[ 3]} {[ 50]}
{[ 4]} {[ 5]} {[ 6]} {[ 70]}
function [lastRow,lastCol] = getLastCell(filename)
sheet = 1; % Change if needed
% Open Excel via ActiveX
excel = actxserver('Excel.Application');
wb = excel.Workbooks.Open(fullfile(pwd, filename));
ws = wb.Sheets.Item(sheet);
% Get the UsedRange
range = ws.UsedRange;
% Get the first used row/column index
firstRow = range.Row; % First used row index
firstCol = range.Column; % First used column index
% Get number of rows and columns in the UsedRange
numRows = range.Rows.Count;
numCols = range.Columns.Count;
% Compute the last used row/column index
lastRow = firstRow + numRows - 1;
lastCol = firstCol + numCols - 1;
% Clean up
wb.Close(false);
excel.Quit();
delete(excel);
end
7 commentaires
The above can be made a little more user intuitive
function usedRange=getLastCell(filename)
% return used range address of current active workbook active sheet...
% Open Excel via ActiveX
excel = actxserver('Excel.Application');
wbk=excel.Workbooks.Open(fullfile(pwd, filename));
usedRange=wbk.ActiveSheet.UsedRange.Address;
wbk.Close(0)
excel.Quit
delete(excel)
end
This returns a char string address range all ready for use -- although you may need to strip the absolute address references before MATLAB readXXX family routines will accept it. There's a current enhancement request under evalutaion to remove this limitation; as of now, an absolute address likely will fail silently.
To handle that, one could instead use
usedRange=strrep(wbk.ActiveSheet.UsedRange.Address,'$','');
instead.
Matt J
le 9 Mar 2025
It's not quit 'ready for use' even after removing the '$' characters, because the upper left corner coordinate needs to be changed to A1
>> excel=actxserver('Excel.Application');
>> wbk=excel.Workbooks.Open(fullfile(pwd,'test.xlsx'));
>> usedRange=wbk.ActiveSheet.UsedRange;
>> usedRange.Address
ans =
'$A$1:$C$4'
>> excel.ActiveWorkbook.Close(0)
>> excel.Quit
>> delete(excel)
It returns the entire used range...
More likely causes of problems with this approach lie in that the used range may indicate a larger area of the sheet than what actually contains current data. If there has been more on the specific worksheet at some point that has only been cleared but not deleted, UsedRange may still reference that prior larger area.
But that's not what you want. If the range starts at A2, like the attached, you want to change it to A1.
>> excel=actxserver('Excel.Application');
>> wbk=excel.Workbooks.Open(fullfile(pwd,'TST.xlsx'));
>> usedRange=wbk.ActiveSheet.UsedRange;
>> usedRange.Address
ans =
'$A$2:$D$3'
Matt J
le 9 Mar 2025
It's not a manipulation. The use case given by the OP was that the spreadsheet my contain blank initial lines.
dpb
le 9 Mar 2025
If the lines were created by writing to somewhere other than the whole sheet...
Catégories
En savoir plus sur Spreadsheets dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!