How do I import Velocity 3.2.0 CSV DVH data into MATLAB 9.1 (R2016b)?

1 vue (au cours des 30 derniers jours)
How do I import radiation oncology software Velocity 3.2.0's dose-volume histogram (DVH) data in a comma-separated value file (CSV, sample file attached) into MATLAB 9.1 (R2016b)? Using Velocity one can create DVH data for multiple tissues displayed in a single graph, and export this data as a sequential two-column CSV.
csvread requires that "the file must contain only numeric values", whereas the CSV is two columns of data sets that begin with header text and end with an empty row.
It appears that for this reason a simple execution of importdata is insufficient, because the command terminates after the importing only the first data set:
test = importdata('filename.csv');
test =
struct with fields:
data: [1024×2 double]
textdata: {2×1 cell}
whereas the file actually contains additional data sets (e.g. copying from row 1026):
58.1704 0.00692086
Prostate
GY (CC)
55.2304 0.0046139
55.2333 0.00230695
What do we use to import data in CSV that is formatted as follows? (The following describes what is seen using Excel 2016.)
header text in Column 1
header text in Columns 1 and 2
numerical data in columns 1 and 2 in multiple rows
empty row
(repeat for next data set for multiple data sets of various length)
Walter Roberson requested a sample data file and provided a solution below using fopen, fgetl, feof, and textscan.
  4 commentaires
Walter Roberson
Walter Roberson le 5 Jan 2017
Velocity appears to be from Varian. Varian advertises,
"Velocity provides a vendor-neutral platform that integrates image, structure, plan and dose data to create a unified patient dataset." Unfortunately their documentation is a bit sparse as to what that format is. Except they mention DICOM, and they mention RT Plan software. Someone has written software to read DICOM RT Plan data in MATLAB; see https://github.com/ulrikls/dicomrt2matlab
It sounds like your data is not DICOM based.
As I poke around, the information I am finding about DVH suggests that the most common formats are not what you are describing your file as having. But it is difficult to tell, as you have not given an example file.
Walter Roberson
Walter Roberson le 5 Jan 2017
There are millions of file formats. People invent their own more often than they use standard formats, and they modify the file format over time, often without considering backwards capability. There is no practical way for Mathworks to already support them all.
Mag tape was always written in records, often fixed length binary records. Variable length records did exist but when it came time to start a new data structure, typically a new record was written. Not inevitably though: packing multiple structures into one tape record did happen. Remember though that memory was typically not large and a complete record at a time has to be read in for mag tape (no positioning by bytes), so the variable length records did not pack long continuous streams in like became common on disc files.

Connectez-vous pour commenter.

Réponse acceptée

Walter Roberson
Walter Roberson le 5 Jan 2017
There is no pre-written Mathworks routine to read that file format. It is however not difficult to write coode for it.
num = 0;
fid = fopen('sample.csv','rt');
while true
H1 = fgetl(fid) ;
if feof(fid); break; end
H2 = fgetl(fid) ;
if feof(fid); break; end
datacell = textscan(fid, '%f%f', 'delimiter', ',', 'combineoutput', true) ;
if isempty(datacell) || isempty(datacell{1}); break; end
num = num + 1;
headers(num) = {H1, H2} ;
data(num) = datacell;
fgetl(fid); %the empty line between organs
end
This will create two cell arrays, one of headers and the other of corresponding numeric values. You might want to do some processing on H1 (organ name) and H2 (not sure what that line is for) before storing that information.
  7 commentaires
Daniel Bridges
Daniel Bridges le 7 Jan 2017
Modifié(e) : Daniel Bridges le 7 Jan 2017
Is it not more legible and memory-efficient to put it immediately after textscan's cell array creation?
datacell = textscan(fid,'%f%f','delimiter',',','collectoutput',true);
if isempty(datacell) || isempty(datacell{1}); break; end
if any(isnan(datacell{1}(end,:))); datacell{1}(end,:) = []; end
Walter Roberson
Walter Roberson le 7 Jan 2017
No, it is the same efficiency. But it certainly does not hurt to have it closer to where datacell is created.

Connectez-vous pour commenter.

Plus de réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by