How do I import Velocity 3.2.0 CSV DVH data into MATLAB 9.1 (R2016b)?

Question

Daniel Bridges le 5 Jan 2017

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/319120-how-do-i-import-velocity-3-2-0-csv-dvh-data-into-matlab-9-1-r2016b

Commenté : Walter Roberson le 7 Jan 2017

sample.csv

How do I import radiation oncology software Velocity 3.2.0's dose-volume histogram (DVH) data in a comma-separated value file (CSV, sample file attached) into MATLAB 9.1 (R2016b)? Using Velocity one can create DVH data for multiple tissues displayed in a single graph, and export this data as a sequential two-column CSV.

csvread requires that "the file must contain only numeric values", whereas the CSV is two columns of data sets that begin with header text and end with an empty row.

It appears that for this reason a simple execution of importdata is insufficient, because the command terminates after the importing only the first data set:

   test = importdata('filename.csv');
test = 
struct with fields:
        data: [1024×2 double]
    textdata: {2×1 cell}

whereas the file actually contains additional data sets (e.g. copying from row 1026):

   58.1704  0.00692086
  
   Prostate  
   GY   (CC)
   55.2304  0.0046139
   55.2333  0.00230695

What do we use to import data in CSV that is formatted as follows? (The following describes what is seen using Excel 2016.)

header text in Column 1
header text in Columns 1 and 2
numerical data in columns 1 and 2 in multiple rows
empty row
(repeat for next data set for multiple data sets of various length)

Walter Roberson requested a sample data file and provided a solution below using fopen, fgetl, feof, and textscan.

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Daniel Bridges le 5 Jan 2017

Modifié(e) : Daniel Bridges le 5 Jan 2017

I am now seeking to answer this question.

It seems a counterproductive workaround to import the entire file into a string and then write a script to parse its contents. Or to put it another way, I expect MathWorks to have a more eloquent solution already prepared that I merely need to find.

One workaround is for Velocity: Instead of creating the "full" multiple-tissue DVH one wishes to export, one must save to multiple files a separate DVH for each organ of interest, so that there is only one data set per CSV. This is not ideal, but it seems faster than continuing to search for additional ideas.

Edit: Walter, I thought it was not uncommon for data to be written sequentially (i.e. appended end-to-end); old magnetic tape comes to mind. Because I thought MathWorks had prepared for common data files, I thought there was a command or option I was simply unaware of. I am sorry if this expectation was incorrect, but I don't see why it was unrealistic. I have attached a sample data file to the original post.

Walter Roberson le 5 Jan 2017

Velocity appears to be from Varian. Varian advertises,

https://www.varian.com/oncology/products/software/image-management-informatics/velocity?cat=store

"Velocity provides a vendor-neutral platform that integrates image, structure, plan and dose data to create a unified patient dataset." Unfortunately their documentation is a bit sparse as to what that format is. Except they mention DICOM, and they mention RT Plan software. Someone has written software to read DICOM RT Plan data in MATLAB; see https://github.com/ulrikls/dicomrt2matlab

It sounds like your data is not DICOM based.

As I poke around, the information I am finding about DVH suggests that the most common formats are not what you are describing your file as having. But it is difficult to tell, as you have not given an example file.

Walter Roberson le 5 Jan 2017

There are millions of file formats. People invent their own more often than they use standard formats, and they modify the file format over time, often without considering backwards capability. There is no practical way for Mathworks to already support them all.

Mag tape was always written in records, often fixed length binary records. Variable length records did exist but when it came time to start a new data structure, typically a new record was written. Not inevitably though: packing multiple structures into one tape record did happen. Remember though that memory was typically not large and a complete record at a time has to be read in for mag tape (no positioning by bytes), so the variable length records did not pack long continuous streams in like became common on disc files.

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Walter Roberson le 5 Jan 2017

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/319120-how-do-i-import-velocity-3-2-0-csv-dvh-data-into-matlab-9-1-r2016b#answer_249448

Ouvrir dans MATLAB Online

There is no pre-written Mathworks routine to read that file format. It is however not difficult to write coode for it.

   num = 0;
   fid = fopen('sample.csv','rt');
   while true 
     H1 = fgetl(fid) ;
     if feof(fid); break; end 
     H2 = fgetl(fid) ;
     if feof(fid); break; end 
     datacell = textscan(fid, '%f%f', 'delimiter', ',', 'combineoutput', true) ;
     if isempty(datacell) || isempty(datacell{1}); break; end 
     num = num + 1;
     headers(num) = {H1, H2} ;
     data(num) = datacell;
     fgetl(fid);  %the empty line between organs
   end

This will create two cell arrays, one of headers and the other of corresponding numeric values. You might want to do some processing on H1 (organ name) and H2 (not sure what that line is for) before storing that information.

7 commentaires
Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens

Daniel Bridges le 6 Jan 2017

Modifié(e) : Daniel Bridges le 6 Jan 2017

Ouvrir dans MATLAB Online

The headers line causes the error:

headers(num) = {H1,H2};

It is fixed by allowing for columns, enabling the creation of a 3x2 cell array in this case:

headers(num,:) = {H1,H2};

To get the headers to read correctly, I've had to omit the last line:

fgetl(fid); %the empty line between organs

This command was actually skipping the first header of the next data section, causing the first row of data to be stored as the second header. With it removed, the headers are stored correctly, but the empty row is being stored at the end of the numerical data as NaN in each column.

I'd like to accept this answer once I can remove the NaN from the end of the imported data. I've been writing a script to plot the data, and while the NaN may not negatively affect it since it's at the end of the vectors, for the sake of propriety it seems better to remove it.

I plan to return to this problem in about 10 hours, and try to post a solution myself unless someone does so first.

Daniel Bridges le 7 Jan 2017

Modifié(e) : Daniel Bridges le 7 Jan 2017

Ouvrir dans MATLAB Online

Is it not more legible and memory-efficient to put it immediately after textscan's cell array creation?

     datacell = textscan(fid,'%f%f','delimiter',',','collectoutput',true); 
     if isempty(datacell) || isempty(datacell{1}); break; end 
     if any(isnan(datacell{1}(end,:))); datacell{1}(end,:) = []; end

Walter Roberson le 7 Jan 2017

No, it is the same efficiency. But it certainly does not hurt to have it closer to where datacell is created.

Connectez-vous pour commenter.

How do I import Velocity 3.2.0 CSV DVH data into MATLAB 9.1 (R2016b)?

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Réponse acceptée

7 commentaires
Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

How do I import Velocity 3.2.0 CSV DVH data into MATLAB 9.1 (R2016b)?

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Réponse acceptée

7 commentaires Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

7 commentaires
Afficher 5 commentaires plus anciensMasquer 5 commentaires plus anciens