large textfile 27580*1102 cell

Question

chocho le 20 Fév 2017

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/325972-large-textfile-27580-1102-cell

Commenté : Rik le 21 Fév 2017

fid = fopen('Cancer.txt','r');
data={};
while ~feof(fid)
  l=fgetl(fid);   %get the lines
    if isempty(strfind(l,'NA')),  %remove NA rows
    else 
        continue
    end
        %read next line
     idx=regexp(l,'\t','split');   %split the colmuns of this line which don't have NA and look for ';' in every column and split it 
      [nrow,ncol]=size(idx);  
      for i=1:ncol  
                if idx(i)==';'   %look for columns which have ';'and split it 
                split this column into 2 columns and put the second column
                       into a new row
                        idx = regexp(idx,';','split')
                        l=[{l(1:idx-1)}; {[l(1:itab) l(idx+1:end)]}]; %split the line into 2
                end
                      i=i+1;
             end
            fprintf(fid,l,idx);
  end
  fid=fclose(fid);

inputs:

Hybridization REF  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05
Composite Element REF  Beta_value  Gene_Symbol  Chromosome  Genomic_Coordinate  Beta_value  Gene_Symbol
cg00000292  0.511852232819811  ATP2A1  16  28890100  0.787687855895422  ATP2A1
cg00003994  0.0341977140819682    MEOX2   15725862  0.334815614333325     MEOX2
cg00008493  0.987979722052904  "COX8C;KIAA1409"  14  93813777  0.986128428295584  "COX8C;KIAA1409"
cg00011459  0.922491239231445  "TMEM186;PMM2"  16  8890425  0.961124285303233  "TMEM186;PMM2"

output:

Hybridization REF  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05 ……
cg00000292  0.511852232819811  ATP2A1   0.787687855895422  
cg00003994  0.0341977140819682    MEOX2   0.334815614333325     
cg00008493  0.987979722052904  COX8C     0.986128428295584      
cg00008493  0.987979722052904  KIAA1409  0.986128428295584

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

chocho le 21 Fév 2017

Modifié(e) : Walter Roberson le 21 Fév 2017

Ouvrir dans MATLAB Online

textfile have these informations:

Hybridization REF  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05
Composite Element REF  Beta_value  Gene_Symbol  Chromosome  Genomic_Coordinate  Beta_value  Gene_Symbol
cg00000292  0.511852232819811  ATP2A1  16  28890100  0.787687855895422  ATP2A1
cg00003994  0.0341977140819682    MEOX2   15725862  0.334815614333325     MEOX2
cg00008493  0.987979722052904  "COX8C;KIAA1409"  14  93813777  0.986128428295584  "COX8C;KIAA1409"
cg00011459  0.922491239231445  "TMEM186;PMM2"  16  8890425  0.961124285303233  "TMEM186;PMM2"
.......................................................................

i want to get this output :

Hybridization REF  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05  TCGA-A6-2672-11A-01D-1551-05 ……
cg00000292  0.511852232819811  ATP2A1   0.787687855895422  
cg00003994  0.0341977140819682    MEOX2   0.334815614333325     
cg00008493  0.987979722052904  COX8C     0.986128428295584      
cg00008493  0.987979722052904  KIAA1409  0.986128428295584

chocho le 21 Fév 2017

Appreciate your help!

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Rik le 21 Fév 2017

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/325972-large-textfile-27580-1102-cell#answer_255655

So essentially you have a tab separated file, where you only want to keep specific columns.

You can read a file like this with readtable. If you really have to go through it line-by-line you can use a for loop, but with this syntax you should be able to select the columns you want to keep. (and with writetable you can write the new file)

Note1: You can set the 'Delimiter' parameter to a tab with '\t'.

Note2: You'll need Matlab 2013b or later. Otherwise you'll have to muck about with the textscan function.

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

chocho le 21 Fév 2017

Ouvrir dans MATLAB Online

 yeah i want like this then from those 6:4:end , i want to calculate the average between them because all of them are of type float
so plz could you help me to do that it seems for me to hard to do it .
i really appreciate your help

Rik le 21 Fév 2017

If you have managed to convert your data to a matrix, then you can use the command mean(data,2) to get the average along the 2nd dimension (so the columns)

Connectez-vous pour commenter.

large textfile 27580*1102 cell

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Réponse acceptée

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Community Treasure Hunt

large textfile 27580*1102 cell

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Réponse acceptée

5 commentaires Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Community Treasure Hunt

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens