Hello, I have a .pro script (IDL) and I want to implement in Matlab. I don't know the equivalent of READU (read unformatted file) of IDL in Matlab. The original script was:
OPENR, lun, fileName2, /GET_LUN ; read the input file
var0a = BYTARR(4) ; create array to load 4 char source program identifier
var0b = var0a; ; create 2nd array to load 4 char file type identifier
var1 = FLTARR(1)
var2 = LONARR(2)
var3 = DBLARR(4)
READU, lun, var0a, var1, var0b, var2, var3
If I try to open the file with notepad I see only symbols.
I have used
fid = fopen('GSS1_t001_cx_ImgRall.dat');
col = fread(fid);
fclose(fid);
and I get a column vector, but I have to assign as made in IDL, so I would like to get an array of 4 char, an array of one element, an array of 4 char, an array of two elements, an array of four elements.
I don't know the use of lun. Maybe it tells us the number of times that we have to get the results.

 Réponse acceptée

Walter Roberson
Walter Roberson le 11 Mar 2016

0 votes

IDL's lun stands for "logical unit number", and it serves the same purpose as MATLAB's fileid.
fid = fopen('GSS1_t001_cx_ImgRall.dat', 'r');
var0a = fread(fid, [1 4], '*uint8');
var0b = fread(fid, [1 4], '*uint8');
var1 = fread(fid, [1 1], '*float');
var2 = fread(fid, [1 2], '*long');
var3 = fread(fid, [1 4], '*double');
fclose(fid);
Note: the above code assumes that the data in the file is the same "endian" as the system running the code. All current releases of MATLAB happen to be on architectures that are "little endian", so the code assumes that as well. If that does not match the file (certainly possible if you got the file somewhere else) then on the fopen() command line, after the 'r', add an addition parameter 'b' (big-endian)
"endian" in files and in memory refers to whether a memory address is to refer to the "Most Significant Byte" or to the "Least Significant Byte" of the data. So if you have two consecutive bytes in memory, such as hex 01 and then hex 02 in that order, and if you are reading 16 bits ("short", int16), is the 01 the most significant byte of the short or is it the least significant byte of the short? To put it a different way, if the file has hex 01 then 02, is the numeric value to be 1 * 256 + 2, or is it to be 1 + 256 * 2 ? Just like writing coefficients of a polynomial, do you write the coefficient of x^0 then the coefficient of x^1 then the coefficient of x^2 and so on, or do you write the coefficient of x^2 followed by the coefficient of x^1 followed by the coefficient of x^0 ? The two orders are not compatible with each other, but are both internally inconsistent. "Little endian" architectures like all of the x86 and x64 architectures, would say that the order in memory should correspond to 1 + 256 * 2 = 513, whereas "Big endian" architectures would say the order in memory should correspond to 1 * 256 + 2 = 258 . The Internet standards say that when data is transmitted between systems where the byte order is not specifically indicated, that "big endian" order should be used, but that is not the same as what Intel processors use. So if you got your file from outside, such as from a research site, it might be in big-endian order.

9 commentaires

OldCar
OldCar le 11 Mar 2016
Modifié(e) : Walter Roberson le 11 Mar 2016
I have tried your solution but it doesn't work. I get a result but it is nonsense. I have tried both big and little endian.
I have tried
clear
clc
fid = fopen('GSS1_t001_cx_ImgRall.dat');
C = textscan(fid,'%4s %f %4s %u %f');
fclose(fid);
but I obtain only the first 4 char, the other are null.
The big issue is that the file is home made and I have no idea about its composition.
I have made a mistake, I need to obtain (for what I have understood from the IDL code) the objects var0a, var1, var0b, var2, var3 defined as
var0a = BYTARR(4) ; create array to load 4 char source program identifier
var0b = var0a; ; create 2nd array to load 4 char file type identifier
var1 = FLTARR(1)
var2 = LONARR(2)
var3 = DBLARR(4)
Walter Roberson
Walter Roberson le 11 Mar 2016
What are the expected values, as determined by IDL?
Is it possible that the expected value for var1 is 1.17 ?
OldCar
OldCar le 11 Mar 2016
Modifié(e) : OldCar le 11 Mar 2016
I don't know, but it could be. I don't have IDL so I can't run the code. Did you get the other values? I think that var0a is GSS1. What did you get as var0b? Which code did you use on Matlab? With my code I have a problem with the variable separator.
The first 4 bytes are 'GSS1'. The 4 bytes after that are numeric. The 3rd group of 4 bytes are 'comp'.
That group of 4 numeric bytes might represent any of:
  • 1066779279 - long, or unsigned long, little endian order
  • 1.16999995708465576171875 - float, little endian order. This is the closest approximation single precision numbers can make to 1.17
  • [-15729, 16277] - signed short (16 bit), little endian
  • [143, 194, 149, 63] - unsigned byte, either endian
  • [-113, -62, -107, 63] - signed byte, either endian
  • 2411894079 - unsigned long, big endian order
  • -1883073217 - signed long, big endian order
  • -1.918736445581672621892012844971095850555915077770112453503514871044899336993694305419921875e-29 -- float, big endian order
  • [-28734 -27329] - signed short (16 bit), big endian
Out of those, the little-endian value of 1.17 has the look of being least arbitrary, so I suspect your data is little endian and that your var1 is before your var0b.
The 8 bytes after that look like they might be [1, 117] as long little-endian. That would be plausible as your var2
If we then guess that what follows that is 4 double in little endian, then those would come out as
[74.9500000000000028421709430404007434844970703125, Inf, 300, 3350]
That is the closest double representation of 74.95, and the 300 and 3350 are exact. The big-endian versions of the values are pretty arbitrary, not "nice" at all. I think it pretty likely that we have successfully identified as little endian.
This leads to an overall solution of:
fid = fopen('GSS1_t001_cx_ImgRall.dat', 'r');
var0a = fread(fid, [1 4], '*char');
var1 = fread(fid, [1 1], '*float');
var0b = fread(fid, [1 4], '*char');
var2 = fread(fid, [1 2], '*long');
var3 = fread(fid, [1 4], '*double');
fclose(fid);
OldCar
OldCar le 12 Mar 2016
Modifié(e) : Walter Roberson le 12 Mar 2016
I am uncertaint because the next instructions are
nCol = var2[0] ; copy column and row numbers to suitably named variables
nRow = var2[1]
var4 = DBLARR( 2*nCol, nRow ) ; now know image size so declare a variable for it and read it
READU, lun, var4
and I have tried
fid = fopen('GSS1_t001_cx_ImgRall.dat', 'r');
var0a = fread(fid, [1 4], '*char');
var1 = fread(fid, [1 1], '*float');
var0b = fread(fid, [1 4], '*char');
var2 = fread(fid, [1 2], '*long');
var3 = fread(fid, [1 4], '*double');
var4 = fread(fid, [2*var2(1) var2(2)], '*double');
fclose(fid);
but I get NaN in every element of var4
Walter Roberson
Walter Roberson le 12 Mar 2016
The entire rest of the file is indeed filled with a representation that looks completely meaningless unless you interpret it as NaN, in which case it is absolutely the correct pattern, the usual non-signaling NaN (NaN is a family of values, and if it had any other bit pattern in that family I would have said it was suspicious, but it is exactly the bit pattern for the most common NaN.)
OldCar
OldCar le 12 Mar 2016
With the command fid = fopen('GSS1_t001_cx_ImgRall.dat'); col9 = fread(fid); fclose(fid);
I obtain a vector of 1924 elements and they are not NaN?
Walter Roberson
Walter Roberson le 12 Mar 2016
Those would be uint8 values, with the pattern 0 0 0 0 0 0 248 255 repeated, and that happens to be the pattern for nan in little endian. The simplest explanation is that the data is indeed nan.
OldCar
OldCar le 17 Mar 2016
you are write. Thank you for your help

Connectez-vous pour commenter.

Plus de réponses (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by