Get next or prior single precision value MATLAB function ?

15 vues (au cours des 30 derniers jours)
Firan Lucian
Firan Lucian le 12 Août 2019
Modifié(e) : Stephen23 le 26 Jan 2021
Is there a MATLAB function for next or prior float32 number ?
(similar to nextafterf, nexttowardf or float_next, float_prior)
tried these functions:
function [out] = next_SP(val)
% Get next after SP value - float32, single precision
% Increment float with smalllest step, SP representable
% TODO check for sign changes, domain INF NAN subnormals changes,
% uint32 ovf realmax + 1, mantissa exp changes
% Check nextafterf from math.h
% The nextafter() functions return the next representable floating-point
% value following x in the direction of y. If y is less than x, these
% functions will return the largest representable number less than x.
% If x equals y, the functions return y.
% https://www.boost.org/doc/libs/1_46_1/libs/math/doc/sf_and_dist/html/math_toolkit/utils/next_float/float_next.html
% Returns the next representable value which is greater than x.
% If x is non-finite then returns the result of a domain_error.
% If there is no such value greater than x then returns an overflow_error.
int_v = typecast(single(val), 'uint32' );
%disp([ 'Init ' num2str(single(val)) ' 0x' num2hex(single(val)) ', ' dec2bin(int_v, 32) ]);
int_v = int_v + 1;
out = typecast(uint32(int_v), 'single' );
%disp([ 'Init ' num2str(out) ' 0x' num2hex(out) ', ' dec2bin(int_v, 32) ]);
end
and
function [out] = prior_SP(val)
% Get next before SP value - float32, single precision
% Decrement float with smalllest step, SP representable
% TODO check for sign changes, domain INF NAN subnormals changes,
% uint32 udf realmax + 1, mantissa exp changes
% Check nexttowardf from math.h
% https://www.boost.org/doc/libs/1_48_0/libs/math/doc/sf_and_dist/html/math_toolkit/utils/next_float/float_prior.html
% Returns the next representable value which is less than x.
% If x is non-finite then returns the result of a domain_error.
% If there is no such value less than x then returns an overflow_error.
int_v = typecast(single(val), 'uint32' );
%disp([ 'Init ' num2str(single(val)) ' 0x' num2hex(single(val)) ', ' dec2bin(int_v, 32) ]);
int_v = int_v - 1;
out = typecast(uint32(int_v), 'single' );
%disp([ 'Init ' num2str(out) ' 0x' num2hex(out) ', ' dec2bin(int_v, 32) ]);
end
nexttoward_SP(-0)
Init 0 0x80000000, 10000000000000000000000000000000
Init NaN 0x7fffffff, 01111111111111111111111111111111
nexttoward_SP(0)
Init 0 0x00000000, 00000000000000000000000000000000
Init 0 0x00000000, 00000000000000000000000000000000
nexttoward_SP(-inf)
Init -Inf 0xff800000, 11111111100000000000000000000000
Init -3.402823466385289e+38 0xff7fffff, 11111111011111111111111111111111
nexttoward_SP(Inf)
Init Inf 0x7f800000, 01111111100000000000000000000000
Init 3.402823466385289e+38 0x7f7fffff, 01111111011111111111111111111111
nexttoward_SP(nan)
Init NaN 0xffc00000, 11111111110000000000000000000000
Init NaN 0xffbfffff, 11111111101111111111111111111111
(maybe similar for half or double precision)
  5 commentaires
Firan Lucian
Firan Lucian le 12 Août 2019
Modifié(e) : Firan Lucian le 12 Août 2019
My code seems to work - just wonder if all special cases are addressed, as incrementing uint you can increment mantissa, exponent than sign.
Or some build in already available.
There are some differences compared to proposed :
x + eps(single(x))
x - eps(single(x))
close all
clear all
tic
% inf
dispv(inf);
% inf -
dispv(-inf);
% zero
dispv(0);
% zero -
dispv(-0);
% 1
dispv(1);
% 1 -
dispv(-1);
% nan
dispv(nan);
% realmin
dispv(realmin('single'));
% realmin-
dispv(-realmin('single'));
% realmax
dispv(realmax('single'));
% realmax-
dispv(-realmax('single'));
toc
function dispv(x)
x = single(x);
disp([' x ' num2str(x) ]);
disp(['x-eps(x) ' num2str(x-eps(x)) ...
', x+eps(x) ' num2str(x+eps(x)) ]);
disp(['prior_SP(x) ' num2str(prior_SP(x)) ...
', next_SP(x) ' num2str(next_SP(x)) ]) ;
disp([ 'x ' dec2bin(typecast(x,'uint32'), 32) ...
' ' dec2bin(typecast(x,'uint32'), 32) ]);
disp([ 'eps ' dec2bin(typecast((x-eps(x)),'uint32'), 32) ...
' ' dec2bin(typecast((x+eps(x)),'uint32'), 32) ]);
disp([ 'p n ' dec2bin(typecast(prior_SP(x),'uint32'), 32) ...
' ' dec2bin(typecast(next_SP(x),'uint32'), 32) ]);
end
prior_next_SP_tester
x Inf
x-eps(x) NaN, x+eps(x) NaN
prior_SP(x) 3.402823466385289e+38, next_SP(x) NaN
x 01111111100000000000000000000000 01111111100000000000000000000000
eps 11111111110000000000000000000000 11111111110000000000000000000000
p n 01111111011111111111111111111111 01111111100000000000000000000001
x -Inf
x-eps(x) NaN, x+eps(x) NaN
prior_SP(x) -3.402823466385289e+38, next_SP(x) NaN
x 11111111100000000000000000000000 11111111100000000000000000000000
eps 11111111110000000000000000000000 11111111110000000000000000000000
p n 11111111011111111111111111111111 11111111100000000000000000000001
x 0
x-eps(x) -1.4013e-45, x+eps(x) 1.4013e-45
prior_SP(x) 0, next_SP(x) 1.4013e-45
x 00000000000000000000000000000000 00000000000000000000000000000000
eps 10000000000000000000000000000001 00000000000000000000000000000001
p n 00000000000000000000000000000000 00000000000000000000000000000001
x 0
x-eps(x) -1.4013e-45, x+eps(x) 1.4013e-45
prior_SP(x) NaN, next_SP(x) -1.4013e-45
x 10000000000000000000000000000000 10000000000000000000000000000000
eps 10000000000000000000000000000001 00000000000000000000000000000001
p n 01111111111111111111111111111111 10000000000000000000000000000001
x 1
x-eps(x) 1, x+eps(x) 1
prior_SP(x) 1, next_SP(x) 1
x 00111111100000000000000000000000 00111111100000000000000000000000
eps 00111111011111111111111111111110 00111111100000000000000000000001
p n 00111111011111111111111111111111 00111111100000000000000000000001
x -1
x-eps(x) -1, x+eps(x) -1
prior_SP(x) -1, next_SP(x) -1
x 10111111100000000000000000000000 10111111100000000000000000000000
eps 10111111100000000000000000000001 10111111011111111111111111111110
p n 10111111011111111111111111111111 10111111100000000000000000000001
x NaN
x-eps(x) NaN, x+eps(x) NaN
prior_SP(x) NaN, next_SP(x) NaN
x 11111111110000000000000000000000 11111111110000000000000000000000
eps 11111111110000000000000000000000 11111111110000000000000000000000
p n 11111111101111111111111111111111 11111111110000000000000000000001
x 1.1755e-38
x-eps(x) 1.1755e-38, x+eps(x) 1.1755e-38
prior_SP(x) 1.1755e-38, next_SP(x) 1.1755e-38
x 00000000100000000000000000000000 00000000100000000000000000000000
eps 00000000011111111111111111111111 00000000100000000000000000000001
p n 00000000011111111111111111111111 00000000100000000000000000000001
x -1.1755e-38
x-eps(x) -1.1755e-38, x+eps(x) -1.1755e-38
prior_SP(x) -1.1755e-38, next_SP(x) -1.1755e-38
x 10000000100000000000000000000000 10000000100000000000000000000000
eps 10000000100000000000000000000001 10000000011111111111111111111111
p n 10000000011111111111111111111111 10000000100000000000000000000001
x 3.402823466385289e+38
x-eps(x) 3.402823263561193e+38, x+eps(x) Inf
prior_SP(x) 3.402823263561193e+38, next_SP(x) Inf
x 01111111011111111111111111111111 01111111011111111111111111111111
eps 01111111011111111111111111111110 01111111100000000000000000000000
p n 01111111011111111111111111111110 01111111100000000000000000000000
x -3.402823466385289e+38
x-eps(x) -Inf, x+eps(x) -3.402823263561193e+38
prior_SP(x) -3.402823263561193e+38, next_SP(x) -Inf
x 11111111011111111111111111111111 11111111011111111111111111111111
eps 11111111100000000000000000000000 11111111011111111111111111111110
p n 11111111011111111111111111111110 11111111100000000000000000000000
Elapsed time is 0.054545 seconds.
Firan Lucian
Firan Lucian le 12 Août 2019
Modifié(e) : Firan Lucian le 12 Août 2019
similar implementation
float ulpsIncrement(float f, int32_t ulps)
{
if (isnan(f) || isinf(f)) return f;
int32_t i;
memcpy(&i, &f, sizeof(float));
i += ulps;
memcpy(&f, &i, sizeof(float));
return f;
}
GLM_FUNC_QUALIFIER float nextafterf(float x, float y)
{
volatile float t;
glm::detail::int32 hx, hy, ix, iy;
GLM_GET_FLOAT_WORD(hx, x);
GLM_GET_FLOAT_WORD(hy, y);
ix = hx&0x7fffffff; // |x|
iy = hy&0x7fffffff; // |y|
if((ix>0x7f800000) || // x is nan
(iy>0x7f800000)) // y is nan
return x+y;
if(x==y) return y; // x=y, return y
if(ix==0) { // x == 0
GLM_SET_FLOAT_WORD(x,(hy&0x80000000)|1);// return +-minsubnormal
t = x*x;
if(t==x) return t; else return x; // raise underflow flag
}
if(hx>=0) { // x > 0
if(hx>hy) { // x > y, x -= ulp
hx -= 1;
} else { // x < y, x += ulp
hx += 1;
}
} else { // x < 0
if(hy>=0||hx>hy){ // x < y, x -= ulp
hx -= 1;
} else { // x > y, x += ulp
hx += 1;
}
}
hy = hx&0x7f800000;
if(hy>=0x7f800000) return x+x; // overflow
if(hy<0x00800000) { // underflow
t = x*x;
if(t!=x) { // raise underflow flag
GLM_SET_FLOAT_WORD(y,hx);
return y;
}
}
GLM_SET_FLOAT_WORD(x,hx);
return x;
}

Connectez-vous pour commenter.

Réponse acceptée

James Tursa
James Tursa le 12 Août 2019
Modifié(e) : James Tursa le 12 Août 2019
The designers of IEEE floating point were brilliant. The next largest value (in magnitude) is always obtained by just adding 1 to the underlying integer bit pattern. When the integer value rolls over into the mantissa part, guess what ... that’s the correct result. When the value reaches realmax and you add 1 to the underlying integer, guess what ... you get the correct result of inf. Brilliant! The issues of 0 crossings and NaN have already been noted by Walter. You will have to decide for yourself what the various NaN bit patterns mean for your application.
  12 commentaires
Steven Lord
Steven Lord le 26 Jan 2021
format hex
x = 3
x =
4008000000000000
y = x + eps(x)
y =
4008000000000001
Note that x and y differ only in the last bit.
z = realmax
z =
7fefffffffffffff
w = z + eps(z)
w =
7ff0000000000000
isinf(w)
ans = logical
1
Stephen23
Stephen23 le 26 Jan 2021
Modifié(e) : Stephen23 le 26 Jan 2021
The original question specifies single type:
format hex
x = single(3)
x = single
40400000
y = x + eps(x)
y = single
40400001
However this does not completely answer the question, which as well as the next value also requests the previous value. As far as I can tell, the previous value can be obtained like this (for positive values):
x = flintmax('single')
x = single
4b800000
z = x - eps(x-eps(x)) % correct
z = single
4b7fffff
z = x - eps(x) % incorrect
z = single
4b7ffffe
As far as I can tell, previous negative values only require this:
x = -flintmax('single')
x = single
cb800000
y = x - eps(x)
y = single
cb800001
By symmetry this implies that the next negative value would actually be given by:
y = x + eps(x+eps(x)) % correct
y = single
cb7fffff
y = x + eps(x) % incorrect
y = single
cb7ffffe
This is just me thinking aloud... please feel free to continue / comment. Is there a neat way to combine the 4 cases?

Connectez-vous pour commenter.

Plus de réponses (0)

Produits


Version

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by