Half precision using GPU

As pointed out, gpuArray does not support half. The main reason is that half is an emulated type only meaningful for deployment to special hardware, it is not native to most processors. Feel free to investigate use of half for code generation.

Do you just want to store data in half to save space on the GPU? You can use the following code to get something like the behaviour you're after:

function u = toHalf(x)
realmaxHalf = single(65504);
x = min(max(x,-realmaxHalf),realmaxHalf);
[f,e] = frexp(abs(x));
sgn = uint16(x>=0);
sgnbit = bitshift(sgn,15);
expbits = bitshift(uint16(e+15),10);
fbits = uint16(f.*2.^10 - 1);
u = bitor(bitor(sgnbit, expbits), fbits);
end
function x = fromHalf(u)
if u == 0
    x = single(0);
    return
end
u = uint16(u);
sgn = single(bitshift(u,-15));
fbits = bitand(u,uint16(1023));
f = single(fbits+1)./(2.^10);
expbits = bitand(u,uint16(31744));
e = single(bitshift(expbits,-10))-15;
x = (sgn.*2-1).*f.*2.^e;
end

Note, this is a very crude implementation of fp16 that takes no account of nans, infs, correct overflow behaviour or denormals. The half version is just a uint16 with the data in it, you can't actually use it to compute anything in fp16.

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Joss Knight le 11 Avr 2023

'fraid not. No chance of that! Your only hope is to actually convert to int16 (by rescaling to some range), but you will find many blockers in the way such as integer overflow and unsupported mathematical operations. The code I gave you merely stores the number you have as a float into 16 bits; you can't actually do any computation with it.

Fernando le 11 Avr 2023

I see. The issue is that I gain more from having larger matrices as oppossed to have smaller ones with higher precision or digits in them.

I guess I could try to work with your solution while I figure out another way or buy a better gpu.

Connectez-vous pour commenter.

Answer 2

Matt J le 11 Avr 2023

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1944744-half-precision-using-gpu#answer_1213224

Modifié(e) : Matt J le 11 Avr 2023

GPU Code Generation does support it, but not the Parallel Computing Toolbox, which is where gpuArray is defined.

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Matt J le 11 Avr 2023

Modifié(e) : Matt J le 11 Avr 2023

You should probaly break the data sets into smaller chunks and process them sequentially. The GeForce RTX 3080 can only process about 70000 threads at a time anyway.

Fernando le 11 Avr 2023

Ok, I will try to look into this.

Connectez-vous pour commenter.

Half precision using GPU

2 commentaires
Afficher AucuneMasquer Aucune

Réponse acceptée

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (1)

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Voir également

Catégories

Tags

Community Treasure Hunt

Half precision using GPU

2 commentaires Afficher AucuneMasquer Aucune

Réponse acceptée

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (1)

3 commentaires Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Voir également

Catégories

Tags

Community Treasure Hunt

2 commentaires
Afficher AucuneMasquer Aucune

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien