numerical instabilites for GPU results

Question

0 votes

I run this code

T=randn(10000,64);
data=randn(1000,64,10);
Tg=gpuArray(T);
datag=gpuArray(data);
res=zeros(10000,1000);
resg=gpuArray(res);
for i=1:10
    res=res+T*data(:,:,i)';
end
for i=1:10
    resg=resg+Tg*datag(:,:,i)';
end
resg=gather(resg);
norm(res-resg,'fro')/norm(res,'fro')

where I would expect "res" (CPU comptuted) and "resg" (GPU computed) to be the same, but they are not.

I am running this on a Tesla Card, i.e.

gpuDevice

ans =

parallel.gpu.CUDADevice handle
Package: parallel.gpu
Properties:
                    Name: 'Tesla C1060'
                   Index: 1
       ComputeCapability: '1.3'
          SupportsDouble: 1
           DriverVersion: 3.2000
      MaxThreadsPerBlock: 512
        MaxShmemPerBlock: 16384
      MaxThreadBlockSize: [512 512 64]
             MaxGridSize: [65535 65535]
               SIMDWidth: 32
             TotalMemory: 4.2948e+09
              FreeMemory: 4.0671e+09
     MultiprocessorCount: 30
             ComputeMode: 'Default'
    GPUOverlapsTransfers: 1
  KernelExecutionTimeout: 0
        CanMapHostMemory: 1
         DeviceSupported: 1
          DeviceSelected: 1
Methods, Events, Superclasses

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

Felix le 18 Mai 2011

There are large numerical differences, i.e.norm(res-resg,'fro')/norm(res,'fro') returns something on the order of 1e234. These are clearly no subtle BLAS differences. I suspect there is something wrong when moving data between the CPU and the GPU?

Gaszton le 19 Mai 2011

I runned the code on my gt425m:

ans =

2.4946e-016

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Felix le 20 Mai 2011

0 votes

I upgraded to the latest drivers

270.41.19

, which seems to have fixed the problem.

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

James Tursa le 20 Mai 2011

FYI, it is bad form to accept your own answer when Edric was the one that suggested updating your drivers.

Connectez-vous pour commenter.

Answer 2

Edric Ellis le 19 Mai 2011

Ouvrir dans MATLAB Online

2 votes

I've just run this using R2011a on Linux and Windows using C1060 cards, and in each case the final "norm" calculation gives a result of around 2e-16. So, this should work! Could you post the output of running

parallel.internal.gpu.CUDADriverVersion

and

ver distcomp

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Felix le 20 Mai 2011

what is your driver version?

When I run this:

T=randn(10000,64);

A=randn(1000,64);

Ag=gpuArray(A);

Tg=gpuArray(T);

res=gather(Tg*Ag');

norm(res-T*A','fro')/norm(T*A','fro')

I get ~1e-16 at first and ~0.05 on repeated runs, so there is a problem in the matrix mult.

Sean de Wolski le 14 Mar 2012

Copying Felix' first post with license censored:

Here it is:

parallel.internal.gpu.CUDADriverVersion

ans =

260.19.26

ver distcomp

-------------------------------------------------------------------------------------

MATLAB Version 7.12.0.635 (R2011a)

MATLAB License Number: ############

Operating System: Linux 2.6.30.10-105.2.23.fc11.x86_64 #1 SMP Thu Feb 11 07:06:34 UTC 2010 x86_64

Java VM Version: Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode

-------------------------------------------------------------------------------------

Parallel Computing Toolbox Version 5.1 (R2011a)

Connectez-vous pour commenter.

numerical instabilites for GPU results

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

Réponse acceptée

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Plus de réponses (1)

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Catégories

Produits

Tags

Community Treasure Hunt

numerical instabilites for GPU results

3 commentaires Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

Réponse acceptée

1 commentaire Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Plus de réponses (1)

4 commentaires Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Catégories

Produits

Tags

Voir également

Community Treasure Hunt

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens