Reset GPU & Clear its Memory

Question

1 vote

I'm running simulations and computations in MATLAB using some reasonably big data sets, and the bulk of the work is done on the GPU. I can only get through about a third of the work I need to do before I receive an error saying the GPU memory is full:

Warning: An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_OUT_OF_MEMORY

I've had this problem for a while, and have tried to get around it by resetting the GPU between each simulation, using any and all of the following:

gpuDevice;
gpuDevice(1);
reset(gpuDevice(1));
wait(gpuDevice(1));

None of these work, neither on their own or combined, nor do they work if I attempt them after my simulations have crashed out. There seems to be no effective way to reset/flush the GPU other than a reboot of my computer.

I'm getting work done this way, but it's slow, and annoying, and means I can't just leave my code running over the weekend as I'd like to - only half of it gets done. I'm sure there must be a way to reset the GPU in MATLAB, and if one of the methods I've tried is correct, what am I doing wrong?

Any ideas?

EDIT: Problem occurs on both R2016a and the R2017a Prerelease.

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Dan Johnson le 23 Jan 2017

Modifié(e) : Dan Johnson le 23 Jan 2017

Ouvrir dans MATLAB Online

Thanks for the comments. I'm running a GeForce GTX 960.

I'd love to provide you with an example, but short of copying out my entire codebase I'm not sure what I could post that would be helpful. Here's the code I execute for each data run (I've renamed the functions for clarity):

for m = 1:8
  inputVars = CreateVars();
  SimulateData(inputVars);
  for n = 1:50
    [outputVars] = RunReconstruction(inputVars);
    save([savePath(m,n)],'outputVars');
  end
close all; clear;
end

NOTE: 1. RunReconstruction() gathers the "outputVars" before passing them back. 2. I typically get to m=4 before I get the CUDA error.

Joss Knight le 20 Juil 2017

I think you're going to have to try to create a minimal reproduction that is a condensed version of your code, otherwise it's impossible to diagnose. Also see below for advise about monitoring your memory usage.

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Joss Knight le 23 Jan 2017

1 vote

Presumably your simulations are adding results continually to some output variables, which are getting larger and larger. Try gathering your results back to the CPU so that you're not clogging up GPU memory with data that isn't being used for computation any more.

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

Joss Knight le 20 Juil 2017

Ouvrir dans MATLAB Online

No, MATLAB releases variables as soon as they are no longer referenced. But it's common for users to run scripts rather than functions, and to aggregate results into a big output array that sits in their MATLAB workspace, e.g.

results(end+1,:) = myNewResults;

Why don't you run your simulation and monitor GPU memory in a separate terminal or command window using nvidia-smi, something like:

nvidia-smi -l 1 -q -d MEMORY

If memory usage is continually going up then you've got some sort of problem with your simulation not releasing variables.

Vitaly Bur le 29 Oct 2020

Ouvrir dans MATLAB Online

I have a same problem with clear GPU memory: After executing this code, the GPU memory is use by 2 GB. Only the D matrix in GPU memory...

A=fix(gpuArray(rand(1,1000))*99)+1;
B=fix(gpuArray(rand(1,1000))*99)+1;
C=gpuArray(rand(100000,100));
E=C(:,A);
F=C(:,B);
D=E.*F;
clear E F C A B

However, if I execute this code.

D=gpuArray(rand(100000,1000));

There will also be a D matrix (same size) in GPU memory, but now it only use 1 GB of GPU memory. Why is there a difference? and how to clear the memory in the first variant?

Connectez-vous pour commenter.

Answer 2

Remi D le 19 Juil 2017

Ouvrir dans MATLAB Online

0 votes

I also think there is a problem. I as soon as I call a cuda mex file, running reset(gpuDevice) would throw an error.

Error using parallel.gpu.CUDADevice/reset
An unexpected error occurred during CUDA execution. The CUDA error was:
all CUDA-capable devices are busy or unavailable

If I don't try to call reset, I can call again the mex function and it works fine. But as soon as I use reset, the only way to use the GPU is to restart Matlab.

I guess I have to go back to C and leave Matlab in the drawer when I need parallel computing :(

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Joss Knight le 20 Juil 2017

Modifié(e) : Joss Knight le 20 Juil 2017

Ouvrir dans MATLAB Online

If you are using custom MEX functions then we'd have to know more about what they're doing to diagnose. Are you storing state, GPU memory, cufft plans? Are you spinning off threads that are using the GPU? You may need to register a listener to the GPUDeviceManager's DeviceDeselecting event (see the documentation here) in order to respond to a call to reset by tidying up your state or waiting for threads to finish.

Another very common scenario is that your custom MEX function is erroring, perhaps seriously, and you are not checking or clearing up that error. If the next thing you do on the GPU is to call reset, than that will be the first place to detect and report the error. So ensure your mex function ends with something like

cudaDeviceSynchronize();
auto err = cudaGetLastError();
if (err != cudaSuccess) {
    mexPrintf("CUDA error: %s\n", cudaGetErrorString(err));
}

Connectez-vous pour commenter.

Reset GPU & Clear its Memory

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Réponses (2)

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Catégories

Produits

Tags

Community Treasure Hunt

Reset GPU & Clear its Memory

4 commentaires Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

Réponses (2)

3 commentaires Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

1 commentaire Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Catégories

Produits

Tags

Voir également

Community Treasure Hunt

4 commentaires
Afficher 2 commentaires plus anciens Masquer 2 commentaires plus anciens

3 commentaires
Afficher 1 commentaire plus ancien Masquer 1 commentaire plus ancien

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens