Shared memory in parfor genetic algorithm

16 vues (au cours des 30 derniers jours)
Alexander Kobyzhev
Alexander Kobyzhev le 11 Mai 2022
Hello, community Matlab!
I use gamultiobj with parallelization, since my optimization function takes a lot of time when performing. To accelerate this process, I want to caching repeated data in some array or map, which can be calculated in the optimization function. That is, the functions can come to the input that the data that does not make no sense that it makes no sense to re-consider. All workers must have access to this array, read and write down values.
The problem is that I do not know how to do it, because gamultiobj uses parfor inside himself, in which, as I understand it, not to transmit data between workers. I wanted to use global variables, but they cannot be used in parfor. I would have perfectly suited the implementation of the LabProbe / Labreceive / Labsend, which are unfortunately used in spmd.
Thanks, Alexander.

Réponse acceptée

Edric Ellis
Edric Ellis le 12 Mai 2022
If you're using R2022a or later, you could use ValueStore to allow workers to share values. The main requirement here is for you to come up with a way to convert the input arguments of your function into a "key" that can be used with the ValueStore. If that is straightforward, then you might be able to get things to work like this:
if isempty(gcp("nocreate")); parpool("local"); end
Starting parallel pool (parpool) using the 'local' profile ... Connected to the parallel pool (number of workers: 2).
parfor ii = 1:10
val = cachingMagic(randi(3));
out(ii) = sum(val, "all");
Analyzing and transferring files to the workers ...done. Cache miss for magic_3 at 12-May-2022 12:43:00 Cache miss for magic_2 at 12-May-2022 12:43:00 Cache hit for magic_2 at 12-May-2022 12:43:01 Cache hit for magic_1 at 12-May-2022 12:43:01 Cache hit for magic_2 at 12-May-2022 12:43:01 Cache hit for magic_1 at 12-May-2022 12:43:01 Cache hit for magic_2 at 12-May-2022 12:43:00 Cache miss for magic_1 at 12-May-2022 12:43:00 Cache hit for magic_2 at 12-May-2022 12:43:01 Cache hit for magic_2 at 12-May-2022 12:43:01
% cachingMagic returns "magic(in)", with caching.
function out = cachingMagic(in)
in (1,1) double {mustBeInteger}
% Because the input is a simple scalar, we can generate a string key very
% easily.
cacheKey = sprintf("magic_%d", in);
% getCurrentValueStore returns empty on the client, so we should guard
% against that
vs = getCurrentValueStore();
isCached = ~isempty(vs) && vs.isKey(cacheKey);
if isCached
fprintf('Cache hit for %s at %s\n', cacheKey, string(datetime));
out = vs(cacheKey);
fprintf('Cache miss for %s at %s\n', cacheKey, string(datetime));
% Not in cache, must compute
out = magic(in);
% Introduce an arbitrary delay to simulate slow computation
% If we have a ValueStore, cache the result
if ~isempty(vs)
vs(cacheKey) = out;
  1 commentaire
Alexander Kobyzhev
Alexander Kobyzhev le 12 Mai 2022
Thank you very much for the answer, it helped me to realize the cache!
Sending and receiving data between workers is not as fast as we would like. The profiler showed that for one thousand calls get and put go for about 25-30 seconds, which is not very fast. However, even so I got an increase in performance by 25%.

Connectez-vous pour commenter.

Plus de réponses (1)

Walter Roberson
Walter Roberson le 11 Mai 2022
You can use Parallel Data Queue to send results back from the worker to the controller, and another set to distribute results to the worker. It is a bit of a nuisance, and might not be efficient.
You could also do something like hash the arguments to get an index to use into a memory map. This might be a challenge to do efficiently.
  1 commentaire
Alexander Kobyzhev
Alexander Kobyzhev le 12 Mai 2022
Thanks for the answer!
It turns out to obtain at least some values between the client and workers I need to create 1 DataQueue on the client side and 4 DataQueue for workers (if I have 4 workers). At the same time, I need to send these 4 DataQueue from workers to the client via DataQueue client. I can also process the data obtained only after each generation of the genetic algorithm, which is not very good, because in several cases the same data can be cast and sent to the client. (loss of performance)
Apparently I have to abandon this idea with caching...

Connectez-vous pour commenter.


En savoir plus sur Startup and Shutdown dans Help Center et File Exchange




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by