Discarded Messages with SPMD and labReceive ... why?
3 views (last 30 days)
I am using SPMD and trying to get some workers communicating w/ each other. There is a flag they need to send/receive. Whoever gets there job done and comitted first, sends out the flag, which the remaining workers should receive and therefore not commit their work.
Here is some abstact code that hopefully gets the point across of what I am trying to do. I would have thought the labBarrier at the bottom would have ensured all workers coming in 2nd place and after would have received the flag from the first workker finished. Some do, but .... I also get many of the warning messages similar to the following:
Warning: An incoming message was discarded from lab 2 (tag: 2)
Indeed some workers are indeed missing the message, even if they finish seconds after that flag was sent out.
How does labSend work? I am missing something here?
% Emulating workers doing some variable time task
% See if other workers got their first and sent an update
[Updates(i),srcWkrIdx,tag] = labReceive(i,2);
Updates(i) = 0;
% Commit work
flag = 1
% Otherwise take a nap
flag = 0
labSend(flag,agentVec(agentVec ~= labindex),2);
Edric Ellis on 20 Jul 2022
Using conditional receives in this way is not a robust way to get the workers to collaborate - you have an ordering problem that cannot be solved. I think you can probably achieve your goal by using one of the "reduction" functions which are designed to collect together results from multiple workers. In particular, you could try gcat to allow each worker to find out what happened on every other worker. gcat (effectively) collects values from all workers and concatenates them together on each worker. In this way, you don't need the labBarrier call either. Something a bit like this:
myResult = doSomeWork();
allResults = gcat(myResult);
% Now, choose what to do based on the results from all workers.