Effacer les filtres
Effacer les filtres

How to interpret tocBytes results

3 vues (au cours des 30 derniers jours)
Jim Riggs
Jim Riggs le 21 Juil 2023
Commenté : Jim Riggs le 22 Juil 2023
I am currently running on R2019b, but when I attempt to start parpool using the 'threads' option, it tells me that 'threads' is not a valid option for parpool (the only option is 'local'). I noticed that 'threadPool' was introduced in 2020a, so I am guessing that perhaps this means that I need a later MATLAB release to use the 'threads' option with parpool.
According to the descision chart located Here - using threads-based local pool is advantageous if you are running on a single machine, and there is a large amount of data being transferred to each worker.
So I have 2 questions;
  1. What is preventing me from being able to use the 'threads' option (Do I need a MATLAB release later than R2019b?)
  2. How do I interpret the results from tocBytes to determine if 'threads' will be a benefit to me?
I am running on a system with Dual Xeon Gold 6148 CPUs @ 2.4 Ghz, 256 Gb, 20 cores (each) Total of 40 cores.
Using 36 workers, tocBytes shows the following data transfer to the workers:
Min: 35 Mb, Max: 506 Mb, Mean: 153 Mb, Median: 95.4 Mb
Total (all 36 workers) is 5.5 Gb
So, are these numbers considered "large", and would I expect to see a benefit in using threads-based processing.

Réponse acceptée

Walter Roberson
Walter Roberson le 21 Juil 2023
You need R2020a or later to use parpool("threads")
However if you are using R2021b or later, it is recommended that you use backgroundPool
Unfortunately, ticBytes() and tocBytes() do not work in parpool("threads") or backgroundPool so it is not possible to use those tools to compare the data transfer.
My understanding is that for ordinary numeric classes, that shared pointers are used for the different threads, but that if copy-on-write is needed that the newly allocated memory is on a per-thread memory pool (so that it can be easily released when the parfeval() finishes) . However, I have not yet been able to come up with a consistent internal description of how threads work that would lead to the same limitations as threads have -- the architectures I have come up with mentally would have fewer limitations than thread pools have in practice. Either that or the architectures I come up with might block all handle objects... I haven't figured out yet what Mathworks is doing that would allow some handle objects to work with thread-shared memory but would still require the same limitations that are seen in practice.
  6 commentaires
Walter Roberson
Walter Roberson le 21 Juil 2023
Modifié(e) : Walter Roberson le 21 Juil 2023
Suppose that you were able to eliminate 100% of the 5.5 gigabytes. That would reduce your computation time by
format long g
bytes_to_transfer = 5.5 * 10^9;
max_bandwidth_bytes_per_second = 119.21 * 2^30
max_bandwidth_bytes_per_second =
128000762839.04
seconds_to_transfer = bytes_to_transfer / max_bandwidth_bytes_per_second
seconds_to_transfer =
0.0429684939215262
Which would be roughly 1/23 of a second. Which is less than the measurment error of "44 minutes"
Jim Riggs
Jim Riggs le 22 Juil 2023
Thank you for the analysis. This is very helpful.

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Parallel Computing Fundamentals dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by