Effacer les filtres
Effacer les filtres

Windows named pipe for data input and ouput in Matlab extremely slow compared to other languages

24 vues (au cours des 30 derniers jours)
Running a Python module between two named pipes runs roughly 40 times faster than using a Matlab implementation.
Source of data: A binary program is generating a data stream consisting of 4 float values with 8 bytes each. Always 32 bytes are somehow representing these 4 timely related measurements and are followed by the next 4 measurements.
Destination of data: the same binary or even another seperate binary should get the data blocks after doing some processing. So the output stream should have the same structure, since it shoule be possible to concatenate each individual data source and destination together.
Adding a python function just passing the data from one named pipe to another named pipe runs with a certain performance.
Doing the same with a Matlab module, even without any data processing in between, just passing input pipe data to the ouput pipe, takes roughly 40 times longer than with the Python function. Python just reads ( inpipe.read() ) 32 bytes from the pipe and writes it to the output pipe ( outpipe.write() ).
The same is done in Matlab. But with a lot lower (40 times) data rate.
Here is the example Matlab code:
bufferSize = 32;
timeOut = 100;
% PipeName and Server defination
pipeNameIn = "datain_fifo";
pipeNameOut = "dataout_fifo";
serverName = "localhost";
% Add .Net
NET.addAssembly('System.Core');
pipeStreamIn = System.IO.Pipes.NamedPipeClientStream(serverName,...
pipeNameIn,...
System.IO.Pipes.PipeDirection.In);
pipeStreamOut = System.IO.Pipes.NamedPipeClientStream(serverName,...
pipeNameOut,...
System.IO.Pipes.PipeDirection.Out);
pipeStreamIn.Connect(timeOut);
pipeStreamOut.Connect(timeOut);
if ~pipeStreamIn.IsConnected
error('Pipe %s isnt connected...', pipeNameIn);
end
if ~pipeStreamOut.IsConnected
error('Pipe %s isnt connected...', pipeNameOut);
end
read_buffer = NET.createArray('System.Byte', bufferSize);
write_buffer = NET.createArray('System.Byte', bufferSize);
while pipeStreamIn.IsConnected
% this is a byte array with 32 bytes, including 4 * 8 byte floats
read_data = pipeStreamIn.Read(read_buffer, int32(0),int32(bufferSize));
inBuf = read_buffer.uint8;
% data processing should happen here
outBuf = inBuf;
pipeStreamOut.Write(outBuf,int32(0),bufferSize);
end
Any idea to increase the data throughput to a comparable rate as with Python would be appreciated. Since currently there is no data processing involved in this data passing, it is assumed, that the data rate reduction is a result of an incorrectly configured pipe or usage mode.
Running e.g. 1M of data rows, takes ~10sec with Python, but 400sec with Matlab.
Above code is more or less derived from an example found on "MATLAB Answers" (Link)
Even though the binary code producing the data can provide both output and input pipe, it is not mandatory. Therefore an input and ouput named pipe is provided, to be more general. An inout pipe could not be used, since the interface of the binary model can not be modified. Since the performance of Python with two pipes is that much higher, I would not expect this to be the issue.
Asynchronous mode on both pipes, did not enhance anything. So I assume, that there must be a way to optimize this data throughput, but I do not see the current caviats.
Any comments and ideas would be really appreciated.
  2 commentaires
MikeMoc
MikeMoc le 18 Déc 2023
In the meanwhile I did try the same on a Linux machine. Named pipes on Linux can be used natively, without including any further functionalities, also from within Matlab. It's more or less file IO processing, based on special a file object.
And most important, the penalty of factor 40 compared to Python is gone.
So the current assumption is, that the interface to .NET is caussing the performance penalty, since it is not required for Linux and the rest of the code is rather identical.
So any ideas about modifying the usage of named pipes on Windows together with Matlab will be really appreciated.
Eric
Eric le 1 Avr 2024
Some ideas (I just started messing with named pipes on Windows myself, so I apologize for dumb suggestions):
  1. For me, MATLAB likes to default to Message mode instead of Byte mode for the In pipe's ReadMode (no idea why). I'm not sure if that's affecting your performance somehow. (I fixed it by just trying recreating the pipe again. Bizarre.)
  2. I discovered that my server writes were initially hanging because the default buffer size is zero. Apparently, a default buffer size of 0 is supposed to be allocated "as needed", but this was clearly not happening for me. So my server writes and client reads were needing to occur simultaneously, and I imagine no optimization was being done by the .NET interface. I'm not sure what your server settings are, but this could be contributing to your slow down. The fix would be to set the buffer size in the server constructor.
  3. I would try increasing bufferSize. In my experience, 32 is a bit small for reading/writing data chunks. I recommend trying 4096 (i.e. a page size on Linux) and see if you do any better. Your program already captures the number of bytes read in read_data, and so could easily be adjusted to account for less than full buffer reads.

Connectez-vous pour commenter.

Réponses (1)

Shubham
Shubham le 22 Jan 2024
Hey MikeMoc,
It appears that some adjustments to the handling of named pipes in Windows using MATLAB are required. The discrepancy in performance between the Python module and MATLAB's approach may stem from MATLAB's reliance on the .NET framework in Windows, an issue not observed on the Linux system.
Although, there is no library support for using named pipes in MATLAB, I can suggest some workarounds for the same.
  1. MEX files for using named pipes: A “MEX” file is a MATLAB executable file. “MEX” files provide an interface for using functions written in C/C++. A named pipe can be created in C/C++ and can be used for inter-process communication. Please have a look at the following answer for creating a named pipe in C: https://stackoverflow.com/questions/2784500/how-to-send-a-simple-string-between-two-programs-using-pipes After creating a C function, use the “mex” command for building the file. Please refer to the following MATLAB documentation for building a “MEX” function: https://www.mathworks.com/help/matlab/ref/mex.html For more information regarding external language interfaces that could be used in MATLAB, please check the following: https://www.mathworks.com/help/matlab/external-language-interfaces.html?s_tid=srchbrcm
  2. Using other IPC methods: Please have a look at the following benchmarks present for IPC: https://github.com/goldsborough/ipc-bench#: The ”Shared Memory” and “Memory-Mapped Files” methods have a much better benchmark as compared to that of named pipes.
Hope this helps.
  1 commentaire
MikeMoc
MikeMoc le 23 Jan 2024
Hi Shubham,
Thanks a lot for your detailed response and your suggestions. Since we have to use a compiled model (no access to the source code) providing a pipe based interface (named pipes) or file IO, we are not totally free in selecting the IPC method. But I will have a closer look at your suggestions, how we can make use of your proposals.
As you also would have noticed, there is yet only your reply to this topic. Interestingly, model interaction (models of different sources) seems not to be a continously occuring issue for many other Matlab users out there.
Thanks a lot again, Mike

Connectez-vous pour commenter.

Catégories

En savoir plus sur Python Package Integration dans Help Center et File Exchange

Produits


Version

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by