histogram of signals gaps width

I am looking for algorithm (effective + vectorized) how to find histogram of gaps (NaN) width in the following manner:
  1. signals are represented by (Nsamples x Nsig) array
  2. gaps in signal are encoded by NaN's
  3. width of gaps: is number of consecutive NaN's in the signal
  4. gaps width histogram: is frequency of gaps with specific widths in signals
And the following conditions are fulfilled:
[Nsamples,Nsig ]= size(signals)
isequal(size(signals),size(gapwidthhist)) % true
isequal(sum(gapwidthhist.*(1:Nsamples)',1),sum(isnan(signals),1)) % true
Of course, compressed form of gapwidthhist (represented by two cells: "gapwidthhist_compressed_widths" and "gapwidthhist_compressed_freqs") is required too.
Example:
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN]' % signal No. 2
gapwidthhist = [1 1 1 1 0 0 0 0 0 0 0 0 0 0; % gap histogram for signal No. 1
3 1 0 0 1 0 0 0 0 0 0 0 0 0]' % gap histogram for signal No. 2
where integer histogram bins (gap widths) are 1:Nsamples (Nsamples=14).
Coresponding compressed gap histogram looks like:
gapwidthhist_compressed_widths = cell(1,Nsig)
gapwidthhist_compressed_widths =
1×2 cell array
{[1 2 3 4]} {[1 2 5]}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
gapwidthhist_compressed_freqs = cell(1, Nsig)
gapwidthhist_compressed_freqs =
1×2 cell array
{[1 1 1 1]} {[3 1 1]}
Typical problem dimension:
Nsamples = 1e5 - 1e6
Nsig = 1e2 - 1e3
Thanks in advance for any help.

Réponses (2)

Image Analyst
Image Analyst le 25 Mai 2021

0 votes

If you have the Image Processing Toolbox and can use regionprops() to count the number and length of NaN regions, you can do this:
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN]' % signal No. 2
[numData, numSignals] = size(signals)
gapwidthhist = zeros(ceil(numData/2), numSignals);
for column = 1 : numSignals
thisSignal = signals(:, column); % Extract this column.
% Find lengths of all NAN runs
props = regionprops(isnan(thisSignal), 'Area');
allLengths = [props.Area];
hc = histcounts(allLengths)
% Load up gapwidthhist
for k2 = 1 : length(hc)
gapwidthhist(k2, column) = hc(k2);
end
end
% Should be
% gapwidthhist = [1 1 1 1 0 0 0 0 0 0 0 0 0 0; % gap histogram for signal No. 1
% 3 1 0 0 1 0 0 0 0 0 0 0 0 0]' % gap histogram for signal No. 2
% What it is:
gapwidthhist

4 commentaires

Michael's comment moved here:
Of course I prefer pure Matlab code (without any additional toolbox). But anyway, why
size(gapwidthhist) is not equal to size(signal)?
size(gapwidthhist)
ans =
7 2
and
size(signals)
ans =
14 2
why use:
gapwidthhist = zeros(ceil(numData/2), numSignals);
NaN region in each separate signal could have the length = numData in general.
OK, the line:
gapwidthhist = zeros(ceil(numData/2), numSignals);
should be modified:
gapwidthhist = zeros(numData, numSignals);
Then are results in required format.
Michael:
You're right. Try this:
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN;
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN]' % signal No. 2
[numData, numSignals] = size(signals)
gapwidthhist = zeros(numData, numSignals);
for column = 1 : numSignals
thisSignal = signals(:, column); % Extract this column.
% Find lengths of all NAN runs
props = regionprops(isnan(thisSignal), 'Area');
allLengths = [props.Area]
edges = [1:max(allLengths), inf]
hc = histcounts(allLengths, edges)
% Load up gapwidthhist
for k2 = 1 : length(hc)
gapwidthhist(k2, column) = hc(k2);
end
end
% What it is:
gapwidthhist'
Michal
Michal le 25 Mai 2021
Well done ... Thanks! Your code is pretty fast even for large dimension problem.
But still, I am looking for pure Matlab code without any toolbox functions, because final user have only basic Matlab.
There is no way how to extract source code of the core functionality, because function "regionprops" calls some
internal built-in functions.

Connectez-vous pour commenter.

Michal
Michal le 26 Mai 2021
Modifié(e) : Michal le 26 Mai 2021

0 votes

This is much more simple Matlab implementation but still not optimal (+ not vectorized):
signals = [1.1 NaN NaN NaN -1.4 NaN 8.3 NaN NaN NaN NaN 1.5 NaN NaN; % signal No. 1
NaN 2.2 NaN 4.9 NaN 8.2 NaN NaN NaN NaN NaN 2.4 NaN NaN; % signal No. 2
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN]'; % signal No. 3
signals
[numData, numSignals] = size(signals);
gapwidthhist = zeros(numData, numSignals);
gaps = zeros(numData+1,numSignals);
auxnan = isnan(signals);
for i = 1:numSignals
c = 0;
for j = 1:numData
if auxnan(j,i)
c = c + 1;
else
gaps(j,i) = c;
c = 0;
end
end
gaps(numData+1,i) = c;
gapwidthhist(:,i) = histcounts(gaps(:,i),1:numData+1);
end
gapwidthhist
Any idea how to optimize (vectorize) this code to be more effective?

Catégories

En savoir plus sur Data Distribution Plots dans Centre d'aide et File Exchange

Produits

Version

R2021a

Question posée :

le 25 Mai 2021

Modifié(e) :

le 26 Mai 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by