Missing counts during histcount?

8 vues (au cours des 30 derniers jours)
Joy Shen
Joy Shen le 30 Août 2023
Commenté : Voss le 31 Août 2023
Hi, I am randomly generating Nsim=10000 values and then binning them with histcounts but there seems to be missing values and I'm not sure where they went. I believe when I sum my histcounts
I should be getting Nsim since it's binning each simulated value into VBinEdges, but right now I'm getting 3742 when I sum my histcounts. Not sure where they disappeared to or I'm misunderstanding what histcount does.
VBinEdges=[0 1E-17 0.3:0.05:0.8 1.1].*Vroom(:,1); %ft^3 goes to 1.1*Vroom because "infinity" to capture everything. Then we cap it later
IndEFH2 = [1:length(ExtSurgeBinEdges_Md)]'; % EFH index
IndV1 = [1:length(VBinEdges_Md)]'; %Flow volume 1 index
IndA_ps= [1:length(DamState)]'; % PS flow area index
temp1=[]; temp2=[]; temp3=[]; [temp1,temp2, temp3]=ndgrid(IndA_ps, IndV1, IndEFH2);
IndMatCVol1=[temp1(:),temp2(:), temp3(:)];
% CVol1 Node Sim
for iCombCVol1=1:size(IndMatCVol1) % iCombCVol1=1:624
% EFH loc 2 sim
EFH2_a=[]; EFH2_a=ExtSurgeBinEdges_Lo(IndMatCVol1(iCombCVol1,3));
EFH2_c=[]; EFH2_c=ExtSurgeBinEdges_Hi(IndMatCVol1(iCombCVol1,3));
pdEFH2=[]; pdEFH2=makedist('Uniform',EFH2_a,EFH2_c);
Esim2=[]; Esim2=random(pdEFH2,[Nsim,1]); %simulate external flood heights within bin (Uniform within bin)
Esim2=Esim2.*(Esim2>=0); % make sure it's positive
% Flow volume 1 sim
V1_a=[]; V1_a=VBinEdges_Lo(IndMatCVol1(iCombCVol1,2));
V1_c=[]; V1_c=VBinEdges_Hi(IndMatCVol1(iCombCVol1,2));
pdV1=[]; pdV1=makedist('Uniform',V1_a,V1_c);
V1sim=[]; V1sim=random(pdV1,[Nsim,1]);
V1sim=V1sim.*(V1sim>=0); % make sure the difference is never below zero
V1sim=V1sim.*(V1sim<=Vroom(:,1)); % make sure the difference is always below Vroom
%Flow volume 2 sim
% PS state sim through area
Mu_ps=[]; Mu_ps=Amean_ps(IndMatCVol1(iCombCVol1,1));
Sig_ps=[]; Sig_ps=0.2*Mu_ps;
pd_Asim_ps=[]; pd_Asim_ps = makedist('normal',Mu_ps,Sig_ps);
Asim_ps = []; Asim_ps=random(pd_Asim_ps,[Nsim,1]);
Asim_ps(Asim_ps<0)=0; % make sure the area is postive
Asim_ps(Asim_ps>A_ps)=A_ps; %make sure area is never bigger than the full area
% Volumetric flow rate 2 sim calculation
Qsim_ps2=[]; Qsim_ps2=Cd.*Asim_ps.*sqrt(2*g*(Esim2-n_ps)); %ft^3/s
Qsim_ps2(Esim2<=n_ps)=1E-20; %No flow when Esim is less than or equal to installation height
% Flow volume sim calculation
Vsim_ps2=[]; Vsim_ps2=Qsim_ps2.*Dsim; %ft^3
Vsim_ps2=Vsim_ps2.*(Vsim_ps2<=Vroom(:,1)); %Ensure Vsim is less than Vroom (3 Vroom configs)
Vsim_ps2(Esim2<=n_ps)=1E-20; %No flow when Esim is less than or equal to installation height
% Cumulative flow 1 sim calculation
CVol1sim= V1sim+Vsim_ps2;
Hist_CVol1=[]; [Hist_CVol1,~] = histcounts(CVol1sim,VBinEdges);
PMF_CVol1(:,iCombCVol1)=Hist_CVol1./sum(Hist_CVol1); % Bins it by the parents
  2 commentaires
Image Analyst
Image Analyst le 30 Août 2023
I tried to reproduce and I just got "Unrecognized function or variable 'Vroom'." What is that function?
Joy Shen
Joy Shen le 31 Août 2023
Vroom = [6250 8125 5000];
This is why I use Vroom(:,1) because I'm checking each value of Vroom. Eventually I'll index and store values for each value but for now I'm sticking with just the first value to make it simple.

Connectez-vous pour commenter.

Réponses (1)

Voss le 30 Août 2023
I suspect that the data you are using histcounts on has elements outside the range of bin edges you have specified.
For example:
% generate 10000 samples from a standard normal distribution
d = makedist('normal',0,1);
x = random(d,[10000,1]);
ans = 1×2
10000 1
% specify some edges that don't include +/- infinity.
% the domain of normal distributions is (-Inf,Inf)
e = [0 1E-17 0.3:0.05:0.8 1.1];
% do the histogram binning
N = histcounts(x,e);
% count the number of samples within the specified bins:
N_inside = sum(N)
N_inside = 3584
% count the number of samples outside the specified bins,
% either below the first edge or above the last edge:
N_outside = nnz(x < e(1) | x > e(end))
N_outside = 6416
% the total number of samples:
N_inside + N_outside
ans = 10000
  3 commentaires
Voss le 31 Août 2023
Those lines work to limit the values of each vector, but you do histcounts on the sum of two of those vectors:
CVol1sim = V1sim+Vsim_ps2;
[Hist_CVol1,~] = histcounts(CVol1sim,VBinEdges);
It looks to me like there's no guarantee that all elements of CVol1sim are within the range of VBinEdges. In fact, it looks like CVol1sim could be as high as 2*Vroom(:,1).
Voss le 31 Août 2023
By the way, this logic:
sets the elements of V1sim that are greater than Vroom(:,1) to zero (or NaN if the element was Inf). Is that what you want to do?
Here's an example:
Vroom = 6250;
V1sim = [-Inf -1000 0 1 7000 Inf]
V1sim = 1×6
-Inf -1000 0 1 7000 Inf
V1sim = 1×6
-Inf -1000 0 1 0 NaN
I imagine you want to set those elements that are greater than Vroom(:,1) to Vroom(:,1), in which case, you can use logic like you have in other places:
V1sim = [-Inf -1000 0 1 7000 Inf]
V1sim = 1×6
-Inf -1000 0 1 7000 Inf
V1sim(V1sim > Vroom(:,1)) = Vroom(:,1)
V1sim = 1×6
-Inf -1000 0 1 6250 6250
(This doesn't address the original question or impact my suggestion that elements sent to histcounts are outside the range of the bin edges.)

Connectez-vous pour commenter.


En savoir plus sur Data Distribution Plots dans Help Center et File Exchange




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by