# Troubles assigning colorbar values in a bivariate histogram plot

18 views (last 30 days)
Mar on 3 Jan 2020
Edited: Adam Danz on 10 Jan 2020
Hello!
Im working with data of height and period of ocean waves, each pair have a power value associated, I have to get a bivariate histogram but instead of number of observations in colorbar i want that the scale represents the power associated.
I attached an example of the plots that i already have and the code to obtain it.
Xedges = [1:0.25:16];
Yedges = [0.00:0.025:1.6];
h = histogram2(Tp,Hs,Xedges,Yedges)
h.FaceColor = 'flat';
h.DisplayStyle = 'tile';
c=colorbar;
c.Label.String = 'Número de observaciones';
c.Label.FontSize = 14;
view(2)
I am new in matlab and maybe Im wrong with the function that i am using, i also tried with sccater3 but i have to group my data in bins
so this option doesnt work.

Max Murphy on 4 Jan 2020
You could try using imagesc instead; without knowing the format of data in Tp or Hs it is hard to say otherwise.
Mar on 7 Jan 2020
Sorry, i just attach the data, first row is Hs, second Tp and the last one is power, i will check imagesc to see if I can achieve something,
Thanks a lot

Adam Danz on 8 Jan 2020
Edited: Adam Danz on 8 Jan 2020
Binned approach
The problems with using the binned bivariate histogram are
1. The colors on the histrogram will be based on density rather than the power values.
2. There are often more than 1 power value within each bin.
A solution to problem 1 is to use imagesc() instead.
A solution to problem 1 is to average the power values within each bin. Note that you're losing information here since the powers are averaged and the actual (x,y) coordinates of the data are binned. The scatter plot version does not have these losses.
Hs = d(:,1);
Tp = d(:,2);
power = d(:,3);
Xedges = [1:0.25:16]; % defined by OP
Yedges = [0.00:0.025:1.6]; % defined by OP
% determine which bin each Tp and Hs value is in
TpBins = discretize(Tp,Xedges);
HsBins = discretize(Hs,Yedges);
% group the Power values by bin
xyGroups = [TpBins(:),HsBins(:)];
[xyGroupUnq, ~, unqGroupID] = unique(xyGroups,'rows');
% average power values within groups
powerGroupMean = splitapply(@mean, power, unqGroupID);
% Reorganize power into matrix
powerMat = zeros(numel(Yedges)-1, numel(Xedges)-1);
idx = sub2ind(size(powerMat),xyGroupUnq(:,2),xyGroupUnq(:,1));
powerMat(idx) = powerGroupMean;
% Plot it
imagesc(Xedges, Yedges, powerMat)
set(gca,'YDir','normal')
cmap = jet(100);
cmap(1,:) = 1; % to make background white
colormap(cmap)
colorbar()

Adam Danz on 7 Jan 2020
Edited: Adam Danz on 8 Jan 2020
Scatter plot approach
"i also tried with sccater3 but i have to group my data in bins so this option doesnt work."
What's wrong with grouping the data in bins? You've got more than 61,000 unique values of power within a range of 45,400 values but that doesn't mean you need tens of thousands of unique color values to get a meaningful image of your data. And if you want that level of precision, you could greatly increase the number of color levels but you won't see much of a difference.
This solution uses scatter() and allows you to set as many color levels as you want. The image below is based on 100 color levels of the jet colormap. You could change the number of levels to 1000 or 10,000 and you probably won't notice a difference.
See the inline comments for details.
% Read in the matrix data from text file and split the columns into variables.
Hs = d(:,1);
Tp = d(:,2);
power = d(:,3);
% set your colormap and the number of levels
cmap = jet(100); % you can use any colormap with any number of levels.
% scale the power values to the colormap
cIdx = round((power - min(power))/range(power)*(size(cmap,1)-1))+1;
% Plot the data
figure()
scatter(Tp, Hs, 15, cmap(cIdx,:),'filled')
xlabel('Tp')
ylabel('Hs')
set(gca,'Colormap',cmap)
cb = colorbar();
caxis([min(power),max(power)])
ylabel(cb, 'power')
One drawback to this method is that you lose the density of clustered data. You could overlay a contour map that shows the density by adding these lines.
hold on
[nCount,xEdges,yEdges] = histcounts2(Tp,Hs,50);
contour(xEdges(2:end)-(xEdges(2)-xEdges(1))/2, yEdges(2:end)-(yEdges(2)-yEdges(1))/2, nCount','-','Color',[.6 .6 .6])

Mar on 8 Jan 2020
Hello Adam! thanks for your contribution, maybe i didnt explain properly, I want my data in bins that´s why i was using histogram2, anyway i found helpful your propose to overlay the contour map to show the density of the data.
At the end what i have in mind is a bivariate histogram in which the color bar represents power values and then add in each square of the histogram the probability that represents, but it seems to me that with a scatter plot with density contours (like the one that you post) the objective of the graph is achieved, that is to show the probability to found some power values.
Thank you so much
Adam Danz on 8 Jan 2020
" I want my data in bins that´s why i was using histogram2"
"the objective of the graph is achieved"

David Goodmanson on 7 Jan 2020
Edited: David Goodmanson on 8 Jan 2020
Hi Mar,
I have been looking at this with some data I made up, but now that you posted it I was able to plug it in.
I thought your plot was pretty good. From your data, the power is
Pdens = 4*Hs^2*Tp or Hs = (1/2)*sqrt(Pdens/Tp) [ Pdens in kW/m ]
with variations on the order of 15%. Is going to color-by-power to try and bring out that fairly minor effect worth the information on number of occurrences that you lose?
Incidentally, the dashed lines on your plot do not seem to be consistent with the formula above. Right or wrong, the plots in the code below use the formula above.
A loglog plot of the data may be useful here, because the power density curves turn into straight lines and the data for small Hs and Tp is not all squished into the corner. The binning is done by logs of the variables and comes out differently. Now the highest-occupancy bins have higher values of Hs and Tc than before. I believe this is because in terms of logs there is a lot more space for small-value bins and the small data get spread around among more bins. On the loglog plot the power density lines aren't labeled but they go from 20 to 40 dBW/m (so 30 dB is 1kW/m) in 5dB steps.
The code below uses x for Tp and y for Hs, as on a plot.
yvals = Dzi(:,1);
xvals = Dzi(:,2);
Pwr = Dzi(:,3);
% initial histogram
figure(1)
h = histogram2(xvals,yvals,'DisplayStyle','tile')
hold on
xPwr = linspace(1,15,100);
yPwr = (1/2)*sqrt([.25 1 3 7 13]')./sqrt(xPwr);
plot(xPwr,yPwr,'k--')
xlim([1 16])
ylim([0 1.6])
grid on
colorbar
hold off
% duplicate histogram2 with patch
figure(2);
h2 = histogram2(xvals,yvals);
xbe = h2.XBinEdges;
ybe = h2.YBinEdges;
bco = h2.BinCounts'; % transpose is necessary
close(2)
bco = bco(:);
[x y] = meshgrid(xbe,ybe);
[xp yp] = mesh2patch(x,y);
ind = (bco == 0);
xp(:,ind) = [];
yp(:,ind) = [];
bco(ind) = [];
figure(2)
patch(xp,yp,bco,'EdgeColor','none')
hold on
xPwr = linspace(1,15,100);
yPwr = (1/2)*sqrt([.25 1 3 7 13]')./sqrt(xPwr);
plot(xPwr,yPwr,'k--')
xlim([1 16])
ylim([0 1.6])
grid on
colorbar
hold off
% loglog version
figure(4)
h4 = histogram2(log10(xvals),log10(yvals))
xbe = 10.^h4.XBinEdges;
ybe = 10.^h4.YBinEdges;
bco = h4.BinCounts'; % transpose is necessary
close(4)
[x y] = meshgrid(xbe,ybe);
[xp yp] = mesh2patch(x,y);
ind = (bco == 0);
xp(:,ind) = [];
yp(:,ind) = [];
bco(ind) = [];
figure(4)
ax = gca;
ax.XScale = 'log';
ax.YScale = 'log';
patch(xp,yp,bco,'EdgeColor','none')
hold on
xPwr = linspace(1,15,100);
dB = [20 25 30 35 40]; % power density, dBW/m
Pvals = 10.^(dB/10)/1000; % power density, KW/m
yPwr = (1/2)*sqrt(Pvals')./sqrt(xPwr);
plot(xPwr,yPwr,'k--')
xlim([1 16])
ylim([.01 1.6])
grid on
colorbar
hold off
function [xpatch ypatch] = mesh2patch(x,y)
% using x and y matrices provided by meshgrid, function creates
% corresponding patch matrices for every 'cell' in meshgrid.
% See notes below.
%
% [xpatch ypatch] = mesh2patch(x,y)
xlf = x(1:end-1,1:end-1); % col index for x
xlf = xlf(:);
xrt = x(1:end-1,2:end);
xrt = xrt(:);
xpatch = [xlf xrt xrt xlf]';
ydn = y(1:end-1,1:end-1); % row index for y
ydn = ydn(:);
yup = y(2:end,1:end-1);
yup = yup(:);
ypatch = [ydn ydn yup yup]';
end
% Meshgrid:
% For [x y] = meshgrid(xvec,yvec)
% coordinate x varies with the column index of x, and
% coordinate y varies with the row index of y.
% If xvec and yvec are vectors of increasing values as usual,
% then the lower left corner of a plot has the coordinates (x(1,1),y(1,1)).
% In a plot, since increasing y goes from down to up, the upper left
% corner of the y matrix is the lower left corner of the plot.
% Patches:
% If x and y have dimension mxn, then the patches, if they were
% the contents of a matrix, have dimension (m-1)x(n-1).
% The xpatch and ypatch matrices have dimension 4x((m-1)*(n-1)).
% The order of patches in xpatch and ypatch is such that if the patch
% colors are a function of x and y, e.g. ColoR = f(x1,y1), then
%
% patch(xpatch,ypatch,ColoR(:))
%
% gives the correct coloring. Note that x1 and y1
% have dimension (m-1)x(n-1), not mxn.