Convolution? Kernal density? Resample? Help!

5 vues (au cours des 30 derniers jours)
ndb
ndb le 1 Sep 2020
Commenté : Jeff Miller le 3 Sep 2020
Hello:
I have a curve (let's assume it's a continuous function) that fluctuates in x,y space wtih regions of monotonically increasing and decreasing data. Each y-value will have a probability density value. I would like to transform the dataset, so that the y-data becomes the "new" x-value and the "new" y-data is the probability density assigned to it; the new dataset will sum the "new" PDF values along this "new" x-space, so there is no non-uniqueness to it. My issue is representing these data graphically and mathematically.
It helps with a visual represenation: Take the following figure as an example:
In the top plot, we have the probability density value that I want each value in the lower left plot (x,y) to have. I want a new, transformed dataset in which the new x values will sum the PDF along the old y-values. For example, the y-value of 2.6x10^-9 will have the probability associated with x~1.51, 1.565, 1.665, etc. The figure on the right is just a multiplication of this PDF and the y-values, and isn't what I want, because it doesn't represent the sum of the probabilities and you end up with a non-unique representation of the data.
My idea (now that I'm typing this question out) is to maybe resample or interpolate the y-data and...then find the PDF value at the corresponding x-value. If the y-value is repeated across x, sum up the PDF values for each.
My quesiton is: is a convolution doing something similar? Or, since I'm doing this with a normal PDF, using ksdensity() the right way to go about this? I'm a bit unfamiliar with these techniques, but they seem to be similar to what I'd like to do. It could also be more simple than that....
Thanks ahead of time and let me know if you need clarification; I'll do my best to provide it.

Réponses (1)

Jeff Miller
Jeff Miller le 2 Sep 2020
As I understand it, you could approximate your situation (to any desired degree of accuracy) using a table with 3 columns:
X Y PDF
The possible X,Y points correspond to your lower left graph, and the PDF values of those points are depicted (somehow) in the upper graph.
So, make this table for all X,Y pairs, using as many rows as you need to get the resolution accuracy that you want.
Now you could sum the PDF values of the rows with identical (within tolerance) Y values (sorting the table rows by Y might facilitate this). As I understand it, this sum would be the PDF(Y) that you are looking for?
Some normalization would surely be required to make sure the PDF approximation integrated to 1, but that would be straightforward.
  2 commentaires
ndb
ndb le 3 Sep 2020
Modifié(e) : ndb le 3 Sep 2020
I guess to simplify the problem:
I have a 3-dimensional vector. I want to sum the z-values along the y-axis, even though it is not 1-1 along the y-axis, where the z-values are equal to:
z = normpdf(x,1.587,0.018);
New visualization:
You wrote: "Now you could sum the PDF values of the rows with identical (within tolerance) Y values (sorting the table rows by Y might facilitate this). As I understand it, this sum would be the PDF(Y) that you are looking for?"
I want to have a new vector [y, z'] where z' is the sum of the z values across y. Does that make more sense? I feel like there's a standard set of methods to do this, but I'm not sure how, but it doesn't seem straighforward as the z values across y are not unique....
Jeff Miller
Jeff Miller le 3 Sep 2020
Your new description seems consistent with my original understanding (which may of course still be wrong), so I still think my previous answer applies.
As I would describe it, you have a normal random variable X. There is a second random variable Y = f(X) whose distribution you seek (what you are calling the [y, z'] vector). There are various standard methods for computing the distribution of Y (maybe Google "transformation of a random variable), including cases where f transforms multiple X's onto the same Y. (The chi-square distribution, for example, is the distribution of X^2, where X is a standard normal, so positive and negative X's get transformed onto the same Y.) Your transformation is obviously very complicated, so I don't think any of those standard methods will work for you.
So, back to my original suggestion, which is essentially to use a discrete approximation, something like this:
(a) generate a list of X's from the known distribution and compute their pdfs. The denser the list, the better your approximation.
(b) for each of those X's, compute Y using whatever your function is.
(c) for each Y value in the list, search through the list of X's, find which ones produce approximately that Y value, and sum their pdfs. This is the pdf of that Y value.
(d) normalize so that the pdfs sum to one. (maybe it is better to normalize the pdfs at step a, or maybe it doesn't matter)

Connectez-vous pour commenter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by