Vous suivez désormais cette question
- Les mises à jour seront visibles dans votre flux de contenu suivi.
- Selon vos préférences en matière de communication il est possible que vous receviez des e-mails.
Interpolation schemes that produce positive second derivatives of the interpolant
32 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Given a set of x-values and y-values. The interpoland to this data should have non-negative second derivatives, allowed to be discontinuous.
The y-values are strictly increasing and the interpoland is allowed to be a C1 function.
Take this data set for example:
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0111, 0.0293, 0.0521, 0.0787, 0.1086, 0.1416, 0.1774, 0.2155, 0.2556, 1.0248];
plot(x,y)
The first 10 points are equally distributed, only the last segment is bigger. Another characteristic of my y-values is that they are close to linear in the beginning, and nonlinear after that.
As for the cubic interpolation family, I tried makima, pchip, as well as csape, but they all result in negative second derivatives at some points.
Is there an interpolation scheme for my purpose? If not, are there easy adjustments to the linear system for the spline coefficients to enforce positive second derivatives?
Thank you!
UPDATE:
As stated in the comments below, I implemented an optimization problem to find the unknown deivatives at the break points using a C1 cubic hermite interpolant. On the interval , we have
where and are the interpolation values and and the unknown derivatives at the break points.
The basis functions are given by
and is the mapping to transform from the "real" interval to the unit interval . All these equations can be taken from wikipedia "cubic hermite interpolation".
In the optimization problem, I want to find the unknown derivatives at the break points under the constraints that the second derivative of the interpolant is positive everywhere and that the second derivative is as smooth as possible. That said, the objective function measures the change in the second derivative at the break points. Also, I decided to implement "second derivative > 0" as a penalty term in the objective function, although these are linear constraints.
I solved the problem using fmincon, but as stated in the comments, linprog or quadprog could work too. If this is indeed a linear optimization problem, I expect that the derivative of the interpolant w.r.t the interpolation values is the unit spline. My overall goal is to get these sensitivities, and the question is how to get them, if my problem were non-linear.
Here is my code:
% inerpolation points (break points)
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
% interpolation values)
y = [0, 0.0198463, 0.0397084, 0.0596019, 0.0795445, 0.0995199, 0.119908, 0.140298, 0.160773, 0.181772, 6.64478];
% create points at which the change of the 2nd derivative is measured
% for every breakpoint x(i) in the interior, i.e., x(2:end-1),
% two points are created x(i)-e and x(i)+e
xObj = zeros(2*(numel(x)-2), 1);
% offset "e" in x(i)+e
offset = min(diff(x))/100;
ctr = 1;
for i=2:numel(x)-1
xObj(ctr) = x(i) - offset;
xObj(ctr+1) = x(i) + offset;
ctr = ctr + 2;
end
% objective function handle
objectiveFunc = @(yprime)Objective_function(yprime, x, y, xObj);
% start vector
yprime0 = ones(size(x));
% call optimizer
yprime = fmincon(objectiveFunc, ...
yprime0, [], []);
function fval = Objective_function(yprime, pointsx, pointsy, pointsxObj)
% minimize the jump of the second derivative at the
% break points (to obtain best smoothness of 2nd derivative)
fval = 0.0;
ctr = 1;
for i = 1:numel(pointsxObj)/2
% calculate second derivative at point (x(i) - e)
idxL = find(pointsxObj(ctr) >= pointsx, 1, "last");
hiL = pointsx(idxL + 1) - pointsx(idxL);
tL = (pointsxObj(ctr) - pointsx(idxL)) / hiL;
secDerivL = Evaluate_cubicSegment(tL, 2, hiL, ...
pointsy(idxL), pointsy(idxL + 1), ...
yprime(idxL), yprime(idxL + 1));
% calculate second derivative at point (x(i) + e)
idxR = find(pointsxObj(ctr+1) >= pointsx, 1, "last");
hiR = pointsx(idxR + 1) - pointsx(idxR);
tR = (pointsxObj(ctr+1) - pointsx(idxR)) / hiR;
secDerivR = Evaluate_cubicSegment(tR, 2, hiR, ...
pointsy(idxR), pointsy(idxR + 1), ...
yprime(idxR), yprime(idxR + 1));
% add squared difference to objective function
fval = fval + (secDerivL - secDerivR)^2;
ctr = ctr + 2;
end
% ----------------------------- 2nd derivative > 0 (lienar constraints) .......................
% impose the constraint at discrete points in the interpolation domain
% choose 100 points in each cubic segment
nConstraints = 100*(numel(pointsx)-1);
xCtr = linspace(min(pointsx), max(pointsx)-1e-3, nConstraints);
for i=1:nConstraints
% find polynomial segment in which xCtr(i) is located
idx = find(xCtr(i) >= pointsx, 1, "last");
hi = pointsx(idx+1) - pointsx(idx);
t = (xCtr(i) - pointsx(idx)) / hi;
% calculate second derivatives
% - of basis functions H
H = Evaluate_basisFunctions(t, 2);
% - of the interpolant
secondDeriv = pointsy(idx)*H(1) + pointsy(idx+1)*H(2) + hi*yprime(idx)*H(3) + hi*yprime(idx+1)*H(4);
% add contribution to objective function if derivative is negative
tmp = max(0.0, -secondDeriv);
fval = fval + 10*tmp^2;
end
end
function H = Evaluate_basisFunctions(t, order)
if order==0
% value
H1 = 1 - 3*t^2 + 2*t^3;
H2 = 3*t^2 - 2*t^3;
H3 = t - 2*t^2 + t^3;
H4 = -t^2 + t^3;
elseif order==1
% first derivative
H1 = -6*t + 6*t^2;
H2 = 6*t - 6*t^2;
H3 = 1 - 4*t + 3*t^2;
H4 = -2*t + 3*t^2;
elseif order==2
% second derivative
H1 = -6 + 12.*t;
H2 = 6 - 12.*t;
H3 = -4 + 6.*t;
H4 = -2 + 6.*t;
elseif order==3
% third derivative
H1 = 12;
H2 = -12;
H3 = 6;
H4 = 6;
else
assert(false, "Evaluations up to 3rd derivative possible");
end
H = [H1; H2; H3; H4];
end
function pi = Evaluate_cubicSegment(t, order, hi, yi, yi1, yiprime, yi1prime)
H = Evaluate_basisFunctions(t, order);
chainRuleFactor = (1/hi)^order;
pi = chainRuleFactor*( H(1).*yi + H(2).*yi1 + H(3).*hi*yiprime + H(4).*hi*yi1prime );
end
Réponse acceptée
Bruno Luong
le 23 Jan 2024
Modifié(e) : Bruno Luong
le 23 Jan 2024
You can use approximation spline, just put more (denser) knots than the data points so it actually interpolates.
For example using my FEX BSFK to get cubic spline interpolant, C^2, and convex interpolant functtion:
Note that the "knee" location of the interpolant seems to be quite unstable.
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0198463, 0.0397084, 0.0596019, 0.0795445, 0.0995199, 0.119908, 0.140298, 0.160773, 0.181772, 1.02478];
% FEX https://www.mathworks.com/matlabcentral/fileexchange/25872-free-knot-spline-approximation?s_tid=srchtitle_BSFK_1
opt = struct('KnotRemoval','none','sigma',1e-10,'shape',struct('p',2,'lo',0,'up',inf));
pp = BSFK(x,y,4,20,[],opt);
% prepare graphical data
xi = linspace(min(x),max(x),1025);
yi = ppval(pp,xi);
xb = pp.breaks(2);
yb = ppval(pp,xb);
% Check if approximation is close to interpolation
norm(y-ppval(pp,x),'inf')/norm(y,'inf') % 2.6938e-05
% Evaluate secpn derivative
fdd = ppder(ppder(pp));
yddi = ppval(fdd,xi);
all(yddi >= 0) % true
figure(1)
clf
plot(x, y, 'or')
hold on
plot(xi, yi, 'b')
xlabel('x')
ylabel('y')
yyaxis right
plot(xi, yddi) % second derivative is postive
ylabel('y''''')
legend('data', 'spline interpolation', 'second derivative')
Note: you can regularize the interpolant by setting parameter d and lambda, however function will not interpolate the data as stricty as the above
opt = struct('KnotRemoval','none','sigma',1e-10,'d',2,'lambda', 1e-6,'shape',struct('p',2,'lo',0,'up',inf));
6 commentaires
Bruno Luong
le 23 Jan 2024
I can't tell I never use John's code.
Mine is focus on free knot spline, which is flexible method for many situation.
Bruno Luong
le 23 Jan 2024
Modifié(e) : Bruno Luong
le 23 Jan 2024
My FEX does not necessary interpolate, but approximate. So the solution of approximation always exists (with constraint f''>=0, and C^2 because it is a cubic spline). However when I add enough knots the approximation become interpolation, since the best approximation is overdeterminded and yield to 0 residual at certain threshold of knot density, assuming the data allows such constraint.
Therefore the problem can be warrantied the interpolation solution exists only when I raise the number of knots high enough.
The solution is not necessary unique, but if you require additionaly the specific solution to minimize some (semi)-norm such as L^2 norm of the (second) derivative then this forces uniqueness of the solution. My second code option opt does somewhat that.
SA-W
le 24 Jan 2024
I see.
So if you use the approach with lsqlin and f''>=0, a solution always exists and is unique as you said above. If I go with your FEX, I can achieve similar results by playing with the regullarization. Right?
Bruno Luong
le 24 Jan 2024
Yes.
There is a subtility you should be aware of. My code make a trade-off between fitness and regularized solution with the lambda value. Ideally if the solution of non-regularized is not unique, you should rather formulate theproblem as the interpolation values as hard constraints and minimize the second derivative energy.
Plus de réponses (4)
Torsten
le 18 Jan 2024
Déplacé(e) : Torsten
le 18 Jan 2024
If you do function approximation (e.g. by splines), there are methods to enforce f''>=0. But for function interpolation, it doesn't make sense in my opinion.
46 commentaires
SA-W
le 18 Jan 2024
What do you mean by approximation? Least-squares fitting?
But for function interpolation, it doesn't make sense in my opinion.
Why? If I know a priori that my discrete set of y-values has a "convex shape", the interpoland should have this property too.
John D'Errico
le 18 Jan 2024
NO. Even if you know the data points lie along a convex curve, that does not force an interpolant to be everywhere convex on that support.
Torsten
le 18 Jan 2024
Modifié(e) : Torsten
le 18 Jan 2024
What do you mean by approximation? Least-squares fitting?
Yes. If there are some "outliers" in your data that disturb convexity, it will usually be very difficult to find an adequate interpolant with the convexity property. Here, an approximant is better suited.
Why? If I know a priori that my discrete set of y-values has a "convex shape", the interpoland should have this property too.
Cases where the usual interpolant is non-convex if the data have a convex shape will be very exotic in my opinion. In your case, the usual spline interpolation looks convex.
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0111, 0.0293, 0.0521, 0.0787, 0.1086, 0.1416, 0.1774, 0.2155, 0.2556, 1.0248];
hold on
plot(x,y)
xx = 3:0.1:16;
yy = interp1(x,y,xx,'spline');
plot(xx,yy)
hold off
grid on
SA-W
le 18 Jan 2024
Using a slightly different convex-shaped data set
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0198463, 0.0397084, 0.0596019, 0.0795445, 0.0995199, 0.119908, 0.140298, 0.160773, 0.181772, 1.02478];
plot(x,y,'-o')
This gives a negative second derivative in a small support region.
SA-W
le 23 Jan 2024
Modifié(e) : SA-W
le 23 Jan 2024
Maybe I should indeed consider to resort to some sort of least-squares fitting given our discussion below.
For instance, if I call the slmengine function (@John D'Errico) with 'concaveUp', 'on' option, this routine internally calls lsqlin with linear ineqality constraints to make sure f''>=0. In that case, can I be sure that the resulting curve (which may not be an interpolant anymore) is linear w.r.t the y-values?
"All spline interpolation or least-square fitting with homogeneuous boundary conditions (natural, not a knot, periodic, clamp first derivative to 0) ae linear with respect to y values regardless the degree."
If I am not mistaken, what the linked FEX download does is least-squares fitting and Bruno's statement is valid here. But please correct me if I am wrong.
Bruno Luong
le 23 Jan 2024
I believe with the inequality such as f''>=0, the spline resulting is not longer linear, for the simple reason that f'' is linear wrt to f. By contradiction, if the contraint is f''>=0 it cannot be linear; since if we multiply y data by -1, the f'' must reverse the sign, so one of those two cases does not satisfy f">=0, unless the trivial case f==0.
SA-W
le 23 Jan 2024
@Bruno Luong But can we at least say that the lsqlin problem with the inequality constraints f''>=0 has a unique solution such that we can compute the sensitivites via finite-differencing?
Bruno Luong
le 23 Jan 2024
You can always to do finite difference, for any (differentiable) problem. The linear approximation (sensitivity) is valid only locally however.
SA-W
le 23 Jan 2024
Of course, but if the optimization problem had no unique solutions, finite differencing would be non-sense. Clearly, the solution of the lsqlin fitting problem problem with f''>=0 must have a unique solution. Is that the case?
Bruno Luong
le 23 Jan 2024
Modifié(e) : Bruno Luong
le 23 Jan 2024
Yes since lsqlin minimizes a quadratic function (strictly convex assuming appropriate bc is used) on a non-empty intersecion of half spaces { f''>=0 }, which is a convex set. So solution exists and unique.
Torsten
le 23 Jan 2024
You know @John D'Errico formulation in his slmengine function is such that the objective is strictly convex ?
Bruno Luong
le 23 Jan 2024
Not in detail I do not know how John formulates f"(x) everywhere ('concaveUp', 'on' ?). This is infinity number of constraints and John should use some trick to reduce into finite number of constraints, I guess.
SA-W
le 23 Jan 2024
For a cubic polynomial, the second derivative is a piecewise linear function and the slmengine formulates f''(x)>=0 at the knots. If there are n knots, there are n linear inequality constraints and if f''>=0 at the knots, it is positive everywhere. Does that make sense?
Does the number of inequality or equality constraints has an influence on the uniqueness of the solution of the problem? I guess not, the uniqueness of the problem is, sloppy speaking, a "property" of choosing lsqlin, right?
Bruno Luong
le 23 Jan 2024
I don't know the details of John's implementation. The best is let him comment on his code.
Torsten
le 23 Jan 2024
Modifié(e) : Torsten
le 23 Jan 2024
In case you use "lsqlin", a sufficient condition for uniqueness is that the matrix C'*C is positive definite (not only positive semidefinite which is per se true). C is explained in the documentation of "lsqlin":
Existence of a solution is assumed, of course (i.e. the feasible region defined by the constraints is not empty).
SA-W
le 24 Jan 2024
I believe with the inequality such as f''>=0, the spline resulting is not longer linear, for the simple reason that f'' is linear wrt to f.
We talked about it already, but I want to understand what you said here. Linear means something like
f(y1+b*y2) = f(y1) + b*f(y2)
How can I translate "f'' is linear w.r.t f" in a similar equation? It sounds like f'' can be formulated as a function of f.
SA-W
le 25 Jan 2024
Modifié(e) : SA-W
le 25 Jan 2024
I tried to implement the fintite difference approach to calculate the derivative of f w.r.t the y-values.
x = linspace(3, 18, 17); %numel(x) = 17
y = [0 0.0915 0.2564 0.4722 0.7258 1.0146 1.3345 1.6827 2.0519 2.4502 2.8733 3.3100 3.7742 4.2586 4.7577 5.2692 5.8130];
plot(x,y);
% fit curve f with lsqlin and f''>=0 and f'(x(1))>0
slmF = slmengine(x,y,'knots', x, 'concaveup', 'on', 'leftminslope', 0);
% forward differencing to calculate df/dyi
h = 1e-6;
unitVectors = eye(numel(x));
for i=1:numel(x)
yi = y + h.*unitVectors(:,i);
slmi = slmengine(x,yi,'knots', x, 'concaveup', 'on', 'leftminslope', 0);
normCoef = (slmi.coef - slmF.coef) ./ h;
% normCoef are the coefficients for the function si(x) =: df/dyi (x)
% ...
end
The .coef matrices have numel(x) rows and 2 columns, where the first column stores the values of the fitted curve at the points, and the second column the first derivatives.
Given that the x-y data are convex, I expected that the step size h has negligible influence on df/dyi = f(y + h*e_i) - f(y) / h. However, this is not the case. Consider normCoef for i=1:
h = 1e-6
normCoef =
0.999995033031503 -1.317473322631768
0.000009659231748 -0.504431128950378
-0.000013950784972 0.135169090947906
0.000019731993817 -0.036213002152508
-0.000017326473589 0.009672115941495
0.000007386535827 -0.002514967512024
0.000006979528067 0.000465534388816
-0.000062704286208 0.000428542923370
0.000286064505417 -0.001286629025543
-0.001068024335638 0.001500947199862
0.001697645135579 -0.000200101823999
-0.001211744038443 -0.001160455398441
0.000550326451076 0.001170506247483
-0.000295791835470 -0.000590524629196
0.000131358923738 -0.000149107171055
-0.000021653789872 0.002064198922902
-0.000012998491172 -0.008569634557531
h = 1e-4
normCoef =
0.999995491566257 -1.349519663240706
0.000007276449859 -0.495840698605254
-0.000009334372719 0.132867014661087
0.000015788467200 -0.035600121559565
-0.000015832012368 0.009512679141621
0.000007386422585 -0.002477481512164
0.000006079643455 0.000467364215662
-0.000058511488987 0.000397151335152
0.000268930491210 -0.001214846808706
-0.001009108188121 0.001420326427026
0.001605728923160 -0.000188703885295
-0.001144649104390 -0.001099241813685
0.000515927571776 0.001098306983138
-0.000269104241113 -0.000492242546724
0.000111552225235 -0.000423337971345
-0.000015869225933 0.002995946402073
-0.000011753211737 -0.011955025051025
h = 1e-2
normCoef =
0.999994228845148 -1.193527414114345
0.000011898524646 -0.537634109765617
-0.000009258211109 0.144052694596855
0.000006093611810 -0.038595244342654
-0.000005391433910 0.010340656461227
0.000002375859265 -0.002779278310200
0.000001377334069 0.000798116836970
0.000012155281559 -0.000381894886553
-0.000082785041489 0.000460143107456
0.000320083377625 -0.000473307636045
-0.000511022516925 0.000062727514960
0.000376935793556 0.000404325307030
-0.000191000298022 -0.000655957641682
0.000120700500794 0.001399552322212
-0.000074713115872 -0.004570132664994
0.000030995909484 0.016593923645802
-0.000002674421129 -0.061575038096129
E.g., the derivative of s1=:df/dy1 at x(1) varies between -1.19 and -1.31 if the step size alters between 1e-2 and 1e-6. For my real code, I evaluate ds1/dx, where the differences are even more visible.
How would you choose the step size h here given the large variations? Does that make sense at all in your opinion?
Bruno Luong
le 25 Jan 2024
Could you compute finite differences with negative steps: h = -1e-2, h = -1e-4, h = -1e-6
SA-W
le 25 Jan 2024
@Bruno Luong Here is what I get. The variations are similar.
h = -1e-6
normCoef =
0.999995830302609 -1.360127859353910
0.000006013509135 -0.492996190398776
-0.000006815048526 0.132104157335444
0.000013115675213 -0.035397712150331
-0.000014465095788 0.009462210348588
0.000007326805829 -0.002469653426207
0.000004652056518 0.000477580253122
-0.000053457682725 0.000364824837007
0.000252192933203 -0.001144739469883
-0.000951206224897 0.001341342537042
0.001514661285285 -0.000180731041244
-0.001080445510837 -0.001031955965125
0.000487524243198 0.001021727369377
-0.000252407872381 -0.000405212974286
0.000101975317079 -0.000634624464091
-0.000013292478229 0.003708900631061
-0.000011200818051 -0.014563127259670
h = -1e-4
normCoef =
0.999995557095890 -1.344854551437596
0.000007355945575 -0.497090932684574
-0.000009569136594 0.133201878235456
0.000015747003146 -0.035689728855592
-0.000015648105034 0.009537584476083
0.000007524463275 -0.002486921180100
0.000004888007776 0.000475815832290
-0.000055428370782 0.000382208811822
0.000261331662976 -0.001184031336865
-0.000983200907356 0.001385044410829
0.001564531446618 -0.000185907001460
-0.001116373882226 -0.001067569649904
0.000505343344948 0.001066783836823
-0.000266497268697 -0.000479960434729
0.000112867839519 -0.000402863722382
-0.000016596670704 0.002891097362623
-0.000011832490543 -0.011560566747226
h = -1e-2
normCoef =
0.999997290366472 -0.936422807636378
0.000006362756698 -0.606526258139201
-0.000006834026561 0.162514642135314
0.000004531791242 -0.043538169491225
-0.000001453809995 0.011655252522630
-0.000001651546766 -0.003102627281132
0.000007493624055 0.000783888390993
-0.000021443238807 -0.000096259697407
0.000065923896475 -0.000211872729772
-0.000226974935025 0.000286049188003
0.000353742383163 -0.000011304865039
-0.000229165389465 -0.000247839180934
0.000072819274166 0.000103707640642
-0.000040335123863 0.000437265468689
0.000029608248386 -0.001991044797922
-0.000006640745109 0.007634735736906
-0.000003273524651 -0.028653119822264
Bruno Luong
le 25 Jan 2024
Modifié(e) : Bruno Luong
le 25 Jan 2024
Look like the slope result is more or less equal when you flip the sign of h, meaning the function seems to be differentiable at this specific point (not stuck by the active constraints). John's second derivative plot in his answer seems to be stricly positive everywhere also backup this observation.
I guess then the variation of fiite difference with respect to abs(h) is not real but related specific to the slmengine implementation, or h of 1e-2 is too large. I would test with h = 1e-8, 1e-10, 1e-12 to see if it converge to a value (before numerical truncation spoil the calculation).
Bruno Luong
le 25 Jan 2024
Modifié(e) : Bruno Luong
le 25 Jan 2024
"Or is that reasonable what we measure here?"
As long as you haven't showed the mathemical wellposedness of the formulation (existence, uniqueness, continuity, and differentiable of the spline interpolation solution), all numerical scheme to derive the sensitivity is doubful IMO.
Bruno Luong
le 25 Jan 2024
Modifié(e) : Bruno Luong
le 25 Jan 2024
If you remove the constraint 'concaveup', 'on' and set h = 1e-2, ae-6 what finite difference do you get?
It should become linear and the finite difference should give the same numerical answer, however I see only one bc is provided whereas the problem needs two to be well-posed.
SA-W
le 25 Jan 2024
Modifié(e) : SA-W
le 25 Jan 2024
If you remove the constraint 'concaveup', 'on' and set h = 1e-2, ae-6 what finite difference do you get?
It should become linear and the finite difference should give the same numerical answer
Yes, this is exactly what happens. h = {1e-2, 1e-4, 1e-6,...} give the coefficients below. Interestingly, the derivative of df/dy1 at x(1) is roughly -1 while it was somehow -1.3 with f''>=0.
The slmengine imposes two BC internally if none are povided.
It is probably non-sense to calculate the coefficients for f with 'concaveup', 'on', but for the sensitivities without. They constraints and BCs must match, right?
normCoef =
0.999999999773452 -1.000317634407111
0.000000000416375 -0.589403113383802
-0.000000000242162 0.157930087892399
0.000000000066819 -0.042317239304165
-0.000000000018496 0.011338870040206
0.000000000005018 -0.003038241054493
-0.000000000001377 0.000814094232565
0.000000000000333 -0.000218135890789
-0.000000000000044 0.000058449334900
-0.000000000000089 -0.000015661450015
-0.000000000000044 0.000004196465159
0.000000000000133 -0.000001124410148
0.000000000000089 0.000000301175940
-0.000000000000089 -0.000000080293905
0.000000000000089 0.000000019999180
0.000000000000089 0.000000000297629
0.000000000000178 -0.000000021189644
Bruno Luong
le 25 Jan 2024
To me strickly speaking when you have finite number of constraints, the approximation solution is in general no longer differentiable when there is one active-constraint. It doesn't make sense to speak about "sensitivity".
I would expect the finite difference result is not the same for when reverse the sign of h.
Now in practice, the constrained problem is hard to be solved numerically accurately. The finite difference result depend therefore on the solver implementation. I just repeat myself here.
Bruno Luong
le 25 Jan 2024
Modifié(e) : Bruno Luong
le 25 Jan 2024
What is normCoef (1,2) = -1.000317634407111 pricisely ?
Can it be -d(s(x1)) / d(y1) or something similar? where s is spline function.
If it is -1.34 then that means s does not interpolate y data at x1 when 'concaveup', is 'on' .
SA-W
le 25 Jan 2024
What is normCoef (1,2) = -1.000317634407111 ?
It is the derivarive of s1 =: df/dy1 at the left point x(1).
normCoef(1,1) = 0.999999999773452
This is the value of s1 =: df/dy1 at the left point x(1). Since it is close one, the spline interpolates at that point.
So there seems to be no real solution that satisfies all my requirements: an approximating spline with f''>0 AND f is differentiable w.r.t y-values. The latter seems to be the problem because too many constraints are active in general. Do you think your FEX below can handle this problem better or do I simple want to enforce something that is unlikely to work?
Bruno Luong
le 25 Jan 2024
Modifié(e) : Bruno Luong
le 25 Jan 2024
You always get a non differentiable when the spline solution has at least one point x where f''(x) = 0. Since when you perturb yi by yi + h, there is one sign of h that makes f"(x) > 0 the other make it violate the constraints meaning f"(x) < 0 when the constraints is assumed disregarded. So the derivative is not defined because the constraint clamp f"(x) == 0 on a "bad" side.
This should be true regardless the algorithm.
Bruno Luong
le 25 Jan 2024
If f" > 0 for all x, the derivative is defined and equal to that of the unscontrained problem.
SA-W
le 25 Jan 2024
I see.
So forgetting about everything related to cubic splines that we talked about so far, can you think of a completely different solution to my problem statement: f''>0 everywhere AND f linear w.r.t y-valus.
For cubic polynomials, you told me it is difficult to achieve this. Polynimals with higher degrees are possibly even harder to control, but have more degrees of freedom.
That said, is there any approach in your mind that makes f''>0 and f linear w.r.t y-values? Of course, if there is no FEX for that, I would also try to implement something new.
I am ready to give up the C2 continuity (C1 is enough to build piecewise constant second derivatives). Also, I am ready to give up the interpolation property.
Bruno Luong
le 25 Jan 2024
"can you think of a completely different solution to my problem statement: f''>0 everywhere AND f linear w.r.t y-valus. "
I though I already express about it: No, it's not possible, since f'' is linear wrt f, if you assume f is linear wrt to y, then f'' would be linear wrt y.Therefore you cannot enforce f'' >= 0 if you inverse the sign of y data.
SA-W
le 25 Jan 2024
Alright, then I have to live with that!
Just for me to understand:
f'' is linear wrt f
Can you explain that? Example: f = x^3, f'' = 6x, I do not see how f'' is linear with respect to f. Linear to me means something like f(a + bc) = f(a) + b(c).
Bruno Luong
le 25 Jan 2024
Modifié(e) : Bruno Luong
le 26 Jan 2024
Linear operator https://mathworld.wolfram.com/LinearOperator.html
It is trivial to verify the any order (second order in your case) differiential operator verifies the two required properties in the definition in this link.for any function f and g and scalar t: the second derivative satisfies
- (f + g)'' = f'' + g''
- (t*f)'' = t * f''
Like many people you confuse between linear function (wrt variable argument) and linear operator.
SA-W
le 26 Jan 2024
No, it's not possible, since f'' is linear wrt f, if you assume f is linear wrt to y, then f'' would be linear wrt y.Therefore you cannot enforce f'' >= 0 if you inverse the sign of y data.
So if f''(y)>=0. Linearity of f'' wr.t. y would mean f''(-y) = -f''(y) <= 0.
Is that what you mean?
Bruno Luong
le 26 Jan 2024
Yes.
SA-W
le 30 Jan 2024
Let me clarify one last point here:
To me strickly speaking when you have finite number of constraints, the approximation solution is in general no longer differentiable when there is one active-constraint
...
You always get a non differentiable when the spline solution has at least one point x where f''(x) = 0
...
If f" > 0 for all x, the derivative is defined and equal to that of the unscontrained problem.
All of what you said makes sense. However, what happens if the constrained lsqlin problem results in a curve that has f''>=m where m >> h, say f''>=0.1. Assuming "traditional cubic spline interpolation" results in f''<0 somewhere, is it possible that the constrained problem results in f''>=m ?
In that case, no constraints were active and the derivative w.r.t the interpolation values is defined (as long as m>>h), isn't it?
Bruno Luong
le 30 Jan 2024
Modifié(e) : Bruno Luong
le 30 Jan 2024
@SA-W " Assuming "traditional cubic spline interpolation" results in f''<0 somewhere, is it possible that the constrained problem results in f''>=m ?"
IMO it is not possible. Since the fitting score is convex function. Here is the outline of the proof
By contradiction, If you draw a line in the space of functions that link the solution of the unconstrained solution fu to the constrained solution fc that is fc''>=m > 0, the score function will have directional derivative negative started from fc (since the function is convex, and score(fu) < score(fc) ), that means there is small step h > 0 such as
- fh := fc + h g
- score(fh) < score(fc), since g is negative
- fh''(x) >= 0 for all (x) by continuity of f''
which contradicts with the fact that fc is the minimum of the contrained problem. End of the proof.
I'm sorry but you seem to want something that look mathematically impossible.
PS: For the same reason I think why I think SLM solution posted by John is not fuly converged yet since it has f''(x) > 0 for all x.
SA-W
le 30 Jan 2024
PS: For the same reason I think why I think SLM solution posted by John is not fuly converged yet since it has f''(x) > 0 for all x.
I just double-checked that the x-y-data I posted and John used in his answer produces f''>0 using traditional cubic spline interpolation such as csape. Hence no active constraints.
Indeed, for all the fits I made so far, there is a f'' at a knot amounted to 1e-5. This supports your proof.
I would like to understand the gist of your sketched proof.
You make the assumption that unconstrained solution fu has a lower score (objective function value) than the constrained solution fc. Makes sense, unconstrained solution should have numerical zero score.
the score function will have directional derivative negative started from fc
Not fully clear to me. Lets make a 1d example and minimize f(x) = x^2. fu is at x=0 (since f' = 2x). Say the constraint is x=3, then the "minimum" is at x=3 and the directional derivative at x=3 is 2*3 = 6 > 0.
Is that example too simple?
Bruno Luong
le 30 Jan 2024
Modifié(e) : Bruno Luong
le 30 Jan 2024
Direction derivative is by definition df(x+h*d)/dh = dot(gradient f at x, d).
You only consider the gradient = df/dx in your 1D example. Note that the direction is d = fu-fc = 0-3 = -3.
SA-W
le 30 Jan 2024
Makes sense.
that means there is small step h > 0 such as
- fh := fc + h g
- score(fh) < score(fc), since g is negative
- fh''(x) >= 0 for all (x) by continuity of f''
which contradicts with the fact that fc is the minimum of the contrained problem.
Why is that a contradiction?
Bruno Luong
le 30 Jan 2024
fh has lower score than fc, therefore fc is NOT constrained solution as we assume earlier.
SA-W
le 30 Jan 2024
Modifié(e) : Bruno Luong
le 30 Jan 2024
@Bruno Luong fh''(x) >= 0 for all (x) by continuity of f''
And why only >=0 and not >=m ?
fh := fc + h g, hence fh'' = fc'' >= m since h and g are scalars.
Bruno Luong
le 30 Jan 2024
Modifié(e) : Bruno Luong
le 30 Jan 2024
fh''(x) >= 0 for all (x) by continuity of f''
And why only >=0 and not >=m ?
There is nothing (at least I don't see it) to ensure this inequality (fh'' >= m) still holds, since the only lower bound we know is m (fc''>=m).
When we perturb fc the bound might no longer be a bound, but for a given epsilon if you reduce the bound to m-epsilon, then there is a small h to verifies fh'' >= m-epsilon (definition of continuity applied). epsilon here is m.
fh := fc + h g, hence fh'' = fc'' >= m since h and g are scalars.
Ops, sorry. There is a mistake in this definition of fh, the correct one is
fh := fc + h d, with d := fu-fc.
Thanks.
SA-W
le 30 Jan 2024
Ok. But where in your proof becomes fc''>=m > 0 relevant? Are not the steps and the conclusion the same if we only assume fc''>=0 ? What we wanted to show is that the constrained solution can not fulfill fc''>=m > 0.
Bruno Luong
le 30 Jan 2024
Modifié(e) : Bruno Luong
le 31 Jan 2024
"Are not the steps and the conclusion the same if we only assume fc''>=0 ? "
Because I can only show a strict smaller lower bound, strictly smaller than m (with epsion). If m is 0 (your case) the whole proof cannot work, since m-epsilon is negative and fh no longer meet the constraint, so I canot tell it is an admissible solution.
John D'Errico
le 18 Jan 2024
Modifié(e) : John D'Errico
le 18 Jan 2024
Are there easy adjustments? Not really. At least not totally trivial. It sounds as if you want an interpolant, NOT an approximant, so you want a curve that truly passes exactly through the points. That may not always be possible of course.
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0111, 0.0293, 0.0521, 0.0787, 0.1086, 0.1416, 0.1774, 0.2155, 0.2556, 1.0248];
plot(x,y,'-o')
In this case, an interpolant with a positively curved second serivative should work, though I would need to look very carefully at the data to be sure of that.
How would I do it? I would start with a C1 cubic interpolant, probably one formulated in terms of a set of unknown first derivatives at each break point. Then choose the set of first derivatives such that the second derivative is everywhere positive, and that minimizes the integral of the square of the second derivative over the support of the interpolant. You could write it using a quadratic programming tool to solve the problem, so it would be efficient. That square of the integral would be written as a simple quadratic form.
So, not utterly trivial to write, but not hard either. It that would depend on your skill in writing and formulating things like quadratic programming, and working with piecewise interpolants.
Can it be done? Using my SLM toolbox, I built this curve:
With a second derivative plot as shown below:
As you can see, it is everywhere non-negative as required.
33 commentaires
SA-W
le 18 Jan 2024
Yes, I want an interpolant.
I have some experience with the optimization toolbox and want to give your scheme a try.
I would start with a C1 cubic interpolant, probably one formulated in terms of a set of unknown first derivatives at each break point. Then choose the set of first derivatives such that the second derivative is everywhere positive, and that minimizes the integral of the square of the second derivative over the support of the interpolant.
A cubic segment can be written as
p(x) = a + b(x-x_i) + c(x-x_i)^2 + d(x-x_i)^3
p'(x) = b + 2c(x-x_i) + 3d(x-x_i)^2
p''(x) = 2c + 6dx
It sounds you would minimize an objective function like
min
i.e., minimizing the second derivatives. But why? f''(x) > 0 is required, but the absolute (positive) value can be as big as it is.
Also, it is not clear to me what the parameters are of the optimization problem. For cubic splines, we have 4 unknowns per segment, but it sounds you only assume the first derivatives to be unknowns.
Maybe I did not fully understand your idea and you can clarify my concerns.
SA-W
le 20 Jan 2024
Modifié(e) : SA-W
le 20 Jan 2024
I implemented a slightly different approach than you suggested.
I started with a C1 cubic hermite interpolant formulated in terms of the derivatives at the break points. However, in my objective function, I minimized the jump in the second derivatives of the interpolant over the break points. Say x(2) is breakpoint 2, then I created two points in the vincinity of x(2), x(2) - delta, and x(2) + delta, evaluated the second derivative there, build the difference. I do this for all interior break points and sum the squared values up.
The constraints that the second derivative must be greater than zero are implemented in linear constraints (A and b, in matlab terminology).
I solved the problem using fmincon optimization algorithm to get the derivatives at the break points.
That worked, however, for my application, I also need to derivative of the interpolant w.r.t the y-values. You have answered this question here and mentioned that, in most cases, the interpolant is linear w.r.t y-values because all operations to solve for the spline are linear.
But, in my case, the optimization problem I set up is non-linear, right? At least, fmincon takes several iterations to find the derivatives. Do you see a way to calculate the sensitivities in that case? Or do you think I can treat my problem as a linear optimization problem? ( you mentioned quadprog in your answer)
Torsten
le 20 Jan 2024
Modifié(e) : Torsten
le 20 Jan 2024
Say x(2) is breakpoint 2, then I created two points in the vincinity of x(2), x(2) - delta, and x(2) + delta, evaluated the second derivative there, build the difference.
But you have the coefficients of the piecewise polynomials left and right of x(2). Just evaluate their respective 2nd derivative at x(2) to get the jump.
But, in my case, the optimization problem I set up is non-linear, right? At least, fmincon takes several iterations to find the derivatives. Do you see a way to calculate the sensitivities in that case?
Maybe we can tell you more if you show us your code.
Torsten
le 20 Jan 2024
Shall I open a new question for that and link it here or shall I post my code in this chat?
Doesn't matter.
If I'm not mistaken, your problem can be formulated as a linear optimization problem that can be solved using "linprog".
SA-W
le 20 Jan 2024
I updated my question including my code. Let me know if something is not clear or not well documented.
If I'm not mistaken, your problem can be formulated as a linear optimization problem that can be solved using "linprog".
If this is true, the interpolant is linear w.r.t the interpolation values, right?
Torsten
le 20 Jan 2024
If I'm not mistaken, your problem can be formulated as a linear optimization problem that can be solved using "linprog".
If this is true, the interpolant is linear w.r.t the interpolation values, right?
I don't think so. The solution of an optimization problem does not depend linearly on its inputs.
SA-W
le 20 Jan 2024
I don't think so. The solution of an optimization problem does not depend linearly on its inputs.
I see. But if I consider the linear system for standard cubic spline interpolation, Ax=b, where the interpolant is linear w.r.t y-values, I could treat the system as a linear optimization problem, namely min ||Ax-b||. Here my inutuition is that the solution of a linear system can be seen as the solution of a linear optimization problem.
Anyway, do you think my problem can be solved using linear algorithms?
And, as I said, I need the derivative of the interpolant w.r.t y-values, how would you compute here?
John D'Errico
le 20 Jan 2024
Modifié(e) : John D'Errico
le 20 Jan 2024
i.e., minimizing the second derivatives. But why? f''(x) > 0 is required, but the absolute (positive) value can be as big as it is.
You want to minimize the square of the second derivatives to insure the resulting interpolant is as smooth as possible. Small second deritvatives do that.
But, in my case, the optimization problem I set up is non-linear, right? At least, fmincon takes several iterations to find the derivatives. Do you see a way to calculate the sensitivities in that case? Or do you think I can treat my problem as a linear optimization problem? ( you mentioned quadprog in your answer)
No. The optimization is nonlinear, but only in the sense that it would be a quadratic functional of the unknown parameters, thus the unknown first derivatives at the breaks. Again, you can write the integral of the square of the second derivative as a quadratic form. I said EXACTLY that as you should recall.
And that can be formulated as a QUADRATIC PROGRAMMING problem. Ergo, quadprog. That it took fmincon more than one iteration is because a quadratic is not a linear objective. This means it takes fmincon a few iterations just to figure out the shape of the surface it is wandering around.
However, in my objective function, I minimized the jump in the second derivatives of the interpolant over the break points. Say x(2) is breakpoint 2, then I created two points in the vincinity of x(2), x(2) - delta, and x(2) + delta, evaluated the second derivative there, build the difference. I do this for all interior break points and sum the squared values up.
That will do something approximate. Not at all valid, and surely not the smoothest possible curve that passes through the data, yet satisfies the requirements on the second derivative. Probably not obscenely terrible either. It is not the same thing as the integral of the square of the second derivative, which will produce different results.
Why do I say the approach you took is not at all the best? Suppose you had a second derivative that was perfectly continuous, AND always positive, but also a truly jagged sawtooth? If you did, then your solution would love it! But that sawtooth second derivative would be pure crap in terms of being smooth.
Also, it is not clear to me what the parameters are of the optimization problem. For cubic splines, we have 4 unknowns per segment, but it sounds you only assume the first derivatives to be unknowns.
Suppose you have a cubic spline that passes through two points, (x1,y1), (x2,y2). This means the value of the spline interpolant at the end points of that interval are fully KNOWN. Do you aggree with that? So you can then easily formulate a cubic segment using a variation of a Hermite form, where only the end point first derivatives are unknown. Again, do you follow?
For example, write a cubic segment on the interval [0,h] as I do below:
syms t h
syms y1 y2 d1 d2
P1(t) = 2*t^3/h^3 - 3*t^2/h^2 + 1;
P2(t) = 3*t^2/h^2 - 2*t^3/h^3;
P3(t) = t - 2*t^2/h + t^3/h^2;
P4(t) = t^3/h^2 - t^2/h;
Now build a cubic that passes through the points (0,y1), (h,y2), and has first derivatives as d1 and d2, where d1 and d2 are currently unknown. That is simply:
P(t) = y1*P1 + y2*P2 + d1*P3 + d2*P4
P(t) =
We can verify it has the necessary properties.
P(0)
ans =
P(h)
ans =
dP = diff(P,t);
dP(0)
ans =
dP(h)
ans =
Do you understand that P(t), on the interval [0,h] has some nice properties? It interpolates a function with known function values at each end. And you can set the first derivatives to be anything you wish.
Do the same for each interval between breaks. Do you see that the function will be C1 continuous?
I've given you enough information now to build a C1 curve where only the first derivatives are unknowns. It will be fully C1, but by varying those first derivatives are the breaks, you can easily have a curve that is more r less smooth and well-behaved. The curve you will want is the one with minimum integral of the second derivative squared.
Torsten
le 20 Jan 2024
Maybe I misunderstood the first argument in the call to "Evaluate_cubicSegment", but shouldn't the two curves obtained this way be equal ?
hold on
plot(x,y)
for i = 1:numel(x)-1
yapprox(i) = Evaluate_cubicSegment(0, 0, x(i+1)-x(i), y(i), y(i+1), yprime(i), yprime(i+1));
end
yapprox(numel(x)) = Evaluate_cubicSegment(1, 0, x(end)-x(end-1), y(end-1), y(end), yprime(end-1), yprime(end));
plot(x,yapprox)
hold off
SA-W
le 20 Jan 2024
Maybe I misunderstood the first argument in the call to "Evaluate_cubicSegment", but shouldn't the two curves obtained this way be equal ?
Yes. And the code you showed also plots the same curves (y == ypprox).
And it results in a linear optimization problem.
I do not see how to translate my code in the linprog or quadprog notation. I appreciated if you could show that, maybe in a separate answer.
SA-W
le 20 Jan 2024
Suppose you have a cubic spline that passes through two points, (x1,y1), (x2,y2). This means the value of the spline interpolant at the end points of that interval are fully KNOWN. Do you aggree with that? So you can then easily formulate a cubic segment using a variation of a Hermite form, where only the end point first derivatives are unknown. Again, do you follow?
Yes, makes totally sense. When I wrote this comment, I actually was not aware of hermite form of a cubic polynomial.
That will do something approximate. Not at all valid, and surely not the smoothest possible curve that passes through the data, yet satisfies the requirements on the second derivative. Probably not obscenely terrible either. It is not the same thing as the integral of the square of the second derivative, which will produce different results.
Why do I say the approach you took is not at all the best? Suppose you had a second derivative that was perfectly continuous, AND always positive, but also a truly jagged sawtooth? If you did, then your solution would love it! But that sawtooth second derivative would be pure crap in terms of being smooth.
My goal is to obtain a smooth second derivative. By smooth, I think about a C2 interpolant, that is, a continuous second derivative. But whether the absolute value of the second derivative is of order 1e2 or 1e5, does not matter to me. By minimizing the square of the second derivative, I think it can easily happen that the second derivative has jumps at the breaks, a property I want to avoid if possible.
I do not understand your example with the sawtooth second derivative? Using my approach, a second derivative which is "sawtooth shaped" is not possible. Maybe you can comment on that again.
If I stick to the approach (minimizing the jumps of the second derivative), is it also possible to formulate it as a quadprog problem?
Torsten
le 20 Jan 2024
Modifié(e) : Torsten
le 20 Jan 2024
min: sum_{i=2}^{n-1} eps_(i)
p_(i)''(0) - p_(i-1)''(1) <= eps_(i) (1)
-(p_(i)''(0) - p_(i-1)''(1)) <= eps_(i) (1a) (i = 2,...,n-1)
((1) together with (1a) set abs(p_i''(0)-p_(i-1)''(1)) (jump in breakpoint i) to eps_i)
p_(i-1)(1) - p_(i)(0) = 0 (2) (i = 2,...,n-1)
(continuity of function values at break points)
p_(i-1)'(1) - p_(i)'(0) = 0 (3) (i = 2,...,n-1)
(continuity of first derivatives at break points)
p_(i)(0) - y(i) = 0 (4) (i = 1,...,n)
(interpolation property)
-p_(i)''(0) <= 0 (5) (i = 1,...,n) (or i = 2,...,n-1 ?)
(convexity condition)
You will have to think about useful boundary conditions of your approximating function (for i=1 and i=n).
Unknowns are eps_i and yprime_i.
It might be the case that a solution only exists for "mildly nonconvex" functions. For others, it could be necessary to weaken the continuity conditions. That's why I wrote that for me, interpolation with positivity constraint on the second derivative doesn't make much sense.
John D'Errico
le 20 Jan 2024
Modifié(e) : John D'Errico
le 20 Jan 2024
My goal is to obtain a smooth second derivative. By smooth, I think about a C2 interpolant, that is, a continuous second derivative.
If this is your goal, then by implication you think you want to generate a traditional cubic spline. That is, the almost unique curve that is C2. I say almost unique, because the only free parameters are two parameters, and there are several common ways to choose those parameters. HOWEVER, you cannot find a cubic spline (in general) which will always have non-negative second derivatives. (For some problems it can exist. In fact, I showed that I was able to generate such a curve in my answer for your data.) You simply do not have sufficent degrees of freedom in a piecewise cubic interpolant to create a C2 interpolant that has everywhere non-negative second derivatives, thus fitting any set of data.
SA-W
le 20 Jan 2024
Modifié(e) : SA-W
le 20 Jan 2024
If this is your goal, then by implication you think you want to generate a traditional cubic spline. That is, the almost unique curve that is C2. I say almost unique, because the only free parameters are two parameters, and there are several common ways to choose those parameters. HOWEVER, you cannot find a cubic spline (in general) which will always have non-negative second derivatives. (For some problems it can exist. In fact, I showed that I was able to generate such a curve in my answer for your data.) You simply do not have sufficent degrees of freedom in a piecewise cubic interpolant to create a C2 interpolant that has everywhere non-negative second derivatives, thus fitting any set of data.
I got your point. However, the property of positive second derivatives is crucial for me. If I obsere that a traditional cubic spline interpolant violates that, I have to come up with a different interpolant. Your suggestion (minimizing the square of the second derivatives) and my approach (minimizing the jumps in the second derivatives at the breaks) are ways to better ensure f''>0.
As I also wrote, I need to calculate the derivative of the interpolant w.r.t interpolation values y. Since we solde a non-linear optimization problem to completely define the interpolant, the resulting interpolant is probably not linear w.r.t the interpolation values y. In such a case, how can we compute those sensitivities?
Torsten
le 20 Jan 2024
Modifié(e) : Torsten
le 20 Jan 2024
In such a case, how can we compute those sensitivities?
By numerical difference quotients. But of course it only makes sense if your problem is well-posed which means that a solution exists and is unique solution. Up to now, I don't see that this is guaranteed with your nonlinear solution method.
SA-W
le 20 Jan 2024
Modifié(e) : SA-W
le 20 Jan 2024
Unknowns are eps_i and yprime_i.
(2)-(5), I guess, is just a summary of important relations.
You will have to think about useful boundary conditions of your approximating function (for i=1 and i=n).
Why would you impose boundary conditions for i=1 and i=n? The optimization problem can be solved without boundary conditions there.
By numerical difference quotients. But of course it only makes sense if your problem is well-posed which means that a solution exists and is unique solution. Up to now, I don't see that this is guaranteed with your nonlinear solution method.
Yes. I tested multistart on my problem with random start vectors, which converged to a unique solution. Thats of course no guarantee.
If we were able to translate the problem into quadprog (what @John D'Errico said), we could easily show that the optimization problem is convex and has one local minimum, which is the global one.
Torsten
le 20 Jan 2024
Modifié(e) : Torsten
le 20 Jan 2024
Unknowns are eps_i and yprime_i.
(2)-(5), I guess, is just a summary of important relations.
(1),(1a),(2)-(5) are the constraints of the optimization problem. They can be formulated in terms of yprime - the main part of the solution vector.
You will have to think about useful boundary conditions of your approximating function (for i=1 and i=n).
Why would you impose boundary conditions for i=1 and i=n? The optimization problem can be solved boundary conditions there.
In the usual spline interpolation, two boundary conditions for i = 1 and i = n are necessary to make the solution of the problem unique. I don't know what further conditions might be necessary to make the solution of your or my formulation of the optimization problem unique. An existing and unique solution is necessary to build the derivatives you requested. If you don't have a function that uniquely maps a vector of y-values to a vector of yprime-values, you cannot build derivatives.
SA-W
le 20 Jan 2024
Modifié(e) : SA-W
le 20 Jan 2024
I don't know what further conditions might be necessary to make the solution of your or my formulation of the optimization problem unique.
So your intuition is that the solution to the problem is not unique at the moment?
I mean, we already specify the function value, the first derivative (via the solution), and the second derivative as a constraint. Only the third derivative is untouched.
Actually, I thought it makes sense to impose (for i=2 and i=n-1) that the third derivative matches from both sides as well - to have a smoother "beginning" and "ending" of the second derivative. But this is of course not really a boundary conditon but rather an additional constraint.
But do you agree that we simply can not make any BC (except for the third derivative)?
Torsten
le 21 Jan 2024
Modifié(e) : Torsten
le 21 Jan 2024
For possible boundary conditions for usual spline interpolation, look up Bruno's answer under
My intuition is that the problem is too strictly constrained to have a solution in most cases. Did you try your code e.g. for a concave function ? I did for y=-x^2, and it gives negative 2nd derivatives almost everywhere. Thus the constraints in my problem formulation
-p_(i)''(0) <= 0 (5) (i = 1,...,n) (or i = 2,...,n-1 ?)
(convexity condition)
would most probably lead to a failure.
Of course, you could weaken the matching conditions for function values and derivatives at the breakpoints in the same way as I did for the second derivatives and put the sum of the deviations from matching into the objective function. This will at least guarantee that a solution exists.
SA-W
le 21 Jan 2024
Modifié(e) : SA-W
le 21 Jan 2024
Of course, you could weaken the matching conditions for function values and derivatives at the breakpoints in the same way as I did for the second derivatives and put the sum of the deviations from matching into the objective function. This will at least guarantee that a solution exists.
But matching function values and first derivatives at the breaks is a priori satisfied by the choice of a C1 interpolant. Regardless at which point we are in the parameter space, the interpolant always interpolates values and first derivatives. So I would expect that "matching values" and "matching first derivatives" gives zero contribution in the objective function. Is that not true? Maybe I misunderstood something in your approach.
Torsten
le 21 Jan 2024
Modifié(e) : Torsten
le 21 Jan 2024
What I want to say is that the weakened condition of a possible jump in the second derivatives (1), (1a) at the breakpoints might not be enough to ensure that a solution of the problem with "hard constraints"
p_(i-1)(1) - p_(i)(0) = 0 (2) (i = 2,...,n-1)
(continuity of function values at break points)
p_(i-1)'(1) - p_(i)'(0) = 0 (3) (i = 2,...,n-1)
(continuity of first derivatives at break points)
p_(i)(0) - y(i) = 0 (4) (i = 1,...,n)
(interpolation property)
together with the convexity condition
-p_(i)''(0) <= 0 (5) (i = 1,...,n) (or i = 2,...,n-1 ?)
(convexity condition)
will exist.
If you have y-values that follow a convex curve, the problem I defined will be solvable.
But I'm quite sure it won't give a solution for more general curves if you don't weaken either condition (5) or conditions (2)-(4).
Of course you have the option to choose splines with more degrees of freedom, e.g. quintic spline functions instead of cubic ones.
SA-W
le 21 Jan 2024
Of course you have the option to choose splines with more degrees of freedom, e.g. quintic spline functions instead of cubic ones.
Never thought about going to higher orders.
If we go for a quintic spline in hermite form, we can specify the value, the first, and the second derivative at every break (6 degrees of freedom per spline). That said, the interpolant is C2 a priori.
The parameters of the optim. problem are the first and second derivatives at the breaks (2n parameters) and, within the objective, function, we only sum up the contributions from the convexity condiditon (f'' >=0).
Does that make sense or would you setup the problem/objective function in a different manner?
Torsten
le 21 Jan 2024
Modifié(e) : Torsten
le 22 Jan 2024
As said, if you really need a derivative of the spline coefficients generating procedure with respect to the y-values in the end, you should not start the whole task without being able to prove that the optimization problem has a unique solution, thus that your optimization establishes a function between y-values and results of the optimization.
If this is not that important and you only want to test the approach, I'd start with quartic splines, demand continuity of 0th, 1st and 2nd derivatives and positivity of the second derivative in the breakpoints and minimize sum p_i''(0) over the breakpoints in the objective function.
The disadvantage of taking a spline of higher order than cubic is that you no longer have control whether you get a negative second-order derivative between the breakpoints. For quartic splines, e.g., the second derivative is still quadratic while it is linear in the cubic case.
SA-W
le 22 Jan 2024
Yes, without being able to calculate sensitivities, this interpolation scheme is useless for me. So the uniqueness of the optimization problem must be given.
But I thought the motivation to move to quintic splines was that the resulting optimization problem has a unique solution.
The disadvantage of taking a spline of higher order than cubic is that you no longer have control whether you get a negative second-order derivative between the breakpoints
For instance, using quintic splines, the second derivative is cubic. But we are not restricted to minimize " sum p_i''(0) " but can choose also many more points between the breaks to enforce the constraint, e.g z = linspace(x(1), x(n), m) where m>>n, like I did it in my code shown in the question. Is this problematic as for the uniqueness of the problem? Or why did you suggest to choose only the breaks?
Torsten
le 22 Jan 2024
But I thought the motivation to move to quintic splines was that the resulting optimization problem has a unique solution.
The motivation is that a solution exists at all.
For instance, using quintic splines, the second derivative is cubic. But we are not restricted to minimize " sum p_i''(0) " but can choose also many more points between the breaks to enforce the constraint, e.g z = linspace(x(1), x(n), m) where m>>n, like I did it in my code shown in the question. Is this problematic as for the uniqueness of the problem? Or why did you suggest to choose only the breaks?
Choosing constraints between the breakpoints makes the problem more and more arbitrary and thus impossible to prove anything about its solvability.
SA-W
le 22 Jan 2024
Choosing constraints between the breakpoints makes the problem more and more arbitrary and thus impossible to prove anything about its solvability.
Makes sense.
And you are convinced that, if we use quintic splines, and apply p_i''(0)>=0 only at the breaks, that the optimization problem has a unique solution?
Torsten
le 22 Jan 2024
Modifié(e) : Torsten
le 22 Jan 2024
And you are convinced that, if we use quintic splines, and apply p_i''(0)>=0 only at the breaks, that the optimization problem has a unique solution?
No. I'm only quite certain that it has at least one solution. And I can easily imagine that the function could be non-convex between the breakpoints - a cubic polynomial for the second derivative can be almost everything.
SA-W
le 22 Jan 2024
So we do not gain much from a polynomial of degree 5. In previous comments, you suggested to use quartic splines. But how do you want to ensure continuity of the second derivative? We only have 5 degrees of freedom and would need 6 to prescribe also the second derivative.
Torsten
le 22 Jan 2024
Modifié(e) : Torsten
le 22 Jan 2024
I don't understand. For usual quartic spline interpolation, even continuity of the third derivative is ensured in the breakpoints. I don't know what constraints can still be imposed to get a unique solution if you want positive second derivatives in the breakpoints.
And what do you mean by "prescribe the second derivative" ? Does it mean "keep continuous" ?
I think the whole debate is in vain because proving that a unique solution for an optimization problem exists (even if it's a linear one) will be hopeless.
SA-W
le 22 Jan 2024
Yes, but in "usual quartile spline interpolation", I have no control over the second derivative. Then, I do not see a benefit compared to "usual cubic spline interpolation".
And what do you mean by "prescribe the second derivative" ? Does it mean "keep continuous" ?
For the cubic hermite form, we made an ansatz of the form
p_i(t) = H_0(t)y_i + H_1(t)y_i+1 + H_2(t)y'_i + H_3(t)y'_i+1.
A cubic polynomial has four degrees of freedom such that we can prescribe the function values (y_i and y_i+1) as well as the first derivative (y'_i and y'_i+1) at the left and right end to have a C1-interpolant.
Similar, for a quintic polynomial, we can make an ansatz with size summands
p_i(t) = H_0(t)y_i + H_1(t)y_i+1 + H_2(t)y'_i + H_3(t)y'_i+1 + H_4(t)y''_i + H_5(t)y''_i+1
which gives two more degrees of freedom to prescribe the second derivatives (y''_i and y''_i+1). Since we can prescribe the second derivatives at the breaks, the interpolant is a C2-function.
I think the whole debate is in vain because proving that a unique solution for an optimization problem exists (even if it's a linear one) will be hopeless.
Probably yes. At least I do not have the mathematical background to come up with a proof of the uniqueness of the problem. However, I know my y-values result from sampling of a convex curve, so I can at least give anything of what we discussed before a try.
Above, you said
If this is not that important and you only want to test the approach, I'd start with quartic splines, demand continuity of 0th, 1st and 2nd derivatives and positivity of the second derivative in the breakpoints and minimize sum p_i''(0) over the breakpoints in the objective function.
that you would implement this approach. However, it is not clear to me how you can demand continuity of the second derivative with a quartic spline. A quartic spline in hermite form has only five degrees of freedom, but we need six (two function values, two first derivatives, two second derivatives). Maybe you can comment again on how you want to realize the continuity of the second derivative with a quartic spline.
Torsten
le 22 Jan 2024
Modifié(e) : Torsten
le 22 Jan 2024
Maybe it's a language problem, but the word "prescribe" is wrong for y',y'',... at the breakpoints in usual spline interpolation. The only thing you prescribe are x and y. All other parameters (y',y'',... at the breakpoints) are computed, namely such that the spline function has continuous first, second, third,... derivatives at the breakpoints. So usually you don't have control over the numerical values of y',y'',... at the breakpoints.
But you want to have some sort of control over y'', namely y'' should be >=0. Since the usual cubic spline only has the property that the second derivative is continuous in the breakpoints, you either have to weaken smoothness properties of the cubic spline or add one or more polynomial degree(s) to gain degrees of freedom for the additional conditions p_i''(0) >= 0 in the breakpoints. That's why I suggested the next higher polynomial degree, namely quartic. In usual spline interpolation, for each additional degree of the local polynomials, you can demand 1 degree higher smoothness in the breakpoints. So while in usual spline interpolation, cubic polynomials lead to C^2 functions, quartic polynomials lead to C^3 functions. But maybe if you reduce the demand for quartic splines to C^2 instead of C^3, you gain degrees of freedom for the additional conditions p_i''(0) >= 0. I don't know - it has to be programmed to see what comes out.
David Goodmanson
le 26 Jan 2024
Hi SW,
Here is one way to go about it. The data has 10 closely spaced points and one well separated point x(11). Fit the first 10 points with a cubic polynomial. It so happens that the second deriviative is positive in that interval. Then fit the last interval, from x(10) to x(11), with a quadratic that has the right values at the end points and the same first deriviative at x(10) as does the cubic fit (three eqns, three unknowns). That way C1 is good, and by eye (and calculation) the second derivative is going to be positive in that interval.
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
xshort = x(1:end-1);
xfin = x(end);
y = [0, 0.0111, 0.0293, 0.0521, 0.0787, 0.1086, 0.1416, 0.1774, 0.2155, 0.2556, 1.0248];
yshort = y(1:end-1);
yfin = y(end);
% fit all but the last point with a cubic
p = polyfit(xshort,yshort,3)
% make sure that y''> 0
d2ydx2 = 6*p(1)*xshort + 2*p(2);
figure(1); grid on
plot(xshort,d2ydx2) % it does stay positive
xs1 = xshort(1);
xse = xshort(end);
yse = yshort(end);
% find the derivative at xse, and fit the interval from xse to xfin
% by a quadratic that matches the derivative at xse
dydx = sum(p.*[3*xse^2 2*xse 1 0])
q = [xse^2 xse 1; xfin^2 xfin 1; 2*xse 1 0]\[yse; yfin; dydx];
q = q'
xfit1 = linspace(xs1,xse,50);
yfit1 = polyval(p,xfit1);
xfit2 = linspace(xse,xfin,50);
yfit2 = polyval(q,xfit2);
xfit = [xfit1 xfit2];
yfit = [yfit1 yfit2];
figure(2); grid on
plot(x,y,'o-',xfit,yfit)
p = -0.0002 0.0058 -0.0186 0.0074
dydx = 0.0496
q = 0.0151 -0.2611 1.3464
Alex Sha
le 31 Jan 2024
Déplacé(e) : Bruno Luong
le 31 Jan 2024
How about to replace interpolation with a fitting function, which ensure non-negative second derivatives:
1: For data:
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0111, 0.0293, 0.0521, 0.0787, 0.1086, 0.1416, 0.1774, 0.2155, 0.2556, 1.0248];
Result:
Sum Squared Error (SSE): 1.81612553970697E-5
Root of Mean Square Error (RMSE): 0.00128492148317141
Correlation Coef. (R): 0.999990141249137
R-Square: 0.999980282595468
Parameter Best Estimate
--------- -------------
p1 0.260835290168589
p2 -17.6776599023055
p3 0.00142484062709233
p4 0.400795422709269
2: For data:
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0198463, 0.0397084, 0.0596019, 0.0795445, 0.0995199, 0.119908, 0.140298, 0.160773, 0.181772, 6.64478];
Result:
Sum Squared Error (SSE): 7.99353965884897E-5
Root of Mean Square Error (RMSE): 0.00269571033965396
Correlation Coef. (R): 0.999999018472319
R-Square: 0.999998036945602
Parameter Best Estimate
--------- -------------
p1 -0.468107959471592
p2 -12.9882511576562
p3 2.13196222726725E-6
p4 0.88833547002951
3: For data:
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0198463, 0.0397084, 0.0596019, 0.0795445, 0.0995199, 0.119908, 0.140298, 0.160773, 0.181772, 1.02478];
Result:
Sum Squared Error (SSE): 8.04747420418266E-5
Root of Mean Square Error (RMSE): 0.00270478938924384
Correlation Coef. (R): 0.999953749555241
R-Square: 0.999907501249586
Parameter Best Estimate
--------- -------------
p1 -0.467984356631285
p2 -12.993660082635
p3 9.9621944291978E-6
p4 0.736411490505736
19 commentaires
Bruno Luong
le 31 Jan 2024
Modifié(e) : Bruno Luong
le 31 Jan 2024
Good fit. It is not clear to me how this expression would give positive second derivative, expecially the p2/x term, when differentiation twice it becomes 2*p2/x^3, and it become negative for p2<0 which is the case in your fitting result.
Bruno Luong
le 31 Jan 2024
Modifié(e) : Bruno Luong
le 31 Jan 2024
If we specify lower bounds > 0 on the pi,
To be precise we have to specify pi >= lower bound > 0. Never strict inequality (pi > 0) would be allowed to be used in optimization theory for contiuous cost functions.
But yes the problem of non-differentiable when one of the constraint is active under discussion remains the same
SA-W
le 1 Fév 2024
With some sort of quadratic variable transformation, one can make f''>0 and can solve an unconstrained optimization problem. Hence, no active constraints at the solution, however, the optimization problem is now non-linear and uniqueness,etc, can not be proven. Makes sense?
Bruno Luong
le 1 Fév 2024
Modifié(e) : Bruno Luong
le 1 Fév 2024
When a function is defined on an open set (strict inequality constraints, to be igouruous a non closed set rather than open set) the solution could be not converge.
Tell me what is the solution of 1D problem:
argmin (x) such that x > 0 ?
Bruno Luong
le 1 Fév 2024
Modifié(e) : Bruno Luong
le 1 Fév 2024
Transformation did not solve the closure requirement issue.
You can transform x = y^2, and minimize
argmin y^2
But the requirement x > 0 becomes then y ~= 0, this requirement did not relax y freely to move.
In contrary if you set y as free variable, then it is equivalent to x >= 0 (a closed set).
Bruno Luong
le 1 Fév 2024
Modifié(e) : Bruno Luong
le 1 Fév 2024
Free variable == no constraint. That equivalents to x >= 0 (NOT x > 0).
y ~= 0 is NOT free
SA-W
le 1 Fév 2024
Thats clear.
But is this not more a theoretical discussion? I mean if I fit exponential terms like exp(y*x), it is obvious that y=0 can never be the solution if my data do not represent a horizontal line.
Bruno Luong
le 1 Fév 2024
Modifié(e) : Bruno Luong
le 1 Fév 2024
To be clear the y = x.^2 is not the y of your data (sorry for the confusion of notation). And this inequality is a side dicussion.
My original question to @Alex Sha is how he can be sure y'' >= 0 with y being the exponential form he proposed.
ANother remark is that when the n-1 data is exactly on a straight line, exponential y can only approximate (but accurately) the first part, and not interpolate it as spline would be able to do.
SA-W
le 1 Fév 2024
To be clear the y = x.^2 is not the y of your data (sorry for the confusion of notation). And this inequality is a side dicussion.
Sure. The data y can be what it is. What I mean is that z = x^2, when fitting exp(-z*x), z = 0 does not make sense for reasonable y data.
My original question to @Alex Sha is how he can be sure y'' >= 0 with y being the exponential form he proposed.
Maybe he comments again on that.
In the plotted curves, f'' < 0 somewhere. This is why I suggested to make to make the pi positive by a transformation z_i = p_i^2 and back out the solution at the end. In fact, it is sufficient to have p2>0 and p3>0 to obtain f''>0
syms x
syms p [1 4]
diff(exp(p1 + p2/x + p3*exp(p4*x)), 2)
ans =
But you think transforming z2 = p2^2, z3 = p3^2 is not a good idea, in theory? Whether this is still a good fit is a different question. Also the fact that the exponentials underflow / overflow very quickly, is of course a different question.
Bruno Luong
le 1 Fév 2024
Modifié(e) : Bruno Luong
le 1 Fév 2024
"But you think transforming z2 = p2^2, z3 = p3^2 is not a good idea, in theory? "
I didn't say that, it's not just what behind @Alex Sha original idea as I understood. His idea is find a free parameters (without need to transform it, which is the same in theory as adding constraints) of a formula that ensures automatically y'' >= 0 (?) and in the same time fits well the data. Obviously it did well with *some* data as he showed.
Now you are the one who interested in this problem. Up to you to decide which road to take, with all the discussion that I'll hope to enlight you make make an informed decision.
SA-W
le 1 Fév 2024
I do not want to bother you further, only one last query:
(without need to transform it, which is the same in theory as adding constraints)
My only motivation to make the fit to an exponential or whatever is that such a transformation makes the resulting curve differentiable w.r.t y-values. Is that the case if all parameters of the fit are strictly positive?
Bruno Luong
le 1 Fév 2024
Modifié(e) : Bruno Luong
le 1 Fév 2024
We already discuss about some aspects of your queston. Let me summarize:
- for unconstrained optimization that has uniqueness solution, the solution is differentiable wrt input data (y)
- for constrained optimization the solution is NOT differentiable wrt input data (y) if one of the inequality constraint is active
- IMO transforming the parameters only change the way you solve numerical the optimiztion problem, it does not change the nature of the dependency of solution wrt input, thus the derivative.
- Using "model" (such as exponential formula here, or quadratic spline or subic spline) you will compute the dependency of solutions on the model space. For example you can decide the model is f(x) = cst. The solution is f(x) = mean(yi) = 1/n sum_i y(i) for all x, n is number of data. It satisfies f">=0, poorly fit the data, but you have the derivative df(dyi) = yi/n. If that makes sense for you then OK. This example is in the spirit of what proposed by Alex, just push to extreme to demonstrate the point of model dependency.
SA-W
le 1 Fév 2024
Thats all clear. But what I meant is
Do I perform constrained optimization if I transform y = x^2 or is it unconstrained optimization? IMO, it affects the quality of the fit a lot, but it is unconstrained optimization.
Torsten
le 1 Fév 2024
Modifié(e) : Torsten
le 1 Fév 2024
If you can substitute all the constraints into your objective function, the optimization becomes unconstraint.
Problem 1 (constraint)
min: x^2 + y^2
s.c. x + y = 1
Problem2 (unconstraint):
min: x^2 + (1-x)^2
Problem 1 (constraint)
min: x^3
x >= 0
Problem 2 (unconstraint)
min: (f(y))^3
where f(y) is any function with range all the non-negative real numbers.
Bruno Luong
le 1 Fév 2024
Modifié(e) : Bruno Luong
le 1 Fév 2024
" IMO, it affects the quality of the fit a lot, but it is unconstrained optimization."
It should not affect the fit quality. With transformation you simply reparametrize the feasible space differently. Now in pratice the numerical optimizer can prefer some parametrization or in contrary migh fail to find the solution, but that is another issue.
Alex Sha
le 2 Fév 2024
The fitting function I provided previously will produce negative values for second derivatives in some cases (for example, second data set). Try the function below, which will ensure all positive values of second derivatives.
Fitting function:
The second derivatives of function:
It is easy to see that the above function will always be greater than zero.
1: For data:
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0111, 0.0293, 0.0521, 0.0787, 0.1086, 0.1416, 0.1774, 0.2155, 0.2556, 1.0248];
Result:
Sum Squared Error (SSE): 8.47449963637154E-5
Root of Mean Square Error (RMSE): 0.00277562435832365
Correlation Coef. (R): 0.999949332317241
R-Square: 0.999898667201696
Parameter Best Estimate
--------- -------------
p1 -74.8659942748385
p2 -2.31846362397161
p3 0.056083091099568
p4 4.31132419470017
2: For data:
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0198463, 0.0397084, 0.0596019, 0.0795445, 0.0995199, 0.119908, 0.140298, 0.160773, 0.181772, 6.64478];
Result:
Sum Squared Error (SSE): 1.46665162719541E-7
Root of Mean Square Error (RMSE): 0.000115469461810764
Correlation Coef. (R): 0.999999998124083
R-Square: 0.999999996248166
Parameter Best Estimate
--------- -------------
p1 -22.141281236849
p2 1.89538261915128
p3 0.00134010429376403
p4 1.38071369179591
For data:
x = [3, 3.8125, 4.6250, 5.4375, 6.25, 7.0625, 7.8750, 8.6875, 9.5, 10.3125, 16];
y = [0, 0.0198463, 0.0397084, 0.0596019, 0.0795445, 0.0995199, 0.119908, 0.140298, 0.160773, 0.181772, 1.02478];
Result:
Sum Squared Error (SSE): 7.55692231606095E-8
Root of Mean Square Error (RMSE): 8.28850371191159E-5
Correlation Coef. (R): 0.999999954352124
R-Square: 0.99999990870425
Parameter Best Estimate
--------- -------------
p1 -18.4830172399681
p2 1.74240209734898
p3 0.00155605799596946
p4 1.05985027372439