How to define a custom equation in fitlm function for linear regression?

29 vues (au cours des 30 derniers jours)
Spirit
Spirit le 22 Nov 2017
Commenté : the cyclist le 8 Juin 2023
I'd like to define a custom equation for linear regression. For example y = a*log(x1) + b*x2^2 + c*x3 + k. This is a linear regression problem - but how to do this within FitLm function?
Thanks, Shriram

Réponses (3)

the cyclist
the cyclist le 22 Nov 2017
Modifié(e) : the cyclist le 22 Nov 2017
% Set the random number seed for reproducibility
rng default
% Make up some pretend data
N = 100;
x1 = rand(N,1);
x2 = rand(N,1);
x3 = rand(N,1);
a = 2;
b = 3;
c = 5;
k = 7;
noise = 0.2*randn(N,1);
y = a*log(x1) + b*x2.^2 + c*x3 + k + noise;
% Put the variables into a table, naming them appropriately
tbl = table(log(x1),x2.^2,x3,y,'VariableNames',{'log_x1','x2_sqr','x3','y'});
% Specify and carry out the fit
mdl = fitlm(tbl,'y ~ 1 + log_x1 + x2_sqr + x3')
  5 commentaires
Walter Roberson
Walter Roberson le 8 Juin 2023
If you have a discontinuity in the first or second derivatie of the model, then you surely do not have a linear regression situation.
You probably need to use ga(). Not fmincon() or similar -- those optimizers cannot handle discontinuities in derivatives either.
the cyclist
the cyclist le 8 Juin 2023
I suggest you search the keywords segmented regression matlab and/or piecewise regression matlab. Although I don't believe there are any built-in functions for this, you should find a few different threads that you might find useful. I also think you might want to start a brand-new question for this, after you have done that search. In that question, I would suggest posting your data, which makes it easier for people to try out code suggestions.

Connectez-vous pour commenter.


laurent jalabert
laurent jalabert le 19 Déc 2021
Modifié(e) : laurent jalabert le 19 Déc 2021
To proceed with a custom function it is possible to use the non linear regression model
The example below is intended to fit a basic Resistance versus Temperature at the second order such as R=R0*(1+alpha*(T-T0)+beta*(T-T0)^2), and the fit coefficient will be b(1)=R0, b(2) = alpha, and b(3)=beta.
The advantage here, is that the SE will be computed directly for R0, alpha and beta.
beta0 is an initial range of [R0 alpha beta]
b(n) is retrieved using mdl.Coefficients.Estimate(n), for n=1,2,3
standard deviation on the coefficients are retrieved by mdl.Coefficients.SE(n)
(Curve fitting toolbox and Statistical/Machine Learning toolbox are both requiered)
clear tbl mdl
% your vector data T_T0 and R of same dimension
tbl = table(T_T0,R);
modelfun = @(b,x)b(1).*(1+b(2).*x(:,1)+b(3).*x(:,1).^2);
beta0 = [100 1e-3 1e-6];
mdl = fitnlm(tbl,modelfun,beta0,'CoefficientNames',{'R0';'alpha';'beta'})
  1 commentaire
Erin Evans
Erin Evans le 7 Juin 2023
Would you be able to help me write this such that there is a conditional statement in it? I essentially need two connected trendlines such that the statistics are for both sections together.
my equation is:
log10(Qs) = log10(a) + b*log(Qr) + c*log10(Max(1, Qr / Qc)) + d*log10(Qr(i) / Qr(i - 1))
the code I have so far is:
AgaDisc = readtable("File Path");
%% Bring in data columns
AgaDischargeArray = table2array(AgaDisc(:,"Discharge"));
AgaLoadArray = table2array(AgaDisc(:,"SedimentLoad"));
%% Normalize the discharge and sediment data
meanDisc = mean(AgaDischargeArray);
medianLoad = median(AgaLoadArray);
AgaDischargeArray = AgaDischargeArray/meanDisc;
AgaLoadArray = AgaLoadArray/medianLoad;
%% log10 sediment
logQs = log10(AgaDischargeArray);
%% log10 discharge
logQt = log10(AgaLoadArray);
%% Create new table of log-normalized data
tbl = table(logQs, logQt);
%% Add weighting factor
weights = zeros(length(logQt), 1);
weightFactor = [0.5, 1, 5, 10];
Q = quantile(logQt, 3);
for i = 1:length(logQt)
if logQt(i) > Q(3)
weights(i) = weightFactor(4);
elseif logQt(i) > Q(2)
weights(i) = weightFactor(3);
elseif logQt(i) > Q(1)
weights(i) = weightFactor(2);
else
weights(i) = weightFactor(1);
end
end
m = fitlm(tbl, 'logQs ~ logQt', 'RobustOpts', 'on', 'weight', weights);

Connectez-vous pour commenter.


laurent jalabert
laurent jalabert le 8 Juin 2023
please check carefully your expression, cause you use log10 and log (I guess neperian log here)
log10(Qs) in equation and logQs = log10(AgaDischargeArray) in your program; is it same ?
d*log(Qr(i)/Qr(i-1) might be similar to d* diff(log(Qr))
log10(Qs) = log10(a) + b*log(Qr) + c*log10(Max(1, Qr / Qc)) + d*log10(Qr(i) / Qr(i - 1))
log10Qs is x(:,1) as first column of tbl.
logQt is x(:,2) as the second column of tbl.
a,b,c,d are unknown
Qc is not defined
d* diff(log(Qr)) will lead to problem because its length is length(Qr) -1

Catégories

En savoir plus sur Linear and Nonlinear Regression dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by