How to get the difference between two robust regressions (based on repeated measures) and statistically analyse it?
2 views (last 30 days)
I have been struggling with this problem for already a couple of long weeks. Perhaps someone can give me some help or suggestions.
I collected data from two different populations of 6 samples each at 5 different time point. In order to analyse the progression pattern of each population over time, I performed a robust regression on each set. Briefly, I have an array with the set of values representing the days I gathered the data [x]; and two more arrays containing the values characterizing each subject repeated 5 times [v] and [d]. I used the following code to plot the robust regression, in this case to run the robust regression on the array [v]:
x=[0 0 0 0 0 0 28 28 28 28 28 28 56 56 56 56 56 56 84 84 84 84 84 84 112 112 112 112 112 112];
v=[1.000000 1.000000 1.000000 1.000000 1.000000 NaN 0.777491 1.071769 0.649429 0.612013 0.568944 NaN 1.115822 0.624034 0.659042 0.583451 0.437539 NaN 1.156354 0.721808 0.574364 0.556517 0.448168 NaN 0.771199 1.259949 0.486212 0.642605 0.439475 NaN];
d=[1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.889396 0.818145 0.992733 0.967640 0.561485 0.714877 0.840673 0.849952 1.214985 1.095830 0.470337 0.611467 0.961033 0.779437 1.240590 0.800760 0.476940 0.733969 0.746624 0.664785 1.002410 0.829174 0.420633 0.719057];
b(1) = 1;
[b(2),stats] = robustfit(x,v - 1,,,'off');
scatter(x,v,'filled'); grid on; hold on;
%[b,stats] = robustfit(x,v);
Ultimately, what I really want is statistically know how different the two regression are. For that, I was suggested to subtract [v] from [d] and check how different from 0 that result would be. I am struggling with this task.
Very naïvely, I already tried to just subtract the arrays and run the regression again on the resultant array. However, I realised that any order of my values within the arrays changes the result, obviously. I was thinking perhaps I could calculate the average per time point for each population and work with those values, but the standard errors of mean are so big, that I think it is important to consider them for this analysis (which is something I do not know how to include). In summary, does MATLAB have any toolbox or function that I can use to run a statistic test offering what I need? Otherwise, can anyone help me to code this problem?
I was using the Curve Fitting Tool from MATLAB to individually analyse the robust regressions of my populations and I realized that a robust Rational fit (numerator=0 and denominator=1) is a lot better, both according to goodness of fit and also biological interpretation of results, when compared to a robust Polynomial fit (degree=1). Is it possible to solve my previous question using the Rational fitting instead of a simple linear fit?
Thanks a lot everybody!
Ben Drebing on 20 Dec 2017
You could take the norm of difference of the two fits. This can give you a pretty good indicator of their similarity. The closer their norm is to 0, the close the lines are to one another. Going along with your example:
[b_1, ~] = robustfit(x,v-1,,,'off');
[b_2, ~] = robustfit(x,d-1,,,'off');
y_1 = 1 + b_1 * unique(x);
y_2 = 1 + b_2 * unique(x);
d = norm(y_1 - y_2) % The close this value is to 0, the closer the fits are to one another.