What is the reference category in the output for a Fitlme with categorical variables and three-way interaction terms?
13 views (last 30 days)
Below table summarizes the output of a mixed linear model with random intercept and slope run on structured panel data ('tbl_early'), where the model specifies as:
lme_PrimaryHU = fitlme(tbl_early, 'logRoL ~ 1 + logLoL + logAnnLioL + Dur + PPI + AvgEffTax_1 + HU + logEP*logAP + EQ*PrimaryHU*relInsLoss_1 + Wstorm*relInsLoss_1 +Storm*PrimaryHU*relInsLoss_1 +(FFR|ID)')
'Dur' has 4 levels and therefore I understood that the output shows three levels with estimates that relate to the fourth, i.e. the reference level ('Dur_one'). From the results one could interpret that Dur_onehalf trades at a discount if compared to Dur_one, all else equal.
'HU', 'Storm', 'EQ' and 'Wstorm' are binary variables, they are not mutually exclusive (cross-sectional analysis) and there is no case in the data in which all of them would be 0. Thus the question is, which of these variables Matlab chose as reference case. !Note that some of the peril variables are used in two- or three-way interaction terms that appear a bit lower in the table! 'PrimaryHU' is a binary variable that controlls for a certain condition which impacts the potential effects from relInsLoss or 'HU', 'Storm', 'EQ' and 'Wstorm' (e.g. 'EQ' alone is positive but not significant at p<0.1, 'EQ*relInsLoss' is negative and still not significant, 'EQ*primaryHU' is negative and significant, 'EQ*relInsLoss*PrimaryHU' is positive and significant). All remaining variables are continuous.
Two-way interactions used:
Three-way interactins used:
Other interaction terms or underlying variables' seperate estimates should be a product of using above interaction terms.
Many thanks for any help in advance!
Peng Li on 10 May 2020
The table you copied isn't the default display from matlab, so it's difficult to tell anything from there. It's like an ANOVA output since items (including interaction items) that are categorical each corresponding to only one line.
As you mentioned, for categorical variables, regression will give explicitely which level that record is for, and the level that without an output row is the reference level. Dichotomous variable is just a specific case of categorical variable. For example if you have sex (0/1), it usually gives sex xx, xx, xx, xx..., that means 0 is used as a reference. Same strategy is used to display interaction items that involve categorical variables.
You have to explicitly make them categorical as well by, e.g., tbl.sex = categorical(tbl.sex); otherwise by default it is used as a continous variable, and thus 0 is always the default reference value.
In the equation you used, FFR doesn't appear as a fixed effect. If you only want a subject specific intercept, use (1|ID) otherwise make sure that that's what you really want.