Which formula fits better for this Linear Mixed-Effects Model?
3 views (last 30 days)
I am currently analyzing a dataset that contains a list of flight simulator tests performed by different pilots. I want to analyze if a certain flight parameter (i.e. amount of input errors during flight, lateral deviation to ideal path, etc.) are affected by the categorical variables in the following list:
- Experimental Campaign: the pilots flew the same flights but in different places (and environmental conditions, such as lack of oxygen, isolation, etc). 5 different campaigns were done. A performance difference is likely to appear depending on the campaigns.
- Group: in each campaign, the pilots were divided in two groups: Frequend and infrequent flyers. A performance difference is expected between both groups.
- Session: During each campaign, the same amount of flights were performed, each month for FF pilots, or every three months for IF Pilots. In total, 10 sessions were made. A variation of performance might happen throughout the experiment, also affected by Group and Campaign.
- Flight Scenario: three different flight scenarios were flown, which required different skill levels. The performance is also expected to vary between type of scenario.
Additionally, an extra list of categorical variables could be considered (Gender, Age, Background, etc.).
Could you please tell me which LME Model formula would you better implement in order to understand the dataset presented? And if you wish, how would you better plot the results of such an analysis?
Peng Li on 5 Aug 2020
To understand your dataset, I still believe that descriptive statics can give you the broad general picture. In terms of statistical analysis, the hypothesis always comes first and then models, not the other way around. So get back to your question, I don't really recommend you do the linear regressions, whatever types, just blindly before you have any hypotheses in mind. And for sure that the software will throw out results whatsoever, even you design has problems. The results will then make no sense.
It might be okay to explore around. But the four-way interaction is way too complicated to be interpretable! It seems that in your question, the same pilot may have completed multiple sessions. That will be where random effects are needed.
To showcase a simpler scenario, for example you'd like to test whether frequent and infrequent pilots (group factor) perform differently in different places, and you'd like to control for demographic varations, you may want to apply this lme model: outcome ~ group * place + age + sex + background + (1|pilot)
the (1|pilot) part in the formula is to take the within-pilot correlation into consideration (random effect).