Main Content

manova

Multivariate analysis of variance (MANOVA) results

Since R2023b

    Description

    A manova object contains the results of a one-, two-, or N-way MANOVA. Use the properties of a manova object to determine if the vector of means in a set of response data differs with respect to the values (levels) of a factor or multiple factors. The object properties include information about the coefficient estimates, MANOVA model fit to the response data, and factors used to perform the analysis. For more information about MANOVA, see Multivariate Analysis of Variance for Repeated Measures.

    Creation

    Description

    example

    maov = manova(factors,Y) performs a one-, two-, or N-way MANOVA and returns the manova object for the factors in factors and the response variables in Y. The manova object maov contains the results of performing the MANOVA.

    example

    maov = manova(tbl,Y) uses the variables in the table tbl as factors for the response variables Y. Each table variable corresponds to a factor.

    example

    maov = manova(tbl,ResponseVarNames) uses the variables in tbl as factors and response variables. The ResponseVarNames argument specifies which variables contain the response data.

    maov = manova(tbl,formula) specifies the MANOVA model in Wilkinson notation. The terms of formula use only the variable names in tbl.

    example

    maov = manova(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify which test statistic to calculate, which factors are categorical, and the MANOVA model type.

    Input Arguments

    expand all

    Response data, specified as a numeric matrix or numeric vector. Each column of Y corresponds to a separate response variable. You must also specify factor values by passing the factors or tbl input argument to manova. Y must have the same number of rows as the input argument containing the factor values, because manova assigns factor values to response data by row index.

    The following example shows how manova assigns factor values to response data for a one-way MANOVA:

    y=[y1;y2;y3;y4;y5;;yN]g=["A";"A";"C";"B";"B";;"D"]

    In this example, g contains the factor values, y contains the response data, and yi=[yi,1,yi,2,,yi,R] is a row vector of response data for the ith observation. R is the number of response variables.

    The following example shows how manova assigns factor values to response data for a three-way MANOVA:

    y=[y1;y2;y3;y4;y5;,yN]g1=[1.25;3.68;9.11;5.90;1.47;;8.86]g2=[1;2;1;3;1;;2]g3=[100;500;300;200;300;;400]

    In this example, g1, g2, and g3 contain the values for the three factors.

    Note

    The manova function ignores NaN values, <undefined> values, empty characters, and empty strings in Y. If factors or tbl contains NaN or <undefined> values, or empty characters or strings, the function ignores the corresponding observations in Y.

    Data Types: single | double

    Factors and factor values for the MANOVA, specified as a numeric, logical, categorical, or string vector, a cell array of character vectors, or a numeric matrix. Factors and factor values are sometimes called grouping variables and group names, respectively.

    For a one-way MANOVA, factors is a column vector or cell array of vectors in which each element represents the factor value of the response data in the same row of Y. For a two- or N-way MANOVA, factors is a numeric matrix in which each column corresponds to a different factor. Each row of factors contains the factor values for the observation in the same row of Y. factors and Y must have the same number of rows.

    The following example shows how manova assigns the factor values to response data for a one-way MANOVA:

    y=[y1;y2;y3;y4;y5;;yN]g=["A";"A";"C";"B";"B";;"D"]

    In this example, g is a vector containing the factor values, y is a matrix of response data, and yi=[yi,1,yi,2,,yi,R] is a row vector of response data for the ith observation. R is the number of response variables.

    The following example shows how manova assigns factor values to the response data for a three-way MANOVA:

    y=[y1;y2;y3;y4;y5;,yN]g1=[1.25;3.68;9.11;5.90;1.47;;8.86]g2=[1;2;1;3;1;;2]g3=[100;500;300;200;300;;400]

    In this example, g1, g2, and g3 are columns of a numeric matrix that contain values for three factors.

    Note

    If factors or tbl contains NaN values, <undefined> values, empty characters, or empty strings, the manova function ignores the corresponding observations in Y.

    Example: [1,2,1,3,1,...,3,1]

    Example: ["white","red","white",...,"black","red"]

    Example: stats=[height,age]; manova(stats,Y);

    Data Types: single | double | logical | categorical | string | cell

    Factors, factor values, and response data, specified as a table. The variables of tbl can contain numeric, logical, categorical, or string elements, or cell arrays of characters. When you specify tbl, you must also specify the response data Y, ResponseVarNames, or formula.

    • If you specify the response data in Y, the table variables represent only the factors for the MANOVA. A factor value in a variable of tbl corresponds to the response data Y at the same row index. tbl must have the same number of rows as the length of Y.

    • If you do not specify Y, you must indicate which variables in tbl contain the response data by using the ResponseVarNames or formula input argument. You can also choose a subset of factors in tbl to use in the MANOVA by setting the name-value argument FactorNames. The manova function associates the values of the factor variables in tbl with the response data in the same row.

    Note

    If factors or tbl contains NaN values, <undefined> values, empty characters, or empty strings, the manova function ignores the corresponding observations in Y.

    Example: mountain=table(altitude,temperature,soilpH,iron); manova(mountain,["soilpH" "iron"])

    Data Types: table

    Names of the response variables, specified as a string vector or a cell array of character vectors. ResponseVarNames indicates which variables in tbl contain the response data. When you specify ResponseVarNames, you must also specify the tbl input argument. The names in ResponseVarNames must be names of variables in tbl.

    Example: "r"

    Data Types: char | string | cell

    MANOVA model, specified as a string scalar or a character vector in Wilkinson notation. When you specify formula, you must also specify tbl. The terms in formula must be names of variables in tbl.

    Example: "r1-r3 ~ f1 + f1:f2:f3"

    Data Types: char | string

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: manova(factors,Y,CategoricalFactors=[1 2],FactorNames=["school" "major" "age"],ResponseNames=["GPA" "StartYear" "GraduationYear"]) specifies the first two factors in factors as categorical, the factor names as "school", "major", and "age", and the names of the response variables as "GPA", "StartYear", and "GraduationYear".

    Factors to treat as categorical, specified as a numeric, logical, or string vector, or a cell array of character vectors. When CategoricalFactors is set to the default value "all", the manova function treats all factors as categorical.

    Specify CategoricalFactors as one of the following:

    • A numeric vector with indices between 1 and N, where N is the number of factor variables. The manova function treats factors with indices in CategoricalFactors as categorical. The index of a factor is the order in which it appears in the columns of tbl.

    • A logical vector of length N, where a true entry means that the corresponding factor is categorical.

    • A string vector of factor names that match names in tbl or FactorNames.

    Example: CategoricalFactors=["Location" "Smoker"]

    Example: CategoricalFactors=[1 3 4]

    Data Types: single | double | logical | char | string | cell

    Factor names, specified as a string vector or a cell array of character vectors.

    • If you specify tbl in the call to manova, FactorNames must be a subset of the table variables in tbl. manova uses only the factors specified in FactorNames. In this case, the default value of FactorNames is the collection of names of the factor variables in tbl.

    • If you specify factors in the call to manova, you can specify any names for FactorNames. In this case, the default value of FactorNames is ["Factor1","Factor2",…,"FactorN"], where N is the number of factors.

    When you specify formula, manova ignores FactorNames.

    Example: FactorNames=["time","latitude"]

    Data Types: char | string | cell

    Type of MANOVA model to fit, specified as one of the options in the following table or an integer, string scalar, character vector, or terms matrix. The default value for ModelSpecification is "linear".

    OptionTerms Included in MANOVA Model
    "linear" (default)Main effect (linear) terms
    "interactions"Main effect and pairwise interaction terms
    "purequadratic"Main effects and squared main effects. All factors must be continuous to use this option. Set CategoricalFactors = [] to specify all factors as continuous.
    "quadratic"Main effects, squared main effects, and pairwise interaction terms. All factors must be continuous to use this option.
    "polyIJK"Polynomial terms up to degree I for the first factor, degree J for the second factor, and so on. The degree of an interaction term cannot exceed the maximum exponent of a main term. You must specify a degree for each factor.
    "full"Main effect and all interaction terms

    To include all main effects and interaction terms up to the kth level, set ModelSpecification equal to k. When ModelSpecification is an integer, the maximum level of an interaction term in the MANOVA model is the minimum between ModelSpecification and the number of factors.

    If you specify formula, manova ignores ModelSpecification.

    You can also specify the terms of a MANOVA model using one of the following:

    • Double or single terms matrix T with a column for each factor. Each term in the MANOVA model is a product corresponding to a row of T. The row elements are the exponents of their corresponding factors. For example, T(i,:) = [1 2 1] means that term i is (Factor1)(Factor2)2(Factor3). Because the manova function automatically includes a constant term in the MANOVA model, you do not need to include a row of zeros in the terms matrix.

    • Character vector or string scalar formula in Wilkinson notation, representing one or more terms. The formula must use names contained in FactorNames, ResponseNames, or table variable names (if tbl is specified).

    Example: ModelSpecification="poly3212"

    Example: ModelSpecification=3

    Example: ModelSpecification="r1-r3 ~ c1*c2"

    Example: ModelSpecification=[0 0 0;1 0 0;0 1 0;0 0 1]

    Data Types: single | double | char | string

    Names of the response variables, specified as a 1-by-R string vector or a 1-by-R cell array of character vectors, where R is the number of response variables. If you specify ResponseVarNames or formula, manova ignores ResponseNames.

    Example: ResponseNames=["soilpH" "plantHeight"]

    Data Types: char | string | cell

    MANOVA test statistics, specified as "all" or one or more of the following values.

    ValueTest NameEquation
    "pillai" (default)Pillai's trace

    V=trace(Qh(Qh+Qe)1)=θi,

    where θi values are the solutions of the characteristic equation Qhθ(Qh + Qe) = 0. Qh and Qe are, respectively, the hypotheses and the residual sum of squares product matrices.

    "hotelling"Hotelling-Lawley trace

    U=trace(QhQe1)=λi,

    where λi are the solutions of the characteristic equation |QhλQe| = 0.

    "wilks"Wilk's lambda

    Λ=|Qe||Qh+Qe|=11+λi.

    "roy"Roy's maximum root statistic

    Θ=max(eig(QhQe1)).

    If you specify TestStatistic as "all", manova calculates all the test statistics in the table above.

    Example: TestStatistic=["pillai" "roy"]

    Data Types: char | string | cell

    Properties

    expand all

    This property is read-only.

    Indices of categorical factors, specified as a numeric vector. This property is set by the CategoricalFactors name-value argument.

    Data Types: double

    This property is read-only.

    Fitted MANOVA model coefficients, specified as a numeric matrix. Each column of the matrix corresponds to a different response variable, and each row corresponds to a different term in the MANOVA model.

    For each categorical factor in the maov object, manova reserves one factor value as the reference value. manova then expands each categorical factor into F – 1 dummy variables, where F is the number of values for the factor. Each dummy variable is fit with a different coefficient during the MANOVA. A dummy variable corresponding to a factor value is 1 when an observation is assigned the same factor value, -1 when it is assigned the reference factor value, and 0 otherwise. For more information, see Dummy Variables Created with Effects Coding.

    Continuous factors have coefficients that are constant across factor values.

    Data Types: single | double

    This property is read-only.

    Degrees of freedom for the error (residuals), equal to the number of observations minus the number of estimated coefficients, specified as a positive integer.

    Data Types: double

    This property is read-only.

    Names of the coefficients, specified as a string vector. The manova function expands each categorical factor into F – 1 dummy variables, where F is the number of values for the factor. The vector ExpandedFactorNames contains the name of each dummy variable. For more information, see Coefficients.

    Data Types: string

    This property is read-only.

    Names and values of the factors used to fit the MANOVA model, specified as a table. The names of the table variables are the factor names, and each variable contains the values of its corresponding factor. If the factors used to fit the model are not given as a table, manova converts them into a table with one column per factor.

    This property is set by the tbl input argument, or the factors input argument together with the FactorNames name-value argument.

    Data Types: table

    This property is read-only.

    Names of the factors used to fit the MANOVA model, specified as a string vector. This property is set by the tbl input argument or the FactorNames name-value argument.

    Data Types: string

    This property is read-only.

    MANOVA model, specified as a MultivariateLinearFormula object. This property is set by the formula input argument or the ModelSpecification name-value argument.

    This property is read-only.

    Estimated covariance matrix for the response variables, specified as a double or single matrix. For more information about covariance matrices, see Covariance.

    Data Types: single | double

    This property is read-only.

    Names of the response variables, specified as a string vector. This property is set by the ResponseVarNames input argument or the ResponseNames name-value argument.

    Data Types: string

    This property is read-only.

    Test statistics used to perform the MANOVA, specified as a string vector. This property is set by the TestStatistic name-value argument.

    Data Types: string

    This property is read-only.

    Response data used to fit the MANOVA model, specified as a numeric vector. This property is set by the Y input argument, or the tbl input argument together with the ResponseVarNames input argument.

    Data Types: single | double

    Object Functions

    barttestBartlett's test for multivariate analysis of variance (MANOVA)
    boxchartBox chart (box plot) for multivariate analysis of variance (MANOVA)
    canonvarsCanonical variables
    coeftestLinear hypothesis test on MANOVA model coefficients
    groupmeansMean response estimates for multivariate analysis of variance (MANOVA)
    multcompareMultiple comparison of marginal means for multiple analysis of variance (MANOVA)
    plotprofilePlot MANOVA response variable means with grouping
    statsMultivariate analysis of variance (MANOVA) table

    Examples

    collapse all

    Load the fisheriris data set.

    load fisheriris

    The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flower: the length and width of sepals and petals in centimeters.

    Perform a one-way MANOVA to test the null hypothesis that the vector of means for the four measurements is the same across the three flower species.

    maov = manova(species,meas)
    maov = 
    1-way manova
    
    Y1,Y2,Y3,Y4 ~ 1 + Factor1
    
        Source     DF     TestStatistic    Value       F       DFNumerator    DFDenominator      pValue  
        _______    ___    _____________    ______    ______    ___________    _____________    __________
    
        Factor1      2       pillai        1.1919    53.466          8             290         9.7422e-53
        Error      147                                                                                   
        Total      149                                                                                   
    
    
      Properties, Methods
    
    
    

    maov is a one-way manova object that contains the results of the one-way MANOVA. The output displays the formula for the MANOVA model and a MANOVA table. In the formula, the flower measurements are represented by the terms Y1, Y2, Y3, and Y4. Factor1 represents the flower species. The MANOVA table contains the p-value for the Pillai's trace test statistic. The p-value indicates that enough evidence exists to reject the null hypothesis at the 95% confidence level, and that the iris species has an effect on at least one of the four measurements.

    Load the carsmall data set.

    load carsmall

    The variable Model_Year contains data for the year a car was manufactured, and the variable Cylinders contains data for the number of engine cylinders in the car. The Acceleration and Displacement variables contain data for car acceleration and displacement.

    Use the table function to create a table of factor values from the data in Model_Year and Cylinders.

    tbl = table(Model_Year,Cylinders,VariableNames=["Year" "Cylinders"]);

    Create a matrix of response variables from Acceleration and Displacement.

    y = [Acceleration Displacement];

    Perform a two-way MANOVA using the factor values in tbl and the response variables in y.

    maov = manova(tbl,y)
    maov = 
    2-way manova
    
    Y1,Y2 ~ 1 + Year + Cylinders
    
         Source      DF    TestStatistic     Value        F       DFNumerator    DFDenominator      pValue  
        _________    __    _____________    ________    ______    ___________    _____________    __________
    
        Year          2       pillai        0.084893    2.1056          4             190           0.081708
        Cylinders     2       pillai         0.94174     42.27          4             190         2.5049e-25
        Error        95                                                                                     
        Total        99                                                                                     
    
    
      Properties, Methods
    
    
    

    maov is a two-way manova object that contains the results of the two-way MANOVA. The output displays the formula for the MANOVA model and a MANOVA table. In the formula, the car acceleration and displacement are represented by the variables Y1 and Y2, respectively. The MANOVA table contains a small p-value corresponding to the Cylinders term in the MANOVA model. The small p-value indicates that, at the 95% confidence level, enough evidence exists to conclude that Cylinders has a statistically significant effect on the mean response vector. Year has a p-value larger than 0.05, which indicates that not enough evidence exists to conclude that Year has a statistically significant effect on the mean response vector at the 95% confidence level.

    Use the barttest function to determine the dimension of the space spanned by the mean response vectors corresponding to the factor Year.

    barttest(maov,"Year")
    ans = 0
    

    The output shows that the mean response vectors corresponding to Year span a point, indicating that they are not statistically different from each other. This result is consistent with the large p-value for Year.

    Load the patients data set.

    load patients

    The variables Systolic and Diastolic contain data for patient systolic and diastolic blood pressure. The variables Smoker and SelfAssessedHealthStatus contain data for patient smoking status and self-assessed heath status.

    Use the table function to create a table of factor values from the data in Systolic, Diastolic, Smoker, and SelfAssessedHealthStatus.

    tbl = table(Systolic,Diastolic,Smoker,SelfAssessedHealthStatus,VariableNames=["Systolic" "Diastolic" "Smoker" "SelfAssessed"]);

    Perform a two-way MANOVA to test the null hypothesis that smoking status does not have a statistically significant effect on systolic and diastolic blood pressure, and the null hypothesis that self-assessed health status does not have an effect on systolic and diastolic blood pressure.

    maov = manova(tbl,["Systolic" "Diastolic"])
    maov = 
    2-way manova
    
    Systolic,Diastolic ~ 1 + Smoker + SelfAssessed
    
           Source       DF    TestStatistic     Value         F       DFNumerator    DFDenominator      pValue  
        ____________    __    _____________    ________    _______    ___________    _____________    __________
    
        Smoker           1       pillai         0.67917     99.494          2              94         6.2384e-24
        SelfAssessed     3       pillai        0.053808    0.87552          6             190            0.51392
        Error           95                                                                                      
        Total           99                                                                                      
    
    
      Properties, Methods
    
    
    

    maov is a manova object that contains the results of the two-way MANOVA. The small p-value for the Smoker term in the MANOVA model indicates that enough evidence exists to conclude that mean response vectors are statistically different across the factor values of Smoker. However, the large p-value for the SelfAssessed term indicates that not enough evidence exists to reject the null hypothesis that the mean response vectors are statistically the same across the values for SelfAssessed.

    Calculate the marginal means for the values of the factor Smoker.

    groupmeans(maov,"Smoker")
    ans=2×5 table
        Smoker     Mean       SE       Lower     Upper 
        ______    ______    _______    ______    ______
    
        false     99.203    0.45685    98.296    100.11
        true      109.45    0.62574    108.21     110.7
    
    

    The output shows that the marginal mean for non-smokers is lower than the marginal mean for smokers.

    Load the patients data set.

    load patients

    The variables Systolic and Diastolic contain data for patient systolic and diastolic blood pressure. The variables Weight, Height, and Smoker contain data for patient weight, height, and smoking status.

    Use the table function to create a table of factor values from the data in Systolic, Diastolic, Weight, Height, and Smoker.

    tbl = table(Systolic,Diastolic,Smoker,Weight,Height,VariableNames=["Systolic" "Diastolic" "Smoker" "Weight" "Height"]);

    Perform a three-way MANOVA to test the null hypothesis that smoking status does not have a statistically significant effect on systolic and diastolic blood pressure, and the null hypothesis that the interaction between weight and height does not have a statistically significant effect on systolic and diastolic blood pressure.

    maov = manova(tbl,"Systolic,Diastolic ~ Smoker + Weight*Height",CategoricalFactors=["Smoker"])
    maov = 
    N-way manova
    
    Systolic,Diastolic ~ 1 + Smoker + Weight*Height
    
           Source        DF    TestStatistic     Value         F       DFNumerator    DFDenominator      pValue  
        _____________    __    _____________    ________    _______    ___________    _____________    __________
    
        Smoker            1       pillai         0.66141     91.809          2              94         7.8511e-23
        Weight            1       pillai        0.020516    0.98446          2              94            0.37746
        Height            1       pillai        0.012788     0.6088          2              94            0.54613
        Weight:Height     1       pillai        0.019438    0.93169          2              94            0.39749
        Error            95                                                                                      
        Total            99                                                                                      
    
    
      Properties, Methods
    
    
    

    maov is a manova object that contains the results of the three-way MANOVA. The small p-value for the Smoker term in the MANOVA model indicates that enough evidence exists to conclude that mean response vectors are statistically different across the factor values of Smoker. However, the large p-value for the Weight:Height term indicates that not enough evidence exists to reject the null hypothesis that the mean response vectors are not statistically different across the combinations of the values for weight and height.

    Display a profile plot of the means for the values of Smoker.

    plotprofile(maov,"Smoker")
    legend

    The plot shows that the mean systolic and diastolic blood pressure values are higher for smokers than non-smokers.

    More About

    expand all

    Alternative Functionality

    The manova1 function returns the output of the barttest object function, and a subset of the manova object properties. manova1 is limited to one-way MANOVA.

    References

    [1] Krzanowski, Wojtek. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.

    [2] Morrison, Donald F. Multivariate Statistical Methods. 2nd ed, McGraw-Hill, 1976.

    Version History

    Introduced in R2023b