Main Content

Data Types

Grouping variables, categorical data, and dataset arrays

Statistics and Machine Learning Toolbox™ provides two additional data types. Work with ordered and unordered discrete, nonnumeric data using the nominal and ordinal data types. Store multiple variables, including those with different data types, into a single object using the dataset array data type. However, these data types are unique to Statistics and Machine Learning Toolbox. For greater cross-product compatibility, use the categorical or table data types, respectively, available in MATLAB®. For more information see Create Categorical Arrays, Create Tables and Assign Data to Them, or watch Tables and Categorical Arrays.


expand all

nominal(Not Recommended) Arrays for nominal data
ordinal(Not Recommended) Arrays for ordinal data
dummyvarCreate dummy variables
onehotencodeEncode data labels into one-hot vectors
onehotdecodeDecode probability vectors into class labels
gplotmatrixMatrix of scatter plots by group
grp2idxCreate index vector from grouping variable
gscatterScatter plot by group
mat2dataset(Not Recommended) Convert matrix to dataset array
cell2dataset(Not Recommended) Convert cell array to dataset array
struct2dataset(Not Recommended) Convert structure array to dataset array
table2dataset(Not Recommended) Convert table to dataset array
dataset2cell(Not Recommended) Convert dataset array to cell array
dataset2struct(Not Recommended) Convert dataset array to structure
dataset2tableConvert dataset array to table
export(Not Recommended) Write dataset array to file
ismissing(Not Recommended) Find dataset array elements with missing values
join(Not Recommended) Merge dataset array observations


dataset(Not Recommended) Arrays for statistical data


Categorical Data

Nominal and Ordinal Arrays

Nominal and ordinal arrays store data that have a finite set of discrete levels, which might or might not have a natural order.

Advantages of Using Nominal and Ordinal Arrays

Easily manipulate category levels, carry out statistical analysis, and reduce memory requirements.

Grouping Variables

Grouping variables are utility variables used to group or categorize observations.

Dummy Variables

Dummy variables let you adapt categorical data for use in classification and regression analysis.

Other MATLAB Functions Supporting Nominal and Ordinal Arrays

Learn about MATLAB functions that support nominal and ordinal arrays.

Create Nominal and Ordinal Arrays

Create nominal and ordinal arrays using nominal and ordinal, respectively.

Categorize Numeric Data

Categorize numeric data into a categorical ordinal array using ordinal.

Change Category Labels

Change the labels for category levels in nominal or ordinal arrays using setlabels.

Add and Drop Category Levels

Add and drop levels from a nominal or ordinal array.

Merge Category Levels

Merge categories in a nominal or ordinal array using mergelevels.

Reorder Category Levels

Reorder the category levels in nominal or ordinal arrays using reorderlevels.

Sort Ordinal Arrays

Determine sorting order for ordinal arrays.

Plot Data Grouped by Category

Plot data grouped by the levels of a categorical variable.

Summary Statistics Grouped by Category

Compute summary statistics grouped by levels of a categorical variable.

Test Differences Between Category Means

Test for significant differences between category (group) means using a t-test, two-way ANOVA (analysis of variance), and ANOCOVA (analysis of covariance) analysis.

Index and Search Using Nominal and Ordinal Arrays

Index and search data by its category, or group.

Linear Regression with Categorical Covariates

Perform a regression with categorical covariates using categorical arrays and fitlm.

Dataset Arrays

Dataset Arrays

Dataset arrays store data with heterogeneous types.

Create a Dataset Array from Workspace Variables

Create a dataset array from a numeric array or heterogeneous variables existing in the MATLAB workspace.

Create a Dataset Array from a File

Create a dataset array from the contents of a tab-delimited or a comma-separated text, or an Excel file.

Add and Delete Observations

Add and delete observations in a dataset array.

Add and Delete Variables

Add and delete variables in a dataset array.

Access Data in Dataset Array Variables

Work with dataset array variables and their data.

Select Subsets of Observations

Select an observation or subset of observations from a dataset array.

Sort Observations in Dataset Arrays

Sort observations (rows) in a dataset array using the command line.

Merge Dataset Arrays

Merge dataset arrays using join.

Stack or Unstack Dataset Arrays

Reformat dataset arrays using stack and unstack.

Clean Messy and Missing Data

Find, clean, and delete observations with missing data in a dataset array.

Calculations on Dataset Arrays

Perform calculations on dataset arrays, including averaging and summarizing with a grouping variable.

Export Dataset Arrays

Export a dataset array from the MATLAB workspace to a text or spreadsheet file.

Dataset Arrays in the Variables Editor

The MATLAB Variables editor provides a convenient interface for viewing, modifying, and plotting dataset arrays.

Index and Search Dataset Arrays

Learn the many ways to index into dataset arrays.

Regression Using Dataset Arrays

This example shows how to perform linear and stepwise regression analyses using dataset arrays.