onehotencode
Syntax
Description
encodes data labels in categorical array B
= onehotencode(A
,featureDim
)A
into a one-hot encoded array
B
. The function replaces each element of A
with a
numeric vector of length equal to the number of unique classes in A
along
the dimension specified by featureDim
. The vector contains a
1
in the position corresponding to the class of the label in
A
, and a 0
in every other position. Any
<undefined>
values are encoded to NaN
values.
encodes categorical data labels in table tblB
= onehotencode(tblA
)tblA
into a table of one-hot
encoded numeric values. The function replaces the single variable of tblA
with as many variables as the number of unique classes in tblA
. Each row in
tblB
contains a 1
in the variable corresponding to
the class of the label in tlbA
, and a 0
in all other
variables.
___ = onehotencode(___,
encodes the labels into numeric values of data type typename
)typename
. Use this syntax with any of the input and output arguments in previous syntaxes.
___ = onehotencode(___,'ClassNames',
also specifies the names of the classes to use for encoding. Use this syntax when
classes
)A
or tblA
does not contain categorical values,
when you want to exclude any class labels from being encoded, or when you want to encode the
vector elements in a specific order. Any label in A
or
tblA
of a class that does not exist in classes
is
encoded to a vector of NaN
values.
Examples
Input Arguments
Output Arguments
Alternative Functionality
To encode data labels, you can also use dummyvar
, which creates dummy variables from grouping variables. The following
table compares the onehotencode
and dummyvar
functions for different use cases.
Use Case | When to Use onehotencode | When to Use dummyvar |
---|---|---|
Encoding multiple variables | Use onehotencode in a loop. For an example, see One-Hot Encode Table with Several Variables. | Specify the input argument group as a cell array or positive
integer matrix. For examples, see Create Dummy Variables from Multiple Grouping Variables and Create Dummy Variables from Numeric Grouping Variables. |
Encoding a variable in cell array format | Convert the cell array variable to a categorical array. | Specify the input argument group as a cell array containing
one or more grouping variables. |
Encoding noncategorical data labels | Specify the data labels as a categorical array or specify the classes to encode
using the ClassNames name-value argument. For an example, see
One-Hot Encode Subset of Classes. | You do not need to convert the data labels, because dummyvar
accepts noncategorical grouping variables as input. |
Encoding an array of data labels | Specify the dimension to expand (featureDim ). | The software automatically determines the dimension to expand.
dummyvar returns dummy variables as a numeric array with
columns created from the columns of the input grouping variables. |
In many cases, you do not need to use the onehotencode
or
dummyvar
function for encoding. Most Statistics and Machine Learning Toolbox™ functions can operate directly on categorical response data. Most classification
and regression functions also accept categorical predictors.
Version History
Introduced in R2021b