Main Content

dlode45

Deep learning solution of nonstiff ordinary differential equation (ODE)

Since R2021b

    Description

    The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE.

    The dlode45 function applies the neural ODE operation to dlarray data. Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the "S", "T", "C", and "B" labels, respectively. For unspecified and other dimensions, use the "U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting the dlarray object directly, or by using the DataFormat option.

    Note

    This function applies the neural ODE operation to dlarray data in deep learning models defined as functions or in custom layer functions. If you want to apply the neural ODE operation within a dlnetwork object or Layer array, use neuralODELayer. To solve ODEs for other workflows, use ode45.

    Y = dlode45(net,tspan,Y0) integrates the system of ODEs characterized by the dlnetwork object net on the time interval defined by the first and last elements of tspan and with the initial conditions Y0. (since R2023a)

    example

    Y = dlode45(odefun,tspan,Y0,theta) integrates the system of ODEs given by the function handle odefun on the time interval defined by the first and last elements of tspan, with the initial conditions Y0 and parameters theta.

    example

    Y = dlode45(___,DataFormat=FMT) specifies the data format for the unformatted initial conditions Y0. The format must contain "S" (spatial), "C" (channel), and "B" (batch) dimension labels only.

    Y = dlode45(___,Name=Value) specifies additional options using one or more name-value arguments. For example, Y = dlode45(odefun,tspan,Y0,theta,GradientMode="adjoint") integrates the system of ODEs given by odefun and computes gradients by solving the associated adjoint ODE system.

    Examples

    collapse all

    For the initial conditions, create a formatted dlarray object containing 128 observations of numeric feature data with 4 channels. Specify the format "CB" (channel, batch).

    miniBatchSize = 128;
    numChannels = 4;
    Y0 = rand(numChannels,miniBatchSize);
    Y0 = dlarray(Y0,"CB");

    View the size and format of the initial conditions.

    size(Y0)
    ans = 1×2
    
         4   128
    
    
    dims(Y0)
    ans = 
    'CB'
    

    Create a dlnetwork object that characterizes the ODE function given by two fully connect operations with a tanh activation between them.

    layers = [
        featureInputLayer(4)
        fullyConnectedLayer(10)
        tanhLayer
        fullyConnectedLayer(4)];
    net = dlnetwork(layers);

    Specify an interval of integration of [0 0.1].

    tspan = [0 0.1];

    Apply the neural ODE operation.

    Y = dlode45(net,tspan,Y0);

    View the size and format of the output.

    size(Y)
    ans = 1×2
    
         4   128
    
    
    dims(Y)
    ans = 
    'CB'
    

    For the initial conditions, create a formatted dlarray object containing a batch of 128 28-by-28 images with 64 channels. Specify the format "SSCB" (spatial, spatial, channel, batch).

    miniBatchSize = 128;
    inputSize = [28 28];
    numChannels = 64;
    Y0 = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize);
    Y0 = dlarray(Y0,"SSCB");

    View the size and format of the initial conditions.

    size(Y0)
    ans = 1×4
    
        28    28    64   128
    
    
    dims(Y0)
    ans = 
    'SSCB'
    

    Specify the ODE function. Define the function odeModel, listed in the ODE Function section of the example, which applies a convolution operation followed by a hyperbolic tangent operation to the input data.

    odefun = @odeModel;

    Initialize the parameters for the convolution operation in the ODE function. The output size of the ODE function must match the size of the initial conditions, so specify the same number of filters as the number of input channels.

    filterSize = [3 3];
    numFilters = numChannels;
    
    parameters = struct;
    parameters.Weights = dlarray(rand(filterSize(1),filterSize(2),numChannels,numFilters));
    parameters.Bias = dlarray(zeros(1,numFilters));

    Specify an interval of integration of [0 0.1].

    tspan = [0 0.1];

    Apply the neural ODE operation.

    Y = dlode45(odefun,tspan,Y0,parameters);

    View the size and format of the output.

    size(Y)
    ans = 1×4
    
        28    28    64   128
    
    
    dims(Y)
    ans = 
    'SSCB'
    

    ODE Function

    The ODE function odeModel takes as input the function inputs t (unused) and y, and the ODE function parameters p containing the convolution weights and biases, and returns the output of the convolution-tanh block operation. The convolution operation applies padding such that the output size matches the input size.

    function z = odeModel(t,y,p)
    
    weights = p.Weights;
    bias = p.Bias;
    z = dlconv(y,weights,bias,Padding="same");
    z = tanh(z);
    
    end

    Input Arguments

    collapse all

    Since R2023a

    Neural network characterizing neural ODE function, specified as an initialized dlnetwork object.

    If net has one input, then predict(net,Y) defines the ODE system. If net has two inputs, then predict(net,T,Y) defines the ODE system, where T is a time step repeated over the batch dimension.

    When the first argument is a dlnetwork object and GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

    Function to solve, specified as a function handle that defines the function to integrate.

    Specify odefun as a function handle with syntax z = fcn(t,y,p), where t is a scalar, y is a dlarray, and p is a set of parameters given by theta. The function returns a dlarray with the same size and format as y. The function must accept all three input arguments t, y, and p, even if not all the arguments are used in the function. The size of the ODE function output z must match the size of the initial conditions.

    For example, specify the ODE function that applies a convolution operation followed by a tanh operation.

    function z = dlconvtanh(t,y,p)
    
    weights = p.Weights;
    bias = p.Bias;
    z = dlconv(y,weights,bias,Padding="same");
    z = tanh(z);
    
    end
    Note here that the t argument is unused.

    Data Types: function_handle

    Interval of integration, specified as a numeric vector or an unformatted dlarray vector with two or more elements. The elements in tspan must be all increasing or all decreasing.

    The solver imposes the initial conditions given by Y0 at the initial time tspan(1), then integrates the ODE function from tspan(1) to tspan(end).

    • If tspan has two elements, [t0 tf], then the solver returns the solution evaluated at point tf.

    • If tspan has more than two elements, [t0 t1 t2 ... tf], then the solver returns the solution evaluated at the given points [t1 t2 ... tf]. The solver does not step precisely to each point specified in tspan. Instead, the solver uses its own internal steps to compute the solution, then evaluates the solution at the points specified in tspan. The solutions produced at the specified points are of the same order of accuracy as the solutions computed at each internal step.

      Specifying several intermediate points has little effect on the efficiency of computation, but for large systems it can affect memory management.

    Note

    The behavior of the dlode45 function differs from the ode45 function.

    If InitialStep or MaxStep is [], then the software uses the values of tspan to initialize the values.

    • If InitialStep is [], then the software uses the elements of tspan as an indication of the scale of the task. When you specify tspan with different numbers of elements, the solution of the solver can change.

    • If MaxStep is [], then the software calculates the maximum step size using the first and last elements of tspan. When you change the initial or final values of tspan, the solution of the solver can change because the solver uses a different step sequence.

    Initial conditions, specified as a formatted or unformatted dlarray object.

    If Y0 is an unformatted dlarray, then you must specify the format using the DataFormat option.

    For neural ODE operations, the data format must contain "S", "C", and "B" dimension labels only. The initial conditions must not have a "T" or "U" dimension.

    Parameters of ODE function, specified as one of these values:

    • dlnetwork object (since R2023a)

    • dlarray object

    • Structure or nested structures of one or more of these values:

      • dlnetwork object (since R2023a)

      • dlarray object

      • Any MATLAB® data type (since R2023b)

    • Cell array of one or more of these values:

      • dlnetwork object (since R2023a)

      • dlarray object

      • Any MATLAB data type (since R2023b)

    • Table with these variables:

      • Layer — Layer name

      • Parameter — Parameter name

      • Value — Parameter value, specified as a one of these values:

        • dlarray object

        • Any MATLAB data type (since R2023b)

    For automatic differentiation workflows (for example, when you define a custom layer or use the dlgradient and dlfeval functions), the software only computes gradients with respect to the dlarray objects contained in the object specified by theta.

    The software passes the parameters to odefun as its third input argument.

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: Y = dlode45(odefun,tspan,Y0,theta,GradientMode="adjoint") integrates the system of ODEs given by odefun and computes gradients by solving the associated adjoint ODE system.

    Description of the data dimensions, specified as a character vector or string scalar.

    A data format is a string of characters, where each character describes the type of the corresponding data dimension.

    The characters are:

    • "S" — Spatial

    • "C" — Channel

    • "B" — Batch

    • "T" — Time

    • "U" — Unspecified

    For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT" (channel, batch, time).

    For neural ODE operations, the data format must contain "S", "C", and "B" dimension labels only. The initial conditions must not have a "T" or "U" dimension.

    If Y0 is not a formatted dlarray object, then you must specify the DataFormat option.

    For more information, see Deep Learning Data Formats.

    Example: DataFormat="SSCB"

    Data Types: char | string

    Method to compute gradients with respect to the initial conditions and parameters when using the dlgradient function, specified as one of these values:

    • "direct" — Compute gradients by backpropagating through the operations undertaken by the numerical solver. This option best suits large mini-batch sizes or when tspan contains many values.

    • "adjoint" — Compute gradients by solving the associated adjoint ODE system. This option best suits small mini-batch sizes or when tspan contains a small number of values.

    • "adjoint-seminorm" (since R2023b) — Compute gradients using adjoint seminorm computations [4]. This option typically uses fewer terms in the approximation error computations and can speed up some neural ODE models.

    When the first argument is a dlnetwork object and GradientMode is "adjoint", the network State property must be empty. To use a network with a nonempty State property, set GradientMode to "direct".

    The dlaccelerate function does not support accelerating the dlode45 function when the GradientMode option is "direct". To accelerate the code that calls the dlode45 function, set the GradientMode option to "adjoint" or accelerate parts of your code that do not call the dlode45 function with the GradientMode option set to "direct".

    Warning

    When GradientMode is "adjoint", odefun must support function acceleration. Otherwise, the function can return unexpected results.

    When GradientMode is "adjoint", the software traces the ODE function input to determine the computation graph used for automatic differentiation. This tracing process can take some time and can end up recomputing the same trace. By optimizing, caching, and reusing the traces, the software can speed up the gradient computation.

    Because of the nature of caching traces, not all functions support acceleration.

    The caching process can cache values that you might expect to change or that depend on external factors. You must take care when you accelerate functions that:

    • have inputs with random or frequently changing values

    • have outputs with frequently changing values

    • generate random numbers

    • use if statements and while loops with conditions that depend on the values of dlarray objects

    • have inputs that are handles or that depend on handles

    • Read data from external sources (for example, by using a datastore or a minibatchqueue object)

    For more information on deep learning function acceleration, see Deep Learning Function Acceleration for Custom Training Loops.

    Initial step size, specified as a positive scalar or [].

    If InitialStepSize is [], then the function automatically determines the initial step size based on the interval of integration and the output of the ODE function corresponding to the initial conditions.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Maximum step size, specified as a positive scalar or [].

    If MaxStepSize is [], then the function uses a tenth of the interval of integration size.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Relative error tolerance, specified as a positive scalar. The relative tolerance applies to all components of the solution.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Absolute error tolerance, specified as a positive scalar. The absolute tolerance applies to all components of the solution.

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

    Output Arguments

    collapse all

    Solution of the neural ODE at the times given by tspan(2:end), returned as a dlarray object with the same underlying data type as Y0.

    If Y0 is a formatted dlarray and tspan contains exactly two elements, then Y has the same format as Y0. If Y0 is not a formatted dlarray and tspan contains exactly two elements, then Y is an unformatted dlarray with the same dimension order as Y0.

    If Y0 is a formatted dlarray and tspan contains more than two elements, then Y has the same format as Y0 with an additional appended "T" (time) dimension. If Y0 is not a formatted dlarray and tspan contains more than two elements, then Y is an unformatted dlarray with the same dimension order as Y0 with an additional appended dimension corresponding to time.

    Algorithms

    collapse all

    Neural Ordinary Differential Equation

    The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE. In particular, given an input, a neural ODE operation outputs the numerical solution of the ODE y=f(t,y,θ) for the time horizon (t0,t1) and with the initial condition y(t0) = y0, where t and y denote the ODE function inputs and θ is a set of learnable parameters. Typically, the initial condition y0 is either the network input or the output of another deep learning operation.

    To apply the operation, dlode45 uses the ode45 function, which is based on an explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. It is a single-step solver—in computing y(tn), it needs only the solution at the immediately preceding time point, y(tn-1) [2] [3].

    Deep Learning Array Formats

    Most deep learning networks and functions operate on different dimensions of the input data in different ways.

    For example, an LSTM operation iterates over the time dimension of the input data, and a batch normalization operation normalizes over the batch dimension of the input data.

    To provide input data with labeled dimensions or input data with additional layout information, you can use data formats.

    A data format is a string of characters, where each character describes the type of the corresponding data dimension.

    The characters are:

    • "S" — Spatial

    • "C" — Channel

    • "B" — Batch

    • "T" — Time

    • "U" — Unspecified

    For example, consider an array containing a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can specify that this array has the format "CBT" (channel, batch, time).

    To create formatted input data, create a dlarray object and specify the format using the second argument.

    To provide additional layout information with unformatted data, specify the format using the DataFormat argument.

    For more information, see Deep Learning Data Formats.

    References

    [1] Chen, Ricky T. Q., Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. “Neural Ordinary Differential Equations.” Preprint, submitted June 19, 2018. https://arxiv.org/abs/1806.07366.

    [2] Dormand, J. R., and P. J. Prince. “A Family of Embedded Runge-Kutta Formulae.” Journal of Computational and Applied Mathematics 6, no. 1 (March 1980): 19–26. https://doi.org/10.1016/0771-050X(80)90013-3.

    [3] Shampine, Lawrence F., and Mark W. Reichelt. “The MATLAB ODE Suite.” SIAM Journal on Scientific Computing 18, no. 1 (January 1997): 1–22. https://doi.org/10.1137/S1064827594276424.

    [4] Kidger, Patrick, Ricky T. Q. Chen, and Terry Lyons. “‘Hey, That’s Not an ODE’: Faster ODE Adjoints via Seminorms.” arXiv, May 10, 2021. https://doi.org/10.48550/arXiv.2009.09457.

    Extended Capabilities

    Version History

    Introduced in R2021b

    expand all