Main Content

simulate

Monte Carlo simulation of ARIMA or ARIMAX models

Description

example

[Y,E] = simulate(Mdl,numObs) simulates length numObs sample response and innovations paths, Y and E, respectively, from the ARIMA model Mdl. The responses can include the effects of seasonality.

example

[Y,E] = simulate(Mdl,numObs,Name=Value) specifies additional options using one or more name-value arguments. For example, simulate(Mdl,10,NumPaths=1000,Y0=y0) simulates 1000 sample paths of length 10 from the ARIMA model Mdl, and uses the observations in y0 as a presample to initialize each generated path.

example

[Y,E,V] = simulate(___) also simulates paths of conditional variances V for a composite conditional mean and variance model (for example, an ARIMA and GARCH composite model) using any of the input argument combinations in the previous syntaxes.

Examples

collapse all

Consider the ARIMA(4,1,1) model

(1-0.75L4)(1-L)yt=2+(1+0.1L)εt,

where εt is a Gaussian innovations series with a mean of 0 and a variance of 1.

Create the ARIMA(4,1,1) model.

Mdl = arima('AR',-0.75,'ARLags',4,'MA',0.1,...
    'Constant',2,'Variance',1)
Mdl = 
  arima with properties:

     Description: "ARIMA(4,0,1) Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
               P: 4
               D: 0
               Q: 1
        Constant: 2
              AR: {-0.75} at lag [4]
             SAR: {}
              MA: {0.1} at lag [1]
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 1

Mdl is a fully specified arima object representing the ARIMA(4,1,1) model.

Simulate a 100-period random response path from the ARIMA(4,1,1) model.

rng(1)  % For reproducibility
y = simulate(Mdl,100);

y is a 100-by-1 vector containing the random response path.

Plot the simulated path.

plot(y)
ylabel('y')
xlabel('Period')

Figure contains an axes object. The axes object with xlabel Period, ylabel y contains an object of type line.

Simulate three predictor series and a response series.

Specify and simulate a path of length 20 for each of the three predictor series modeled by

(1-0.2L)xit=2+(1+0.5L-0.3L2)ηit,

where ηit follows a Gaussian distribution with mean 0 and variance 0.01, and i = {1,2,3}.

[MdlX1,MdlX2,MdlX3] = deal(arima('AR',0.2,'MA',...
    {0.5,-0.3},'Constant',2,'Variance',0.01));

rng(4); % For reproducibility 
simX1 = simulate(MdlX1,20);
simX2 = simulate(MdlX2,20);
simX3 = simulate(MdlX3,20);
SimX = [simX1 simX2 simX3];

Specify and simulate a path of length 20 for the response series modeled by

(1-0.05L+0.02L2-0.01L3)(1-L)1yt=0.05+xt[0.5-0.03-0.7]+(1+0.04L+0.01L2)εt,

where εt follows a Gaussian distribution with mean 0 and variance 1.

MdlY = arima('AR',{0.05 -0.02 0.01},'MA',...
    {0.04,0.01},'D',1,'Constant',0.5,'Variance',1,...
    'Beta',[0.5 -0.03 -0.7]);
simY = simulate(MdlY,20,'X',SimX);

Plot the series together.

figure
plot([SimX simY])
title('Simulated Series')
legend('{X_1}','{X_2}','{X_3}','Y')

Figure contains an axes object. The axes object with title Simulated Series contains 4 objects of type line. These objects represent {X_1}, {X_2}, {X_3}, Y.

Forecast the daily NASDAQ Composite Index using Monte Carlo simulations.

Load the NASDAQ data included with the toolbox. Extract the first 1500 observations for fitting.

load Data_EquityIdx
nasdaq = DataTable.NASDAQ(1:1500);
n = length(nasdaq);

Specify, and then fit an ARIMA(1,1,1) model.

NasdaqModel = arima(1,1,1);
NasdaqFit = estimate(NasdaqModel,nasdaq);
 
    ARIMA(1,1,1) Model (Gaussian Distribution):
 
                  Value      StandardError    TStatistic      PValue  
                _________    _____________    __________    __________

    Constant      0.43031       0.18555          2.3191       0.020392
    AR{1}       -0.074391      0.081985        -0.90737        0.36421
    MA{1}         0.31126      0.077266          4.0284     5.6158e-05
    Variance       27.826       0.63625          43.735              0

Simulate 1000 paths with 500 observations each. Use the observed data as presample data.

rng default;
Y = simulate(NasdaqFit,500,'NumPaths',1000,'Y0',nasdaq);

Plot the simulation mean forecast and approximate 95% forecast intervals.

lower = prctile(Y,2.5,2);
upper = prctile(Y,97.5,2);
mn = mean(Y,2);

figure
plot(nasdaq,'Color',[.7,.7,.7])
hold on
h1 = plot(n+1:n+500,lower,'r:','LineWidth',2);
plot(n+1:n+500,upper,'r:','LineWidth',2)
h2 = plot(n+1:n+500,mn,'k','LineWidth',2);

legend([h1 h2],'95% Interval','Simulation Mean',...
			'Location','NorthWest')
title('NASDAQ Composite Index Forecast')
hold off

Figure contains an axes object. The axes object with title NASDAQ Composite Index Forecast contains 4 objects of type line. These objects represent 95% Interval, Simulation Mean.

Simulate response and innovation paths from a multiplicative seasonal model.

Specify the model

(1-L)(1-L12)yt=(1-0.5L)(1+0.3L12)εt,

where εt follows a Gaussian distribution with mean 0 and variance 0.1.

Mdl = arima('MA',-0.5,'SMA',0.3,...
	'SMALags',12,'D',1,'Seasonality',12,...
	'Variance',0.1,'Constant',0);

Simulate 500 paths with 100 observations each.

rng default % For reproducibility
[Y,E] = simulate(Mdl,100,'NumPaths',500);

figure
subplot(2,1,1);
plot(Y)
title('Simulated Response')

subplot(2,1,2);
plot(E)
title('Simulated Innovations')

Figure contains 2 axes objects. Axes object 1 with title Simulated Response contains 500 objects of type line. Axes object 2 with title Simulated Innovations contains 500 objects of type line.

Plot the 2.5th, 50th (median), and 97.5th percentiles of the simulated response paths.

lower = prctile(Y,2.5,2);
middle = median(Y,2);
upper = prctile(Y,97.5,2);

figure
plot(1:100,lower,'r:',1:100,middle,'k',...
			1:100,upper,'r:')
legend('95% Interval','Median')

Figure contains an axes object. The axes object contains 3 objects of type line. These objects represent 95% Interval, Median.

Compute statistics across the second dimension (across paths) to summarize the sample paths.

Plot a histogram of the simulated paths at time 100.

figure
histogram(Y(100,:),10)
title('Response Distribution at Time 100')

Figure contains an axes object. The axes object with title Response Distribution at Time 100 contains an object of type histogram.

Input Arguments

collapse all

Fully specified ARIMA model, specified as an arima model object created by arima or estimate.

The properties of Mdl cannot contain NaN values.

Sample path length, specified as a positive integer. That is, the number of random observations to generate per output path. Y, E, and V have numObs rows.

Data Types: double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: simulate(Mdl,10,NumPaths=1000,Y0=y0) simulates 1000 sample paths of length 10 from the ARIMA model Mdl, and uses the observations in y0 as a presample to initialize each generated path.

Number of independent sample paths to generate, specified as a positive integer. The output arguments Y, E, and V have NumPaths columns.

Example: NumPaths=1000

Data Types: double

Presample response data used as initial values for the model, specified as a numeric column vector or a numeric matrix.

Each row of Y0 corresponds to a period in the presample. The following conditions apply:

  • The last row contains the latest presample responses.

  • To initialize the AR components, Y0 must have at least Mdl.P rows.

  • If Y0 has more rows than is required to initialize the model, simulate uses only the latest required rows.

Each column of Y0 corresponds to a separate, independent presample path. The following conditions apply:

  • If Y0 is a column vector, simulate applies it to each path. In this case, the AR components and conditional variance model of all paths in Y derive from common initial responses.

  • If Y0 is a matrix, simulate applies Y0(:,j) to initialize path j. Y0 must have at least NumPaths columns; simulate uses only the first NumPaths columns of Y0.

By default, simulate sets any necessary presample observations by using one of the following methods:

  • For a model with a stationary AR process and without a regression component, simulate sets all presample responses to the unconditional mean of the model.

  • For a model that represents a nonstationary process or that contains a regression component, simulate sets all necessary presample responses to zero.

Data Types: double

Presample innovation data used to initialize either the moving average (MA) component of the ARIMA model or the conditional variance model, specified as a numeric column vector or a numeric matrix. simulate assumes the presample innovations have a mean of zero.

Each row of E0 corresponds to a period in the presample. The following conditions apply:

  • The last row contains the latest presample innovations.

  • To initialize the MA components, E0 must have at least Mdl.Q rows.

  • If Mdl.Variance is a conditional variance model (for example, a garch model object), E0 can require more rows than Mdl.Q to initialize the model.

  • If E0 has more rows than is required to initialize the model, simulate uses only the latest required rows.

Each column of E0 corresponds to a separate, independent presample path. The following conditions apply:

  • If E0 is a column vector, simulate applies it to each simulated path. In this case, the MA components and conditional variance model of all paths in Y derive from the same initial innovations.

  • If E0 is a matrix, simulate applies E0(:,j) to initialize simulating path j. E0 must have at least NumPaths columns; simulate uses only the first NumPaths columns of E0.

By default, simulate sets all necessary presample innovations to 0.

Data Types: double

Presample conditional variance data used to initialize the conditional variance model, specified as a positive numeric column vector or a positive numeric matrix. If the conditional variance Mdl.Variance is constant, simulate ignores V0.

Each row of V0 corresponds to a period in the presample. The following conditions apply:

  • The last row contains the latest presample conditional variances.

  • To initialize the conditional variance model, V0 must have enough rows. For details, see the simulate function of conditional variance models.

  • If V0 has more rows than is required to initialize the conditional variance model, simulate uses only the latest required rows.

Each column of V0 corresponds to a separate, independent presample path. The following conditions apply:

  • If V0 is a column vector, simulate applies it to each simulated path.

  • If V0 is a matrix, simulate applies V0(:,j) to initialize simulating path j. V0 must have at least NumPaths columns; simulate uses only the first NumPaths columns of V0.

By default, simulate sets all necessary presample observations to the unconditional variance of the conditional variance process.

Data Types: double

Exogenous predictor data for the regression component in the model, specified as a numObs-bynumPreds numeric matrix.

numPreds is the number of predictor variables (numel(Mdl.Beta)).

Each row of X corresponds to a period in the length numObs simulation sample (period for which simulate simulates observations; the period after the presample). The following conditions apply:

  • The last row contains the latest predictor data.

  • If the specified predictor data has more than numObs rows, simulate uses only the latest numObs rows.

  • simulate does not use the regression component in the presample period.

Each column of X corresponds to a separate predictor variable.

simulate applies X to each simulated path; that is, X represents one path of observed predictors.

By default, simulate excludes the regression component, regardless of its presence in Mdl.

Data Types: double

Note

  • NaNs in input data indicate missing values. simulate uses listwise deletion to delete all sampled times (rows) in the input data containing at least one missing value. Specifically, simulate performs these steps:

    1. Synchronize, or merge, the presample data sets E0, V0, and Y0 to create the set Presample.

    2. Remove all rows from Presample and the predictor data X containing at least one NaN.

    Listwise deletion applied to the in-sample data can reduce the sample size and create irregular time series.

  • simulate assumes that you synchronize the predictor series such that the most recent observations occur simultaneously. The software also assumes that you synchronize the presample series similarly.

Output Arguments

collapse all

Simulated response paths, returned as a length numObs numeric column vector or a numObs-by-NumPaths numeric matrix. Y represents the continuation of the presample responses in Y0.

Each row corresponds to a period in the simulated series; the simulated series has the periodicity of Mdl. Each column is a separate simulated path.

Simulated model innovations paths, returned as a length numObs numeric column vector or a numObs-by-NumPaths numeric matrix.

The dimensions of E correspond to the dimensions of Y.

Simulated conditional variance paths of the mean-zero innovations associated with Y, returned as a length numObs numeric column vector or a numObs-by-NumPaths numeric matrix.

The dimensions of V correspond to the dimensions of Y.

References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Enders, Walter. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[3] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

Version History

Introduced in R2012a