1970
Joseph George Caldwell, PhD (Statistics)
1432 N Camino Mateo, Tucson, AZ 85745-3311 USA
Tel. (001)(520)222-3446, E-mail jcaldwell9@yahoo.com
(File converted to Microsoft Word March 25, 2018, with minor edits.)
Copyright © 2018 Joseph George Caldwell. All rights reserved.
ROAD MAP
1. DEFINITIONS / EXAMPLES / FRAMEWORK
2. THE NATURE OF TIME SERIES MODELING
3. TIME SERIES DESCRIPTORS
4. BASIC TIME SERIES MODELS
5. TIME SERIES MODEL BUILDING (ADDITIONAL DETAILS)
6. BOX-JENKINS MODELS
7. PROGRAMS / DATA REQUIREMENTS / REFERENCES
I. DEFINITIONS / EXAMPLES / FRAMEWORK
II.
· DEFINITION OF TIME SERIES
· EXAMPLES OF TIME SERIES
· TYPES OF TIME SERIES CONSIDERED
· DEFINITION OF TIME SERIES ANALYSIS
WHAT IS A TIME SERIES?
A SET OF OBSERVATIONS ARRANGED CHRONOLOGICALLY
A REALIZATION OF A STOCHASTIC PROCESS
STOCHASTIC PROCESS = A FAMILY OF RANDOM VARIABLES: {X(t), tєT}
EXAMPLES OF TIME SERIES
ECONOMIC
· DAILY COMMODITY OR STOCK PRICES
· MONTHLY INTEREST RATES
· SALES AND PRICES
· CASH BALANCES
PHYSICAL
· DAILY RAINFALL, MEAN DAILY TEMPERATURE
· INDUSTRIAL PROCESS YIELDS
· AIRCRAFT VIBRATION
· ACOUSTIC AND ELECTROMAGNETIC SIGNALS AND NOISE
· TRACKS OF SHIPS, PLANES, RE-ENTRY VEHICLES
BIOLOGICAL
· DAILY EGG PRODUCTION, POPULATION GROWTH
· BRAIN WAVES
TYPES OF TIME SERIES CONSIDERED HERE
DISCRETE, EQUIDISTANT POINTS IN TIME
STOCHASTIC, NOT DETERMINISTIC
E.G., zt = zt-1 + xt-1 + et, NOT zt = a cos(2π bt)
“EMPIRICAL-THEORETICAL” MODELS
E.G., NO KALMAN FILTER (MODEL ESSENTIALLY KNOWN)
EMPHASIS ON FORECASTING (NOT CONTROL)
UNIVARIATE, NOT MULTIVARIATE
WHAT IS TIME SERIES ANALYSIS?
TIME SERIES ANALYSIS IS THE APPLICATION OF MATHEMATICAL TECHNIQUES TO DESCRIBE, PREDICT AND CONTROL TIME SERIES
WHICH TECHNIQUE TO USE DEPENDS ON THE TYPE OF ANSWER WANTED
TWO CATEGORIES OF TIME SERIES ANALYSIS TECHNIQUES:
SPECTRAL ANALYSIS
FREQUENCY RESPONSE STUDIES
MODEL BUILDING
FORECASTING
SIMULATION
CONTROL
III. THE NATURE OF TIME SERIES MODELING
IV.
HOW TIME SERIES MODELS ARE USED
PROCEDURE FOR MODEL DEVELOPMENT
GENERAL CLASSES OF TIME SERIES MODELS
CRITERIA FOR SELECTING A TIME SERIES MODEL CLASS
HOW TIME SERIES MODELS ARE USED
REPRESENT IMPORTANT CHARACTERISTICS OF TIME SERIES BY MATHEMATICAL MODEL
DEVELOP OPTIMAL FORECAST OR CONTROL PROCEDURE FOR MODEL
APPLY MODEL-OPTIMAL PROCEDURE IN REAL-WORLD SITUATION
PROCEDURE FOR MODEL DEVELOPMENT
ITERATIVE APPROACH TO MODEL BUILDING:
GENERAL CLASSES OF TIME SERIES MODELS
UNIVARIATE
SINGLE VARIABLE (E.G., SALES OF A MINOR ITEM UNRELATED TO OTHERS)
MULTIPLE VARIABLES, ONE OF WHICH IS DEPENDENT ON THE OTHERS (E.G., CORPORATE QUARTERLY FORECAST; CONTROL PROBLEMS; LEADING INDICATORS)
MULTIVARIATE
MULTIPLE RELATED VARIABLES
FORECAST SALES AND PRICE SIMULTANEOUSLY
COORDINATES OF REENTRY VEHICLE:
CRITERIA FOR SELECTING A TIME SERIES MODEL CLASS
NATURE OF APPLICATION (E.G., FORECASTING, CONTROL, POLICY ANALYSIS)
FORECAST HORIZON (RANGE)
PRECISION REQUIREMENTS
COST (MODEL DEVELOPMENT AND IMPLEMENTATION, DATA COLLECTION)
DATA REQUIREMENTS AND AVAILABILITY
ALL OF THE PRECEDING CONSIDERATIONS ARE TAKEN INTO ACCOUNT IN SELECTING AN APPROPRIATE MODEL CLASS
NATURE OF APPLICATION
FORECASTING, CONTROL OR POLICY ANALYSIS
UNCONDITIONAL FORECASTING: NO REQUIREMENT FOR EXPLANATORY VARIABLES
CONDITIONAL FORECASTING; CONTROL, POLICY ANALYSIS: NEED EXPLANATORY / CONTROL VARIABLES
FORECAST HORIZON
SHORT RANGE: ONE-THREE PERIODS INTO THE FUTURE
MEDIUM RANGE: 4-12 PERIODS, 1-2 SEASONS
LONG RANGE: BEYOND
MEASUREMENT OF PRECISION
STANDARD ERROR OF THE ESTIMATE:
LONG-RANGE VS SHORT-TERM MODELS:
MODEL COSTS
DATA COLLECTION COSTS
· NUMBER OF VARIABLES, SCOPE
· HISTORY REQUIRED
MODEL DEVELOPMENT COSTS
· PROFESSIONAL TIME (STATISTICAL ANALYSIS; DATABASE DEVELOPMENT; WEB DEVELOPMENT)
· COMPUTER SOFTWARE / HARDWARE / SYSTEMS
MODEL IMPLEMENTATION COSTS
· PERSONNEL TIME
· COMPUTER-RELATED
V. TIME SERIES DESCRIPTORS
VI.
BASIC CHARACTERISTICS OF TIME SERIES
DESCRIPTORS OF STATIONARY TIME SERIES
ESTIMATION OF TIME SERIES DESCRIPTORS
BASIC CHARACTERISTICS OF TIME SERIES
CONTINUOUS VS. DISCRETE
EQUISPACED
DETERMINISTIC VS. STOCHASTIC COMPONENTS
(E.G., MEAN MONTHLY TEMPERATURE VS. SALES)
STATIONARY VS. NONSTATIONARY
STATIONARITY
STRONG STATIONARITY (PROBABILITY DISTRIBUTION INDEPENDENT OF SHIFTS):
WEAK STATIONARITY (2ND MOMENTS INDEPENDENT OF SHIFTS):
(NOTE: THIS
PRESENTATION USES LOWER-CASE LETTERS BOTH FOR ABSTRACT RANDOM VARIABLES AND
THEIR OBSERVED REALIZATIONS.)
DESCRIPTORS OF STATIONARY TIME SERIES
MEAN
VARIANCE
AUTOCORRELATION FUNCTION (ACF)
SPECTRAL DENSITY FUNCTION (SDF)
PARTIAL AUTOCORRELATION FUNCTION (PACF OR PAF)
AUTOCORRELATION FUNCTION (ACF)
(CORRELOGRAM)
OBSERVATION AT TIME t:
MEAN:
VARIANCE:
AUTOCOVARIANCE AT LAG k:
AUTOCORRELATION AT LAG k:
EXAMPLE
MODEL: OR
WHERE at, at-1,… ARE A SEQUENCE OF UNCORRELATED RANDOM NORMAL VARIABLES WITH MEAN 0 AND VARIANCE 1 (“WHITE NOISE”)
SPECTRAL DENSITY FUNCTION (SDF)
THE SPECTRAL DENSITY FUNCTION IS THE FOURIER COSINE TRANSFORM OF THE AUTOCORRELATION FUNCTION:
THE POWER SPECTRUM IS THE FOURIER COSINE TRANSFORM OF THE AUTOCOVARIANCE FUNCTION (CALLED THE PERIODOGRAM FOR SERIES WITH DETERMNISTIC COMPONENTS):
ESTIMATION OF THE AUTOCORRELATION FUNCTION (ACF)
SAMPLE:
SAMPLE MEAN:
SAMPLE VARIANCE:
SAMPLE AUTOCOVARIANCE FUNCTION:
SAMPLE AUTOCORRELATION FUNCTION:
ESTIMATION OF THE SPECTRAL DENSITY FUNCTION (SDF)
FOR SERIES WITH DETERMINISTIC COMPONENTS (MIXTURES OF SINE AND COSINE) WAVES AT FIXED FREQUENCIES, BURIED IN NOISE):
THE SAMPLE SPECTRUM IS THE ESTIMATE OF THE PERIODOGRAM:
FOR SERIES WITH RANDOM CHANGES OF FREQUENCY, AMPLITUDE AND PHASE:
THE SMOOTHED SPECTRUM IS THE ESTIMATE OF THE SPECTRUM:
WHERE λk ARE SUITABLY CHOSEN WEIGHTS, CALLED A LAG WINDOW.
REPRESENTATIONAL EQUIVALENCE
STATIONARY NORMAL STOCHASTIC PROCESSES ARE CHARACTERIZED BY:
· THEORETICAL MODEL
· MEAN, VARIANCE AND AUTOCORRELATION FUNCTION
· MEAN, VARIANCE AND SPECTRAL DENSITY FUNCTION
VII. BASIC TIME SERIES MODELS
VIII.
EXAMPLES
SUMMARY STATISTICS
WHITE NOISE (WN) PROCESS
MODEL:
AUTOREGRESSIVE (AR) PROCESS
MODEL:
OR
OR
WHERE THE BACKWARD SHIFT OPERATOR, B, IS DEFINED BY
PARTIAL AUTOCORRELATION FUNCTION (PACF) FOR AUTOREGRESSIVE PROCESSES
THE PARTIAL AUTOCORRELATION OF LAG k IS THE SOLUTION TO
FOR AN AUTOREGRESSIVE PROCESS OF ORDER p,
ESTIMATE THE PACF BY THE LEAST-SQUARES ESTIMATE OF IN THE LINEAR MODEL:
MOVING AVERAGE (MA) PROCESS
MODEL:
OR
OR
MIXED AUTOREGRESSIVE MOVING AVERAGE (ARMA) PROCESS
MODEL:
OR
OR
SUMMARY STATISTICS
PROCESS |
ACF |
PACF |
SDF |
WN |
0 |
0 |
0 |
AR |
TAILS OFF |
CUTS OFF |
VARIES |
MA |
CUTS OFF |
TAILS OFF |
VARIES |
ARMA |
TAILS OFF |
TAILS OFF |
VARIES |
IX. TIME SERIES MODEL BUILDING (ADDITIONAL DETAILS)
X.
· ITERATIVE APPROACH
· REVIEW OF MODEL CLASSES (“FORECASTING METHODS”)
· COMPARISONS BETWEEN METHODS
ITERATIVE APPROACH TO MODEL BUILDING
THE THEORY-BASED ITERATIVE APPROACH TO DEVELOPMENT OF TIME SERIES MODELS WAS DEVELOPED AND PROMOTED BY G. E. P. BOX AND GWILYM JENKINS, STARTING IN THE 1960s. THE DESCRIPTOR “BOX-JENKINS” IS APPLIED EITHER TO DESCRIBE THEIR APPROACH TO DEVELOPING TIME-SERIES MODELS, OR TO DESCRIBE THE CLASS OF AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODELS EMPLOYED IN THEIR APPROACH.
TYPES OF FORECASTING MODELS
General model class |
Model type |
Usual forecast horizon |
Accuracy |
Cost to develop |
Cost to implement |
||
Short term (ST) |
Medium term (MT) |
Long term (LT) |
|||||
Qualitative |
Delphi |
LT |
MED |
MED |
MED |
HI |
HI |
Market research |
MT |
HI |
MED |
MED |
HI |
HI |
|
Panel consensus |
LT |
LO |
LO |
LO |
HI |
HI |
|
Visionary |
LT |
LO |
LO |
LO |
HI |
HI |
|
Historical analogy |
LT |
LO |
MED |
MED |
HI |
HI |
|
Single variable |
Moving average |
ST |
LO |
LO |
LO |
LO |
VLO |
Exponential smoothing |
ST |
MED |
LO |
LO |
LO |
VLO |
|
Univariate time series models (“Box-Jenkins,” AR, MA, ARMA, ARIMA) |
ST |
HI |
MED |
LO |
MED |
VLO |
|
X-11 |
ST |
MED |
MED |
LO |
LO |
LO |
|
Trend projections |
ST |
MED |
MED |
MED |
LO |
VLO |
|
Neural network |
ST |
HI |
MED |
LO |
MED |
LO |
|
Multiple variable univariate |
Regression model (BJ) |
MT |
HI |
MED |
LO |
MED |
LO |
Econometric model |
MT |
MED |
HI |
MED |
HI |
LO |
|
Anticipation survey |
MT |
MED |
LO |
LO |
HI |
HI |
|
Input-output model |
LT |
LO |
MED |
MED |
VHI |
MED |
|
Economic input-output |
LT |
LO |
HI |
HI |
VHI |
MED |
|
Diffusion index |
MT |
LO |
LO |
LO |
HI |
LO |
|
Leading indicator (BJ) |
ST |
HI |
MED |
LO |
MED |
LO |
|
Life-cycle analysis |
MT |
LO |
LO |
LO |
HI |
HI |
|
Neural network |
ST |
HI |
MED |
LO |
MED |
LO |
|
Multivariate |
Econometric simultaneous-equation model (“SEM”) |
MT |
HI |
MED |
LO |
VHI |
HI |
Multivariate time series models (“Box Jenkins,” VAR, VARMA, VARX, MARMA, MARIMA, Kalman filter) |
MT |
HI |
MED |
LO |
MED |
LO |
FORECASTS BASED ON UNVALIDATED MODELS
MODEL IS “FITTED” TO DATA – LITTLE OR NO DIAGNOSTIC CHECKING (MODEL VALIDATION)
· MOVING AVERAGE
· EXPONENTIAL SMOOTHING
· REGRESSION ON COMPONENTS
· TREND, SEASONAL PATTERNS, FOURIER COMPONENTS
· NEURAL NETWORK
FORECASTS BASED ON VALIDATED MODELS
· SINGLE VARIABLE
o BOX-JENKINS
· MULTIPLE VARIABLE (THEORETICAL OR EMPIRICAL MODEL)
· BOX-JENKINS
· ECONOMETRIC
·
SOME FORECASTING ACCURACY COMPARISONS
STOCHASTIC VS. INTUITIVE
Forecast Error Variance |
||||||||||
Lead time |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
MSE (Brown) |
102 |
158 |
218 |
256 |
363 |
452 |
554 |
669 |
799 |
944 |
MSE (BJ) |
42 |
91 |
136 |
180 |
222 |
266 |
317 |
371 |
427 |
483 |
ECONOMETRIC VS. STOCHASTIC
Theil Coefficient |
||
Model |
Price |
Quantity |
Econometric |
0.80 |
0.65 |
Box-Jenkins |
1.00 |
0.70 |
Random walk |
1.00 |
1.00 |
Mean |
18.23 |
.96 |
IBM STOCK PRICE SERIES WITH COMPARISON OF LEAD-3 FORECASTS OBTAINED FROM BEST IMA(0,1,1) PROCESS AND BROWN’S QUADRATIC FORECAST FOR A PERIOD BEGINNING JULY 11, 1960. (REF. BJ p. 168)
HYBRID MODELS
· TREND REMOVAL
· REGRESSION + BOX-JENKINS
XI. BOX-JENKINS MODELS
· MODEL IDENTIFICATION
· PARAMETER ESTIMATION
· DIAGNOSTIC CHECKING
· FORECASTING
· EXAMPLE
BOX-JENKINS MODEL CLASS
AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODEL:
OR
WHERE
AND
IS STATIONARY. THAT IS,
WHERE
MODEL IDENTIFICATION
APPLY SUCCESSIVE DIFFERENCING (OR OTHER PROCEDURE) TO ORIGINAL DATA UNTIL STATIONARITY IS ACHIEVED (ACF DIES OUT):
EXAMINE THE ACF AND PACF OF wt TO SUGGEST A TENTATIVE MODEL (I.E., IDENTIFY THE DEGREE AND STRUCTURE OF φ(B) AND Θ(B) POLYNOMIALS).
IDENTIFICATION EXAMPLES
BEHAVIOR OF THE AUTOCORRELATION FUNCTIONS FOR THE d-th DIFFERENCE OF AN ARMA PROCESS OF ORDER (p,d,q). REFERENCE: BJ p. 176.
PARAMETER ESTIMATION
IF THE MODEL IS AUTOREGRESSIVE (NO θs), WE HAVEFG A LINEAR STATISTICAL MODEL:
WHERE z = (zn, zn-1,…,zp+1)’, φ = (φ1,…,φp)’, a = (an, an-1,…,ap+1)’, AND
SO THE LEAST-SQUARES ESTIMATE OF φ IS
IF θs ARE PRESENT, THE MODEL IS A NONLINEAR STATISTICAL MODEL:
I.E.,
IS A LINEAR MODEL WITH PARAMETER δ = β – β0. DETERMINE THE PARAMETER ESTIMATES ITERATIVELY.
TESTS OF MODEL ADEQUACY
PROPERTIES OF φ(B) AND Θ(B) POLYNOMIALS (STATIONARITY, INVERTIBILITY, COMMON FACTORS)
TESTS OF SIGNIFICANCE OF MODEL PARAMETERS
OVERFITTING
TEST RESIDUALS FOR “WHITENESS”
INFORMATION CRITERIA (E.G., AKAIKE INFORMATION CRITERION, BAYES INFORMATION CRITERION)
DEVELOP MODEL FOR DIFFERENT PERIODS OF DATA
DIAGNOSTIC CHECKING OF RESIDUALS
SIGNIFICANCE OF VARIOUS STATISTICS COMPUTED FROM RESIDUALS:
· MEAN (t-TEST)
· PACF (t-TEST ON EACH VALUE)
· ACF (t-TEST ON EACH VALUE, χ2 TEST ON ENTIRE FUNCTION)
· SPECTRUM (GRENANDER-ROSENBLATT TEST, KOLMOGOROV-SMIRNOV TEST)
MODEL IDENTIFICATION
THE PACF AND ACF CAN SUGGEST ADDITION OF φ AND θ COMPONENTS.
A PRELIMINARY FITTED MODEL IS:
THE et’s ARE CORRELATED; A MODEL FOR THE et’s IS:
THE MODIFIED MODEL IS:
EXAMPLE OF MODEL IDENTIFICATION
SUPPOSE THAT THE CORRECT MODEL IS OF ORDER (0,2,2), BUT THAT THE FITTED MODEL IS
SUPPOSE THAT THE MODEL SUGGESTED FOR THE RESIDUALS IS
THE COMBINED MODEL IS
THIS SUGGESTS THAT A MODEL OF ORDER (0,2,2) SHOULD BE EXAMINED.
MODEL SIMPLIFICATION
A MODEL OF THE FORM
MIGHT BE REDUCIBLE TO
IF θ IS CLOSE TO 1.
CAUTION
ESTIMATED AUTOCORRELATIONS MAY BE HIGHLY CORRELATED, AND MAY HAVE LARGE VARIANCES.
USE THE ACF ONLY TO SUGGEST MODELS TO FIT (ESTIMATE).
RELY ON DIAGNOSTIC CHECKS TO ACCEPT OR REJECT FITTED MODELS.
OPTIMAL FORECASTER
THE OPTIMAL FORECASTER MINIMIZES THE MEAN SQUARED ERROR OF PREDICTION.
WHERE
SEASONALITY
A TENTATIVE MODEL IS
WHERE et IS CORRELATED WITH et-1, et-2,….
SUPPOSE THAT WE CAN REPRESENT THE RESIDUALS BY
WHERE THE at ARE WHITE.
THEN WE MAY COMBINE THESE RESULTS TO OBTAIN
EXPONENTIAL SMOOTHING FORECASTER
A SPECIAL CASE OF
WITH
I,E.
THE LEAST-SQUARES FORECASTER IS:
OR
MOVING AVERAGE FORECASTER
A SPECIAL CASE OF
WITH
THE LEAST-SQUARES FORECASTER IS
BOX-JENKINS SAMPLE PROBLEM
ACF OF SUCCESSIVE DIFFERENCES
TENTATIVE MODEL
MODEL FORM (ACF SPIKES AT LAGS 1 AND 12)
PARAMETER ESTIMATION: θ1 = .40. θ12 = .61; SO FITTED MODEL IS:
OR
DIAGNOSTIC CHECKS:
TOPICS NOT COVERED IN THIS BRIEFING
MULTIPLE VARIABLE (TRANSFER FUNCTION) MODELS
· CROSS CORRELATION FUNCTION (PREWHITENING OF INPUT)
· CROSS SPECTRAL ANALYSIS
· DESIGN OF EXPERIMENTS
MULTIVARIATE MODELS
· MULTIPLE CORRELATION COEFFICIENT
· MULTIPLE COHERENCY SPECTRUM
OPTIMAL CONTROL
· FEED FORWARD CONTROL
· FEED BACK CONTROL
PHYSICAL MODELS
· KALMAN FILTER
NONLINEAR MODELS
MODEL EXTENSIONS
· AUTOREGRESSIVE CONDITIONAL HETEROSKEDASTICITY (ARCH) AND GENERALIZED AUTOREGRESSIVE CONDITIONAL HETEROSKEDASTICITY (GARCH) MODELS
XII. PROGRAMS / DATA REQUIREMENTS / REFERENCES
XIII.
THE FIRST COMMERCIALLY AVAILABLE GENERAL-PURPOSE BOX-JENKINS FORECASTING PROGRAM WAS TIMES, DEVELOPED BY THE AUTHOR IN 1968-70.
UNIVARIATE BOX-JENKINS ANALYSIS PROGRAMS ARE NOW INCLUDED IN ALL MAJOR GENERAL-PURPOSE STATISTICAL SOFTWARE PACKAGES (Stata, SAS, SPSS), AND MULTIVARIATE BOX-JENKINS ANALYSIS PROGRAMS ARE FREELY AVAILABLE IN R.
A MINIMUM OF ABOUT 100 DATA POINTS ARE REQUIRED TO DEVELOP A BOX-JENKINS MODEL.
THE SEMINAL REFERENCE FOR THE BOX-JENKINS METHODOLOGY IS
BOX, G. E. P., AND GWILYM JENKINS, TIME SERIES ANALYSIS, FORECASTING CONTROL, FIRST EDITION HOLDEN-DAY, 1970, LATEST ADDITION IS 5TH EDITION BY GEORGE E. P. BOX, GWILYM M. JENKINS, GREGORY C. REINSEL AND GRETA M. LJUNG, WILEY, 2016.
OTHER REFERENCES INCLUDE
CRYER, JONATHAN D., AND KUNG-SIK CHAN, TIME SERIES ANALYSIS WITH APPLICATIONS IN R, 2ND ED., SPRINGER, 2008.
TSAY, RUEY S., MULTIVARIATE TIME SERIES ANALYSIS WITH R AND FINANCIAL APPLICATIONS, WILEY, 2014
LÜTKEPOHL, HELMUT, NEW INTRODUCTION TO MULTIPLE TIME SERIES ANALYSIS, SPRINGER, 2006
HAMILTON, JAMES D., TIME SERIES ANALYSIS, PRINCETON UNIVERSITY PRESS, 1994
A VERSION OF THE TIMES REFERENCE MANUAL IS POSTED AT INTERNET WEBSITE http://www.foundationwebsite.org/TIMESVol1TechnicalBackground.pdf .
FndID(243)
FndTitle(UNIVARIATE TIME SERIES ANALYSIS BRIEFING)
FndDescription(UNIVARIATE TIME SERIES ANALYSIS BRIEFING)
FndKeywords(time series analysis; Box-Jenkins models; briefing)