UNIVARIATE TIME SERIES ANALYSIS

BRIEFING

1970

Joseph George Caldwell, PhD (Statistics)

1432 N Camino Mateo, Tucson, AZ 85745-3311 USA

Tel. (001)(520)222-3446, E-mail jcaldwell9@yahoo.com

(File converted to Microsoft Word March 25, 2018, with minor edits.)

Copyright © 2018 Joseph George Caldwell.  All rights reserved.


ROAD MAP

1.     DEFINITIONS / EXAMPLES / FRAMEWORK

2.     THE NATURE OF TIME SERIES MODELING

3.     TIME SERIES DESCRIPTORS

4.     BASIC TIME SERIES MODELS

5.     TIME SERIES MODEL BUILDING (ADDITIONAL DETAILS)

6.     BOX-JENKINS MODELS

7.     PROGRAMS / DATA REQUIREMENTS / REFERENCES


I.                    DEFINITIONS / EXAMPLES / FRAMEWORK

II.                  

·        DEFINITION OF TIME SERIES

·        EXAMPLES OF TIME SERIES

·        TYPES OF TIME SERIES CONSIDERED

·        DEFINITION OF TIME SERIES ANALYSIS


WHAT IS A TIME SERIES?

Picture1

A SET OF OBSERVATIONS ARRANGED CHRONOLOGICALLY

A REALIZATION OF A STOCHASTIC PROCESS

            STOCHASTIC PROCESS = A FAMILY OF RANDOM VARIABLES: {X(t), tєT}


EXAMPLES OF TIME SERIES

ECONOMIC

·        DAILY COMMODITY OR STOCK PRICES

·        MONTHLY INTEREST RATES

·        SALES AND PRICES

·        CASH BALANCES

PHYSICAL

·        DAILY RAINFALL, MEAN DAILY TEMPERATURE

·        INDUSTRIAL PROCESS YIELDS

·        AIRCRAFT VIBRATION

·        ACOUSTIC AND ELECTROMAGNETIC SIGNALS AND NOISE

·        TRACKS OF SHIPS, PLANES, RE-ENTRY VEHICLES

BIOLOGICAL

·        DAILY EGG PRODUCTION, POPULATION GROWTH

·        BRAIN WAVES


TYPES OF TIME SERIES CONSIDERED HERE

DISCRETE, EQUIDISTANT POINTS IN TIME

STOCHASTIC, NOT DETERMINISTIC

E.G., zt = zt-1 + xt-1 + et, NOT zt = a cos(2π bt)

“EMPIRICAL-THEORETICAL” MODELS

E.G., NO KALMAN FILTER (MODEL ESSENTIALLY KNOWN)

EMPHASIS ON FORECASTING (NOT CONTROL)

UNIVARIATE, NOT MULTIVARIATE

Picture2


WHAT IS TIME SERIES ANALYSIS?

TIME SERIES ANALYSIS IS THE APPLICATION OF MATHEMATICAL TECHNIQUES TO DESCRIBE, PREDICT AND CONTROL TIME SERIES

WHICH TECHNIQUE TO USE DEPENDS ON THE TYPE OF ANSWER WANTED

TWO CATEGORIES OF TIME SERIES ANALYSIS TECHNIQUES:

SPECTRAL ANALYSIS

FREQUENCY RESPONSE STUDIES

MODEL BUILDING

FORECASTING

SIMULATION

CONTROL


III.               THE NATURE OF TIME SERIES MODELING

IV.               

HOW TIME SERIES MODELS ARE USED

PROCEDURE FOR MODEL DEVELOPMENT

GENERAL CLASSES OF TIME SERIES MODELS

CRITERIA FOR SELECTING A TIME SERIES MODEL CLASS


HOW TIME SERIES MODELS ARE USED

REPRESENT IMPORTANT CHARACTERISTICS OF TIME SERIES BY MATHEMATICAL MODEL

DEVELOP OPTIMAL FORECAST OR CONTROL PROCEDURE FOR MODEL

APPLY MODEL-OPTIMAL PROCEDURE IN REAL-WORLD SITUATION


PROCEDURE FOR MODEL DEVELOPMENT

ITERATIVE APPROACH TO MODEL BUILDING:

Picture4


GENERAL CLASSES OF TIME SERIES MODELS

UNIVARIATE

SINGLE VARIABLE (E.G., SALES OF A MINOR ITEM UNRELATED TO OTHERS)

MULTIPLE VARIABLES, ONE OF WHICH IS DEPENDENT ON THE OTHERS (E.G., CORPORATE QUARTERLY FORECAST; CONTROL PROBLEMS; LEADING INDICATORS)

MULTIVARIATE

MULTIPLE RELATED VARIABLES

FORECAST SALES AND PRICE SIMULTANEOUSLY

COORDINATES OF REENTRY VEHICLE:


CRITERIA FOR SELECTING A TIME SERIES MODEL CLASS

NATURE OF APPLICATION (E.G., FORECASTING, CONTROL, POLICY ANALYSIS)

FORECAST HORIZON (RANGE)

PRECISION REQUIREMENTS

COST (MODEL DEVELOPMENT AND IMPLEMENTATION, DATA COLLECTION)

DATA REQUIREMENTS AND AVAILABILITY

ALL OF THE PRECEDING CONSIDERATIONS ARE TAKEN INTO ACCOUNT IN SELECTING AN APPROPRIATE MODEL CLASS


NATURE OF APPLICATION

FORECASTING, CONTROL OR POLICY ANALYSIS

UNCONDITIONAL FORECASTING: NO REQUIREMENT FOR EXPLANATORY VARIABLES

CONDITIONAL FORECASTING; CONTROL, POLICY ANALYSIS: NEED EXPLANATORY / CONTROL VARIABLES


FORECAST HORIZON

SHORT RANGE: ONE-THREE PERIODS INTO THE FUTURE

MEDIUM RANGE: 4-12 PERIODS, 1-2 SEASONS

LONG RANGE: BEYOND


MEASUREMENT OF PRECISION

STANDARD ERROR OF THE ESTIMATE:

Picture5

LONG-RANGE VS SHORT-TERM MODELS:

Picture6


MODEL COSTS

DATA COLLECTION COSTS

·        NUMBER OF VARIABLES, SCOPE

·        HISTORY REQUIRED

MODEL DEVELOPMENT COSTS

·        PROFESSIONAL TIME (STATISTICAL ANALYSIS; DATABASE DEVELOPMENT; WEB DEVELOPMENT)

·        COMPUTER SOFTWARE / HARDWARE / SYSTEMS

MODEL IMPLEMENTATION COSTS

·        PERSONNEL TIME

·        COMPUTER-RELATED


V.                 TIME SERIES DESCRIPTORS

VI.               

BASIC CHARACTERISTICS OF TIME SERIES

DESCRIPTORS OF STATIONARY TIME SERIES

ESTIMATION OF TIME SERIES DESCRIPTORS


BASIC CHARACTERISTICS OF TIME SERIES

CONTINUOUS VS. DISCRETE

EQUISPACED

DETERMINISTIC VS. STOCHASTIC COMPONENTS

(E.G., MEAN MONTHLY TEMPERATURE VS. SALES)

STATIONARY VS. NONSTATIONARY


STATIONARITY

Picture7

STRONG STATIONARITY (PROBABILITY DISTRIBUTION INDEPENDENT OF SHIFTS):

WEAK STATIONARITY (2ND MOMENTS INDEPENDENT OF SHIFTS):

(NOTE: THIS PRESENTATION USES LOWER-CASE LETTERS BOTH FOR ABSTRACT RANDOM VARIABLES AND THEIR OBSERVED REALIZATIONS.)

DESCRIPTORS OF STATIONARY TIME SERIES

MEAN

VARIANCE

AUTOCORRELATION FUNCTION (ACF)

SPECTRAL DENSITY FUNCTION (SDF)

PARTIAL AUTOCORRELATION FUNCTION (PACF OR PAF)


AUTOCORRELATION FUNCTION (ACF)

(CORRELOGRAM)

OBSERVATION AT TIME t: 

MEAN:

VARIANCE:

AUTOCOVARIANCE AT LAG k:

AUTOCORRELATION AT LAG k:

Picture8


EXAMPLE

MODEL:                      OR          

WHERE at, at-1,… ARE A SEQUENCE OF UNCORRELATED RANDOM NORMAL VARIABLES WITH MEAN 0 AND VARIANCE 1 (“WHITE NOISE”)

Picture3


SPECTRAL DENSITY FUNCTION (SDF)

THE SPECTRAL DENSITY FUNCTION IS THE FOURIER COSINE TRANSFORM OF THE AUTOCORRELATION FUNCTION:

THE POWER SPECTRUM IS THE FOURIER COSINE TRANSFORM OF THE AUTOCOVARIANCE FUNCTION (CALLED THE PERIODOGRAM FOR SERIES WITH DETERMNISTIC COMPONENTS):

Picture9


ESTIMATION OF THE AUTOCORRELATION FUNCTION (ACF)

SAMPLE:

SAMPLE MEAN:

SAMPLE VARIANCE:

SAMPLE AUTOCOVARIANCE FUNCTION:

SAMPLE AUTOCORRELATION FUNCTION:


ESTIMATION OF THE SPECTRAL DENSITY FUNCTION (SDF)

FOR SERIES WITH DETERMINISTIC COMPONENTS (MIXTURES OF SINE AND COSINE) WAVES AT FIXED FREQUENCIES, BURIED IN NOISE):

THE SAMPLE SPECTRUM IS THE ESTIMATE OF THE PERIODOGRAM:

FOR SERIES WITH RANDOM CHANGES OF FREQUENCY, AMPLITUDE AND PHASE:

THE SMOOTHED SPECTRUM IS THE ESTIMATE OF THE SPECTRUM:

WHERE λk ARE SUITABLY CHOSEN WEIGHTS, CALLED A LAG WINDOW.


REPRESENTATIONAL EQUIVALENCE

STATIONARY NORMAL STOCHASTIC PROCESSES ARE CHARACTERIZED BY:

·        THEORETICAL MODEL

·        MEAN, VARIANCE AND AUTOCORRELATION FUNCTION

·        MEAN, VARIANCE AND SPECTRAL DENSITY FUNCTION


VII.            BASIC TIME SERIES MODELS

VIII.          

EXAMPLES

SUMMARY STATISTICS


WHITE NOISE (WN) PROCESS

MODEL:

Picture10


AUTOREGRESSIVE (AR) PROCESS

MODEL:

OR

OR

WHERE THE BACKWARD SHIFT OPERATOR, B, IS DEFINED BY

Picture11


PARTIAL AUTOCORRELATION FUNCTION (PACF) FOR AUTOREGRESSIVE PROCESSES

THE PARTIAL AUTOCORRELATION OF LAG k IS THE SOLUTION  TO

FOR AN AUTOREGRESSIVE PROCESS OF ORDER p,

ESTIMATE THE PACF BY THE LEAST-SQUARES ESTIMATE OF  IN THE LINEAR MODEL:


MOVING AVERAGE (MA) PROCESS

MODEL:

OR

OR

Picture12


MIXED AUTOREGRESSIVE MOVING AVERAGE (ARMA) PROCESS

MODEL:

OR

OR

Picture13


SUMMARY STATISTICS

PROCESS

ACF

PACF

SDF

WN

0

0

0

AR

TAILS OFF

CUTS OFF

VARIES

MA

CUTS OFF

TAILS OFF

VARIES

ARMA

TAILS OFF

TAILS OFF

VARIES


IX.               TIME SERIES MODEL BUILDING (ADDITIONAL DETAILS)

X.                  

·        ITERATIVE APPROACH

·        REVIEW OF MODEL CLASSES (“FORECASTING METHODS”)

·        COMPARISONS BETWEEN METHODS


ITERATIVE APPROACH TO MODEL BUILDING

Picture14

THE THEORY-BASED ITERATIVE APPROACH TO DEVELOPMENT OF TIME SERIES MODELS WAS DEVELOPED AND PROMOTED BY G. E. P. BOX AND GWILYM JENKINS, STARTING IN THE 1960s.  THE DESCRIPTOR “BOX-JENKINS” IS APPLIED EITHER TO DESCRIBE THEIR APPROACH TO DEVELOPING TIME-SERIES MODELS, OR TO DESCRIBE THE CLASS OF AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODELS EMPLOYED IN THEIR APPROACH.


TYPES OF FORECASTING MODELS

General model class

Model type

Usual forecast horizon

Accuracy

Cost to develop

Cost to implement

Short term (ST)

Medium term (MT)

Long term (LT)

Qualitative

Delphi

LT

MED

MED

MED

HI

HI

Market research

MT

HI

MED

MED

HI

HI

Panel consensus

LT

LO

LO

LO

HI

HI

Visionary

LT

LO

LO

LO

HI

HI

Historical analogy

LT

LO

MED

MED

HI

HI

Single variable

Moving average

ST

LO

LO

LO

LO

VLO

Exponential smoothing

ST

MED

LO

LO

LO

VLO

Univariate time series models (“Box-Jenkins,” AR, MA, ARMA, ARIMA)

ST

HI

MED

LO

MED

VLO

X-11

ST

MED

MED

LO

LO

LO

Trend projections

ST

MED

MED

MED

LO

VLO

Neural network

ST

HI

MED

LO

MED

LO

Multiple variable univariate

Regression model (BJ)

MT

HI

MED

LO

MED

LO

Econometric model

MT

MED

HI

MED

HI

LO

Anticipation survey

MT

MED

LO

LO

HI

HI

Input-output model

LT

LO

MED

MED

VHI

MED

Economic input-output

LT

LO

HI

HI

VHI

MED

Diffusion index

MT

LO

LO

LO

HI

LO

Leading indicator (BJ)

ST

HI

MED

LO

MED

LO

Life-cycle analysis

MT

LO

LO

LO

HI

HI

Neural network

ST

HI

MED

LO

MED

LO

Multivariate

Econometric simultaneous-equation model (“SEM”)

MT

HI

MED

LO

VHI

HI

Multivariate time series models (“Box Jenkins,” VAR, VARMA, VARX, MARMA, MARIMA, Kalman filter)

MT

HI

MED

LO

MED

LO


FORECASTS BASED ON UNVALIDATED MODELS

MODEL IS “FITTED” TO DATA – LITTLE OR NO DIAGNOSTIC CHECKING (MODEL VALIDATION)

·        MOVING AVERAGE

·        EXPONENTIAL SMOOTHING

·        REGRESSION ON COMPONENTS

·        TREND, SEASONAL PATTERNS, FOURIER COMPONENTS

·        NEURAL NETWORK


FORECASTS BASED ON VALIDATED MODELS

·        SINGLE VARIABLE

o   BOX-JENKINS

·        MULTIPLE VARIABLE (THEORETICAL OR EMPIRICAL MODEL)

·        BOX-JENKINS

·        ECONOMETRIC

·         


SOME FORECASTING ACCURACY COMPARISONS

STOCHASTIC VS. INTUITIVE

Forecast Error Variance

Lead time

1

2

3

4

5

6

7

8

9

10

MSE (Brown)

102

158

218

256

363

452

554

669

799

944

MSE (BJ)

42

91

136

180

222

266

317

371

427

483

ECONOMETRIC VS. STOCHASTIC

Theil Coefficient

Model

Price

Quantity

Econometric

0.80

0.65

Box-Jenkins

1.00

0.70

Random walk

1.00

1.00

Mean

18.23

.96


IBM STOCK PRICE SERIES WITH COMPARISON OF LEAD-3 FORECASTS OBTAINED FROM BEST IMA(0,1,1) PROCESS AND BROWN’S QUADRATIC FORECAST FOR A PERIOD BEGINNING JULY 11, 1960.  (REF. BJ p. 168)

Picture15


HYBRID MODELS

·        TREND REMOVAL

·        REGRESSION + BOX-JENKINS


XI.               BOX-JENKINS MODELS

·        MODEL IDENTIFICATION

·        PARAMETER ESTIMATION

·        DIAGNOSTIC CHECKING

·        FORECASTING

·        EXAMPLE


BOX-JENKINS MODEL CLASS

AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODEL:

OR

WHERE

AND

IS STATIONARY.  THAT IS,

WHERE


MODEL IDENTIFICATION

APPLY SUCCESSIVE DIFFERENCING (OR OTHER PROCEDURE) TO ORIGINAL DATA UNTIL STATIONARITY IS ACHIEVED (ACF DIES OUT):

EXAMINE THE ACF AND PACF OF wt TO SUGGEST A TENTATIVE MODEL (I.E., IDENTIFY THE DEGREE AND STRUCTURE OF φ(B) AND Θ(B) POLYNOMIALS).


IDENTIFICATION EXAMPLES

BEHAVIOR OF THE AUTOCORRELATION FUNCTIONS FOR THE d-th DIFFERENCE OF AN ARMA PROCESS OF ORDER (p,d,q).  REFERENCE: BJ p. 176.

Picture16


PARAMETER ESTIMATION

IF THE MODEL IS AUTOREGRESSIVE (NO θs), WE HAVEFG A LINEAR STATISTICAL MODEL:

WHERE z = (zn, zn-1,…,zp+1)’, φ = (φ1,…,φp)’, a = (an, an-1,…,ap+1)’, AND

SO THE LEAST-SQUARES ESTIMATE OF φ IS

IF θs ARE PRESENT, THE MODEL IS A NONLINEAR STATISTICAL MODEL:

I.E.,

IS A LINEAR MODEL WITH PARAMETER δ = ββ0.  DETERMINE THE PARAMETER ESTIMATES ITERATIVELY.


TESTS OF MODEL ADEQUACY

PROPERTIES OF φ(B) AND Θ(B) POLYNOMIALS (STATIONARITY, INVERTIBILITY, COMMON FACTORS)

TESTS OF SIGNIFICANCE OF MODEL PARAMETERS

OVERFITTING

TEST RESIDUALS FOR “WHITENESS”

INFORMATION CRITERIA (E.G., AKAIKE INFORMATION CRITERION, BAYES INFORMATION CRITERION)

DEVELOP MODEL FOR DIFFERENT PERIODS OF DATA


DIAGNOSTIC CHECKING OF RESIDUALS

SIGNIFICANCE OF VARIOUS STATISTICS COMPUTED FROM RESIDUALS:

·        MEAN (t-TEST)

·        PACF (t-TEST ON EACH VALUE)

·        ACF (t-TEST ON EACH VALUE, χ2 TEST ON ENTIRE FUNCTION)

·        SPECTRUM (GRENANDER-ROSENBLATT TEST, KOLMOGOROV-SMIRNOV TEST)


MODEL IDENTIFICATION

THE PACF AND ACF CAN SUGGEST ADDITION OF φ AND θ COMPONENTS.

A PRELIMINARY FITTED MODEL IS:

THE et’s ARE CORRELATED; A MODEL FOR THE et’s IS:

THE MODIFIED MODEL IS:


EXAMPLE OF MODEL IDENTIFICATION

SUPPOSE THAT THE CORRECT MODEL IS OF ORDER (0,2,2), BUT THAT THE FITTED MODEL IS

SUPPOSE THAT THE MODEL SUGGESTED FOR THE RESIDUALS IS

THE COMBINED MODEL IS

THIS SUGGESTS THAT A MODEL OF ORDER (0,2,2) SHOULD BE EXAMINED.


MODEL SIMPLIFICATION

A MODEL OF THE FORM

MIGHT BE REDUCIBLE TO

IF θ IS CLOSE TO 1.


CAUTION

ESTIMATED AUTOCORRELATIONS MAY BE HIGHLY CORRELATED, AND MAY HAVE LARGE VARIANCES.

USE THE ACF ONLY TO SUGGEST MODELS TO FIT (ESTIMATE).

RELY ON DIAGNOSTIC CHECKS TO ACCEPT OR REJECT FITTED MODELS.


OPTIMAL FORECASTER

THE OPTIMAL FORECASTER MINIMIZES THE MEAN SQUARED ERROR OF PREDICTION.

WHERE


SEASONALITY

A TENTATIVE MODEL IS

WHERE et IS CORRELATED WITH et-1, et-2,….

SUPPOSE THAT WE CAN REPRESENT THE RESIDUALS BY

WHERE THE at ARE WHITE.

THEN WE MAY COMBINE THESE RESULTS TO OBTAIN


EXPONENTIAL SMOOTHING FORECASTER

A SPECIAL CASE OF

WITH

I,E.

THE LEAST-SQUARES FORECASTER IS:

OR


MOVING AVERAGE FORECASTER

A SPECIAL CASE OF

WITH

THE LEAST-SQUARES FORECASTER IS


BOX-JENKINS SAMPLE PROBLEM

Picture17


ACF OF SUCCESSIVE DIFFERENCES

Picture18


Picture19


TENTATIVE MODEL

MODEL FORM (ACF SPIKES AT LAGS 1 AND 12)

PARAMETER ESTIMATION: θ1 = .40. θ12 = .61;  SO FITTED MODEL IS:

OR

DIAGNOSTIC CHECKS:

Picture20


TOPICS NOT COVERED IN THIS BRIEFING

MULTIPLE VARIABLE (TRANSFER FUNCTION) MODELS

·        CROSS CORRELATION FUNCTION (PREWHITENING OF INPUT)

·        CROSS SPECTRAL ANALYSIS

·        DESIGN OF EXPERIMENTS

MULTIVARIATE MODELS

·        MULTIPLE CORRELATION COEFFICIENT

·        MULTIPLE COHERENCY SPECTRUM

OPTIMAL CONTROL

·        FEED FORWARD CONTROL

·        FEED BACK CONTROL

PHYSICAL MODELS

·        KALMAN FILTER

NONLINEAR MODELS

MODEL EXTENSIONS

·        AUTOREGRESSIVE CONDITIONAL HETEROSKEDASTICITY (ARCH) AND GENERALIZED AUTOREGRESSIVE CONDITIONAL HETEROSKEDASTICITY (GARCH) MODELS


XII.            PROGRAMS / DATA REQUIREMENTS / REFERENCES

XIII.           

THE FIRST COMMERCIALLY AVAILABLE GENERAL-PURPOSE BOX-JENKINS FORECASTING PROGRAM WAS TIMES, DEVELOPED BY THE AUTHOR IN 1968-70.

UNIVARIATE BOX-JENKINS ANALYSIS PROGRAMS ARE NOW INCLUDED IN ALL MAJOR GENERAL-PURPOSE STATISTICAL SOFTWARE PACKAGES (Stata, SAS, SPSS), AND MULTIVARIATE BOX-JENKINS ANALYSIS PROGRAMS ARE FREELY AVAILABLE IN R.

A MINIMUM OF ABOUT 100 DATA POINTS ARE REQUIRED TO DEVELOP A BOX-JENKINS MODEL.

THE SEMINAL REFERENCE FOR THE BOX-JENKINS METHODOLOGY IS

BOX, G. E. P., AND GWILYM JENKINS, TIME SERIES ANALYSIS, FORECASTING CONTROL, FIRST EDITION HOLDEN-DAY, 1970, LATEST ADDITION IS 5TH EDITION BY GEORGE E. P. BOX, GWILYM M. JENKINS, GREGORY C. REINSEL AND GRETA M. LJUNG, WILEY, 2016.

OTHER REFERENCES INCLUDE

CRYER, JONATHAN D., AND KUNG-SIK CHAN, TIME SERIES ANALYSIS WITH APPLICATIONS IN R, 2ND ED., SPRINGER, 2008.

TSAY, RUEY S., MULTIVARIATE TIME SERIES ANALYSIS WITH R AND FINANCIAL APPLICATIONS, WILEY, 2014

LÜTKEPOHL, HELMUT, NEW INTRODUCTION TO MULTIPLE TIME SERIES ANALYSIS, SPRINGER, 2006

HAMILTON, JAMES D., TIME SERIES ANALYSIS, PRINCETON UNIVERSITY PRESS, 1994

A VERSION OF THE TIMES REFERENCE MANUAL IS POSTED AT INTERNET WEBSITE http://www.foundationwebsite.org/TIMESVol1TechnicalBackground.pdf .

FndID(243)

FndTitle(UNIVARIATE TIME SERIES ANALYSIS BRIEFING)

FndDescription(UNIVARIATE TIME SERIES ANALYSIS BRIEFING)

FndKeywords(time series analysis; Box-Jenkins models; briefing)