SMSP

Introduction to
Smoothing Splines
Tongtong Wu
Feb 29, 2004
Outline
 Introduction
 Linear and polynomial regression, and
interpolation
 Roughness penalties
 Interpolating and Smoothing splines
 Cubic splines
 Interpolating splines
 Smoothing splines
 Natural cubic splines
 Choosing the smoothing parameter
 Available software
Key Words
 roughness penalty
 penalized sum of squares
 natural cubic splines
Motivation
10
8
6
(y18)
4
2
5 10 15
Index
Motivation
10
8
6
y18
4
2
5 10 15
Index
Motivation
10
8
6
y18
4
2
5 10 15
Index
Motivation
10
8
6
(y18)
4
2
Spline(y18)
5 10 15
Index
Introduction
 Linear and polynomial regression :
 Global influence
 Increasing of polynomial degrees happens in
discrete steps and can not be controlled
continuously
 Interpolation
 Unsatisfactory as explanations of the given
data
Roughness penalty approach
 A method for relaxing the model
assumptions in classical linear regression
along lines a little different from
polynomial regression.
 Aims of curving fitting
 A good fit to the data
 To obtain a curve estimate that does not
display too much rapid fluctuation
 Basic idea: making a necessary
compromise between the two rather
different aims in curve estimation
 Quantifying the roughness of a curve
An intuitive way:
 g ' ' (t ) dt
 b 2
a
(g: a twice-differentiable curve)
 Motivation from a formalization of a
mechanical device: if a thin piece of
flexible wood, called a spline, is bent to
the shape of the graph g, then the leading
term in the strain energy is proportional to

2
g ' '
 Penalized sum of squares
n
S ( g )   Yi  g (ti )    g ' ' (t ) dt
2 b 2
a
i 1
 g: any twice-differentiable function on [a,b]
  : smoothing parameter (‘rate of exchange’
between residual error and local variation)
 Penalized least squares estimator

gˆ  arg min S ( g )
Curve for a large value of 
10
8
6
y18
4
2
5 10 15
Index
Curve for a small value of 
10
8
6
y18
4
2
5 10 15
Index
Interpolating and Smoothing Splines
 Cubic splines
 Interpolating splines
 Smoothing splines
 Choosing the smoothing parameter
Cubic Splines
 Given a<t1<t2<…<tn<b, a function g is a
cubic spline if
1. On each interval (a,t1), (t1,t2), …, (tn,b), g is a
cubic polynomial
2. The polynomial pieces fit together at points ti
(called knots) s.t. g itself and its first and
second derivatives are continuous at each ti,
and hence on the whole [a,b]
Cubic Splines
 How to specify a cubic spline
g (t )  d i (t  ti )3  ci (t  ti ) 2  bi (t  ti )  ai for ti  t  ti 1
 Natural cubic spline (NCS) if its second
and third derivatives are zero at a and b,
which implies d0=c0=dn=cn=0, so that g is
linear on the two extreme intervals [a,t1]
and [tn,b].
Natural Cubic Splines
Value-second derivative representation
 We can specify a NCS by giving its value
and second derivative at each knot ti.
 Define g  ( g1 ,, g n )' , where gi  g (ti )
  ( 2 ,,  n1 )' , where  i  g ' ' (ti )
which specify the curve g completely.
 However, not all possible vectors
represent a natural spline!
 Theorem 2.1
The vector g and  specify a natural
spline g if and only if
Q ' g  R
Then the roughness penalty will satisfy
b
a
g ' ' (t ) 2 dt   ' R  g ' Kg
 h11 0  0  hi  ti 1  ti for i  1,, n
 1 1 
 h1  h2 h21  0 
 h21  h21  h31  0 
Q 
 0 h31  0 
     
 1

 0 0  hn 1  n( n  2 )
1 1 
( h
3 1 3  h ) h2  0 
6
 1 1 
R  h 2 (h2  h3 )  0 
 6 3 
     
 0 0 
1
(hn  2  hn 1 )
 3  ( n  2 )( n  2 )
 R is strictly diagonal dominant, i.e.
| rii |  j i | rij |, i
 R is positive definite, so we can define
K  QR 1Q'
Interpolating Splines
 To find a smooth curve that interpolate (ti,zi),
i.e. g(ti)=zi for all i.
 Theorem 2.2
Suppose n  2 and t1<…<tn. Given any
values z1,…,zn, there is a unique natural cubic
spline g with knots ti satisfying
g (ti )  zi for i  1,, n
Interpolating Splines
 The natural cubic spline interpolant is the
unique minimizer of  g ' '2 over S2[a,b] that
interpolate the data.
 Theorem 2.3
Suppose g is the interpolant natural cubic
~  S [a, b] with g~(t )  z for i  1,, n
spline, g 2 i i
then
 
~ ' '2  g ' '2
g
Smoothing Splines
 Penalized sum of squares
n
S ( g )   Yi  g (ti )    g ' ' (t ) dt
2 b 2
a
i 1
 g: any twice-differentiable function on [a,b]
  : smoothing parameter (‘rate of exchange’
between residual error and local variation)
 Penalized least squares estimator

gˆ  arg min S ( g )
Smoothing Splines
1. The curve estimator ĝ is necessarily
a natural cubic spline with knots at ti,
for i=1,…,n.
Proof: suppose g is the NCS
n n

 iY  g (t i )2
 
 iY  ~ (t )2
g i
i 1 i 1
 g ' ' (t ) dt   g ' ' (t ) dt

b b
2 ~ 2
a a
 S ( g )  S ( g~)
Smoothing Splines
2. Existence and uniqueness
Let Y  (Y1 ,, Yn )' then
n

 i
Y  g (t i )2
 (Y  g )' (Y  g )
i 1
since g be precisely the vector of g (ti ) .
Express  g ' ' 2
 g ' Kg ,
S(g)  (Y  g)'(Y  g)  g'Kg
 g'(I  K)g  2Y' g  Y'Y
Minimum is achieved by setting g  ( I  K ) 1Y

Smoothing Splines
2. Theorem 2.4
Let ĝ be the natural cubic spline with
knots at ti for which g  ( I  K ) Y . Then
1
for any g in S2[a,b]

S ( gˆ )  S ( g )
Smoothing Splines
3. The Reinsch algorithm
Y  ( I  K ) g  ( I  QR 1Q) g
 g  Y  QR 1Q) g  Y  Q ( Q' g  R )
 Q' Y  ( R  Q' Q)
The matrix ( R  Q' Q) has bandwidth 5 and is

symmetric and strictly positive-definite,
therefore it has a Cholesky decomposition
R  Q' Q  LDL '
Smoothing Splines
3. The Reinsch algorithm for spline smoothing
Step 1: Evaluate the vector Q' Y .
Step 2: Find the non-zero diagonals of
R  Q ' Q
and hence the Cholesky decomposition
factors L and D.
Step 3: Solve
LDL '   Q' Y
for  by forward and back substitution.
Step 4: Find g by g  Y  Q .
Smoothing Splines
4. Some concluding remarks
 Minimizing curve ĝ essentially does not depend
on a and b, as long as all the data points lie
between a and b.
 If n=2, for any  , setting ĝ to be the straight
line through the two points (t1,Y1) and (t2,Y2) will
reduce S(g) to zero.
 If n=1, the minimizer is no longer unique, since
any straight line through (t1,Y1) will yield a zero
value S(g).
Choosing the Smoothing Parameter
 Two different philosophical
approaches
 Subjective choice
 Automatic method – chosen by data
 Cross-validation
 Generalized cross-validation
Choosing the Smoothing Parameter
 Cross-validation
 Y  gˆ 
n
min CV ( )  n (ti ; )
1 ( i ) 2
i

i 1
2
 Yi  gˆ (ti ) 
n
 n  
1
 if gˆ is the spline smoother w ith 
i 1  1  Aii ( ) 
 Generalized cross-validation
n

 i i
Y  ˆ
g (t ) 2
n  residual sum of squares

min GCV ( )  n 1 i 1


1  n trA( )
1
 2
(equivalen t df) 2
Available Software
smooth.spline in R
 Description:
Fits a cubic smoothing spline to the supplied data.
 Usage:
plot(speed, dist)
cars.spl <- smooth.spline(speed, dist)
cars.spl2 <- smooth.spline(speed, dist, df=10)
lines(cars.spl, col = "blue")
lines(cars.spl2, lty=2, col = "red")
Available Software
Example 1
library(modreg)
y18 <- c(1:3,5,4,7:3,2*(2:5),rep(10,4))
xx <- seq(1,length(y18), len=201)
(s2 <- smooth.spline(y18)) # GCV
(s02 <- smooth.spline(y18, spar = 0.2))
plot(y18, main=deparse(s2$call), col.main=2)
lines(s2, col = "blue");
lines(s02, col = "orange");
lines(predict(s2, xx), col = 2)
lines(predict(s02, xx), col = 3);
mtext(deparse(s02$call), col = 3)
Available Software
Example 1
Available Software
Example 2
data(cars) ## N=50, n (# of distinct x) =19
attach(cars)
plot(speed, dist, main = "data(cars) & smoothing splines")
cars.spl <- smooth.spline(speed, dist)
cars.spl2 <- smooth.spline(speed, dist, df=10)
lines(cars.spl, col = "blue")

lines(cars.spl2, lty=2, col = "red")
lines(smooth.spline(cars, spar=0.1))
## spar: smoothing parameter (alpha) in (0,1]
legend(5,120,c(paste("default [C.V.] => df
=",round(cars.spl$df,1)), "s( * , df = 10)"), col =
c("blue","red"), lty = 1:2, bg='bisque')
detach()
Available Software
Example 2
Extensions of
 Semiparametric modeling: a simple application
to multiple regression
Y  g (t )  x'   
 Generalized linear models (GLM)
 To allow all the explanatory variables to be
nonlinear
Y  g (t )  
 Additive model approach
d
Y   g j (t j )  
j 1
Reference
 P.J. Green and B.W. Silverman (1994)
Nonparametric Regression and Generalized
Linear Models. London: Chapman & Hall

SMSP

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SMSP

Uploaded by

Copyright:

Available Formats

Introduction to

 Penalized least squares estimator

 Penalized least squares estimator

 g ' ' (t ) dt   g ' ' (t ) dt

for any g in S2[a,b]

The matrix ( R  Q' Q) has bandwidth 5 and is

n  residual sum of squares

lines(cars.spl, col = "blue")

You might also like