#LyX 1.3 created this file. For more info see http://www.lyx.org/
\lyxformat 221
\textclass article
\begin_preamble
\usepackage{ragged2e}
\RaggedRight
\setlength{\parindent}{1 em}
\end_preamble
\language english
\inputencoding auto
\fontscheme pslatex
\graphics default
\paperfontsize 12
\spacing single
\papersize Default
\paperpackage a4
\use_geometry 1
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\leftmargin 1in
\topmargin 1in
\rightmargin 1.25in
\bottommargin 1in
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default
\layout Title
Approximations
\layout Author
Paul E.
Johnson
\layout Section
Linear Approximation
\layout Subsection
Linear equation
\layout Standard
We start here because it is particularly easy.
\layout Standard
\added_space_bottom vfill
Suppose
\begin_inset Formula $y=f(x)=a+bx.$
\end_inset
That's a straight line.
Draw a graph:
\layout Standard
The derivative of that line,
\begin_inset Formula $\frac{dy}{dx}=f'(x)=b$
\end_inset
, the same quantity you ordinarily know as the
\begin_inset Quotes eld
\end_inset
slope
\begin_inset Quotes erd
\end_inset
.
\layout Standard
Consider you want to build an approximation to the value
\begin_inset Formula $y=f(x)$
\end_inset
but you have to base your approximation from a
\begin_inset Quotes eld
\end_inset
starting estimate
\begin_inset Quotes erd
\end_inset
at a point
\begin_inset Formula $x_{0}$
\end_inset
.
We want to
\begin_inset Quotes eld
\end_inset
estimate
\begin_inset Quotes erd
\end_inset
the value of
\begin_inset Formula $y$
\end_inset
at some point,
\begin_inset Formula $x$
\end_inset
.
\layout Standard
Quite obviously, our
\begin_inset Quotes eld
\end_inset
best estimate
\begin_inset Quotes erd
\end_inset
is
\begin_inset Formula \begin{equation}
\hat{y}=f(x)=f(x_{0})+b\cdot(x-x_{0})\label{eq:linapprox1}\end{equation}
\end_inset
\layout Standard
The
\begin_inset Quotes eld
\end_inset
estimate
\begin_inset Quotes erd
\end_inset
using this formula is always exactly right, of course, because
\begin_inset Formula $f(x)$
\end_inset
is a straight line.
\layout Subsection
Nonlinear equation
\layout Standard
Now imagine that
\begin_inset Formula $f(x)$
\end_inset
is not a straight line.
Begin at
\begin_inset Formula $x_{0}$
\end_inset
and you can calculate
\begin_inset Formula $f(x_{0})$
\end_inset
.
Then the linear approximation to
\begin_inset Formula $f(x_{0})$
\end_inset
is
\begin_inset Formula \begin{equation}
\hat{y}=f(x_{0})+f'(x)(x-x_{0})\label{eq:linapprox2}\end{equation}
\end_inset
\layout Standard
\added_space_bottom vfill
Note that, depending on how sharply the curve changes, the linear approximation
may be either a good or bad estimate.
Make a picture:
\layout Section
\added_space_top vfill
Taylor Series Approximation
\layout Standard
Here is an intuition.
Suppose your approximation,
\begin_inset Formula $\hat{y}$
\end_inset
is too low.
That happened because the curve
\begin_inset Formula $f(x)$
\end_inset
is concave upwards--the slope is rapidly increasing.
That means the second derivative,
\begin_inset Formula $f''(x)>0$
\end_inset
.
\layout Standard
So, to improve the approximation, one might add another term that includes
\begin_inset Formula $f''(x)$
\end_inset
.
Suppose, for example, you tried:
\begin_inset Formula \begin{equation}
\hat{y}=f(x_{0})+f'(x_{0})(x-x_{0})+f''(x_{0})(x-x_{0})\label{eq:approx3}\end{equation}
\end_inset
\newline
That might get you closer to the mark.
It could be that the guess becomes too large after that correction.
\layout Standard
The problem with
\begin_inset LatexCommand \ref{eq:approx3}
\end_inset
is that it is completely unsystematic.
There's no
\begin_inset Quotes eld
\end_inset
guarantee
\begin_inset Quotes erd
\end_inset
that the error,
\begin_inset Formula $\hat{y}-f(x)$
\end_inset
will be within any given limits.
\layout Standard
That's where Taylor's theorem enters the picture.
The theorem holds that you can keep adding terms to the approximation and
it will get
\begin_inset Quotes eld
\end_inset
closer and closer
\begin_inset Quotes erd
\end_inset
to the correct value.
In fact, if you set your tolerable error level, you can add enough terms
to make the actual error smaller than the tolerable level.
\layout Standard
What is the magical formula? A Taylor series is:
\begin_inset Formula \begin{equation}
\hat{y}=f(x_{0})+f'(x_{0})(x-x_{0})+(\frac{1}{2})f''(x_{0})(x-x_{0})^{2}+(\frac{1}{6})f'''(x_{0})(x-x_{0})^{3}+\label{eq:approx3}\end{equation}
\end_inset
\begin_inset Formula \[
+...+(\frac{1}{N!})f^{N}(x_{0})(x-x_{0})^{N}\]
\end_inset
\layout Standard
For many well-behaved functions (insert boring math details here), as
\begin_inset Formula $N\rightarrow\infty$
\end_inset
, the series converges--approaches a finite value.
That means, for
\begin_inset Formula $x$
\end_inset
near to the starting estimate
\begin_inset Formula $x_{0}$
\end_inset
, then the value of
\begin_inset Formula $\hat{y}$
\end_inset
is equal to
\begin_inset Formula $f(x)$
\end_inset
.
In fact, it gets
\begin_inset Quotes eld
\end_inset
arbitrarily close
\begin_inset Quotes erd
\end_inset
.
\layout Standard
In most applied problems, it is not necessary to raise
\begin_inset Formula $N$
\end_inset
to a very high number.
In fact, most of the time, applications only increase
\begin_inset Formula $N$
\end_inset
to 2 or 3.
\layout Section
Newton's method of finding roots
\layout Standard
Given a function
\begin_inset Formula $g(x),$
\end_inset
we want to find a maximum or minimum value.
Assuming the function is differentiable, we want a point where
\begin_inset Formula $\frac{\partial y}{\partial x}=g'(x)=0$
\end_inset
.
\layout Subsection
Relabel the derivative at
\begin_inset Formula $x$
\end_inset
as
\begin_inset Formula $h(x).$
\end_inset
\layout Standard
Since I get tired of typing
\begin_inset Formula $g'(x)$
\end_inset
or the symbol
\begin_inset Formula $\frac{\partial y}{\partial x}$
\end_inset
, I will now just assign a new letter to the derivitive.
In the past, I confused people by re-using
\begin_inset Formula $f$
\end_inset
here, so I'll try to avoid that mistake by introducing the letter
\begin_inset Formula $h$
\end_inset
.
From this point forward,
\begin_inset Formula \[
h(x)=g'(x)\]
\end_inset
\layout Subsection
Roots.
\layout Standard
In order to find a critical point--max or min--we need to find a value of
\begin_inset Formula $x$
\end_inset
such that
\begin_inset Formula $h(x)=0.$
\end_inset
Such values of
\begin_inset Formula $x$
\end_inset
are called
\begin_inset Quotes eld
\end_inset
roots.
\begin_inset Quotes erd
\end_inset
\layout Subsection
Crude guess-timation
\layout Standard
Suppose
\begin_inset Formula $g(x)$
\end_inset
is
\begin_inset Quotes eld
\end_inset
U
\begin_inset Quotes erd
\end_inset
shaped and suppose the derivative
\begin_inset Formula $g'(x)$
\end_inset
(same as
\begin_inset Formula $h(x)$
\end_inset
) can be calculated.
Select some point
\begin_inset Formula $x_{0}$
\end_inset
and find out whether the slope
\begin_inset Formula $g'(x)$
\end_inset
(same as
\begin_inset Formula $h(x)$
\end_inset
)is positive or negative.
Suppose
\begin_inset Formula $g'(x_{0})<0$
\end_inset
(meaning
\begin_inset Formula $h(x_{0})<0$
\end_inset
), then we should move to the right to a point
\begin_inset Formula $x_{1}$
\end_inset
and calculate
\begin_inset Formula $g'(x_{1}).$
\end_inset
Keep going, over and over.
As
\begin_inset Formula $x_{N}$
\end_inset
gets closer and closer to the bottom of the
\begin_inset Quotes eld
\end_inset
U
\begin_inset Quotes erd
\end_inset
shape, then
\begin_inset Formula $g'(x_{N})$
\end_inset
will get smaller and smaller, until, when the exact bottom of the curve
is found,
\begin_inset Formula $g'(x)=0$
\end_inset
.
That is, we find where
\begin_inset Formula $h(x)=0$
\end_inset
.
\layout Standard
That might get boring, unless you have a good algorithm with which to choose
each successive guess.
One of the first proposals was offered by Newton, one of the inventors
of the calculus of infinite differences.
\layout Subsection
Newton's method
\layout Standard
Newton's method of finding the roots of a function, such as
\begin_inset Formula $h(x)$
\end_inset
, is rather simple.
\layout Standard
We think of
\begin_inset Formula $x_{0}$
\end_inset
as the initial guess and then
\begin_inset Formula $x_{1}$
\end_inset
is the
\begin_inset Quotes eld
\end_inset
new improved guess.
\begin_inset Quotes erd
\end_inset
So, repeatedly the following calculation is made.
First, calculate a
\begin_inset Quotes eld
\end_inset
newer improved guess
\begin_inset Quotes erd
\end_inset
with this formula:
\layout Standard
\begin_inset Formula \[
new\, guess=x_{0}-\frac{h(x_{0})}{h'(x_{0})}\]
\end_inset
\begin_inset Formula \[
x_{0}=x_{1}\]
\end_inset
\begin_inset Formula \[
x_{1}=new\, guess\]
\end_inset
\layout Standard
Make that calculation repeatedly, until
\begin_inset Formula $|x_{1}-x_{0}|$
\end_inset
is smaller than the target value.
Often, one will see the stoping criterion stated as a tolerance parameter,
such as
\begin_inset Formula \[
\frac{|x_{1}-x_{0}|}{x_{0}}<10^{-6}\]
\end_inset
\layout Standard
Note that, if the difference between
\begin_inset Formula $x_{1}$
\end_inset
and
\begin_inset Formula $x_{0}$
\end_inset
is very very small, then that must mean that
\begin_inset Formula $h(x_{0})$
\end_inset
has become very small.
\layout Standard
How to explain this algorithm? Perhaps a picture is worth a thousand words.
\layout Standard
\begin_inset Graphics
filename NewtonMethod.eps
width 5.5in
\end_inset
\layout Standard
It is not too difficult to justify this approach.
If one calculates a linear approximation to forecast
\begin_inset Formula $h(x_{1})$
\end_inset
from a point
\begin_inset Formula $x_{0}$
\end_inset
,
\begin_inset Formula \[
\hat{y}=\hat{h(x_{1})}=h(x_{0})+h'(x_{0})(x_{1}-x_{0})\]
\end_inset
Recall we are looking for the point where the function,
\begin_inset Formula $h(x)$
\end_inset
, equals 0.
\layout Standard
So think backwards for a minute.
\layout Standard
Don't estimate
\begin_inset Formula $y$
\end_inset
, but rather set
\begin_inset Formula $\hat{y}=0.$
\end_inset
\layout Standard
Solve for
\begin_inset Formula $x_{1}$
\end_inset
, the value for which
\begin_inset Formula $\hat{y}=0.$
\end_inset
\layout Standard
\begin_inset Formula \[
0=h(x_{0})+h'(x_{0})(x_{1}-x_{0})\]
\end_inset
\newline
which implies:
\layout Standard
\begin_inset Formula \[
-h(x_{0})=h'(x_{0})(x_{1}-x_{0})\]
\end_inset
\begin_inset Formula \[
-\frac{h(x_{0})}{h'(x_{0})}=x_{1}-x_{0}\]
\end_inset
\layout Standard
\begin_inset Formula \[
x_{1}=x_{0}-\frac{h(x_{0})}{h'(x_{0})}\]
\end_inset
\layout Section
Think backwards again.
\layout Standard
We were talking about adjusting
\begin_inset Formula $x$
\end_inset
to find the critical points.
If you replace
\begin_inset Formula $x$
\end_inset
in the above with a parameter, say
\begin_inset Formula $\theta$
\end_inset
or
\begin_inset Formula $\beta$
\end_inset
, then everything works OK.
You are finding optimal estimates.
\layout Standard
In statistics, we have some function--a maximum likelihood equation or a
sum of squared residuals.
The input data, the
\begin_inset Formula $y$
\end_inset
's and
\begin_inset Formula $x$
\end_inset
's, is just seen as
\begin_inset Quotes eld
\end_inset
numerical constants
\begin_inset Quotes erd
\end_inset
in the formula.
The variables that we are adjusting to maximize or minimize things, are
the values of the parameters.
\layout Standard
So, if you translate that idea into the above story, we would not be maximizing
\begin_inset Formula $g(x)$
\end_inset
, but rather a function that depends on
\begin_inset Formula $\beta$
\end_inset
, as in maximum likelihood model, where the MLE is the value
\begin_inset Formula $\hat{\beta}$
\end_inset
that maximizes this:
\begin_inset Formula \[
lnL(\beta|X,y)=\sum_{i=1}^{N}prob(y|X,\beta)\]
\end_inset
Or, in least squares analysis, the value
\begin_inset Formula $\hat{b}$
\end_inset
that minimizes the sum of squared errors between the observed
\begin_inset Formula $y_{i}$
\end_inset
and the predicted value
\begin_inset Formula $f(X,b)$
\end_inset
:
\begin_inset Formula \[
S(b|y,x)=\sum_{i=1}^{N}(y_{i}-f(X_{i},b))'(y_{i}-f(X_{i},b))\]
\end_inset
\layout Standard
In the OLS case, where we assume the linear model
\begin_inset Formula \[
y=f(X,b)=Xb+e\]
\end_inset
it is not difficult to find the minimum of the sum of squares.
\layout Standard
It is very difficult to find the minimum of the sum of squares if you put
some other function in place of
\begin_inset Formula $f(X,b)$
\end_inset
.
And, in the maximum likelihood case, if the probability model is something
not Normal, or the role of the parameters is complicated, then maximization
is difficult.
\layout Standard
When
\begin_inset Formula $lnL()$
\end_inset
or
\begin_inset Formula $S()$
\end_inset
can't be solved with algebra, then the maxima and minima have to be found
by numerical approximation.
And that's where approximations and algorithms like Newton's come into
play.
\the_end