\batchmode
\makeatletter
\def\input@path{{/home/pauljohn/SVN/SVN-guides/stat/Distributions/DistributionWriteups/Beta//}}
\makeatother
\documentclass[12pt,english]{article}
\usepackage{mathptmx}
\usepackage[T1]{fontenc}
\usepackage{geometry}
\geometry{verbose,tmargin=1in,bmargin=1in,lmargin=1in,rmargin=1in}
\makeatletter
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Textclass specific LaTeX commands.
\usepackage{Sweavel}
<>=
if(exists(".orig.enc")) options(encoding = .orig.enc)
@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands.
\usepackage{lmodern}
\usepackage{graphicx}
\usepackage{listings}
\lstset{tabsize=2, breaklines=true,style=Rstyle}
\makeatother
\usepackage{babel}
\begin{document}
\title{Beta Distribution}
\author{Paul Johnson and Matt Beverlin }
\date{June 10, 2013}
\maketitle
<>=
dir.create("plots", showWarnings=F)
@\fvset{listparameters={\setlength{\topsep}{0em}}}
\SweaveOpts{prefix.string=plots/t,split=T,ae=F,height=4,width=6}
\def\Sweavesize{\normalsize}
\def\Rcolor{\color{black}}
\def\Rbackground{\color[gray]{0.95}}
<>=
options(width=180, prompt=" ", continue=" ")
options(useFancyQuotes = FALSE)
set.seed(12345)
op <- par()
options(SweaveHooks=list(fig=function() par(ps=12)))
pdf.options(onefile=F,family="Times",pointsize=12)
@
\section{Description}
How likely is it that the Communist Party will win the next elections
in Russia?
In my view, the probability is 0.33. You may think the chance is 0.22.
Your brother thinks the chances are 0.6 and his sister-in-law thinks
the chances are 0.1.
People disagree and we would like a way to summarize the chances that
a person will say 0.4 or 0.5 or whatnot.
The Beta density function is a very versatile way to represent outcomes
like proportions or probabilities. It is defined on the continuum
between 0 and 1. There are two parameters which work together to determine
if the distribution has a mode in the interior of the unit interval
and whether it is symmetrical.
The Beta can be used to describe not only the variety observed across
people, but it can also describe your subjective degree of belief
(in a Bayesian sense). If you are not entirely sure that the probability
is 0.22, but rather you think that is the most likely value but that
there is some chance that the value is higher or lower, then maybe
your personal beliefs can be described as a Beta distribution.
\section{Mathematical Definition}
The standard $Beta$ distribution gives the probability density of
a value $x$ on the interval (0,1):
\begin{equation}
Beta(\alpha,\beta):\,\, prob(x|\alpha,\beta)=\frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}\label{eq:BetaDensity}
\end{equation}
where $B$ is the beta function
\[
B(\alpha,\beta)=\int_{0}^{1}t^{\alpha-1}(1-t)^{\beta-1}dt
\]
\subsection{Don't let all of those betas confuse you.}
It is disappointingly confusing, but the word ``beta'' is used for
3 completely different meanings.
\begin{enumerate}
\item $Beta(\alpha,\beta)$ ``Beta'' is the name of the probability distribution
\item $B(\alpha,\beta)$ ``Beta'' is the name of a function that appears
in the denominator of the density function
\item $\beta$ ``Beta'' is the name of the second parameter in the density
function
\end{enumerate}
\subsection{About the Beta function $B$}
The Beta function $B$ in the denominator plays the role of a ``normalizing
constant'' which assures that the total area under the density curve
equals 1.
The Beta function is equal to a ratio of Gamma functions:
\[
B(\alpha,\beta)=\frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}
\]
\\
Keeping in mind that for integers, $\Gamma(k)=(k-1)!$, one can do
some checking and get an idea of what the shape might be.
A 3 dimensional graph of the Beta function can be found in Figure
\ref{cap:The-Beta-Function}.
\begin{figure}
<>=
N <- 30
alphaseq <- seq(0.2, 2 , length.out=N)
betaseq <- seq(0.2, 2 , length.out=N)
myProbX <- function(x1, x2){
beta(x1,x2)
}
mybeta<- outer(alphaseq,betaseq, FUN="myProbX")
persp(alphaseq,betaseq, mybeta, theta = 120, phi = 7, col = "pink", zlim=c(0,10),zlab="Beta Function", xlab="alpha", ylab="beta",ticktype="detailed")
@
\includegraphics[width=6.5in]{plots/t-graph1.pdf}
\caption{The Beta Function\label{cap:The-Beta-Function}}
\end{figure}
\section{Moments of the Beta}
The expected value of a variable that is Beta distributed is:
\begin{equation}
E(x)=\mu=\frac{\alpha}{\alpha+\beta}\label{eq:BetaMean}
\end{equation}
and the variance is
\begin{equation}
Variance(x)=\frac{\alpha\beta}{(\alpha+\beta)^{2}(\alpha+\beta+1)}\label{eq:BetaVariance}
\end{equation}
People who are familiar with the Generalized Linear Model will notice
that
\[
V(\mu)=\frac{\beta}{(\alpha+\beta)(\alpha+\beta+1)}\cdot\mu
\]
is a variance function, $V(\mu)$, which indicates the dependence
of the observed variance on the mean. For a fixed pair of parameters
$(\alpha,\beta)$, the variance is proportional to $\mu$. A graph
illustrating the Variance function is presented in Figure \ref{cap:The-Variance-Function}.
The third and fourth moments are:
\begin{equation}
Skewness(x)=\frac{2(\beta-\alpha)\sqrt{1+\alpha+\beta}}{\sqrt{\alpha+\beta}(2+\alpha+\beta)}\label{eq:BetaSkewness}
\end{equation}
\begin{equation}
Kurtosis(x)=\frac{6[\alpha^{3}+\alpha^{2}(1-2\beta)+\beta^{2}(1+\beta)-2\alpha\beta(2+\beta)]}{\alpha\beta(\alpha+\beta+2)(\alpha+\beta+3)}
\end{equation}
\begin{figure}
<>=
N <- 30
alphaseq <- seq(0.2, 2 , length.out=N)
betaseq <- seq(0.2, 2 , length.out=N)
VarMu <- function(x1, x2){
v <- x2/((x1+x2)*(x1+x2+1))
}
myvar <- outer(alphaseq,betaseq, FUN="VarMu")
mypersp <- persp(alphaseq,betaseq, myvar, theta = 125, phi = 7, col = "pink", zlim=c(0,0.5),zlab="Multiplier", xlab="alpha", ylab="beta",ticktype="detailed")
text(trans3d(1.2,0.1,0.4, mypersp), label=expression(Var(x) == mu %*% Multiplier ))
@
\includegraphics[width=6.5in]{plots/t-graph2.pdf}
\caption{The Ratio of Variance to the Mean \label{cap:The-Variance-Function} }
\end{figure}
\subsection{The Mode}
If $\alpha>1$and $\beta>1$, the peak of the density is in the interior
of {[}0,1{]} and mode of the $Beta$ distribution is
\begin{equation}
mode=\gamma=\frac{\alpha-1}{\alpha+\beta-2}\label{eq:BetaMode}
\end{equation}
If $\alpha$ or $\beta<1$, the mode may be at an edge.
As we will illustrate below, if $\alpha=\beta=1$, then the Beta is
identical to a Uniform distribution.
\section{Illustration}
One advantage of the Beta distribution is that it can take on many
different shapes. If one believed that all scores were equally likely,
then one could set the parameters $\alpha=1$ and $\beta=1$, as illustrated
in Figure \ref{Fig:Uniform}, this gives a ``flat'' probability
density function.
\begin{figure}
<>=
x <- seq(0,1,length=21)
myvar <- dbeta (x,1,1)
plot (x,myvar,main="",xlab="x",ylab="probability",ylim=c(0,1),type="l")
@
\includegraphics[width=6in]{plots/t-graph3.pdf}
\caption{Beta(1,1) is the Uniform distribution\label{Fig:Uniform}}
\end{figure}
In models of elections, one may need a distribution of ideal points
to resemble a single-peaked distribution on the interval {[}0,1{]}.
The Beta can be very useful in this kind of exercise. Consider Figure
\ref{cap:Some-Pleasant-Single-Peaked}.
\begin{figure}
<>=
#PJ 2004-05-04
#For project with erik herron to demonstrate Beta parameters
#
sampleSize <- 100
# Suppose you want to generate 10 distributions
# and display them in a single picture
par(mfrow=c(5,1))
# x11(height=3.0,width=5)
createDist <-function(mode){
alphaMin <- 3
B <- (alphaMin-mode*(alphaMin-2)-1)/mode
A <- (mode*alphaMin-2*mode+1)/(1-mode)
if (mode <= 0.5) {
alpha1 <- alphaMin
alpha2 <- B
} else
{
alpha2 <- alphaMin
alpha1 <- A
}
z <- dbeta(seq(0,1,length=100),shape1=alpha1,shape2=alpha2)
plot(1:100,z, type="l",ylim=c(0,2.5),main=paste("Beta(",round(alpha1,2),",",round(alpha2,2),")"), xlab="x",ylab="probability density")
}
par(mfrow=c(3,1))
createDist(.3)
createDist(.5)
createDist(.7)
@
\includegraphics[width=6in]{plots/t-graph4.pdf}
\caption{Some Pleasant Single-Peaked Betas\label{cap:Some-Pleasant-Single-Peaked}}
\end{figure}
At one point, it fascinated me that the mode did not equal the mean
and that the variance ends up characterizing the ``slack'' between
those two things. Various densities in Figure \ref{cap:ModeMeanVariance}
might be entertaining. In these examples, the Beta parameters are
chosen to keep the mode constant at 0.3. Note how the mean and variance
change across the illustrations.
\begin{figure}
\caption{Beta Distributions with Mode=0.3\label{cap:ModeMeanVariance}}
<>=
#PJ 2004-05-04
#For project with erik herron do demonstrate Beta parameters
#
sampleSize <- 500
createDist <-function(alphaMin, mode) {
B <- (alphaMin-mode*(alphaMin-2)-1)/mode
A <- (mode*alphaMin-2*mode+1)/(1-mode)
if (mode <= 0.5) {
alpha1 <- alphaMin
alpha2 <- B
} else {
alpha2 <- alphaMin
alpha1 <- A
}
mean1 <- (alpha1)/(alpha1+alpha2)
var1 <- (alpha1*alpha2)/(((alpha1+alpha2)^2)*(alpha1+alpha2+1));
z <- dbeta(seq(0,1,length=100),shape1=alpha1,shape2=alpha2);
plot(1:100,z, type="l",ylim=c(0,3.5),main=paste("Beta(",round(alpha1,2),", ", round(alpha2,2),")"," mode=",mode," mean=",round(mean1,2),", var=",round(var1,3),sep=""), xlab="ideal point",ylab="density")
}
makeDisp <- function(aseq, mode){
par(mfrow=c(5,2))
for (i in 1:length(aseq))
createDist(aseq[i], mode)
}
myseq <- seq(1.1, 7, length=10)
makeDisp(myseq,.3)
@
\includegraphics[width=6in]{plots/t-graph5.pdf}
\end{figure}
Perhaps, as they say, the versatility of the Beta is a strength, but
there is some reason to be cautious. If the parameters wander into
the (0,1) range, then some pretty surprising shapes can appear. Consider
Figure \ref{cap:Some-Unpleasant-Betas}.
\begin{figure}
<>=
createDist2 <-function(alpha1,alpha2) {
mean1 <- (alpha1)/(alpha1+alpha2)
var1 <-(alpha1*alpha2)/(((alpha1+alpha2)^2)*(alpha1+alpha2+1));
inx <- seq(0,1,length.out=100)
z <- dbeta(inx,shape1=alpha1,shape2=alpha2);
mytitle <- paste("Beta(",alpha1,",",alpha2," mean=",round(mean1,2)," var=",round(var1,2),sep="")
plot(inx,z,type="l",ylim=c(0,3.5),xlab="x",ylab="density",main=mytitle)
}
makeDisp <- function() {
myalpha<- c(0.2, 0.5, 0.75 , 1.1)
par(mfcol=c(4,2))
for (j in 1:4){
createDist2(0.7,myalpha[j])
}
for (j in 1:4){
createDist2(1.2,myalpha[j])
}
}
makeDisp()
@
\includegraphics[width=6in]{plots/t-graph6.pdf}
\caption{Some Unpleasant Betas\label{cap:Some-Unpleasant-Betas}}
\end{figure}
\section{About the connection between the mean, the mode, and the variance}
In the pictures displaying the Beta density, one's eye is drawn to
the peak of the frequency distribution, which is the mode. We can
set the Beta's parameters in order to generate a distribution with
a desired mode. Let the mode be represented by $\gamma$.
Here's a simple starting point: Suppose the mode is .50. That is the
same as the mean (its symmetric), and the mode formula (\ref{eq:BetaMode})
implies:
\begin{equation}
.50=\frac{\alpha-1}{\alpha+\beta-2}
\end{equation}
and
\[
.50\alpha+.50\beta-1=\alpha-1
\]
\[
.5\alpha=.5\beta
\]
\begin{equation}
\alpha=\beta
\end{equation}
If one wants the mode to be in the middle, one can choose any value
for $\alpha,$ as long as one chooses the same value for $\beta$.
(Whew! What a relief. This exactly matched my intuition.)
If the mode is in the center, we know $\alpha$ and $\beta$ are equal,
but we don't know their values. The selection, it turns out, depends
on how much diversity there is. If one wants a distribution to have
points ``tightly bunched'' around the mode, then one should choose
a large value for $\alpha,$ say 10.0,
\begin{equation}
variance\, of\, Beta(10,10)=0.01190
\end{equation}
\\
In contrast, if $\alpha=1.5$, the variance is much greater:
\begin{equation}
variance\, of\, Beta(1.5,1.5)=0.0625
\end{equation}
\\
Seen in this light, the parameter $\alpha$ is a ``homogeneity indicator.''
As $\alpha$ gets bigger, the distribution collapses around the mode.
Although this particular calculation works only for a mode in the
center, it does outline the process that we can use to assign $\alpha$
and $\beta$ for all other values of the mode.
Suppose the mode is .4. From equation \ref{eq:BetaMode}
\[
.40=\frac{\alpha-1}{\alpha+\beta-2}
\]
\[
.40\alpha+.40\beta-0.8=\alpha-1
\]
\[
.4\beta=.6\alpha-0.2
\]
\begin{equation}
\beta=\frac{3}{2}\alpha-\frac{1}{2}
\end{equation}
or
\[
.60\alpha=.2+.40\beta
\]
\begin{equation}
\alpha=\frac{1}{3}+\frac{2}{3}\beta
\end{equation}
\\
It is quite possible to calculate one parameter as a function of another,
after specifying the mode, even if the mode is off center.
Generally speaking, for any value of the mode, $\gamma\in(0,1)$ (keeping
in mind the original stipulation that $\alpha,\beta>1$):
\begin{equation}
\gamma=\frac{\alpha-1}{\alpha+\beta-2}
\end{equation}
\begin{equation}
\gamma\alpha+\gamma\beta-2\gamma=\alpha-1
\end{equation}
\begin{equation}
(1-\gamma)\alpha=\gamma\beta-2\gamma+1
\end{equation}
\begin{equation}
\alpha=\frac{\gamma\beta-2\gamma+1}{(1-\gamma)}=\frac{\beta-2+\frac{1}{\gamma}}{(\frac{1}{\gamma}-1)}=\frac{\gamma}{1-\gamma}\beta-\frac{2\gamma-1}{1-\gamma}\label{eq:AlphaFnMode}
\end{equation}
\\
So $\alpha$ is a linear function of $\beta$. (Note: 2013-10-25;
reader notified me of typographical error in equation \ref{eq:AlphaFnMode}.
Sorry!)
And
\begin{equation}
\gamma\beta=\alpha-1-\gamma\alpha+2\gamma
\end{equation}
\begin{equation}
\beta=\frac{\alpha-\gamma\alpha+2\gamma-1}{\gamma}=\frac{\alpha-\gamma(\alpha-2)-1}{\gamma}=\frac{(1-\gamma)}{\gamma}\alpha-\frac{1-2\gamma}{\gamma}\label{eq:BetaFnMode}
\end{equation}
This indicates that if we begin with the mode, and then take as given
either $\alpha$ or $\beta$, we can calculate the missing parameter
($\beta$ or $\alpha$, as the case may be). As a result, instead
of thinking of the Beta's shape as determined by parameters $\alpha$
and $\beta$, sometimes it is easier to think of it in terms of the
mode (most likely value) and the homogeneity.
\end{document}