Distribution Overview

Paul E. Johnson

2015-02-04

Visualize (Whirled Peas)

Quick PDF sketch

Guess that Distribution

I’m Not that

length(x1)
## [1] 2000
x1[1:10]
##  [1]  178.86 -225.95  371.74  133.33  311.81  177.53  368.16  363.94
##  [9]   73.85  335.23
rockchalk::summarize(x1)
## $numerics
##            x1
## 0%    -549.42
## 25%    -49.50
## 50%     87.25
## 75%    223.28
## 100%   757.48
## mean    84.84
## sd     196.60
## var  38653.11
## NA's     0.00
## N     2000.00
## 
## $factors
## NULL

I am not: a) Poisson b) Normal c) Uniform d) Beta e) Binomial

I’m Not that

plot of chunk unnamed-chunk-3

I am not a) Poisson b) Normal c) Gamma d) Beta e) Binomial

Answer:

x1 <- rnorm(2000, 88, 200)

I’m Not that 2

x2[1:10]
##  [1] 2.2661 0.9260 0.6397 0.3697 0.8037 4.2529 2.6449 1.1851 0.4970 0.7945
summarize(x2)
## $numerics
##            x2
## 0%      0.027
## 25%     0.964
## 50%     1.714
## 75%     2.680
## 100%   10.756
## mean    1.991
## sd      1.377
## var     1.896
## NA's    0.000
## N    2000.000
## 
## $factors
## NULL

I am not a) Poisson b) Normal c) Uniform d) Beta

I’m Not that 2

plot of chunk unnamed-chunk-7

I am not a) Poisson b) Logistic c) Uniform d) Beta

Answer

x2 <- rgamma(2000, 2, 1)

I’m Not that 3

x3[1:10]
##  [1] 2 2 3 3 2 5 3 3 4 3
rockchalk::summarize(x3)
## $numerics
##            x3
## 0%      0.000
## 25%     2.000
## 50%     3.000
## 75%     4.000
## 100%    9.000
## mean    2.959
## sd      1.447
## var     2.094
## NA's    0.000
## N    2000.000
## 
## $factors
## NULL

I am not: a) Poisson b) Normal c) Uniform d) Beta e) Binomial

I’m Not that

plot of chunk unnamed-chunk-11 I am not a) Poisson b) Normal c) Gamma d) Beta

Answer:

Actually, I was rbinom(2000, 10, prob = c(0.3))

Harvesting from R/WorkingExamples

plot-histogramWithLinesAndLegend.R(html)

drawHist(x1)

plot of chunk unnamed-chunk-12

ex 2

drawHist(x2)

plot of chunk unnamed-chunk-13

ex 3

drawHist(x3)

plot of chunk unnamed-chunk-14

Expected Values

EV: Simple idea with complicated jargon

Probability weighted sum of outcomes

Calculate the Expected Value: ?

0.5 * 1 + 0.4 * 2 + 0.1 * 3

Easy!

Do you think terminology is the major problem

Even if EV is not subjectively informative…

Got R?

x1 <- rpois(5000, lambda = 0.2)
hist(x1, main = "My EV is 0.2")
x2 <- rpois(5000, lambda = 10)
hist(x2, main = "My EV is 10")
mean(x1)
mean(x2)
sd(x1)
sd(x2)

I got

x1 <- rpois(5000, lambda = 0.2)
hist(x1, main = "My EV is 0.2")

plot of chunk unnamed-chunk-15

x2 <- rpois(5000, lambda = 10)
hist(x2, main = "My EV is 10")

plot of chunk unnamed-chunk-16

plot of chunk unnamed-chunk-17

Sample average is one way to estimate EV

\[ \bar{x}=\frac{sum\ of\ observed\ values}{N} \] - Need criteria to say if \(\bar{x}\) is a “good”" estimate of \(E[x]\)

Probabilities. Who Likes Calculus?

Continuous variables

Extreme scores

I hate the word population

Data Generating Process

I want the distribution of an estimate

Theoretical Variance

Variance

Terminology

How to calculate sample variance estimates

But it is not unbiased.

\[s1 = \frac{Sum\ of (x - E[x])^2}{N-1}\]