Download Probability Theory and Statistics Random Numbers Random

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
28.10.1998
Random Numbers
Probability Theory and
Statistics
• ref: Law-Kelton -91, Chapter 4
• Random number (satunnaisluku)
• Sample space
(otosavaruus)
Random Numbers
Distributions
Samples
Confidence Intervals
• discrete
• continuous
• Random variable
S = {1, 2, 3, 4, 5, 6}
S = [0, 1] S =[0,1)
(satunnaismuuttuja)
– one way to generate random numbers
– determined by a function or rule: R -> S
– random variables: X, Y=5X, Z=g(X,Y)
28.10.1998
Copyright Teemu Kerola 1998
1
Random Variable X
• Distribution function
F(x) = P(X ≤ x)
• Density function f(x) = P(X=x) = P(x)
• Both F(x) and f(x) determine the (same)
distribution
dice:
P(X=6) = 1/6
• X discrete?
– P(a ≤ X ≤ b) = ∑ab f(x) = F(b) - F(a)
– P(a < X ≤ b) = ∫ab f(x) = F(b) - F(a)
28.10.1998
Copyright Teemu Kerola 1998
(tiheysfunktio)
Prob
• X continuous?
28.10.1998
f(x)
Density:
Discrete distribution
X = Random[1,6]
1/6
0
0 1 2 3 4 5 6
-∞
∞
sum = 1
F(x)
1
(kertymäfunktio)
Distribution:
temperature:
P(20 ≤ X ≤ 24)
= 80%
Copyright Teemu Kerola 1998
3
f(x)
0
0 1 2 3 4 5 6
-∞
28.10.1998
∞
Copyright Teemu Kerola 1998
Continuous distribution
X = Random[1,6]
Density:
2
4
Continuous distribution
X = Random[0,1)
f(x)
Density:
1
1/5
-∞
0
0 1 2
F(x)
1
5 6
∞
Distribution:
-∞
28.10.1998
Prob. Theory and Statistics
-∞
area = 1
area = 1
0
0
F(x)
1
∞
1
Distribution:
0
0 1 2
5 6
Copyright Teemu Kerola 1998
∞
-∞
5
28.10.1998
0
0
1
Copyright Teemu Kerola 1998
∞
6
1
28.10.1998
Continuous distribution
X = Expon(a)
f(x)
Properties for Random Var. X
1/a
memoryless
property
area = 1
∞
1
−∞
x x+∆x
5% percentile = X0.05
0
28.10.1998
0
7
Variance and Deviation
(varianssi)
Norm(0,1)
f(x)
µ ∞
• average E(X) = EX = µ = ∫x x f(x) dx
– describes where the values are located
• variance Var(X) = σ2 = E[ (X-µ)2 ] = E(X2)-µ2
– describes how close to average the values are
• Standard deviation
σ = σ2
Copyright Teemu Kerola 1998
Covariance
9
(kovarianssi)
60% of probability mass
– Does X depend on Y or vice versa or what?
• Covariance Cov(X,Y) = CXY
= E[ (X-µX)(Y-µY) ] = E(XY) - µX µY
• CXY = CYX
• CXX = Cov(X,X) = Var(X) = σ2 X large => Y ???
– CXY = 0 X, Y uncorrelated
– CXY > 0 X, Y positively correlated X large => Y large
– CXY < 0 X, Y negatively correlated
X large => Y small
Copyright Teemu Kerola 1998
−∞
−1.65 −0.84
0
0.84
∞
1.65
• Normal distribution Norm (µ, σ2)
• Standard Normal Distribution =
Z-distribution = Norm (0, 1) = N(0, 1)
• Z-percentiles Z0.95
= 1.645
= Z1 - α/2 for α = 10%
28.10.1998
Copyright Teemu Kerola 1998
Correlation
10
(korrelaatio)
• Problem: CXY varies much in magnitude
– can not deduct much from its magnitude
• X and Y are random variables
Prob. Theory and Statistics
8
90% of
probability mass
µ
µ
Copyright Teemu Kerola 1998
Standard Normal Distribution
very small
variance
small var.
28.10.1998
28.10.1998
(hajonta)
large var.
28.10.1998
95% percentile = X0.95
EX = ∫x x f(x) dx
– EX = ∑x x P(x)
∞
1
Copyright Teemu Kerola 1998
−∞
∞
x’ x’+∆x
(mediaani)
• median X0.5: F(X0.5)= 0.5
• mean, expected value EX = E(X) = µ (keskiarvo)
(unohtavaisuus
ominaisuus)
f(x)
P(X [x’, x’+∆x])
5%
0
F(x)
1
-∞
(fraktiili)
P(X [x, x+∆x])
5%
0
-∞
50% percentile = X0.5
f(x)
11
• Solution: scale with (geom. aver) variance
• Correlation
CXY
Cor(X,Y) = ρXY =
ρXY = 0
ρXY > 0
ρXY < 0
ρXY = 0.98
σ2Xσ2Y
∈[−1,1]
X, Y uncorrelated
X, Y positively correlated
X, Y negatively correlated
X, Y highly positively correlated
• Correlation does not mean causal dependence
– Fig. on homeworks vs. exams results
28.10.1998
Copyright Teemu Kerola 1998
12
2
28.10.1998
Independent X and Y ?
Set of Random Variables
(riippumattomuus)
(yhteisjakauman
tiheysfuntio)
• Random variables X, Y
• Joint prob. density function f (X,Y):
P(X∈A, Y∈B) = ∫A ∫B f(x,y) dx dy
• X and Y are independent if
f(x,y) = fX(x) . fY(y)
where
fX(x) = ∫-∞,∞ f(x,y) dy
marginal prob.
fY(y) = ∫-∞,∞ f(x,y) dx
density functions
28.10.1998
Copyright Teemu Kerola 1998
13
• Many random variables Xi or Xi
– i ∈ [1, N]
•
•
•
•
•
28.10.1998
Copyright Teemu Kerola 1998
• Random variables X1 , X2 , ...
• Xi‘s identically distributed?
EXi = µi = µ
∀i = 1, 2, ....
Var(Xi)= σi2 = σ2
∀i = 1, 2, ....
fXi(x) = f(x)
∀i = 1, 2, ....
• Xi‘s independent?
Xi and Xj are independent ∀ij ∈ {1, 2, ...}
• Now, Xi‘s are IID
• Problem: What is the distribution of X?
• Solution: try it out and see
• Sample: one set of values for X
– throw a dice for 100 times? 10000 times?
– measure response time from 100 trials?
– write down sales sum for next 10 years?
• New question: what can we tell about X from
the sample?
– independent and identically distributed (IID)
– almost never the case in real life
– almost always needed for the math to work out
Copyright Teemu Kerola 1998
– What are EX=µ and Var(X)= σ2 ?
15
28.10.1998
Copyright Teemu Kerola 1998
Sample Properties
• Sample Set S = { Xi | i=1,...,n }
• Sample points Xi
X ( n) =
∑X
i
(otosjoukko)
– problem: var( X )= σ2 is unknown
– solution: use sample variance s2(n) instead:
Prob. Theory and Statistics
Copyright Teemu Kerola 1998
∑ [ X − X(n) ] ∑ X
2
s2 (n) =
i
i
n −1
=
i
2
i
σ2
n
− n X2 (n)
(otosvarianssi)
n −1
(otoskeskiarvo)
– now,
(harhaton estimaatti)
is unbiased estimator for EX=µ: EX (n) = µ
– with large n, sample mean is close to µ
– how close? how large n?
28.10.1998
Var ( X(n) ) =
• variance for sample mean:
i
n
16
Sample Properties
(otospisteet)
– should be set of IID random variables
– Xi ~ X, i.e.,
Xi has the (unknown) same distribution than X
EXi=EX=µ and Var(Xi)=Var(X)= σ2
• Sample mean
14
Sample
IID Set of Random Variables
28.10.1998
i ∈ [1, ∞]
E(Xi) = EXi = µXi = µi
Var(Xi) = Var(Xi) = σXi2 = σi2
Cov(Xi, Xj) = Cov(Xi,Xj) = CXi,Xj = Ci,j = Cij
Cor(Xi, Xj) = Cor(Xi,Xj) = ρXi,Xj = ρi,j = ρij
All comparisons pair-wise!
17
s2 (n)
)
Var( X(n) ) =
=
n
∑ [X
i
− X(n) ]2
i
n(n − 1)
is unbiased estimator for Var (X(n) )
– know now how good estimator sample mean is
28.10.1998
Copyright Teemu Kerola 1998
18
3
28.10.1998
How to Use
Central Limit Theorem?
Central Limit Theorem (CLT)
• “When n becomes large, normalized sample
mean becomes normally distributed”
X(n) − µ
σ2 / n
• Problem: what if n is not very large?
• Solution: use Student distribution instead:
X(n) − µ
σ2 / n
P ( − z1−α / 2 ≤
P( − z1−α / 2 ≤
≈ T(n − 1)
(vapausaste)
Copyright Teemu Kerola 1998
19
{
28.10.1998
f(x)
}
{
{
28.10.1998
}
90%of prob. mass
−∞
}
21
f(x)
small sample
−1.65 −0.84
0
0.84
large sample
• Distribution for normalized µ based on sample
• 60% CI is narrower, but more likely to be wrong, i.e.,
40% change that it does not to contain the true mean
28.10.1998
Copyright Teemu Kerola 1998
• Question: X is normally distributed.
What is the mean µ of X? 1.20, 1.50, 1.68, 1.89, 0.95,
1.49, 1.58, 1.55, 0.50, 1.09
• Answer:
– Take sample of 10 observations for X
– compute X(10) and s2 (10)
1.34 and 0.17
µ
µ
• distribution s2 for µ (the true, unknown mean) based on
sample
• large sample, s2 small
• small sample, s2 large
• Strong Law of Large Numbers:
(suurten lukujen laki)
– get 90% percentile for T-distribution with 9 degrees of
freedom t9, 0.95 = 1.83
– compute 90% confidence interval for µ:
X(10) ± t 9, 0.95 s2 (10) / 10
→∞
X(n) n
→ µ with probability 1
23
22
Example A
N(µ , s2)
Copyright Teemu Kerola 1998
1.65 ∞
90% confidence interval
60% confidence interval
s (n) / n ) = 90% n ≤ 30 ?
2
Effect of Sample Size,
Strong Law of Large Numbers
Prob. Theory and Statistics
20
N(0, 1)
n > 30 ?
Copyright Teemu Kerola 1998
28.10.1998
(luottamusväli)
Copyright Teemu Kerola 1998
s (n) / n ) = 1 − α
2
P ( µ ∈ X(n) ± z0.95 s2 (n) / n ) = 90%
P ( µ ∈ X(n) ± t n −1, 0.95
}
s2 (n) / n ) = 1 − α
Confidence Interval (CI)
• 90% confidence interval, α = 10%
or
≤ z1− α / 2 ) = 1 − α
X(n) − µ
≤ z1−α / 2 ) = 1 − α
s2 (n) / n
P ( µ ∈ X(n) ± z1− α / 2
• (1-α) confidence interval
{
σ2 / n
• Solve for µ, get confidence interval for µ
Confidence Interval
P ( µ ∈ X(n) ± z1−α / 2
X(n) − µ
• Use unbiased estimator for σ2
where n-1 is the degrees of freedom for T
• Percentiles for N(0,1) and T(k) are tabulated
28.10.1998
• Use CLT for percentiles
(keskeinen
raja-arvolause)
→∞
n
→ N(0,1)
28.10.1998
Copyright Teemu Kerola 1998
(1.34 - 0.24, 1.24 + 0.24)
= (1.10, 1.44)
24
4
28.10.1998
Example B
Example C
• Question: I know the distribution.
Is the mean µ = 5? (“Two-tailed hypothesis” H0)
• Answer:
• Question: do X and Y have same mean?
• Answer:
– Take samples for X and Y, sample averages: X(n) and Y(m)
– get test factor tn+m-1 :
– Take sample of 10, compute test factor t9 :
t9 =
X(10) − 5
(testisuure)
t n + m −1 =
s2 (10) / 10
– get 90% confidence interval for T-distribution with 9
degrees of freedom (-t9, 0.95 , t9, 0.95 ) = (-1.83, 1.83)
– if test factor t9 is not in there, reject H0, i.e., say “no”
• 10% chance of being wrong! (why?)
– if tn is in there, accept H0, i.e., say “yes”
28.10.1998
Copyright Teemu Kerola 1998
25
X (n) − Y(m )
1
1
s
+
n m
where s 2 =
1
n+m −2
(∑ ( X
i
− X )2 +
∑ (Y
i
− Y )2
)
– get 90% confidence interval for T-distribution
– if tn+m-1 is not in there, reject H0, i.e., say “no”
• 10% chance of being wrong
28.10.1998
Copyright Teemu Kerola 1998
26
28.10.1998
Copyright Teemu Kerola 1998
28
Usual Assumptions
That Usually Do Not Apply
• Normal distribution
– central limit theorem
– strong law of large numbers
• Independent samples
– X1, X2, ...
• Assumptions required by probability theory and
statistics to work do not usually apply
• Results may still be usable
• Watch out and always check how far off you are
from required assumptions
28.10.1998
Prob. Theory and Statistics
Copyright Teemu Kerola 1998
27
5
Related documents