Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistical Design of
Experiments
SECTION I
Probability Theory
Review
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
PROBABILITY
• For any real system or
phenomenon, there will be a certain
amount of variability associated
with data generated by the system.
• Probability is the language used to
characterize and interpret the
variability of this data.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
PROBABILITY
• Probability is the most widely used
formalism for quantifying
uncertainty.
• There are two views of probability
the frequentist view and the
subjective view.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
PROBABILITY CONCEPTS
• Random experiment: an experiment in
which the outcomes can be different
even though the experiment is run under
the same conditions.
• Sample space: The set of possible
outcomes of a random experiment.
• Event: A specified subset of sample
outcomes.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
FREQUENTIST APPROACH
• Frequentist (Classical) View of
Probability
The probability of an event occurring
in a particular trial is the frequency
with which the event occurs in a long
sequence of similar trials.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
FREQUENTIST APPROACH
If a random experiment can result in any
one of N equally likely outcomes and if
exactly n of these outcomes correspond
to event A, then:
P(A) = n/N
The above definition is not very useful
for real world decision making (outside
games of chance) since it is not possible
to conduct an experiment many times.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
BAYESIAN APPROACH
• Personalistic View of Probability also
known as Subjectivist or Bayesian
Probability:
P(A | I)
The probability of an event, A, is the degree
of belief that a person has that the event
will occur, given all the relevant information,
I, known to the person. In this case the
probability is a function not only of the event
but of the current state of knowledge of the
decision maker.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
BAYESIAN APPROACH
Since different people have
different information relative to an
event, and these same people may
acquire new information at different
rates as time progresses in the
Bayesian Approach, there is no
such thing as “the” probability of
an event.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
AXIOMS OF PROBABILITY
(These Axioms Apply to both Frequentist and
Subjective Views of Probability)
If EVENT A is defined on a sample space S then:
(i) P(A) = Sum of the probabilities of all
elements in A
(ii) if A = S, then P(A) = P(S) = 1
(iii) 0 ≤ P(A) ≤ 1
(iv) if Ac is the complement of A,
then P(Ac) = 1 – P(A)
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
ADDITION RULES
(i) P(AUB) = P(A) + P(B) – P(AB)
(ii) If A and B are mutually exclusive
(i.e. P(AB) = 0), then P(AUB) = P(A) +P(B)
(iii) If A1, A2, A3,…,An are mutually
exclusively and A1UA2UA3….UAn = S this
is said to be an exhaustive collection.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
CONDITIONAL PROBABILITY
The probability of an event A occurring
when it is known that some other event
B has already occurred is called a
conditional probability and denoted
P(A|B) and reads “The probability that
the event A occurs given that B has
occurred”. It the joint occurrence of A
and B is know, the conditional
probability may be calculated from the
relationship:
P(A|B) = P(AB)/P(B)
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
MULTIPLICATION RULE
Since:
P(A|B) = P(AB)/P(B)
A
And:
P(B|A) = P(AB)/P(A)
AB
B
We have:
P(AB) = P(A|B)P(B) = P(B|A)P(A)
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
TOTAL PROBABILITY RULE
If A1,A2,A3,…,An are an exhaustive
collection of sets, then:
n
P(B) =
 P(BA )
i
A1
A2
i=1
Or equivalently:
n
P(B) =  P(B|Ai )P(Ai )
i=1
Dr. Gary Blau, Sean Han
B
A3
A4
A5
Monday, Aug 13, 2007
EXAMPLE FOR PROBABILITY
RULES
A manager is trying to determine the
probability of successfully meeting a
deadline for producing 1000 grams of a new
active ingredient for clinical trials. He knows
that the probability of success is conditional
on the amount of support from management
(manpower & facilities). Having been around
for a while he can also estimate the
probability of getting different levels of
support from his management. Calculate the
probability of successfully meeting the
deadline.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
EXAMPLE (DATA)
• Let Ai be the event of the amount of
management support $.
• Let B be the event of a successfully meeting the
deadline
Subjective Probabilities
Management Support
P(Ai)
P(B|Ai)
$ 1MM
0.5
0.7
$ 2MM
0.4
0.8
$ 3MM
0.1
0.9
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
EXAMPLE (SOLUTION)
n
P(B) = 
P(B|Ai)P(Ai)
i 1
= (.5)(.7) + (.4)(.8) + (.1)(.9)
= .76
There is a 76% chance that the manager
will be able to make the material on time.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
INDEPENDENT EVENTS
(i) Two Events If P(A|B) = P(A) and
P(B|A) =P(B) for two events A and B (i.e.
neither event is influenced by the outcome of
the other being known), then A and B are said
to be independent. Therefore:
P(AB) = P(A)P(B)
(ii) Multiple Events A1, A2, …, An are
independent events if and only if, for any
subset Ail, …, Aik of A1, A2, …, An:
P(AilAi2…Aik) = P(Ail)P(Ai2)…P(Aik)
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
ON-SPEC PRODUCT EXAMPLE
Based on historical data, the
probability of an off-spec batch of
material from a processing unit is .01.
What is the probability of producing 10
successive batches of on-spec
material?
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
ON-SPEC PRODUCT EXAMPLE
(SOLUTION)
Probability of on-spec product in a
batch: P(Ai) = 1 - .01 = .99. Since the
batches are independent, the probability
of 10 successive on-spec batches is:
P(A1A2…A10) = P(A1)P(A2)…P(A10)
= (.99)10
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
BAYES’ THEOREM
Suppose the probability of an event A, P(A) is known
before an experiment is conducted, this is called the
prior probability, then the experiment is conducted
and we wish to determine the “new” or updated
probability of A. Let B be some event condition on A,
the P(A|B) is called the likelihood function. Therefore
by Bayesian theorem:
P(A|B) = P(AB)/P(B) and P(B|A) = P(AB)/P(A),
then:
P( B | A) P( A)
P( A | B) 
P( B)
posterior
Dr. Gary Blau, Sean Han

likelihood * prior
Monday, Aug 13, 2007
DIAGNOSING A DISEASE
EXAMPLE
The analytical group in your development
department has developed a new test for
detecting a particular disease in humans.
You wish to determine the probability that a
person really has the disease if the test is
positive.
A is the event that an individual has the
disease.
B is the event that an individual tests
positive.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
DATA FOR EXAMPLE
(Prior Information):
Probability that an individual has a disease:
P(A) = .01
Probability that an individual does not have
the disease: P(Ac) = .99
(Likelihood)
Probability of a positive test result if person
has the disease: P(B|A) = 0.90
Probability of a positive test result even if
person does not have the disease (False
Positive):
P(B|Ac) = .05
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
DETERMINE POSTERIOR
PROBABILITY
(Posterior, Calculated probability)
P(A|B) = P(B|A)P(A) / P(B)
= P(B|A)P(A) / (P(B|A)P(A)+P(B|Ac)P(Ac))
= .09*.01/(.09*.01+.05*.99)
=.153
This is a rather amazing result that there is only
a 15% chance of having the disease when test is
positive even though there is a 90% chance of
testing positive if one has the disease!
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
RANDOM VARIABLES AND
PROBABILITY DISTRIBUTIONS
• A Random Variable X assigns numerical
values to the outcomes of a random
experiment
• Each of these outcomes is considered
an event and has an associated
probability
• The formalism for describing the
probabilities of all the outcomes is a
probability distribution.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
DISCRETE PROBABILITY
DISTRIBUTIONS
When the number of outcomes from an
experiment are countable the random
variable has a discrete probability
distribution p(x):
p(x)
.2
.1
10 20 30 40 50
Dr. Gary Blau, Sean Han
x
Monday, Aug 13, 2007
CONTINUOUS PROBABILITY
DISTRIBUTIONS
When the number of outcomes from a random
experiment is infinite (not countable for practical
purposes) the random variable has a continuous
probability distribution f(x). (i.e. probability density
function).
P(l ≤x≤ u) = area under curve from l to u.

f(x)
l u
Dr. Gary Blau, Sean Han
x
Monday, Aug 13, 2007
MOMENTS OF DISTRIBUTIONS
• Central Tendency (Mean)
  E ( x) 
x
i
p ( xi )
all xi
  E ( x) 


xf ( x)dx

if x is discrete
if x is continuous
• Scatter (Variance)
2 
 (x
i
 ) 2 p ( xi )
all xi


2

 (x  )
2
f ( x)dx
if x is discrete
if x is continuous

Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
DISCRETE PROBABILITY
DISTRIBUTION EXAMPLE
The discrete probability distribution of X is given
by:
x
p(x)
0
0.25
1
0.5
2
0.25
Calculate its mean and variance.
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
EXAMPLE (SOLUTION)
μx = ∑ x p(x)
= (0)(.25) + (1)(.5) + (2)(.25)
=1
σ = ∑ (x- µX)2 p(x)
= (0-1)2(.25) + (1-1)2(.5) + (2-1)2(.25)
= .5
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
CONTINUOUS PROBABILITY
DISTRIBUTION EXAMPLE
In a controlled lab experiment, the error in
measuring the reaction temperature in C
is given by
 x2

f  x =  3
0

1  x  2
elsewhere
(1) Is f (x) a probability density function?
(2) What is the probability the error is
between 0 and 1?
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
EXAMPLE (SOLUTION)
2
x 2 x3
(1) 

3
9
1
2
1
 8 / 9  (1) / 9  1
So it is a probability density function.
1
(2)

0
x2
x3

3
9
Dr. Gary Blau, Sean Han
1
0
 1/ 9  0  1/ 9
Monday, Aug 13, 2007
UNIFORM DISTRIBUTION
The probability density function of a
continuous uniform distribution is:
1
f x =
b-a
F x =
a≤x≤b

x
a
fx  u  du
1
a b - adu
x -a
=
b-a
=
Dr. Gary Blau, Sean Han
x
Monday, Aug 13, 2007
UNIFORM DISTRIBUTION
 =
x

a
xfx
 u  du
1
a b - adu
b +a
=
2
x
=

2
=
=

x

x
a
a
x
(x -  )2 fx  u  du
(x -  )2
1
du
b -a
(b-a)2
=
12
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
NORMAL DISTRIBUTION
Normal (Gaussian) Distribution is the most
frequently occurring distribution in entire field of
statistics. Gauss found that such a distribution
is represented by the probability density
function:
f  x =
  x-μ 2
1
exp  2

2σ
2πσ


 , -∞ < x < ∞


with:
E  X = μ
Dr. Gary Blau, Sean Han
and
V  X  = σ2
Monday, Aug 13, 2007
STANDARD NORMAL
DISTRIBUTION
If the parameters of the normal distribution
are:   0 and 2  1, the defining random
variable is called the standard normal
random variable Z with probability density
function
f z 
 z 2 
1
exp 
 dz ,
2
 2 
  z  
The approximate values of cumulative
distribution function are listed in Z table.
z  P  Z  z  
z

Dr. Gary Blau, Sean Han
 u 2 
1
exp 
 du
2
2


Monday, Aug 13, 2007
STANDARD NORMAL
DISTRIBUTION
The values of cumulative standard
normal distribution function for the
standard normal random variable Z can
be used to find the corresponding
possibilities for normal random
variables X with E(X) = μ and V(X)= σ2
using the following transformation to
convert the distribution of X to Z:
X 
Z

Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
NORMAL DISTRIBUTION
EXAMPLE
If X ~ N(50, 100) [Read “If the random variable
X is distributed normally with mean 50 and
variance 100], find the probability that P(42 ≤
X ≤ 62).
P(42 ≤ X ≤ 62) = P((42-50)/10 ≤ Z ≤ (62-50)/10)
= Z1.2 – Z-.8
= .885 - .212
= .673
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
TRIANGULAR DISTRIBUTION
 2( x  a)
 (b  a)(c  a) ,

f x = 
 2(b  x) ,
 (b  a)(b  c)
a  x  c
c  x  b
abc

3
2
2
2
a

b

c
 ab  ac  bc
2
 
18
Dr. Gary Blau, Sean Han
Monday, Aug 13, 2007
Related documents