Download Chapter 9 Lecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Introducing probability
PSLS chapter 9
© 2009 W. H. Freeman and Company
Objectives (PSLS chapter 9)
Introducing probability

Randomness and probability

Probability models

Discrete vs continuous sample spaces

Probability rules

Random variables

Meaning of a probability

Risk and odds
Randomness and probability
A phenomenon is random if individual
outcomes are uncertain, but there is
nonetheless a regular distribution of
outcomes in a large number of repetitions.
The probability of any outcome of a random phenomenon can be
described as the proportion of times the outcome would occur in a very
long series of repetitions, i.e., long-run relative frequency.
Example: Coin toss
Probability of heads is 0.5
= proportion of times you get
heads in many repeated trials
First series of tosses
Second series
Probability models
Probability models mathematically describe the outcome of random
processes. They consist of three parts:
1) S = Sample Space: This is a set, or list, of all possible
outcomes of a random process.
2) Events: An event is a subset of the sample space.
3) A probability for each possible event in the sample space S.
Example: Probability Model for a Coin Toss
S = {Head, Tail}
Events: e.g., {Head}, e.g., {Tail}
Probability of heads = P{Head} = 0.5
Probability of tails = P[Tail} = 0.5
Sample space
Important: It's the question that determines the sample space.
A. A couple wants 3
children. What are the
possible sequences of
boys (B) and girls (G)?
B
B -
BBB
G -
BBG
B
G
G…
B -
BGB
G -
BGG
…
B. A couple wants 3 children.
What is the number of girls
they could have?
S = { BBB, BBG,
BGB, BGG, GBB,
GBG, GGB, GGG }
Note: 8 elements, 23
S = { 0, 1, 2, 3 }
C. A researcher designs a new maze for lab rats. What are the possible
outcomes for the time to finish the maze (in minutes)?
S = ( 0, ∞] = (all numbers > 0)
Discrete vs. continuous sample spaces
Finite sample spaces deal with discrete variables that can take on
only certain values (e.g. a whole number, i.e., a count, or a qualitative
category).
Blood types
For a random person:
S = {O+, O-, A+, A-, B+, B-, AB+, AB-}
and the probability of each event
reflects, indeed is, the population
relative frequencies or proportions.
Probability = Population Proportion
Discrete variables contrast with continuous variables that can take
on any one of an infinite number of possible values over an interval.
Discrete vs. continuous sample spaces
Categorical variables are necessarily discrete.
E.g., the variable Color (red, green, …) has events such as {red}, {green}, …,
or {Color = red}, {Color = green}, …, or {C = red}, {C = green} …
Quantitative variables can be discrete or continuous.
Discrete quantitative variables have only a finite or countably infinite set of
values, usually counts, e.g., X = number of children in a family (0, 1, 2,
…), with events of interest such as {X = 0}, {X = 1}, {X > 0}, {0 < X < 10},
…
Continuous quantitative variables are uncountably infinite, and can take on
any value in an interval, e.g., Y = age (years), W = weight (g).
Events of interest are always intervals, e.g., {Y < 17.5}, {17.5 < Y < 21.5},
{Y > 21.5}. Singleton events such as {Y = 17.5} and {Y = 21.5} are never
of interest, because continuous variables are always rounded off.
Continuous sample spaces contain an uncountably infinite number of
events.
We use density curves to model continuous probability distributions.
Density curves come in all
imaginable shapes.
Some are well known
mathematically and others aren’t.
Events are defined over intervals of values.
Probability are computed as areas under the corresponding density
curve.
The total area under a density curve represents the whole population
(sample space) and equals 1 (100%).
Shaded area = proportion (%) of
individuals in the population with
% individuals with X
such that x1 < X < x2
values of X between x1 and x2.
x1
x2
Shaded area = probability of
drawing 1 individual at random
with value between x1 and x2.
P( x1 < X < x2 )
x1
x2
 probability = relative frequency in population = population proportion
The probability of a single-point event, e.g., {Y = 1}, is
meaningless in a continuous sample space.
Only intervals, e.g., {0 < Y < 0.5}, can have a non-zero probability,
represented by the area under the density curve for that interval.
For the uniform distribution shown to the left,
P(0 ≤ y ≤ 0.5) = (0.5 − 0)*1 = 0.5
P(0 < y < 0.5) = (0.5 − 0)*1 = 0.5
Height
=1
P(0 ≤ y < 0.5) = (0.5 − 0)*1 = 0.5
The probability of a single event is zero:
y
P(y = 1) = (1 − 1)*1 = 0
Probability rules
Probabilities range from 0
(no chance of the event) to
1 (the event has to happen).
For any event A, 0 ≤ P(A) ≤ 1
Probability of being type O+
P(O+) = 0.38
The probability of the complete
sample space must equal 1.
P(sample space) = 1
P(all blood types) = .38 + .07 + .34
+ .06 + .09 + .02 + .03 + .01 = 1
The probability of an event not
occurring is 1 minus the
probability that is does occur.
P(not A) = 1 – P(A)
P(not A+) = 1 – P(A+)
= 1 – 0.34 = 0. 66
Two events are disjoint if they
have no outcomes in common
Events A
and B are
disjoint.
and can never happen together.
Events A and
B are NOT
disjoint.
People may have black, brown, red, or blond hair. But a single
individual can (naturally) only have one hair color. The events {black},
{brown}, {red}, and blond hair colors are all disjoint.
A person can have both brown hair and blue eyes, or brown hair and
brown eyes. Hair and eye colors are NOT disjoint.
A and B disjoint
When two events A and B are disjoint, the
probability that A OR B occurs is the sum of their
individual probabilities.
P(A or B) = “P(A U B)” = P(A) + P(B)
This is the addition rule for disjoint events.
A and B NOT disjoint
The probability that a random person is “blood group O” is
P(O) = P(O+ or O-) = P(O+) + P(O-) = .38 + .07 = .45
The probability that a random person is “rhesus neg" is:
P(O- or A- or B- or AB-) = .07 + .02 + .06 + .01 = 0.16
General addition rule for any two
events A and B:
The probability that A occurs,
or B occurs, or both events occur is:
P(A or B) = P(A) + P(B) – P(A and B)
The probability that a random person is either
“blood group O” or “rhesus neg” is
P(O or -) = P(O) + P(neg) - P(“O-”)
= .45 + .16 - .07 = .54
(blood group and rhesus are NOT disjoint)
A couple wants 3 children.

What are the arrangements (ordered sequences) of boys (B) and girls (G)?
Genetics tells us that the probability that a baby is a boy or a girl is the same, 0.5.
→ Sample space: {BBB, BBG, BGB, GBB, GGB, GBG, BGG, GGG}
→ All eight outcomes in the sample space are equally likely.
→ The probability of each is thus 1/8.

What are the numbers (X) of girls they could have?
The same genetic laws apply. We can use the probabilities above to calculate the
probability for each possible number of girls.
→ Sample space {0, 1, 2, 3}
→ P(X = 0) = P(BBB) = 1/8
→ P(X = 1) = P(BBG or BGB or GBB) = P(BBG) + P(BGB) + P(GBB) = 3/8
We generate two random numbers between 0 and 1 and take Y to be their sum.
Y can take any value between 0 and 2. The density curve for Y is:
Height = 1 because the base = 2, and
the area under the curve has to equal 1
by definition.
Area of a triangle = ½ (base*height)
Y
0
1
2
Probability that Y is greater than 1? P(Y > 1) = 0.5
Probability that Y is less than 0.5? P(Y < 0.5) = 0.125
Probability that Y is either less than 0.5 or greater than 1?
P(Y < 0.5 or Y > 1) = 0.125 + 0.5 = 0.625
0.125
0.125
0
0.5
0.25
0.5
1
1.5
2
Meaning of a probability

Theoretical probability
 From understanding the phenomenon and symmetries in the
problem


Six-sided fair die: Each side has the same chance of turning up;
therefore, each has a probability 1/6.

Genetic laws of inheritance based on meiosis process.
Empirical probability
 From our knowledge of numerous similar past events

Mendel discovered the probabilities of inheritance of a given trait from
experiments on peas, without knowing about genes or DNA.

Predicting the weather: A 30% chance of rain today means that it rained
on 30% of all days with similar atmospheric conditions.

Personal (subjective) probability
 From subjective considerations, typically about unique events

Probability of a large meteorite hitting the Earth. Probability of life on
Mars. These do not make sense in terms of frequency.
A personal probability represents an individual’s personal degree of
belief based on prior knowledge.

We may say “there is a 40% chance of life on Mars.” In fact, either there
is or there isn’t life on Mars. The 40% probability is our degree of belief,
how confident we are about the presence of life on Mars based on what
we know about life requirements, pictures of Mars, and probes we sent.
Personal probabilities may be based on personal experiences, for
instance a long time resident of a town may state that the probability of
snow is 20% based on his/her long-time observations.
Risk and odds
In the health sciences, probability concepts are often expressed in terms
of risk and odds.

The risk of an undesirable outcome of a random phenomenon is the
probability of that undesirable outcome.
risk(event A) = P(event A)

The odds of any outcome of a random phenomenon is the ratio of
the probability of that outcome over the probability of that outcome
not occurring.
odds(A) = P(event A) / [1 − P(event A)]
Sickle-cell anemia is a serious, inherited blood disease affecting the shape of
red blood cells. Individuals carrying only one copy of the defective gene (“sicklecell trait”) are generally healthy but may pass on the gene to their offspring.
If a couple learns from prenatal tests that they both carry the sickle-cell trait,
genetic laws of inheritance tell us that there is a 25% chance that they could
conceive a child who will suffer from sickle-cell anemia. What are the
corresponding risk and odds?
The risk of conceiving a child who will suffer from sickle-cell anemia is the
probability, so
risk of sickle-cell anemia = 0.25.
The odds is the ratio of two probabilities, so
odds of sickle-cell anemia = 0.25/(1 − 0.25) = 0.333
Which can also be written as odds of 1 to 3 (1:3).
Related documents