Download Probability and Random Variables (Rees: §5.1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probability and Random Variables (Rees: §5.1 - §5.4)
Earlier in this course, we looked at methods of describing the data in a sample. Next we
would like to have models for the ways in which data can arise. Before we can develop these
models, we need some probability theory.
There are many situations (experiments) in which we do not know in advance what
will happen.
Examples
• A toss of a fair coin.
• Genetics.
• The weather in Edinburgh tomorrow.
• The winner of a particular horse race.
The result of an experiment is called an outcome. An event is a collection of the
possible outcomes from an experiment.
Examples
A = “Get a head from a toss of a fair coin”.
B = “Get cystic fibrosis if parents are carriers”.
C = “It will rain or snow in Edinburgh tomorrow”.
D = “Fat Chance wins the 3.00 at Kempton”.
Probability uses a numeric scale to describe how likely these events are.
Let E be an event – something that can happen! We will assign a probability Pr(E) to
that event such that 0 ≤ Pr(E) ≤ 1.
• Pr(E) = 0 means “(almost) never happens”.
• Pr(E) = 1 means “(almost) always happens”.
For event A, there are 2 possible outcomes and both are equally likely, so that Pr(A) = 0.5
For event B, there are 4 equally likely outcomes and only one of these leads to the disease,
so that Pr(B) = 0.25
For events C and D, or if the coin is not fair, the meaning of probability is less clear. One
approach is to consider it to be the proportion of successes in a very large number of trials.
Thus for event C, one could consider weather records from previous years for tomorrow’s
date.
For event D, this approach is impractical, but we can still do the corresponding ‘thought
experiment’. This approach works well in practice.
If there are n outcomes which are equally likely, then each outcome has probability
If the event E occurs for r of these outcomes, then
r
n
number of favourable outcomes
=
number of possible outcomes
P(E) =
1
1
n
Probability satisfies various rules:
• 0 ≤ Pr(A) ≤ 1
• Pr(A) + Pr(Not A) = 1
(Either A occurs or it does not)
Betting Odds compare P(E) with P(Not E)
Example: Throw a fair die.
1
Probability of getting a 6 is
6
Odds are 5-1
against getting 6
Odds are 5-1 On
getting ‘Not 6’
Most people find these easier to understand as they avoid small decimal numbers, but
they are difficult to combine.
If E and F are events, then:
(E And F) are all the outcomes that are in both E and F
(E Or F) are all the outcomes that are in E or in F or in both.
These events can be shown on a Venn diagram.
'
$
'
E
$
F
&
&
%
%
The outer rectangle represents all the possible outcomes. The regions labelled E and F
represent the outcomes in those events. The overlap region of E and F represents the event
“E And F”. The larger outline that contains both letters represents the event “E Or F”.
In the Venn diagram, it is possible to draw all of the events so that the area corresponds
to the probability of an event. This implies that the area enclosed by the outer rectangle is
one unit.
It is clear from the diagram that:
Pr(E OR F ) = Pr(E)+Pr(F )−Pr(E AND F ) The subtraction of the ‘AND’ term is because
adding the two probabilities means the overlap region has been counted twice.
This is called the “ADDITION LAW” or the “OR LAW” (see Rees §5.10)
2
If E and F cannot both occur together, they are said to be disjoint or mutually
exclusive events. On a Venn diagram, this corresponds to there being no overlap of their
shapes. Then, the addition law simplifies to:
Pr(E OR F ) = Pr(E) + Pr(F )
Example: Athletes data.
Distance USA
Sprinter
7
2
400m
Middle
2
Long
1
12
GB
5
2
4
1
12
Kenya
0
12
0
4
7
13
5
7
12
36
If an athlete is picked at random:
4
= 91
Pr(400m) = 36
12
Pr(GB) = 36 = 31
Pr(USA or GB) = Pr(USA) + Pr(GB) because mutually exclusive
+ 12
= 23
= 12
36
36
Pr(USA or Sprinter)
= Pr(USA) + Pr(Sprinter) − Pr(USA and Sprinter)
7
+ 12
− 36
= 12
36
36
=
17
36
Note that the events “Sprinter” and “Kenyan” are mutually exclusive, because there are
no athletes who belong to both events.
In the previous example, each of the individuals was equally likely to be selected. Then:
Probability =
Number of favourable outcomes
Total number of outcomes
Example: Gregor Mendel carried out breeding experiments on peas. Each pea was
classified as (Round or Wrinkled) and (Yellow or Green).
The results from one of these experiments were summarised as:
Round
Wrinkled
Total
Yellow
315
101
416
Green
108
32
140
For peas from this experiment:
Pr(Green) = 140
= 0.252
556
Pr(Wrinkled) = 133
= 0.239
556
Mendel noticed that both of these values are close to:
32
Pr(Green AND Wrinkled) = 556
= 0.058
1
This is close to: 16 = 0.0625
3
Total
423
133
556
1
4
= 0.25
This suggested a theoretical model for his results.
Yellow
Green
9
16
3
16
3
16
1
16
Round
Wrinkled
We now know that the Wrinkled and Green traits are controlled by recessive genes and
these are on different chromosomes. Genetic theory would therefore predict these probabilities.
This justification of the theoretical model was not available in Mendel’s time.
Problem: How should we decide if a data set is consistent with a possible theoretical
model? This will be considered later in the course.
Example: Roulette
There are 37 numbers on the wheel: 0 – 36.
18 are red
18 are black
1 is green
(9 odd and 9 even)
(9 odd and 9 even)
(zero)
We can display this in a table:
Red
Odd
9
Even
9
Zero
0
18
Black
9
9
0
18
Green
0
18
0
18
1
1
1
37
Consider a spin of the wheel and assume it is fair, so that all numbers are equally likely
Let A = ‘Red’ and B = ‘Odd’.
Number of reds
18
Pr(Red) = Pr(A) =
=
Number of outcomes
37
Number of odds
18
Pr(Odd) = Pr(B) =
=
Number of outcomes
37
Number Red AND Odd
Number of outcomes
9
=
37
For ‘OR’ questions, it is usually best to use the Addition Law:
Pr(Red AND Odd) =
Pr(Red OR Odd) = Pr(Red) + Pr(Odd)
− Pr(Red AND Odd)
18 18
9
=
+
−
37 37 37
27
=
37
4
Complementary Events
If A is an event, then the event “Not A”is said to be the complementary event. We have:
Pr(A) + Pr(Not A) = 1
The use of a complementary event can make the calculation of a probability easier.
Example: In the Mendel pea experiments, ‘Round’ and ‘Wrinkled’ were complementary
events.
Example: Two dice are thrown. It helps to suppose that one dice is Red while the other
is Blue. Possible outcomes are:
Blue
1
2
3
4
5
6
1 (1,1)
(1,2)
(1,3)
(1,4)
(1,5) (1,6)
2 (2,1)
(2,2)
(2,3)
(2,4)
(2,5) (2,6)
Red 3 (3,1)
(3,2)
(3,3)
(3,4)
(3,5) (3,6)
4 (4,1)
(4,2)
(4,3)
(4,4)
(4,5) (4,6)
5 (5,1)
(5,2)
(5,3)
(5,4)
(5,5) (5,6)
6 (6,1)
(6,2)
(6,3)
(6,4)
(6,5) (6,6)
There are 36 possible outcomes – all equally likely.
2
Pr(Total = 3) = 36
6
Pr(Total = 7) = 36
9
Pr(Both odd) = 36
6
Pr(Red = Blue) = 36
Pr(Red > Blue) = 15
36
Example: For 2 fair dice let:
E be the event ‘Sum is 4’
F be the event ‘Dice show same even number’.
Find the probability that at least one of these events occurs.
3
P(E) = P(1, 3) + P(2, 2) + P(3, 1) = 36
3
P(F) = P(2, 2) + P(4, 4) + P(6, 6) = 36
These are not mutually exclusive because
1
P(E And F) = P(2, 2) = 36
Using the addition rule of probability:
P(E Or F) = P(E) + P(F) – P(E And F)
3
3
1
= 36
+ 36
− 36
5
= 36
5
Example: For 2 fair dice (Red and Blue) let E be the event ‘At least one 6’
P(E) = P(Red 6) + P(Blue 6) – P(Both 6)
1
= 16 + 61 − 36
= 11
36
6
Conditional Probability
Knowledge about the occurrence of one event may affect the probability of another event.
Example: Using straws to draw lots.
Five students use 4 long straws and one short straw to decide who will buy the next round
of drinks. Before the draw, each student is equally likely to draw the short straw. However,
if you are last to draw and the other four were long, it must be you.
Example: A standard pack of 52 cards has four suits (clubs, diamonds, hearts and spades)
with 13 cards in each. Draw 2 cards at random without replacement.
= 0.250
P(First card is Club) = 13
52
P(Second card is Club) = 13
= 0.250
52
But, if we know that the first card is a Club, then:
= 0.235
P(Second card is Club) = 12
51
If we know that the first card is not a Club, then:
P(Second card is Club) = 13
= 0.255
51
Note: It is information that leads to a change in the probability, not the drawing process.
Example: Throw 2 fair dice, one Red and one Blue. The table gives the probabilities of
having values for the Red die if we know the sum of the 2 dice.
Value on Red die
Sum
1
2
3
4
5
6
2
1
0
0
0
0
0
3
0
0
0
0
0
0
0
0
10
0
0
0
11
0
0
0
0
1
5
1
6
1
5
1
4
1
3
1
2
0
9
1
4
1
5
1
6
1
5
1
4
1
3
0
0
1
3
1
4
1
5
1
6
1
5
1
4
0
8
1
2
1
3
1
4
1
5
1
6
1
5
0
7
1
2
1
3
1
4
1
5
1
6
12
0
0
0
0
0
1
4
5
6
1
6
1
5
1
4
1
3
1
2
The various probabilities in the previous examples are Conditional Probabilities. We
write:
P(Second card is Club | First is club) = 0.235
P(Red die shows 3 | Sum is 5) = 14 = 0.25
In speech, the vertical bar is read as “given that”.
To calculate a Conditional Probability
• Find all of the outcomes consistent with both events
• Find all of the outcomes consistent with the condition
7
Then the Conditional Probability is the ratio of the corresponding probabilities.
P(E And C)
So P(E | C) =
P(C)
Or P(E And C) = P(E | C) P(C)
Example: 2 fair dice, Red and Blue.
Calculate P(Red die shows 1 Or 2 | Sum is 6)
• 2 outcomes are consistent with both events:
these are (1, 5) and (2, 4)
2
So probability = 36
• 5 outcomes are consistent with a sum of 6:
these are (1, 5) (2, 4) (3, 3) (4, 2) (5, 1)
5
So probability = 36
So P(Red 1 Or Red 2 | Sum is 6) =
2/36
5/36
=
2
5
Example: Draw cards from standard pack without replacement.
P(First 2 cards are clubs)
= P(first club) P(second club | first club)
13
= 52
× 12
51
1
= 17
P(First 3 cards are clubs)
= P(first 2 clubs) P(3rd club | first 2 clubs)
11
= 13
× 12
× 50
52
51
= 0.0129
Example: Win jackpot on National Lottery.
Draw 6 numbers out of 49 without replacement.
5
4
6
P(Win) = 49
× 48
× 47
×
= 7.2 × 10−8
= 13983815 : 1
3
46
×
2
45
×
1
44
With 2 draws a week, you will win on average once every 140 thousand years!
Example: Can consider birthdays as being drawn from the 365 possibilities with approximately equal probabilities.
= 0.9973
P(2 out of 2 different) = 364
365
364
P(3 out of 3 different) = 365 × 363
= 0.9918
365
364
362
P(4 out of 4 different) = 365
× 363
×
= 0.9836
365
365
364
363
P(23 out of 23 different) = 365 × 365 × · · · × 343
365
= 0.4927
< 0.5
In the last two examples, the drawing was done without replacement. The probabilities of
interest were for either all matching or none matching respectively. If the required probability
is for a partial match, the calculations are rather more complicated.
8
Example: Ten bats out of a colony of 50 bats have been ringed. Five bats are caught at
random from this colony.
Assume that each bat is equally likely to be caught. Then the probability that the first
four bats caught are unmarked and the last one caught has a ring:
40 39 38 37 10
×
×
×
×
50 49 48 47 46
= 0.08627
=
In a similar way, the probability could be calculated that only the first bat of five already
had a ring. You should check that the probability comes out to be the same!
This is true for any other position in the sequence, so the probability that only one bat
from five trapped at this colony already has a ring is:
5 × 0.08627 = 0.431
Example: Four playing cards, two red and two black. Two cards are chosen at random.
What is the probability that they are the same colour?
Answer: Suppose that one card has been chosen. There are 3 possibilities for the second
card. Only one of these is the same colour as the first card.
This can also be considered as the number of ways of choosing 2 items out of 4 – see
later.
Example: 3 cards. One card is blank on both sides, one has X on both sides and the
third has a side of each type. A card is selected at random and one side shown. If this is
blank, what is the probability that the other side is blank?
Answer: There are 3 sides that are blank out of the 6 possible.
So P(Blank side chosen) = 36 = 12 .
However, of these 3 sides, 2 have blank sides on the reverse.
So P(Blank on reverse | Blank side chosen) = 32 .
Example: TV game show (Monty Hall).
There are 3 doors with a good prize behind only one door. The game show host knows
which door conceals the prize. The show winner selects one of the 3 doors. The host opens
one of the other 2 doors and shows that it does not hide a prize.
Should the winner:
• Stay with the selected door.
• Switch to the unopened door.
• Make random choice of 2 doors.
Answer: Probability of the good prize for the selected door is 13 . Opening one of the
other doors does not change this, so the probability for the other unopened door is 23 .
So the winner should always switch!
You can play this game at:
http://math.ucsd.edu/∼crypto/Monty/monty.html
Wikipedia has more information.
The situation is not the same as that in the TV programme ‘Deal or No Deal’.
9
Example: Suppose there are 2 identical opaque bags A and B.
Bag A contains 3 red counters and 1 blue counter.
Bag B contains 2 red counters and 2 blue counters.
One of the bags is chosen at random and a counter removed. If the selected counter is red,
what is the probability that the chosen bag was bag A?
Suppose we did 200 trials of this experiment.
What results would we expect to get?
Red
Blue
Total
Bag A
75
25
100
Bag B
50
50
100
Total
125
75
200
Thus we would expect to get 125 red counters, and in 75 of these cases it would come
from bag A.
75
So P(Bag A | Red counter) = 125
= 0.6
Note: If we actually did 200 trials, we are very unlikely to get the exact table above.
Finding the expected numbers is just a device to make the calculations easier to understand.
A Tree Diagram (Rees §5.13) is a good way of representing small problems.
3
4
1
2
PP
PP
PP
P
PP
P
1
2
h
hhhh
hh
A
hhh
hh
hhhh
h
P=
3
8
Blue P =
1
8
Red
P=
2
8
Blue P =
2
8
Red
1
4
1
2
PP
P
(
((((
(
(
(
B
PP
(((
PP ((((((
PP
PP
PP
PP
PP
PP
PP
PP
1
P
P
2
P( Bag A | Red counter) =
3
8
3
2
+
8
8
=
3
5
= 0.6
This is an example of the use of Bayes Rule. The calculations above using a table of
expected numbers or using a tree diagram can be written as a formula. Most students find
the formula unhelpful and prefer to use one of these other methods.
10
Example: Cystic Fibrosis is the most serious of the human genetic diseases in the UK.
5% of population are carriers but have no symptoms. 25% of children of 2 carriers have the
disease.
There is a diagnostic test which is positive with probability 0.85 for carriers and is always
negative for non-carriers. If someone tests negative, what is the probability of being a carrier?
From 2000 individuals we would expect:
100 to be carriers (= 5%) and
85 of these to test positive.
We expect the rest (1900) to be non-carriers and to test negative.
Positive
Negative
Total
So P(Carrier | Negative) =
15
1915
Carriers
85
15
100
Non-carriers
0
1900
1900
Total
85
1915
2000
= 0.0078
Example: A factory makes equal numbers of cars on each weekday.
P(Fault | Monday) = 0.20
P(Fault | Friday) = 0.15
P(Fault | Other day) = 0.10
P(Monday car | Fault)
0.20
=
0.20 + 0.10 + 0.10 + 0.10 + 0.15
20
=
65
= 0.308
A Venn diagram for this:
Mon
Tue
Wed
11
Thu
Fri