Download SEEDSM4f

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Scientific Methods 1
‘Scientific evaluation, experimental design
& statistical methods’
COMP80131
Lecture 4: Statistical Methods-Probability
Barry & Goran
www.cs.man.ac.uk/~barry/mydocs/myCOMP80131
16 Nov 2011
COMP80131-SEEDSM2-4
1
Probability
There are two useful definitions of probability:
1. Baysian probability is a person’s belief in the truth of a
statement S, quantified on a scale from 0 (definitely not true)
to 1 (definitely true).
2. Experimental (or frequentist) probability is determined by the
number, M, of times that a statement S will be found to be
true if it is tested a large number, N, of times .
The probability, P(S), may then be defined as the limit of
M / N as more and more experiments are carried out & N
tends to infinity.
16 Nov 2011
COMP80131-SEEDSM2-4
2
Different language
• By either definition, probability P(S) is a number in range 0 to 1.
• Multiply by 100 to express as a percentage.
• Or express as odds:
e.g. ‘4 to 1 against’ means 1/5 = 0.2 = 20%.
• What do odds of ‘4 to 1 on’ mean?
• What does ‘50-50’ mean ?
16 Nov 2011
COMP80131-SEEDSM2-4
3
Calculating probability
• The 2 definitions of probability usually mean the same thing.
• By examining a coin, we could give ourselves good reason for
believing that tossing it just once will give an even chance of
getting heads, i.e. that
the Baysian definition of P(S) = 0.5 where S = ‘get heads’.
• If the coin is then tossed N = 100 times we would expect about
M = 50 occurrences of heads meaning that M/N  0.5.
• Increasing N to 1000 and then to 1000000 would be expected to
produce closer & closer approximations to P(S) = 0.5.
• If this does not happen, our ‘a-priori’ belief may be wrong.
• The coin may be ‘weighted’ after all.
16 Nov 2011
COMP80131-SEEDSM2-4
4
Random process
•
•
•
•
•
•
•
•
•
Tossing a coin is a random process.
It generates a ‘random variable’ Heads or Tails.
It is random because the outcome cannot be predicted exactly.
If 1= heads and 0 = tails we have a random binary number.
Throwing a dice generates a random integer in range 1-6.
Spinning a Roulette wheel generates a random no. in range 0-36.
Setting & marking an exam produces random nos in range 0-100
These are all random processes producing discrete variables.
Some random processes produce continuous variables.
e.g. measuring people’s heights.
16 Nov 2011
COMP80131-SEEDSM2-4
5
Simulating random process
• MATLAB has functions that generate pseudo-random numbers.
• ‘rand’ produces a pseudo-random number ‘uniformly distributed’
in the range 0 to 1.
• May be considered ‘continuous’ since floating pt is very accurate.
• Calling ‘rand’ repeatedly produces numbers evenly distributed
across the range 0 to 1.
• They are ‘pseudo-random’ because if we know the algorithm
used, we can predict the numbers.
• So we pretend we do not know the algorithm.
• ‘rand’ may be considered to simulate some random process that
generates truly random numbers, uniformly distributed..
16 Nov 2011
COMP80131-SEEDSM2-4
6
Simulating coin tossing in MATLAB
for n=1:20
R = rand;
if R > 0.5, Heads(n)=1 else Heads(n) = 0; end;
end; % of n loop
Heads
10110001110101011101 - 12 heads & 8 tails
When I changed 20 to 10,000, I got 5066 heads: P(Heads)  0.5066
When I ran it again, I got 4918 heads : P(Heads)  0.4918
16 Nov 2011
COMP80131-SEEDSM2-4
7
Using an unfair coin
for n=1:20
R = rand;
if R > 0.4, Heads(n)=1 else Heads(n) = 0; end;
end; % of n loop
Heads
00101001110101010101 - 10 heads & 10 tails
•When I changed 20 to 10,000, I got 6012 heads: P(Heads)  0.6012
•When I ran it again, I got 5979 heads : P(Heads)  0.5979
16 Nov 2011
COMP80131-SEEDSM2-4
8
Estimating probability experimentally
• We cannot measure probability with 100% accuracy.
• All measurements are estimates that may be slightly or totally
wrong.
• According to experimental definition, we have to perform an
experiment an infinite number of times to measure a probability.
• This is clearly impossible.
• In practice, we have to perform the experiment a finite number of
times
• (Cannot spend all our lives tossing coins)
• Accept resulting measurement as an estimate of true probability.
16 Nov 2011
COMP80131-SEEDSM2-4
9
Baysian Definition
• According to Baysian definition of probability, a person’s
belief in the truth of a statement may be affected by one or
more assumption (hypotheses).
• “I assume it is a fair coin”
• Different people may have different beliefs.
• Can only estimate probability using information we have at
hand, though we can modify this estimate later if we get new
information.
16 Nov 2011
COMP80131-SEEDSM2-4
10
Conditional probability
• P(S  S1) means the probability of ‘statement S’ being true given
that we know that another statement, S1, is definitely true.
• If S stands for ‘get heads’ we may at first believe that P(S) = 0.5.
• But what if someone tells us that the statement S1:
‘coin is weighted with heavier metal on one side’,
is true?
• We may change our measurement of probability to P(S  S1).
• P(S) is then referred to as the ‘prior’ probability
• P(S  S1) is the ‘conditional’ or ‘posterior’ probability.
16 Nov 2011
COMP80131-SEEDSM2-4
11
Bayes Theorem
•Expresses the probability of some fact ‘A’ being true when we
know that some other fact ‘B’ is true:
P( A B) 
•
•
•
•
•
P( B A)  P( A)
P( B)
P(A) is ‘prior’ as it does not take into account any information
about B.
Similarly P(B) is ‘prior’.
P(A|B) and P(B|A) are conditional or ‘posterior’ probabilities.
Let A = ‘coin is fair’ & B = ‘getting 12 heads out of 20’
P(A B) = P(B A)  P(A) / P(B)
16 Nov 2011
COMP80131-SEEDSM2-4
12
What is prob of getting 12 heads out of 20?
clear all;
% WITH FAIR COIN
HIS=zeros(21,1);
for rep=1:1000
for n=1:20
R = rand; % Unif random number between 0 & 1
if R > 0.5, Heads(n)=0; else Heads(n)=1; end;
end; % of n loop
Count = sum(Heads);
HIS(1+Count) = HIS(1+Count)+1;
end; % of rep loop
figure(1); stem(0:20,HIS);
16 Nov 2011
COMP80131-SEEDSM2-4
13
Histogram for 1000 trials
FAIR COIN
200
180
Frequency out of 1000 trials
160
140
120
100
80
60
40
20
0
0
16 Nov 2011
2
16
14
12
10
8
6
4
Number of Heads obtainable with 20 coin-tosses
COMP80131-SEEDSM2-4
18
20
14
Estimate of probability distribution
FAIR COIN
Estimate of prob distribution based on 1000 trials
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0
16 Nov 2011
2
4
6
8
10
12
14
16
Number of Heads obtainable with 20 coin-tosses
COMP80131-SEEDSM2-4
18
20
15
Probability estimate (fair coin)
Estimated probabilities:
for 0:9 heads
0 0 0 0 0.008 0.011 0.024 0.087 0.119 0.160
for 10:19 heads
0.194 0.157 0.115 0.076 0.03 0.012 0.003 0.003 0.001 0
for 20 heads
0
So our estimate of the probability of getting 12 heads out of 20
with a fair coin is 0.115.
16 Nov 2011
COMP80131-SEEDSM2-4
16
What is prob of getting 12 heads out of 20?
clear all;
%WITH 60-40 WEIGHTED COIN
HIS=zeros(21,1);
for rep=1:1000
for n=1:20
R = rand; % Unif random number between 0 & 1
if R > 0.4, Heads(n)=1; else Heads(n)=0; end;
end; % of n loop
Count = sum(Heads);
HIS(1+Count) = HIS(1+Count)+1;
end; % of rep loop
figure(1); stem(0:20,HIS);
16 Nov 2011
COMP80131-SEEDSM2-4
17
HISTOGRAM for ‘60-40’ weighted coin
200
180
Frequency out of 1000 trials
160
140
120
100
80
60
40
20
0
0
16 Nov 2011
2
4
6
8
10
12
14
16
Number of Heads obtainable with 20 coin-tosses
COMP80131-SEEDSM2-4
18
20
18
Prob distribution estimate for ‘60-40’
weighted coin
Estimate of prob distribution based on 1000 trials
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0
16 Nov 2011
2
4
6
8
10
12
14
16
Number of Heads obtainable with 20 coin-tosses
COMP80131-SEEDSM2-4
18
20
19
Estimate Cumulative Prob Distrib
CDF(1)= HIS(1)/1000;
for n=2:21,
CDF(n)=CDF(n-1)+HIS(n)/1000;
end;
figure(3); stem(0:20,CDF);
Easily derived from a Histogram or Prob Distribution.
Estimate prob of getting between 0 and n Heads
16 Nov 2011
COMP80131-SEEDSM2-4
20
Estimate of Cumulative Prob Dist
FAIR COIN
Estimate of cumulative prob dist based on 1000 trials
1
0.9
0.8
0.7
Usually an S
shaped function
0.6
0.5
0.4
0.3
0.2
0.1
0
0
16 Nov 2011
2
4
6
8
10
12
14
16
Number of Heads obtainable with 20 coin-tosses
18
COMP80131-SEEDSM2-4
20
21
4 coin-tosses: how many possible outcomes?
0000
0001
0010
0011
0100
0101
0110
0111
1111
1001
1010
1011
1100
1101
1110
1111
16 Nov 2011
How many with 0 heads? 1
How many with 1 heads? 4 = 4C1
How many with 2 heads? 6 = 4C2 = 43/ (2!)
How many with 3 heads? 4 = 4C3
How many with 4 heads? 1
Combinations:
nCr
= no of ways of choosing r from n
= n(n-1) …(n-r+1) / (r!)
COMP80131-SEEDSM2-4
22
Binomial Prob Distribution
• Distributions have up to now been estimated.
• For random processes with just 2 outputs, we can derive a true
distribution:
• If p=prob(Heads), prob of getting Heads exactly r times in n
independent coin-tosses is:
r (1-p)(n-r)
C
p
n r
• For a fair coin. p=0.5,  this becomes nCr /2n
• For a fair dice, the prob of throwing 3 sixes in five throws is:
[54/(3 2 1)] (1/6)3  (5/6)2
16 Nov 2011
COMP80131-SEEDSM2-4
23
Implementing formula (fair coin)
•
•
•
•
•
•
•
•
p = 0.5; % for fair coin tossing
n=20;
for r=0:n
nCr = prod(n:-1:(n-r+1))/prod(1:r);
P(1+r) = nCr * (p^r) * (1-p)^(n-r);
end;
figure(4); stem(0:20,P);
axis([0 20 0 0.2]); grid on;
16 Nov 2011
COMP80131-SEEDSM2-4
24
True prob distribution (n=20)
0.2
True probability of getting that no of heads
0.18
Fair coin
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0
2
16 Nov 2011
4
16
14
12
10
8
6
No of heads obtainable with n coin-tosses
18
COMP80131-SEEDSM2-4
20
25
True probability from formula
For 0-9 heads:
0 0 0.0002 0.0011 0.0046 0.015 0.037 0.074 0.12 0.16
For 10-19 heads:
0.176 0.16 0.12 0.074 0.037 0.015 0.0046 0.0011 0.0002
For 20 heads:
0
True prob of getting 12 heads with a fair coin is 0.12.
0
Changing p to 0.4, we find that the true probability of getting 12
heads out or 20 with a ‘60-40’ weighted coin is: 0.18
16 Nov 2011
COMP80131-SEEDSM2-4
26
Back to Bayes Theorem
• There are 2 coins a fair one & a ‘60-40’ weighted one.
• We chose a coin at random & toss it 20 times.
• What is the probability of having a weighted coin when I get
12 heads out of 20?
• A = ‘coin is weighted 60-40’ & B = ‘get 12 heads out of 20’
• We know that P(B Fair coin) is 0.12 & P(B A) is 0.18.
• So P(B) will be the average of 0.12 & 0.18 = 0.15
• P(A B) = P(B A)  P(A) / P(B)
•
= 0.18  0.5 /0.15 = 0.6
16 Nov 2011
COMP80131-SEEDSM2-4
27
Further illustration of Bayes Theorem
• At a college there are:
10 students from France 5 girls & 5 boys
15 from UK 5 girls & 10 boys
20 from Canada 5 girls & 15 boys
16 Nov 2011
COMP80131-SEEDSM2-4
28
Calculation
• If we choose a student at random, the a-priori probability that
this student is French is P(French) = 10/45 = 2/9  0.22
• Now if we notice that this student is a boy, how does this
change the probability that the student is French?
• Use Bayes’ Theorem as follows:
P( French Boy ) 
P( Boy French)  P( French)
P( Boy )
= 0.5  (10/45) / (30/45) = 1/6  0.167
• The fact that we notice that the chosen student is a boy
gives us additional information that changes the probability
that the student chosen at random will be French.
16 Nov 2011
COMP80131-SEEDSM2-4
29
Check the calculation
• We can check the previous result by common sense, noticing
that out of 30 boys, in the college 5 are from France.
Therefore, P(FB) = 5/30 = 1/6.
16 Nov 2011
COMP80131-SEEDSM2-4
30
Usefulness of Bayes Theorem
• In general Bayes’ theorem allows us to take additional
information into account when calculating probabilities.
Without the additional information, we have a ‘prior’
probability and with it we have a ‘conditional’ or
‘posterior’ probability.
16 Nov 2011
COMP80131-SEEDSM2-4
31
Bayes Theorem in medicine
• A patent goes to a doctor with a bad cough & a fever.
The doctor needs to decide whether he has ‘swine flu’.
• Let statement S = ‘has bad cough and fever’ and
statement F = ‘has swine flu’.
• The doctor consults his medical books and finds that
about 40% of patients with swine-flu have these same
symptoms.
• Assuming that, currently, about 1% of the population is
suffering from swine-flu and that currently about 5%
have bad cough and fever (due to many possible causes
including swine-flu), we can apply Bayes theorem to
estimate the probability of this particular patient having
swine-flu.
16 Nov 2011
COMP80131-SEEDSM2-4
32
Another problem to solve
• A doctor in another country knows form his text-books that for
40% of patients with swine-flu, the statement S, ‘has bad
cough and fever’ is true. He sees many patients and comes to
believe that the probability that a patient with ‘bad cough and
fever’ actually has swine-flu is about 0.1 or 10%. If there were
reason to believe that, currently, about 1% of the population
have a bad cough and fever, what percentage of the population
is likely to be suffering from swine-flu?
16 Nov 2011
COMP80131-SEEDSM2-4
33
Some questions from Lecture 2
• Analyse the ficticious exam results & comment on features.
• Compute means, stds & vars for each subject & histograms for the
distributions.
• Make observations about performance in each subject & overall
• Do marks support the hypothesis that people good at Music are also
good at Maths?
• Do they support the hypothesis that people good at English are also
good at French?
• Do they support the hypothesis that people good at Art are also good
at Maths?
• If you have access to only 50 rows of this data, investigate the same
hypotheses
• What conclusions could you draw, and with what degree of certainty?
16 Nov 2011
COMP80131-SEEDSM2-4
34
Continuous random processes
• Characterised by probability density functions (pdf)
pdf(x)
Uniform pdf: Prob of the random variable x
lying between a and b is:
1
b
x
ab
pdf(x)
 pdf ( x)dx  b  a
a
1
Gaussian (Normal) pdf with mean m & std dev .
1
pdf ( x) 
e
 2
1  x m  2
 

2  
b
Pr ob   pdf ( x)dx
a
m-
16 Nov 2011
m
m+
68%
ab
x
COMP80131-SEEDSM2-4
95.5% for m  2
99.7% for m  3
35
pdf & Histograms
•
•
•
•
Ru = rand(10000,1); %10000 unif samples
hist(Ru,20);
Rg=randn(10000,1); %Gaussian with m=0, std=1
hist(Rg,20);
600
1600
1400
500
1200
400
1000
300
800
600
200
400
100
200
0
0
0.1
0.2
16 Nov 2011
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
-4
-3
COMP80131-SEEDSM2-4
-2
-1
0
1
2
3
4
5
36
Converting histogram to estimate of pdf
• Divide each column by number of samples
• Then multiply by number of bins.
• For better approximation, increase number
of bins
16 Nov 2011
COMP80131-SEEDSM2-4
37
Concept of a ‘null-hypothesis’
• A null-hypothesis is an assumption that is made and then
tested by a set of experiments designed to reveal that it is
likely to be false, if it is false.
• Testing is done by considering how probable the results are,
assuming the null hypothesis is true.
• If the results appear very improbable the researcher may
conclude that the null-hypothesis is likely to be false.
• This is usually the outcome the researcher hopes for when he
or she is trying to prove that a new technique is likely to have
some value.
16 Nov 2011
COMP80131-SEEDSM2-4
38
An example
• Assume we wish to find out if a proposed technique designed
to benefit users of a system is likely to have any value.
• Divide the users into two groups and offer the proposed
technique to one group and something different to the other
group.
• The null-hypothesis would be that the proposed technique
offers no measurable advantage over the other techniques.
16 Nov 2011
COMP80131-SEEDSM2-4
39
The testing
• This would be carried out by looking for differences between
the sets of results obtained for each of the two groups.
• Careful experimental design will try to eliminate differences
not caused by the techniques being compared.
• Must take a large number of users in each group & randomize
the way the users are assigned to groups.
• Once other differences have been eliminated as far as possible,
any remaining difference will hopefully be indicative of the
effectiveness of the techniques being investigated.
• The vital question is whether they are likely to be due to the
advantages of the new technique, or the inevitable random
variations that arise from the other factors.
• Are the differences statistically significant?
• Can employ a statistical significance to find out.
16 Nov 2011
COMP80131-SEEDSM2-4
40
Failure of the experiment
• If the results are not found to look improbable under the nullhypothesis,
i.e. if the differences between the two groups are not
statistically significant,
then no conclusion can be made.
• The null-hypothesis could be true, or it could still be false.
• It would be a mistake to conclude that the ‘null-hypothesis’
has been proved likely to be true in this circumstance.
• It is quite possible that the results of the experiment give
insufficient evidence to make any conclusions at all.
16 Nov 2011
COMP80131-SEEDSM2-4
41
P-Value
• Probability of obtaining a test result at least as extreme as the
one that was actually observed, assuming that the null
hypothesis is true.
• Reject the null hypothesis if the p-value is less than some
value α (significance level) which is often 0.05 or 0.01.
• When the null-hypothesis is rejected, the result is said to be
statistically significant.
16 Nov 2011
COMP80131-SEEDSM2-4
42
Checking whether a coin is fair
Suppose we obtain heads 14 times out of 20 flips.
The p-value for this test result would be the probability of a fair
coin landing on heads at least 14 times out of 20 flips. This is:
(20C14 + 20C15+20C16+20C17+20C18+20C19+20C20) / 220 = 0.058
This is probability that a fair coin would give a result as extreme
or more extreme than 14 heads out of 20.
16 Nov 2011
COMP80131-SEEDSM2-4
43
Significance test
• Reject null hypothesis if p-value  α .
• If α= 0.05, the rejection of the null hypothesis is at the 5%
(significance) level.
• The probability of wrongly rejecting the null-hypothesis
(Type 1 error) will be equal to α.
• This is considered sufficiently low.
• In this case, p-value > 0.05, therefore observation is consistent
with null hypothesis and we cannot reject it.
• Cannot conclude that coin is likely to be unfair.
• But we have NOT proved that coin is likely to be fair.
• 14 heads out of 20 flips can be ascribed to chance alone
• It falls within the range of what could happen 95% of the time
with a fair coin.
16 Nov 2011
COMP80131-SEEDSM2-4
44