Download Languages and Finite Automata

Discrete Probability CSC-2259 Discrete Structures Konstantin Busch - LSU 1 Introduction to Discrete Probability Unbiased die Sample Space: S  {1,2,3,4,5,6} All possible outcomes 2 Event: any subset of sample space E1  {3} E2  {2,5} Experiment: procedure that yields events Throw die Konstantin Busch - LSU 3 Probability of event E: size of event set |E| p( E )   size of sample space | S | Note that: 0  p( E )  1 since 0 | E || S | Konstantin Busch - LSU 4 What is the probability that a die brings 3? E  {3} Event Space: Sample Space: Probability: S  {1,2,3,4,5,6} |E| 1 p( E )   |S| 6 5 What is the probability that a die brings 2 or 5? E  {2,5} Event Space: Sample Space: Probability: S  {1,2,3,4,5,6} |E| 2 p( E )   |S| 6 6 Two unbiased dice Sample Space: 36 possible outcomes S  {(1,1), (1,2), (1,3), , (6,6)} First die Second die Ordered pair 7 What is the probability that two dice bring (1,1)? Event Space: E  {(1,1)} Sample Space: S  {(1,1), (1,2), (1,3), , (6,6)} Probability: |E| 1 p( E )   | S | 36 8 What is the probability that two dice bring same numbers? Event Space: E  {(1,1), (2,2), (3,3), (4,4), (5,5), (6,6)} Sample Space: S  {(1,1), (1,2), (1,3), , (6,6)} Probability: |E| 6 p( E )   | S | 36 9 Game with unordered numbers Game authority selects a set of 6 winning numbers out of 40 Number choices: 1,2,3,…,40 i.e. winning numbers: 4,7,16,25,33,39 Player picks a set of 6 numbers (order is irrelevant) i.e. player numbers: 8,13,16,23,33,40 What is the probability that a player wins? Konstantin Busch - LSU 10 Winning event: E  {{4,7,16,25,33,39}} | E | 1 a single set with the 6 winning numbers Sample space: S  {all subsets with 6 numbers out of 40}  {{1,2,3,4,5,6},{1,2,3,4,5 ,7},{1,2,3,4,5,8},}  40  | S |    3,838,380 6 Konstantin Busch - LSU 11 Probability that player wins: |E| 1 1 P( E )    | S |  40  3,838,380   6 Konstantin Busch - LSU 12 A card game Deck has 52 cards 13 kinds of cards (2,3,4,5,6,7,8,9,10,a,k,q,j), each kind has 4 suits (h,d,c,s) Player is given hand with 4 cards What is the probability that the cards of the player are all of the same kind? Konstantin Busch - LSU 13 Event: E  {{2 h ,2d ,2c ,2s },{3 h ,3d ,3c ,3s },,{jh , jd , jc , js }} | E | 13 each set of 4 cards is of same kind Sample Space: S  {all possible sets of 4 cards out of 52}  {{2 h ,2d ,2c ,2s },{2 h ,2d ,2c ,3h },{2 h ,2d ,2c ,3d },}  52  52! 52  51 50  49 | S |      270,725 4 3 2  4  4! 48! Konstantin Busch - LSU 14 Probability that hand has 4 same kind cards: |E| 13 13 P( E )    | S |  52  270,725   4 Konstantin Busch - LSU 15 Game with ordered numbers Game authority selects from a bin 5 balls in some order labeled with numbers 1…50 Number choices: 1,2,3,…,50 i.e. winning numbers: 37,4,16,33,9 Player picks a set of 5 numbers (order is important) i.e. player numbers: 40,16,13,25,33 What is the probability that a player wins? Konstantin Busch - LSU 16 Sampling without replacement: After a ball is selected it is not returned to bin Sample space size: 5-permutations of 50 balls 50! 50 | S | P(50,5)    50  49  48  47  46  245,251,200 (50  5)! 45! |E| 1 Probability of success: P( E )   | S | 245,251,200 Konstantin Busch - LSU 17 Sampling with replacement: After a ball is selected it is returned to bin Sample space size: 5-permutations of 50 balls with repetition | S | 50  312,500,000 5 |E| 1 Probability of success: P( E )   | S | 312,500,000 Konstantin Busch - LSU 18 Probability of Inverse: Proof: p( E )  1  p( E ) E SE |S E| |S || E| |E| p( E )    1  1  p( E ) |S| |S| |S| End of Proof Konstantin Busch - LSU 19 Example: What is the probability that a binary string of 8 bits contains at least one 0? E  {01111111, 10111111,,00111111,,00000000} E  {11111111} |E| 1 p( E )  1  p( E )  1   1 8 |S| 2 Konstantin Busch - LSU 20 Probability of Union: E1 , E2  S p( E1  E2 )  p( E1 )  p( E2 )  p( E1  E2 ) Proof: | E1  E2 || E1 |  | E2 |  | E1  E2 | | E1  E2 | | E1 |  | E1 |  | E1  E2 | p( E1  E2 )   |S| |S| | E1 | | E2 | | E1  E2 |    |S| |S| |S|  p( E1 )  p( E2 )  p( E1  E2 ) Konstantin Busch - LSU End of Proof 21 Example: What is the probability that a binary string of 8 bits starts with 0 or ends with 11? Strings that start with 0: E1  {00000000, 00000001,,01111111} | E1 | 27 (all binary strings with 7 bits 0xxxxxxx) Strings that end with 11: E2  {00000011, 00000111,,11111111} | E2 | 2 6 (all binary strings with 6 bits xxxxxx11) Konstantin Busch - LSU 22 Strings that start with 0 and end with 11: E1  E2  {000000011, 00000111,,01111111} | E1  E2 | 2 (all binary strings with 5 bits 0xxxxx11) 5 Strings that start with 0 or end with 11: p ( E1  E2 )  p ( E1 )  p ( E2 )  p ( E1  E2 ) | E1 | | E2 | | E1  E2 |    |S| |S| |S| 7 6 5 2 2 2 1 1 1 5  8 8 8     2 2 2 2 4 8 8 Konstantin Busch - LSU 23 Probability Theory Sample space: S  {x1 , x2 ,, xn } Probability distribution function p: 0  p( xi )  1 n  p( x )  1 x 1 i Konstantin Busch - LSU 24 Notice that it can be: p( xi )  p ( x j ) Example: Biased Coin Heads (H) with probability 2/3 Tails (T) with probability 1/3 Sample space: 2 p( H )  3 S  {H , T } 1 p (T )  3 2 1 p( H )  p(T )    1 3 3 Konstantin Busch - LSU 25 Uniform probability distribution: 1 p ( xi )  n Sample space: S  {x1 , x2 ,, xn } Example: Unbiased Coin Heads (H) or Tails (T) with probability 1/2 S  {H , T } 1 p( H )  2 Konstantin Busch - LSU 1 p (T )  2 26 Probability of event E: E  {x1 , x2 ,, xk }  S k p ( E )   p ( xi ) i 1 |E| For uniform probability distribution: p ( E )  |S| Konstantin Busch - LSU 27 Example: Biased die S  {1,2,3,4,5,6} 1 p(1)  p(2)  p(3)  p(4)  p(5)  7 What is the probability that the die outcome is 2 or 6? 2 p(6)  7 E  {2,6} 1 2 3 p( E )  p(2)  p(6)    7 7 7 Konstantin Busch - LSU 28 Combinations of Events: Complement: p( E )  1  p( E ) Union: p( E1  E2 )  p( E1 )  p( E2 )  p( E1  E2 )   Union of disjoint events: p  Ei    p( Ei )  i  i Konstantin Busch - LSU 29 Conditional Probability Three tosses of an unbiased coin Tails Heads Tails Condition: first coin is Tails Question: What is the probability that there is an odd number of Tails, given that first coin is Tails? Konstantin Busch - LSU 30 Sample space: S  {TTT , TTH , THT , THH , HTT , HTH , HHT , HHH } Restricted sample space given condition: F  {TTT , TTH , THT , THH } first coin is Tails Konstantin Busch - LSU 31 Event without condition: E  {TTT , THH , HTH , HHT } Odd number of Tails Event with condition: EF  E  F  {TTT , THH } first coin is Tails Konstantin Busch - LSU 32 F  {TTT , TTH , THT , THH } EF  E  F  {TTT , THH } Given condition, the sample space changes to F | E  F | | E  F | / | S | p( E  F ) 2 / 8 p( EF )      0.5 |F| |F|/|S| p( F ) 4/8 (the coin is unbiased) Konstantin Busch - LSU 33 Notation of event with condition: EF  E | F event E given F p( E  F ) p( EF )  p( E | F )  p( F ) Konstantin Busch - LSU 34 Conditional probability definition: (for arbitrary probability distribution) Given sample space S with events E and F (where p ( F )  0 ) the conditional probability of E given F is: p( E  F ) p( E | F )  p( F ) Konstantin Busch - LSU 35 Example: What is probability that a family of two children has two boys given that one child is a boy Assume equal probability to have boy or girl Sample space: Condition: S  {BB , BG , GB, GG} F  {BB , BG , GB} one child is a boy Konstantin Busch - LSU 36 Event: E  {BB} both children are boys Conditional probability of event: p( E  F ) p({BB}) 1/ 4 1 p( E | F )     p( F ) p({BB , BG , GB}) 3 / 4 3 Konstantin Busch - LSU 37 Independent Events Events E1 and E2 are independent iff: p( E1  E2 )  p( E1 ) p( E2 ) Equivalent definition (if p( E2 )  0 ): p( E1 | E2 )  p( E1 ) Konstantin Busch - LSU 38 Example: 4 bit uniformly random strings E1 : a string begins with 1 E2 : a string contains even 1 E1  {1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111} E2  {0000, 0011, 0101, 0110, 1001, 1010, 1100, 1111} E1  E2  {1111, 1100, 1010, 1001} | E1 || E2 | 8 | E1  E2 | 4 8 1 p( E1 )  p ( E2 )   16 2 4 1 1 1 p( E1  E2 )      p( E1 ) p( E2 ) 16 4 2 2 Events E1 and E2 are independent Konstantin Busch - LSU 39 Bernoulli trial: Experiment with two outcomes: success or failure Success probability: Failure probability: p q  1 p Example: Biased Coin Success = Heads 2 p  p( H )  3 Failure = Tails 1 q  p (T )  3 Konstantin Busch - LSU 40 Independent Bernoulli trials: the outcomes of successive Bernoulli trials do not depend on each other Example: Successive coin tosses Konstantin Busch - LSU 41 Throw the biased coin 5 times What is the probability to have 3 heads? Heads probability: Tails probability: 2 p 3 1 q 3 Konstantin Busch - LSU (success) (failure) 42 HHHTT HTHHT HTHTH THHTH   Total numbers of ways to arrange in sequence 5 coins with 3 heads:  5     3 Konstantin Busch - LSU 43 Probability that any particular sequence has 3 heads and 2 tails is specified positions: 3 2 pq For example: HHHTT p p p q q pppqq  p q HTHHT p q p p q pqppq  p q HTHTH p q p q Konstantin Busch - LSU p 3 2 3 2 pqpqp  p q 3 2 44 Probability of having 3 heads:  5 3 2 p q  p q    p q    p q  3 3 2 1st sequence success (3 heads) 3 2 3 2nd sequence success (3 heads) Konstantin Busch - LSU 2  5  st    3 sequence success (3 heads) 45 Throw the biased coin 5 times Probability to have exactly 3 heads:  5 3 2   p q  3 5!  2     3!2!  3  3 2 1    0.0086  3 Probability to have 3 heads and 2 tails in specified sequence positions All possible ways to arrange in sequence 5 coins with 3 heads Konstantin Busch - LSU 46 Theorem: Probability to have k successes in n independent Bernoulli trials:  n  k nk   p q k  Also known as binomial probability distribution:  n  k nk b(k ; n, p)    p q k  Konstantin Busch - LSU 47 Proof:  n  k nk   p q k  Total number of sequences with k successes and n  k failures Probability that a sequence has k successes and n  k failures in specified positions Example: SFSFFS…SSF End of Proof Konstantin Busch - LSU 48 Example: Random uniform binary strings probability for 0 bit is 0.9 probability for 1 bit is 0.1 What is probability of 8 bit 0s out of 10 bits? i.e. 0100001000 p  0 .9 q  0.1 k 8 n  10  n  k nk 10  b(k ; n, p)    p q   (0.9)8 (0.1) 2  0.1937102445 k  8 Konstantin Busch - LSU 49 Birthday Problem Birthday collision: two people have birthday in same day Problem: How many people should be in a room so that the probability of birthday collision is at least ½? Assumption: equal probability to be born in any day Konstantin Busch - LSU 50 366 days in a year If the number of people is 367 or more then birthday collision is guaranteed by pigeonhole principle Assume that we have n  366 Konstantin Busch - LSU people 51 We will compute pn :probability that n people have all different birthdays It will helps us to get 1  pn :probability that there is a birthday collision among n people Konstantin Busch - LSU 52 Sample space: Cartesian product S  {1,2,,366} {1,2,,366}   {1,2,,366} 1st person’s Birthday choices 2nd person’s Birthday choices nth person’s Birthday choices S  {(1,1,,1), (2,1,,1),  (366,366,,366)} Sample space size: | S | 366  366366  366n Konstantin Busch - LSU 53 Event set: each person’s birthday is different E  {(1,2,,366), (366,1,,365), , (366,365,,1)} 1st person’s birthday Sample size: 2nd person’s birthday #choices #choices nth person’s birthday #choices 366! | E | P(366, n)   366  365  364(366  n  1) (366  n)! Konstantin Busch - LSU 54 Probability of no birthday collision | E | 366  365  364(366  n  1) pn   n |S| 366 Probability of birthday collision: 1  pn n  22 1  pn  0.475 n  23 1  pn  0.506 Konstantin Busch - LSU 55 Probability of birthday collision: n  23 Therefore: 1  pn 1  pn  0.506 n  23 people have probability at least ½ of birthday collision Konstantin Busch - LSU 56 The birthday problem analysis can be used to determine appropriate hash table sizes that minimize collisions Hash function collision: h(k1 )  h(k2 ) Konstantin Busch - LSU 57 Randomized algorithms: algorithms with randomized choices (Example: quicksort) Las Vegas algorithms: randomized algorithms whose output is always correct (i.e. quicksort) Monte Carlo algorithms: randomized algorithms whose output is correct with some probability (may produce wrong output) Konstantin Busch - LSU 58 A Monte Carlo algorithm Primality_Test( n,k ) { for(i  1 to k ) { b  random_num ber(1, ,n) if (Miller_Test( n,b ) == failure) return(false) } // n is not prime } return(true) // most likely n is prime Konstantin Busch - LSU 59 Miller_Test( n,b ) { s n-1  2 t s, t  0, s  log n, t is odd for ( j  0 to s-1 ) { j t 2 if ( b  1 ( mod n) or b t  1 ( mod n) ) return(success) } return(failure) } Konstantin Busch - LSU 60 A prime number n passes the Miller test for every 1  b  n A composite number n passes the Miller test in range 1  b  n for fewer than n numbers 4 1 false positive with probability: 4 Konstantin Busch - LSU 61 If the primality test algorithm returns false then the number is not prime for sure If the algorithm returns true then the answer is correct (number is prime) with high probability: k 1 1 1    1  1 n 4 for k  log 4 n Konstantin Busch - LSU 62 Bayes’ Theorem p( E )  0 p( F )  0 p( E | F ) p( F ) p( F | E )  p( E | F ) p( F )  p( E | F ) p( F ) Applications: Machine Learning Spam Filters Konstantin Busch - LSU 63 Bayes’ Theorem Proof: p( E  F ) p( F | E )  p( E ) p( E  F )  p( F | E ) p( E ) p( E  F ) p( E | F )  p( F ) p( E  F )  p( E | F ) p( F ) p( F | E ) p( E )  p( E | F ) p( F ) p( E | F ) p( F ) p( F | E )  p( E ) Konstantin Busch - LSU 64 E  (E  F )  (E  F ) p( E )  p( E  F )  p( E  F ) (E  F )  (E  F )   p( E  F )  p( E | F ) p( F ) p( E  F )  p( E | F ) p( F ) p( E )  p( E | F ) p( F )  p( E | F ) p( F ) Konstantin Busch - LSU 65 p( E | F ) p( F ) p( F | E )  p( E ) p( E )  p( E | F ) p( F )  p( E | F ) p( F ) p( E | F ) p( F ) p( F | E )  p( E | F ) p( F )  p( E | F ) p( F ) End of Proof Konstantin Busch - LSU 66 Example: Select random box then select random ball in box Box 1 Box 2 Question: If a red ball is selected, what is the probability it was taken from box 1? Konstantin Busch - LSU 67 E : select red ball F : select box 1 E : select green ball F : select box 2 Question probability: P( F | E )  ? Question: If a red ball is selected, what is the probability it was taken from box 1? Konstantin Busch - LSU 68 Bayes’ Theorem: p( E | F ) p( F ) p( F | E )  p( E | F ) p( F )  p( E | F ) p( F ) We only need to compute: p( F ) p( F ) p( E | F ) Konstantin Busch - LSU p( E | F ) 69 E : select red ball F : select box 1 E : select green ball F : select box 2 Box 1 Box 2 p( F )  1 / 2  0.5 p( F )  1 / 2  0.5 Probability to select box 1 Probability to select box 2 Konstantin Busch - LSU 70 E : select red ball F : select box 1 E : select green ball F : select box 2 Box 1 Box 2 p( E | F )  7 / 9  0.777... Probability to select red ball from box 1 p( E | F )  3 / 7  0.428.... Probability to select red ball from box 2 Konstantin Busch - LSU 71 p( E | F ) p( F ) p( F | E )  p( E | F ) p( F )  p( E | F ) p( F ) p( F )  1 / 2  0.5 p( F )  1 / 2  0.5 p( E | F )  7 / 9  0.777... p( E | F )  3 / 7  0.428.... 0.777  0.5 0.777 p( F | E )    0.644 0.777  0.5  0.428  0.5 0.777  0.428 Final result Konstantin Busch - LSU 72 What if we had more boxes? Generalized Bayes’ Theorem: p ( Fj | E )  p ( E | Fj ) p ( Fj ) n  p( E | F ) p( F ) i 1 i i Sample space S  F1  F2    Fn mutually exclusive events Konstantin Busch - LSU 73 Spam Filters Training set: Spam (bad) emails B Good emails G A user classifies each email in training set as good or bad Konstantin Busch - LSU 74 Find words that occur in B and G nG (w) nB (w) number of spam emails that contain word w number of good emails that contain word w nB ( w) p(w)  |B| nG ( w) q( w)  |G | Probability that a good email contains w Probability that a spam email contains w Konstantin Busch - LSU 75 A new email X arrives S: Event that X is spam E: Event that X contains word w What is the probability that X is spam given that it contains word w? P( S | E )  ? Reject if this probability is at least 0.9 Konstantin Busch - LSU 76 p( E | S ) p( S ) p( S | E )  p( E | S ) p( S )  p( E | S ) p( S ) We only need to compute: p( S ) 0.5 p( S ) 0.5 simplified assumption p( E | S ) nB ( w) p(w)  |B| p( E | S ) nG ( w) q( w)  |G | Computed from training set Konstantin Busch - LSU 77 Example: Training set for word “Rolex”: “Rolex” occurs in 250 of 2000 spam emails “Rolex” occurs in 5 of 1000 good emails If new email contains word “Rolex” what is the probability that it is a spam? Konstantin Busch - LSU 78 “Rolex” occurs in 250 of 2000 spam emails nB ( Rolex )  250 nB ( Rolex ) 250 p( Rolex )    0.125 |B| 2000 “Rolex” occurs in 5 of 1000 good emails nG ( Rolex )  5 nG ( Rolex ) 5 q( Rolex )    0.005 |G| 1000 Konstantin Busch - LSU 79 If new email X contains word “Rolex” what is the probability that it is a spam? S: Event that X is spam E: Event that X contains word “Rolex” P( S | E )  ? Konstantin Busch - LSU 80 p( E | S ) p( S ) p( S | E )  p( E | S ) p( S )  p( E | S ) p( S ) We only need to compute: p( S ) p( S ) 0.5 0.5 simplified assumption p( E | S ) p( E | S ) p( Rolex )  0.125 q( Rolex )  0.005 Computed from training set Konstantin Busch - LSU 81 0.125  0.5 0.125 p( S | E )    0.961... 0.125  0.5  0.005  0.5 0.13 New email is considered to be spam because: p( S | E )  0.961  0.9 spam threshold Konstantin Busch - LSU 82 Better spam filters use two words: p( E1 | S ) p( E2 | S ) p( S | E1  E2 )  p( E1 | S ) p( E2 | S )  p( E1 | S ) p( E2 | S ) Assumption: E1 and E2 are independent the two words appear independent of each other Konstantin Busch - LSU 83

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Languages and Finite Automata