Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5 Sets, Etc.
A set is a collection of distinguishable objects, called members or elements
 xS
 S =ψ
Z = {, -2, -1, 0, 1, 2, }
N = {0, 1, 2, }
R ≣ The set of real numbers
A  B ≣ A is a subset of B.
A  B ≣ A is a proper subset of B
A = B iff A  B and B  A
Idempotency
AA=A
Commutativity
AB=BA
Associativity
(A  B)  C = A  (B  C)
Distributivity
A  (B  C) = (A  B)  (A  C)
DeMorgan’s Law
A  ( B  C )  ( A  B)  ( A  C )
A  ( B  C )  ( A  B)  ( A  C )
A B  A B
A B  A B
When all the sets under consideration are subsets of some larger set U, called
universe, we can define the complement of a set A as A  U  A
A partition of a set S is a collection of {S1, S2, , Sn} subsets of S, s.t.
(1) Si  Sj = ψ i, j
n
(2)  S i  S
i 1
An infinite set that can be put into a one-to-one correspondence with N is called
countable set.
e.g. N is countable
Z is countable
2K – 1 if K  0
-2K
if K  0
|AB|=|A|+|B|−|AB|
The set of all subsets of a set S, denoted 2s, is called the power set of S
e.g. 2{a, b}= {ψ, {a} , {b}, {a, b}}
An ordered pair of 2 elements a and b is denoted (a, b), and can be defined
formally as (a, b) = {a, {a, b}}. Thus (a, b)≠(b, a)
Cartesian product of sets A and B, denoted A × B, is
A × B = {(a, b) : a  A and b  B}
The Cartesian product of n sets A1, A2, , An is the set of n-tuples
A1 × A2 ×  × An = {(a1, a2, , an) : ai  Ai, i = 1, 2, , n}
Relations
A binary relation R on 2 sets A and B is a subset of the Cartesian product A × B
If (a, b)  R, we sometimes write aRb
R is a binary relation on A if R is a subset of A × A
e.g. “  ” is a binary relation on N : {(a, b) : a, b  N and a  b}
Properties on a binary relation R
 R is reflexive if aRa
R is symmetric if aRb  bRa
R is transitive if aRb, bRc  aRc
R is an equivalence relation if R is reflexive, symmetric and transitive
The equivalence class of a  A is {b  A : aRb}
e.g. R = {(a, b) : a, b  N and a + b is even}
R forms 2 equivalence classes
1, 3, 5,
………
2, 4, 6,
………
An equivalence relation is the same as a partition
R is antisymmetric if aRb and bRa  a = b
A relation that is reflexive, antisymmetric and transitive is a partial order
e.g. “ a descendant of “ relation on people is a partial order
A partial order R on a set A is total if a, b  A, we have aRb or bRa
e.g. “  “ is a total order
Exercises
5.2-2, 5.2-5
Functions
A function f is a binary relation on A and B s.t. for all a  A, there exists
precisely one b  B s.t. (a, b)  f. A is the domain of f and B is the codomain.
If (a, b)  f, we write b = f(a)
e.g. f = {(a, b) : a  N and b = a mod 2} is a function f : N  {0, 1}
e.g. g = {(a, b) : a, b  N and a + b is even} is NOT a function. Why?
When the domain of a function f is a Cartesian product f : A1 × A2 ××An  B,
we write f(a1, a2,, an) = b instead of f((a1, a2, , an)) = b
The range of f is defined as {b  B : b = f(a) for some a  A}
Properties
 A function f is a surjection if range of f = codomain of f
 A function f is an injection if f(a)  f(b) a, b  A and a  b
A function is a bijection or called one-to-one correspondence if it is injective and
surjective
n
e.g. f (n)  (1) n   is a bijection from N to Z
2
When a function f is bijective, its inverse f –1
f -1(a) = a iff f(a) = b
n
e.g. f (n)  (1) n  
2
2m
–1
f (m) =
-2m-1
Q:
if m  0
if m<0
2k-1
if k>0
-2k
if k  0
f (k) =
–1
what is f ?
Graph
A directed graph G is an ordered pair (V, E), where V is a finite set and E is a
binary relation on V
In an undirected graph, E is a subset of 2-subset of V (i.e. E consists of unordered
pairs)
In a directed graph, we say an edge (u, v) is incident from u or incident to v. We
can also say v is adjacent to u. For an undirected graph (V, E), we say an edge (u,
v)  E is incident on u and v.
The degree of a vertex in an undirected graph is the # of edges incident on it.
For a vertex in a directed graph, we distinguish the degree as in-degree and
out-degree.
A path of length k is a sequence <v0, v1, , vk> s.t. (vi-1, vi)  E for i = 1, ,k.
If there is a path from u to v, we say v is reachable from u.
A simple path is the one without duplication of vertices on its sequence.
A cycle is a path <v0, v1, , vk> s.t. v0 = vk.
A cycle <v0, v1, , vk> is simple if v1, v2, , vk are distinct.
A graph with no cycles is acyclic.
An undirected graph is connected if every pair of vertices is connected by a path.
A directed graph is strongly connected if every 2 vertices are reachable from
each other.
2 graphs G = (V, E) and G’ = (V’, E’) are isomorphic if there exists a bijection
f : V  V’ s.t. (u, v)  E iff (f(u), f(v))  E’. See Fig. 5.3 on p.89.
A graph G’ = (V, E) is a subgraph of G = (V, E) if V’  V and E’  E.
A complete graph is an undirected graph in which every pair of vertices is
adjacent.
A bipartite graph is an undirected graph G = (V, E) in which V can be partitioned
into 2 sets V1 and V2 s.t. (u, v)  E, u  V1, v  V2 or u  V2, v  V1.
A multigraph is like an undirected graph, but it allows multiple edges between
vertices and self-loops.
A hypergraph is like an undirected graph, but each edge is allowed to connect
more than 2 vertices.
Exercise
5.4-7 in page 90
Chapter 6. Counting and Probability
Q: Argue that
 n   n  1  n  1
   
  
 k   k   k  1
Q: Argue that
 n  n n 
     
 j  k  j  k
j
Def. A sample space is a set of elementary events, each of which is a possible
outcome of an experiment.
e.g. Flipping 2 distinguishable coins
S = {HH, HT, TH, TT}
Def. An event is a subset of sample space S
Def. A probability distribution Pr{} on a sample space S is a mapping from events of
S to real numbers s.t.
1. Pr{A} > 0  event A
2. Pr{S} = 1
3. Pr{AB} = Pr{A} + Pr{B} for any 2 mutually exclusive event A and B
Def. A probability distribution is discrete if it is defined over a finite or countable
infinite sample space
We often deal with uniform probability distribution. That is
1
, where sS
Pr{s} 
|S|
In that case, we say we pick an element of S at random
e.g. Flipping a fair coin n times
S = {H, T}n
Def. The continuous uniform probability distribution is defined over a closed interval
[a, b] of reals, where a < b, s.t.
Pr{[ c, d ]} 
d c
, for any interval [c, d] where a  c  d  b
ba
A discrete random variable X is a function from a finite or countably infinite
sample space S to real numbers
e.g. Flipping a coin n times
S = {H, T}n
X ≡ # of heads appeared
X
HHH
HHT
HTH
S
3
2
1
0
HTT
THH
THT
TTH
TTT
We define X = x as {s  S : X(s) = x}
That is, x = 2 is equivalent to {HHT, HTH, THH}
f (2)  Pr{ X  2} 
3
8
 n  1   1 
In general, f(x) = Pr{X = x} =        
 x  2   2 
n
n x
, f(x) is the
probability density function of X
e.g. Tossing a pair of 6-sided dice
X ≡ maximum of 2 values
S:
1, 1
1, 2
1
1, 6
2, 1
2
3
4
5
6
6, 6
Pr{X=3} = Pr{(1 3) (2 3) (3 3) (3 2) (3 1)} =
5
36
We can define several random variables on the same sample space
Def. Two random variables X and Y are independent if Pr{X = x|Y = y} = Pr{X = x}
I.e. Pr{X = x and Y = y} = Pr{X = x} Pr{Y = y}
Def. The expected value of X is E[X] =
 x  Pr{ X  x}
x
e.g. Tossing a fair coin 4 times
X ≡ # of heads came up
E[ X ]  0  Pr{ X  0}  1  Pr{ X  1}  2  Pr{ X  2}  3  Pr{ X  3}  4  Pr{ X  4}
4
4
4
4
 4  1 
 4  1 
 4  1 
 4  1 
 0        2      3      4  
1   2 
 2  2 
3  2 
 4  2 
4
 1   4  4  4  4
    (   2   3   4 )
 2  1   2   3   4 
4
1
    4  (1  1) 3
2
2
E[ X  Y ]  E[ X ]  E[Y ]
E[ g ( x)]   g ( x)  Pr( X  x)
x
E[aX ]  aE[ X ]
E[aX  Y ]  aE[ X ]  E[Y ]
For any 2 independent random variables X and Y
E[XY]=E[X]  E[Y]
Def. The variance of X is
Var[ X ]  E[ X  E[ X ] 2 ]
 E[ X 2  2 XE[ X ]  E 2 [ X ]]
 E[ X 2 ]  2 E[ XE[ X ]]  E 2 [ X ]
 E[ X 2 ]  E 2 [ X ]
When X and Y are independent
Var[X+Y] = Var[X] + Var[Y]
Bayes’s Theorem
Pr{ A}  Pr{B | A}
Pr{ A | B} 
Pr{B}
=
Pr{ A}  Pr{B | A}
Pr{ A}  Pr{B | A}  Pr{ A}  Pr{B | A}
Ex.Given a fair coin and a biased coin that always comes up heads. Suppose you
chose one coin at random, tossed it twice, and this coin came up heads twice.
What is the probability that it is biased?
Let X = 1 be choosing biased coin
X = 0 be choosing fair coin
Y be # of heads came up (= 2)
1
1
Pr( X  0) 
2
2
Pr(Y  2 | X  1)  1
Pr( X  1) 
Pr( X  1 | Y  2) 
=
Pr(Y  2 | X  0) 
1
4
Pr( X  1)  Pr(Y  2 | X  1)
Pr(Y  2)
Pr( X  1)  Pr(Y  2 | X  1)
Pr( X  0)  Pr(Y  2 | X  0)  Pr( X  1)  Pr(Y  2 | X  1)
1
1
2
=
1 1 1
  1
2 4 2
1
=2
5
8
=
4
5
Def. A Bernoulli trial is an experiment with outcome “success” of probability p, and
outcome “failure” of probability q = 1-p
Q: We have a sequence of Bernoulli trials. How many trials do we need for the
1st success?
Def. A geometric distribution g(p) is
Pr{X = k} = qk-1.p, see Fig 6.1 for its graphical representation.
E[X] =
1
,
p
We have n Bernoulli trials, how many successes occur in n trials?
Def. A binomial distribution b(k; n, p)
n
is    p k  (1  p) n k
k
n
 b(k ; n, p)  1 , see Fig. 6.2 for its graphical representation.
k 0
n
E[ X ]  E[ X i ]  n  p , where Xi describes the # of successes in ith trial
i 1
n
Var[ X ]  Var[ X i ]
i 1
n
  pq
i 1
 npq
Theorem
n
Pr{ X  k}     p k , X ≡ # of successes in n Bernoulli trials
k 
proof
see P.121
Corollary
Pr{X  k} ≡ the probability of at most k successes
≡ the probability of at least n-k failures
 n 
  (1  p ) n  k
 
n  k
n
    (1  p ) n  k
k
Probability analysis:
1. The birthday paradox : How many people must be in a room before there is a
good chance(>
1
) that two of them were born on the same day?
2
Suppose there are K people, with birthdays being b1, b2,, bk, and there are
n(=365) days in a year
1
i  j
n
Ai ≡ the event that person (i+1)’s birthday different from person j’s for all
Pr{bi  b j } 
ji
k 1
Bk ≡  Ai ≡ k people have different birthdays
i 1
Pr{Bk }  Pr{Bk 1 }  Pr{ Ak 1 | Bk 1 }
 Pr{B1 }  Pr{ A1 | B1 }  Pr{ A2 | B2 } Pr{ Ak 1 | Bk 1 }
 n 1  n  2   n  k 1
 1 
n
 n   n  
 1  2   k  1 
 1  1  1   1 
n 
 n  n  
Since 1  x  e x
1
1
1  e n
n
Pr{Bk }  e
e
1 2 3
k 1
   
n n n
n
k ( k 1)
2n
To let Pr{Bk } 
1
2
k (k  1)
1
 ln( )
2n
2
k (k  1)  2n  ln 2 , for n = 365, k  23
2. Toss identical balls into b bins
Suppose you toss n balls, how many balls fall in a given bin? n 
1
b
How many balls must one toss until every bin contains at least one ball?
hit
hit
hit
hit
hit
hit
○ × ○ × × ○ × × ○ … ○ …… ○
i
i+1
X i ≡ # of balls you toss after hit i before you reach hit i+1
E[ X 0 ]  1
E[ X 1 ] 
b
b 1
E[ X i ] 
b
bi
E[ X b 1 ]  b
b 1
E[ X ]   [ X i ]  b(1 
i 0
1 1
1
    )  b  ln b  O(1)
2 3
b
Exercise
6-2 in page 133
Z-transform
F ( Z )   f n Z n , where f n  Pr[ X  n]
n 0
ex. f n 
1
n!
Zn
 ez
n
!
n0
F (Z )  
F ( Z )  E[ Z K ]
dF ( Z )
| z 1   n  Z n1 f n
dZ
n 0
=X
d F (Z )
| Z 1   n(n  1)Z n2 f n
dZ
n 0
2
=  ( n 2  n) Z n  2  f n
n 0
= E[ X 2  X ]
 X2 X
  F (1)  F (1)  {F (1)}2
2
X
Let X  X 1  X 2    X n where Xi is independent
FX ( Z )  E[ Z X 1  X 2  X n ]
 E ( Z X 1 )  E ( Z X 2 ) E ( Z X n )
 FX 1 ( Z )  FX 2 ( Z ) FX n ( Z )
e.g. Poisson distribution
k
k 0
k!
FX 1 ( Z )  
 e  Z k
(Z ) k  
e
k!
k 0
 e ( Z 1)
FX ( Z )  e( 1  2  n )( Z 1)