Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Complex Systems “GA and Walsh Functions Part-I A Gentle introduction”, › David E. Goldberg, › 1989, pages 129-152. › Dept of Engineering Mechanics, Univ of Alabama, USA. › chapter 3: A Primer on Bethke’s Walsh schema transform › William A. Green › CSD Univ of New-Orleans, USA. 2 Part I: An algebraic introduction. ← Part II: Overview of the Walsh Transform Part III: Walsh Analysis of Fitness Part IV: Walsh Coefficients Part V: Sumation 3 Vector spaces The v.s. of functions Basis of a vector space The Inner product of a v.s Orthonormal Basis. 4 Given the field F. A vector space V is a group of elements called vectors. With two operations: › Vector Addition 𝑉 × 𝑉 → 𝑉 (for every 𝑣, 𝑤 ∈ 𝑉 𝑣 + 𝑤 ∈ 𝑉) › scalar multiplication 𝐹 × 𝑉 → 𝑉 (for every 𝑣 ∈ 𝑉 𝛼 ∈ 𝐹 𝛼𝑣 ∈ 𝑉 we are given 0 ∈ 𝑉 ∀𝑣 ∈ 𝑉 0 + 𝑣 = 𝑣 called the Zero vector The following conditions are met: › Vector addition is commutative and associative › For all 𝑣 ∈ 𝑉 there is −𝑣 ∈ 𝑉 so that 𝑣 + −𝑣 = 0 › For all 𝑣, 𝑤 ∈ 𝑉 𝛼, 𝛽 ∈ 𝐹 𝛼𝛽 𝑣 = 𝛼 𝛽𝑣 𝛼 + 𝛽 𝑣 = 𝛼𝑣 + 𝛽𝑣 𝛼 𝑣 + 𝑤 = 𝛼𝑣 + 𝛼𝑤 1𝑣 = 𝑣 5 • • • • • Vector Addition for every 𝑣, 𝑤 ∈ 𝑉 𝑣 + 𝑤 ∈ 𝑉 Zero vector : 0 ∈ 𝑉 ∀𝑣 ∈ 𝑉 0 + 𝑣 =𝑣 Vector addition is commutative Vector addition is associative For all 𝑣 ∈ 𝑉 there is −𝑣 ∈ 𝑉 so that 𝑣 + −𝑣 = 0 Scalar Multiplication • • for every 𝑣 ∈ 𝑉 𝛼 ∈ 𝐹 𝛼𝑣 ∈ 𝑉 For all 𝑣, 𝑤 ∈ 𝑉 𝛼, 𝛽 ∈ 𝐹 • 𝛼𝛽 𝑣 = 𝛼 𝛽𝑣 • 𝛼 + 𝛽 𝑣 = 𝛼𝑣 + 𝛽𝑣 • 𝛼 𝑣 + 𝑤 = 𝛼𝑣 + 𝛼𝑤 • 1𝑣 = 𝑣 6 V is called Finitely Generated Vector space if: › there exists a finite set 𝑆 ⊆ 𝑉 › 𝑉 = 𝑠𝑝𝑆 = 𝑣 ∈ 𝑉 𝑣 𝑖𝑠 𝑎 𝑙𝑖𝑛𝑒𝑎𝑟 𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 𝑖𝑛 𝑆 If 𝑉 𝐹is finitely generated vector space › 𝐵 ⊆ 𝑉 is an ordered set › B is a Basis of V ⟺ : B spans V B is Linearly Independent. › 𝐵 = 𝑛 is called the dimension of V and every basis of V will have exactly n vectors. unique representation : for every 𝑣 ∈ 𝑉 there is a unique representation as a linear combination of vectors in B. 7 𝑏1 𝑏2 𝑏3 0𝑉 𝑣 𝜆2 𝜆1 … 𝜆3 … 𝜆𝑛 𝑏𝑛 𝐵 = 𝑏1 , 𝑏2 , 𝑏3 , … , 𝑏𝑛 is a basis for V ⇔ 1. 𝐵 spans 𝑉 For every 𝑣 ∈ 𝑉, 𝑣 = 𝜆1 𝑏1 + 𝜆2 𝑏2 + 𝜆3 𝑏3 + ⋯ + 𝜆𝑛 𝑏𝑛 𝜆𝑖 ∈ 𝐹 2. 𝐵 is linearly independent: 𝜆1 𝑏1 + 𝜆2 𝑏2 + 𝜆3 𝑏3 + ⋯ + 𝜆𝑛 𝑏𝑛 = 0𝑉 ⟹ 𝜆1 = 𝜆2 = 𝜆3 = ⋯ = 𝜆𝑛 = 0𝐹 8 Field R. X is the set of bit strings of length 𝑙 For 𝑙 = 3 𝑋 = 000,001,010, … , 111 F(X) are all the functions from X into R 𝐹(𝑋): 𝑋 → 𝑅 With the following point-wise operations: Vector addition: for all 𝑓, 𝑔 ∈ 𝐹(𝑋) ∀𝑥 ∈ 𝑋 (𝑓 9 𝜆2 0𝐹(𝑥) 𝜆1 𝑓1 000 001 010 011 ⋮ 111 𝑓 … … 𝜆𝑛 𝑓2 𝑓3 𝜆3 𝑅 : the field of reals 𝑓𝑛 With the following point-wise operations: Vector addition: for all 𝑓, 𝑔 ∈ 𝐹(𝑋) ∀𝑥 ∈ 𝑋 𝑓 + 𝑔 𝑥 = 𝑓 𝑥 + 𝑔 𝑥 Scalar multiplication: for all 𝑓 ∈ 𝐹 𝑋 𝛼 ∈ 𝑅 ∀𝑥 ∈ 𝑋 𝛼𝑓 𝑥 = 𝛼𝑓 𝑥 10 We shall look at the following set of functions: › 𝐵 = 𝛿𝑥 𝑦 𝑥 ∈ 𝑋 › 𝛿𝑥 𝑦 = 𝑥=𝑦 𝑒𝑙𝑠𝑒 B is a basis for F(X) › › › › 𝐹 𝑋 = 𝑠𝑝𝐵 Proof: 𝑓(𝑥) ∈ 𝐹 𝑋 than 𝑓 𝑦 = 𝑥∈𝑋 𝜆𝑓(𝑥) 𝛿𝑥 𝑦 𝜆𝑓(𝑥) = 𝑓(𝑥) B is linearly independent Proof: 𝑥∈𝑋 𝜆𝑥 𝛿𝑥 𝑦 = 0 is true for each w ∈ 𝑋 0= 1 0 𝑥∈𝑋 𝜆𝑥 𝛿𝑥 𝑤 = 𝜆𝑤 𝛿𝑤 𝑤 = 𝜆𝑤 Therefor the dim 𝐹 𝑥 = 𝑋 = 2𝑙 11 𝒙 𝑅 : the field of reals 000 001 010 011 ⋮ 111 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑥 = 𝑥1 ⋁𝑥2 ⋀ 𝑥2 ⋁𝑥3 𝒙𝟑 𝒙𝟐 𝒙𝟏 𝑓𝒊𝒕𝒏𝒆𝒔𝒔(𝒙) 0 000 1 1 001 2 2 010 2 3 011 2 4 100 0 5 101 1 6 110 1 7 111 2 = 𝛿0 𝑥 + 2𝛿1 𝑥 + 2𝛿2 𝑥 + 2𝛿3 𝑥 + 𝛿5 𝑥 + 𝛿6 𝑥 + 2𝛿7 𝑥 GA looks at 𝑋 a group of 𝑙 𝑏𝑖𝑡 𝑠𝑡𝑟𝑖𝑛𝑔𝑠 and evaluates it by a fitness func.𝑓 𝑥 ∈ 𝐹 X : X → 𝑅 𝑓 𝑥 is a linear combination of the basis of F(X). 𝑓 𝑗 = 2𝑙 −1 𝑖=0 𝑓𝑖 𝛿𝑖 𝑗 , 𝑓𝑖 ∈ 𝑅 , 𝑓𝑖 = 𝑓(𝑖) (𝑖, 𝑗 are treated and as integer or string) 𝐷𝑖𝑚 𝐹 𝑋 = 2𝑙 12 is a function of two arguments over the vector space into the field. › :𝑉 × 𝑉 → 𝑅 The inner product function must satisfy the following properties: linearity 𝛼𝑣 + 𝛽𝑤 𝑢 = 𝛼 𝑣 𝑢 + 𝛽 𝑤 𝑢 › Symetric 𝑣 𝑢 = 𝑢 𝑣 › › Positive definite 𝑢 𝑢 ≥ 0, 𝑢 𝑢 = 0 ⟺ 𝑢 = 0 13 𝑓𝑔 = 1 𝑋 𝑥∈𝑋 𝑓 𝑥 𝑔(𝑥) = 1 2𝑙 𝑥∈𝑋 𝑓 𝑥 𝑔(𝑥) 14 Set 𝑆 = 𝑓1 , … , 𝑓𝑘 is said to be orthonormal if: 0 𝑖≠𝑗 › 𝑓𝑖 𝑓𝑗 = 1 𝑖=𝑗 = 2𝑙 dim 𝐹 𝑋 Any orthonormal group of 2𝑙 vectors in 𝐹(𝑋) is a basis of 𝐹(𝑋) 15 Vector spaces The v.s. of functions Basis of a vector space The Inner product of a v.s Orthonormal Basis. 16 Part I: An algebraic introduction. Part II: Overview the Walsh Transform ← Part III: Walsh Analysis of Fitness Part IV: Walsh Coefficients Part V: Sumation 17 𝒙 000 001 010 011 ⋮ 111 𝑅 : the field of reals 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑥 = 𝑥1 ⋁𝑥2 ⋀ 𝑥2 ⋁𝑥3 𝒙𝟑 𝒙𝟐 𝒙𝟏 𝑓𝒊𝒕𝒏𝒆𝒔𝒔(𝒙) 0 000 1 1 001 2 2 010 2 3 011 2 4 100 0 5 101 1 6 110 1 7 111 2 GA receives a l-bit string and decides on its fitness with a fitness function. › 𝐹𝑖𝑡𝑛𝑒𝑠𝑠 𝑥 : 𝑋 → 𝑅 › GA is looking for maximum fitness In max-sat problems. Fitness would be number of satisfied clause › 𝑓𝑖𝑡𝑛𝑒𝑠𝑠_𝑜𝑓_max _𝑠𝑎𝑡(𝑥): 𝑋 → 𝑅 18 We shall define a special set of functions: › W= 𝜓1 , … , 𝜓2𝑙 › 𝜓𝑖 𝑥 : 𝑋 → 𝑅 W is a set of 2𝑙 orthonormal vectors W= 𝜓1 , … , 𝜓2𝑙 are called Walsh functions 19 The function 𝜓𝑗 (𝑥) masks 𝑥 with the bit representation of 𝑗 and returnes a value of 1 or +1. depending on the outcome of the bit−rep−of−j ⋀ bit−rep−of−x −1 𝜓𝑗 𝑥 = 1 𝑗⋀𝑥 𝑖𝑠 𝑜𝑑𝑑 𝑗⋀𝑥 𝑖𝑠 𝑒𝑣𝑒𝑛 𝑥 = 111 𝜓5 111 = ⋀ 2 is even ⇒ = 1 𝑗 = 101 𝑥 = 110 𝜓5 110 = ⋀ 1 is 𝑜𝑑𝑑 ⇒ = −1 𝑗 = 101 20 We shall define a linear function: › 𝑦𝑖 = 𝑦𝑖 𝑥𝑖 1 = 1 − 2𝑥𝑖 = −1 𝑥𝑖 = 0 𝑥𝑖 = 1 › 𝑦𝑖 is linear therefor 1:1 onto. ⇒ The inverse func is linear as well, 1:1 and onto. 𝑦𝑖 −1 𝑦𝑖 𝑥𝑖 1 2 = 𝑥𝑖 = (1 − 𝑦𝑖 ) 21 𝑙 𝑗𝑖 (𝑦 (𝑥 )) 𝑖 𝑖 𝑖=1 › 𝑗𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑖 𝑏𝑖𝑡 𝑖𝑛 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑗 𝜓𝑗 𝑥 = 𝑦𝑖 (𝑥𝑖 ) = 1 −1 𝑥𝑖 = 0 𝑥𝑖 = 1 An example of Walsh func l=3 𝜓1=001 𝑥 = (𝑦1 (𝑥1 )) 𝑗1 =1 (𝑦2 (𝑥2 )) 𝑗2 =0 (𝑦3 (𝑥3 )) 𝑗3 =0 𝜓5=101 𝑥 = (𝑦1 (𝑥1 )) 𝑗1 =1 (𝑦2 (𝑥2 )) 𝑗2 =0 (𝑦3 (𝑥3 )) 𝑗3 =1 𝒋 𝒋𝟑 𝒋𝟐 𝒋𝟏 0 000 1 1 1 001 𝑦1 (𝑥1 ) 𝑦1 2 010 𝑦2 (𝑥2 ) 𝑦2 3 011 𝑦1 (𝑥1 )𝑦2 (𝑥2 ) 𝑦1 𝑦2 4 100 𝑦3 (𝑥3 ) 𝑦3 5 101 𝑦1 (𝑥1 )𝑦3 (𝑥3 ) 𝑦1 𝑦3 6 110 𝑦2 (𝑥2 )𝑦3 (𝑥3 ) 𝑦2 𝑦3 7 111 𝑦1 (𝑥1 )𝑦2 (𝑥2 )𝑦3 (𝑥3 ) 𝑦1 𝑦2 𝑦22 3 𝜓𝑗 𝑥 𝜓𝑗 𝑥 𝐹 𝑋 : 𝑋 → 𝑅 is a vector space under 𝑅 In this vs there is an inner product function: 1 1 𝑓𝑔 = 𝑓 𝑥 𝑔(𝑥) = 𝑙 𝑓 𝑥 𝑔(𝑥) 𝑋 2 𝑥∈𝑋 𝑥∈𝑋 We want to show that 𝜓1 , … , 𝜓2𝑙 = 𝑊 ⊆ 𝐹(𝑋) is orthonormal 0 𝑖≠𝑗 › 𝜓𝑖 (𝑥) 𝜓𝑗 (𝑥) = 1 𝑖=𝑗 23 Easy part: 𝜓𝑗 (𝑥) 𝜓𝑗 (𝑥) = 1 𝜓𝑗 (𝑥) 𝜓𝑗 (𝑥) 1 = 𝑙 2 1 = 𝑙 2 1 = 𝑙 2 𝑙 𝑗𝑖 𝑖=1(𝑦𝑖 (𝑥𝑖 )) › 𝑗𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑖 𝑏𝑖𝑡 𝑖𝑛 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑗 𝜓𝑗 𝑥 = 𝑥∈𝑋 𝜓𝑗 𝑥 𝜓𝑗 𝑥 𝑥∈𝑋 𝜓𝑗 𝑥 2 1=1 𝑥∈𝑋 𝑦𝑖 (𝑥𝑖 ) = 1 −1 𝑥𝑖 = 0 𝑥𝑖 = 1 24 If 𝑗 ≠ 𝑘 they differ at some bit-possition 𝑖 Meaninig : › 𝜓𝑗 (𝑥) masks position 𝑖, and › 𝜓𝑘 (𝑥) does not (or the other way arround) Define a function › 𝑐𝑖 𝑥 : 𝑋 → 𝑋 › returns same bit string 𝑥 with flipped position 𝑖 › 𝑐𝑖 𝑥 is 1:1 and onto › Therefor invertible. 𝑥 = 𝑐𝑖 𝑐𝑖 𝑥 25 𝜓𝑗 (𝑥) 𝜓𝑘 (𝑥) 1 = 𝑙 𝜓𝑗 𝑥 𝜓𝑘 𝑥 2 𝑥∈𝑋 1 = 𝑙 2 1 = 𝑙 2 1 2𝑙 𝑥∈𝑋 𝑏𝑖𝑡−𝑖−𝑜𝑓 𝑥=0 𝑥∈𝑋 𝑏𝑖𝑡−𝑖−𝑜𝑓 𝑥=0 1 𝜓𝑗 𝑥 𝜓𝑘 𝑥 + 𝑙 2 1 𝜓𝑗 𝑥 𝜓𝑘 𝑥 + 𝑙 2 𝜓𝑗 𝑥 𝜓𝑘 𝑥 𝑥∈𝑋 𝑏𝑖𝑡−𝑖−𝑜𝑓 𝑥=1 𝜓𝑗 𝑐𝑖 𝑥 𝜓𝑘 𝑐𝑖 𝑥 𝑥∈𝑋 𝑏𝑖𝑡−𝑖−𝑜𝑓 𝑥=0 (𝜓𝑗 𝑥 𝜓𝑘 𝑥 + 𝜓𝑗 𝑐𝑖 𝑥 𝜓𝑘 𝑐𝑖 𝑥 ) = 0 𝑥∈𝑋 𝑏𝑖𝑡−𝑖−𝑜𝑓 𝑥=0 26 𝜆2 0𝐹(𝑥) … 000 001 010 011 ⋮ 111 𝜓1 𝜓23 𝜆1 𝜓2 𝜓3 𝜆3 … 𝜆𝑛 𝑅 : the field of reals 𝑓𝑖𝑡𝑛𝑒𝑠𝑠(𝑥) 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑥 = 𝑥1 ⋁𝑥2 ⋀ 𝑥2 ⋁𝑥3 𝑊 = 𝜓1 , … , 𝜓2𝑙 ⊆ 𝐹(𝑋) is orthonormal 0 𝑖≠𝑗 𝜓𝑗 (𝑥) 𝜓𝑗 (𝑥) = 1 𝑖=𝑗 27 W is a set of 2𝑙 orthonormal vectors in F(x). ⇒ W is a basis of 𝐹 𝑥 : 𝑋 → 𝑅. › Proof: 𝑊 = 2𝑙 › Linearity independence: 𝜆1 𝜓1 ⋯ 𝜆2𝑙 𝜓2𝑙 = 0 0 = 0 𝜓𝑗 (𝑥) = = 𝜆1 𝜓1 ⋯ 𝜆2𝑙 𝜓2𝑙 𝜓𝑗 (𝑥) =𝜆1 𝜓1 𝜓𝑗 (𝑥) + ⋯ + 𝜆𝑗 𝜓𝑗 𝜓𝑗 𝑥 = 𝜆𝑗 + ⋯ + 𝜆2𝑙 𝜓2𝑙 𝜓𝑗 𝑥 28 𝜆2 0𝐹(𝑥) … 000 001 010 011 ⋮ 111 𝜓1 𝜓23 … 𝜆1 𝜓2 𝜆𝑛 𝑅 : the field of reals 𝜓3 𝑓𝑖𝑡𝑛𝑒𝑠𝑠(𝑥) 𝜆3 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 𝑥 = 𝑥1 ⋁𝑥2 ⋀ 𝑥2 ⋁𝑥3 = 𝑤1 𝜓1 𝑥 + ⋯ + 𝑤23 𝜓23 (𝑥) = 𝑙 2 −1 𝑗=0 𝑤𝑗 𝜓𝑗 (𝑥) Therefor every 𝑓(𝑥) ∈ 𝐹(𝑥) › there are unique 𝑤1 , … , 𝑤2𝑙 ∈ 𝑅 that 𝑤1 𝜓1 ⋯ 𝑤2𝑙 𝜓2𝑙 = 𝑓 𝑥 › 𝑤𝑗 = 𝑓(𝑥) 𝜓𝑗 (𝑥) = 𝑤1 𝜓1 ⋯ 𝑤2𝑙 𝜓2𝑙 𝜓𝑗 (𝑥) =𝑤1 𝜓1 𝜓𝑗 (𝑥) + ⋯ + 𝑤𝑗 𝜓𝑗 𝜓𝑗 𝑥 + ⋯ + 𝑤2𝑙 𝜓2𝑙 𝜓𝑗 (𝑥) 𝑊𝑎𝑙𝑠ℎ 𝑝𝑜𝑙𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝑓 𝑥 = 2𝑙 −1 𝑗=0 𝑤𝑗 𝜓𝑗 (𝑥) 29 Part I: An algebraic introduction. Part II: Overview of the Walsh Transform Part III: Walsh Analysis of Fitness ← Part IV: Walsh Coefficients Part V: Sumation 30 Hadamard Martix • • • • • Before we go on… A related term. square martix entries are either +1 or −1 rows are mutually ortogonal geometric interpretation: this means that each pair of rows represent two perpendicular vectors Combinatorical interpretaion: it means that each pair of rows has matching entries in exactly half of their columns and mismatched entries in the remaining entries 31 32 Part I: An algebraic introduction. Part II: Overview of the Walsh Transform Part III: Walsh Analysis of Fitness ← Part IV: Walsh Coefficients Part V: Sumation 33 𝒙 Schema: Schema is a similarity subset. A set of strings with well defined similarity. Example : 𝑡ℎ𝑒 𝑠𝑐ℎ𝑒𝑚𝑎 ∗ 1 ∗ = 010,011,110,111 For a schema ℎ, its average fitness: 𝑓 ℎ = 𝑥∈ℎ 𝑓(𝑥) 𝒙𝟑 𝒙𝟐 𝒙𝟏 𝑓𝒊𝒕𝒏𝒆𝒔𝒔(𝒙) 0 000 1 1 001 2 2 010 2 3 011 2 4 100 0 5 101 1 6 110 1 7 111 2 𝑓 ∗1∗ = 1 2+2+1+2 4 = 1.75 ℎ 34 𝒋 𝒑𝒂𝒓𝒕𝒊𝒕𝒊𝒐𝒏 𝒋 Set of schema 0 ∗∗∗ 1 ∗∗ 𝑓 2 ∗𝑓∗ 3 ∗ 𝑓𝑓 4 𝑓 ∗∗ All schema ∗∗ 1,∗∗ 0,∗ 01,∗ 11,∗ 10,∗ 00,0 ∗ 0,0 ∗ 1,1 ∗ 0 1 ∗ 1, 𝑎𝑛𝑑 𝑎𝑙𝑙 𝑓𝑖𝑥𝑒𝑑 𝑠𝑐ℎ𝑒𝑚𝑎 ∗ 1 ∗,∗ 0 ∗,∗ 01,∗ 11,∗ 10,∗ 00,00 ∗, 01 ∗, 10 ∗ 11 ∗, 𝑎𝑛𝑑 𝑎𝑙𝑙 𝑓𝑖𝑥𝑒𝑑 𝑠𝑐ℎ𝑒𝑚𝑎 ∗ 00,∗ 01,∗ 10,∗ 11,100,000,101, 110,011,111,001,010 1 ∗∗, 0 ∗∗, 0 ∗ 1,1 ∗ 1,1 ∗ 0,0 ∗ 0,00 ∗, 01 ∗, 10 ∗ 11 ∗, 𝑎𝑛𝑑 𝑎𝑙𝑙 𝑓𝑖𝑥𝑒𝑑 𝑠𝑐ℎ𝑒𝑚𝑎 5 𝑓∗𝑓 6 𝑓𝑓 ∗ 7 𝑓𝑓𝑓 0 ∗ 0,0 ∗ 1,1 ∗ 0,1 ∗ 1,100,000,101, 110,011,111,001,010 00 ∗, 01 ∗, 10 ∗, 11 ∗, 100,000,101, 110,011,111,001,010 All the fixed schema 35 For a schema ℎ, its average fitness: 𝑓 ℎ = 1 = ℎ 1 = ℎ 𝑥∈ℎ 𝑓(𝑥) ℎ 2𝑙 −1 𝑥∈ℎ 2𝑙 −1 𝑗=0 𝑤𝑗 𝜓𝑗 (𝑥) 𝑗=0 𝑤𝑗 𝑥∈ℎ Explanation 1: 1 = ℎ = 2𝑙 −1 𝑗=0 ±ℎ 𝑤𝑗 0 𝜓𝑗 (𝑥) 𝜓𝟐 𝒑𝒂𝒓𝒕𝒏 2 ∗𝟏∗ 𝑓𝒊𝒕𝒏𝒆𝒔𝒔(𝒙) 𝑦2 ∗𝑓∗ 010 −1 𝑦2 ∗𝑓∗ 011 −1 𝑦2 ∗𝑓∗ 110 −1 𝑦2 ∗𝑓∗ 111 −1 𝜓𝟑 𝒑𝒂𝒓𝒕𝒏 3 ∗𝟏∗ 𝑓𝒊𝒕𝒏𝒆𝒔𝒔(𝒙) 𝑦2 𝑦1 ∗ 𝑓𝑓 010 −1 𝑦2 𝑦1 ∗ 𝑓𝑓 011 1 𝑦2 𝑦1 ∗ 𝑓𝑓 110 −1 𝑦2 𝑦1 ∗ 𝑓𝑓 111 1 ± ℎ 𝑖𝑓 ℎ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 0 𝑖𝑓 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝑖𝑓 ℎ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝑖𝑓 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝑥∈ℎ 𝜓𝑗 (𝑥) = 36 𝑓 ℎ 1 = ℎ 2𝑙 −1 𝑗=0 2𝑙 −1 = 𝑗=0 ±ℎ 𝑤𝑗 0 ±𝑤𝑗 0 𝑖𝑓 ℎ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝑖𝑓 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝑖𝑓 ℎ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝑖𝑓 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 2𝑙 −1 = 𝑗=0 𝑤𝑗 𝑠𝑖𝑔𝑛(ℎ, 𝑗) 𝜓𝟐 𝒑𝒂𝒓𝒕𝒏 ∗ 𝟏 ∗ 2 𝑓𝒊𝒕𝒏𝒆𝒔𝒔(𝒙) 𝑦2 ∗𝑓∗ 010 −1 𝑦2 ∗𝑓∗ 011 −1 𝑦2 ∗𝑓∗ 110 −1 𝑦2 ∗𝑓∗ 111 −1 𝜓𝟑 𝒑𝒂𝒓𝒕𝒏 ∗ 𝟏 ∗ 3 𝑓𝒊𝒕𝒏𝒆𝒔𝒔(𝒙) 𝑦2 𝑦1 ∗ 𝑓𝑓 010 −1 𝑦2 𝑦1 ∗ 𝑓𝑓 011 1 𝑦2 𝑦1 ∗ 𝑓𝑓 110 −1 𝑦2 𝑦1 ∗ 𝑓𝑓 111 1 Explanation 2 0 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 : 𝑠𝑖𝑔𝑛 ℎ, 𝑗 = 1 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 1𝑠 𝑖𝑛 ℎ⋀𝑗𝑖𝑠 𝑒𝑣𝑒𝑛 −1 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 1𝑠 𝑖𝑛 ℎ⋀𝑗𝑖𝑠 𝑜𝑑𝑑 37 𝒋 𝒑𝒂𝒓𝒕𝒊𝒕𝒊𝒐𝒏 𝒋 1 ∗∗ 𝑓 1 ∗∗ 𝑓 Set of schema ∗∗ 1,∗∗ 0,∗ 01,∗ 11,∗ 10,∗ 00,0 ∗ 0,0 ∗ 1,1 ∗ 0 1 ∗ 1, 𝑎𝑛𝑑 𝑎𝑙𝑙 𝑓𝑖𝑥𝑒𝑑 𝑠𝑐ℎ𝑒𝑚𝑎 𝑠𝑐ℎ𝑒𝑚𝑎 𝑛𝑜𝑡 𝑓𝑖𝑥𝑒𝑑 𝑎𝑠 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 1 ∗ 1 ∗,∗ 0 ∗,∗ 01,∗ 11,∗ 10,∗ 00,00 ∗, 01 ∗, 10 ∗ 𝑖𝑓 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 exists 𝑖 ∶a fixed bit position in 𝑝𝑎𝑡𝑟𝑖𝑡𝑖𝑜𝑛 − 𝑗 (𝑓) but ℎ is not fixed. = › 𝑥∈ℎ 𝜓𝑗 (𝑥) › 𝜓𝑗 (𝑥)+ 𝜓𝑗 (𝑥) 𝑥∈ℎ 𝑥∈ℎ 𝑖−𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 𝑥 =0 𝑖−𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 𝑥 =1 › 𝜓𝑗 𝑥∈ℎ 𝑖−𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 𝑥 =0 𝑥 + 𝜓𝑗 𝑥∈ℎ 𝑖−𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛 𝑥 =0 𝑐𝑖 𝑥 =0 38 ± ℎ 𝑖𝑓 ℎ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗← 𝑥∈ℎ 𝜓𝑗 (𝑥) = 0 𝑖𝑓 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 ℎ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 ⇒ › All fixed positions in ℎ › Match the 𝑓 positions in 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 › Match the 𝑦𝑖 in 𝜓𝑗 𝑥 = › 𝑥∈ℎ 𝜓𝑗 (𝑥) = −ℎ › = ℎ 𝑥∈ℎ 𝑙 𝑗𝑖 𝑖=1(𝑦𝑖 (𝑥𝑖 )) 𝑙 𝑗𝑖 (𝑦 (𝑥 )) 𝑖 𝑖 𝑖=1 ℎ⋀𝑗 ℎ𝑎𝑣𝑒 𝑜𝑑𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓𝑓𝑖𝑥𝑒𝑑 1𝑠 ℎ⋀𝑗 ℎ𝑎𝑣𝑒 𝑒𝑣𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑖𝑥𝑒𝑑 1𝑠 39 𝑥∈ℎ 𝜓𝑗 (𝑥) = 𝑥∈ℎ 𝑙 𝑗𝑖 (𝑦 (𝑥 )) 𝑖=1 𝑖 𝑖 ℎ ∉ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 ⇒ 𝑠𝑖𝑔𝑛 ℎ, 𝑗 = 0 1 ℎ ℎ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 ⇒ 𝑠𝑖𝑔𝑛 ℎ, 𝑗 = 𝑥∈ℎ 𝜓𝑗 (𝑥) −1 𝑜𝑑𝑑 1𝑠 𝑖𝑛 ℎ⋀𝑗 = 1 𝑒𝑣𝑒𝑛 1𝑠 𝑖𝑛 ℎ⋀𝑗 40 𝑙 = 3, 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 3 =∗ 𝑓𝑓, 𝜓3 𝑥 = 𝑦1 𝑥1 𝑦2 𝑥2 Scheme ℎ = ∗ 11 ∈ ∗ 𝑓𝑓 𝑠𝑖𝑔𝑛 ℎ, 3 = 1 (𝑦 2 1 1 𝑦2 1 +𝑦1 1 𝑦2 1 ) = 1 Scheme ℎ = ∗ 01 ∈ ∗ 𝑓𝑓 𝑠𝑖𝑔𝑛 ℎ, 3 = 1 (𝑦1 2 1 𝑦2 0 +𝑦1 1 𝑦2 0 ) = −1 41 2𝑙 −1 𝑓 ℎ = 𝑗=0 = 𝑤𝑗 𝑠𝑖𝑔𝑛 ℎ, 𝑗 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 ℎ∈𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝑤𝑗 𝑠𝑖𝑔𝑛(ℎ, 𝑗) 42 Schame 11 ∗ ∈ 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠: 0,2,4,6 › Because 11 ∗ is not fixed same as: › ∗∗∗,∗ 𝑓 ∗, 𝑓 ∗∗, 𝑓𝑓 ∗ 𝑓 11 ∗ = 𝑤0 𝑠𝑖𝑔𝑛 11 ∗,∗∗∗ + 𝑤2 𝑠𝑖𝑔𝑛 11 ∗,∗ 𝑓 ∗ + 𝑤4 𝑠𝑖𝑔𝑛 11 ∗, 𝑓 ∗∗ + 𝑤6 𝑠𝑖𝑔𝑛 11 ∗, 𝑓𝑓 ∗ = 𝑤0 − 𝑤2 − 𝑤4 + 𝑤6 43 𝒋 𝒑𝒂𝒓𝒕𝒊𝒕𝒊𝒐𝒏 𝒋 Set of schema 0 ∗∗∗ 1 ∗∗ 𝑓 3 ∗ 𝑓𝑓 All schema ∗∗ 1,∗∗ 0,∗ 01,∗ 11,∗ 10,∗ 00 0 ∗ 0, 111,000, … … ∗ 00,∗ 01,∗ 10,∗ 11,100,000,101, 110,011,111,001,010 Schema *** Fitness average as sum of Walsh coef 𝑤0 ∗∗ 0 𝑤0 + 𝑤1 ∗∗ 1 𝑤0 − 𝑤1 1 ∗∗ 𝑤0 − 𝑤4 11 ∗ 𝑤0 − 𝑤2 − 𝑤4 + 𝑤6 101 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 − 𝑤4 + 𝑤5 − 𝑤6 + 𝑤7 44 Part I: An algebraic introduction. Part II: Overview of the Walsh Transform Part III: Walsh Analysis of Fitness Part IV: Walsh Coefficients ← Part V: Sumation 45 𝒋 𝒑𝒂𝒓𝒕𝒊𝒕𝒊𝒐𝒏 𝒋 Set of schema 0 ∗∗∗ 1 ∗∗ 𝑓 3 ∗ 𝑓𝑓 All schema ∗∗ 1,∗∗ 0,∗ 01,∗ 11,∗ 10,∗ 00 0 ∗ 0, 111,000, … … ∗ 00,∗ 01,∗ 10,∗ 11,100,000,101, 110,011,111,001,010 Schema *** Fitness average as sum of Walsh coef 𝑤0 ∗∗ 0 𝑤0 + 𝑤1 ∗∗ 1 𝑤0 − 𝑤1 1 ∗∗ 𝑤0 − 𝑤4 11 ∗ 𝑤0 − 𝑤2 − 𝑤4 + 𝑤6 101 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 − 𝑤4 + 𝑤5 − 𝑤6 + 𝑤7 46 Schema Fitness average as sum of Walsh coef *** 𝑤0 **0 𝑤0 + 𝑤1 **1 𝑤0 − 𝑤1 1** 𝑤0 − 𝑤4 11* 𝑤0 − 𝑤2 − 𝑤4 + 𝑤6 101 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 − 𝑤4 + 𝑤5 − 𝑤6 + 𝑤7 Low order schema have few terms High order schema have many terms 47 ∆𝑓 is a recursive definition: › › ∆𝑓 of ∗∗∗ 𝑤0 ∆𝑓 of ℎ is the fitness difference of schema ℎ from its lower order ∆𝑓 › 𝑓 ∗∗∗ = 𝑤0 𝑓 ∗∗ 1 = 𝑤0 − 𝑤1 𝑓 ∗ 0 ∗ = 𝑤0 + 𝑤2 𝑓 ∗ 01 = 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 schema ∗ 01 First order estimation is: 𝑤0 − 𝑤1 + 𝑤2 Second order estimation is −𝑤3 ∆𝑓 𝑜𝑓 (∗ 01) is −𝑤3 Example: › › › › › Interpretation: walsh coeff represent the ℎ𝑖𝑔ℎ𝑒𝑟 𝑎𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑠𝑐ℎ𝑒𝑚𝑎 𝑓𝑖𝑡𝑛𝑒𝑠𝑠 − 𝑛𝑒𝑥𝑡 𝑙𝑜𝑤𝑒𝑟 𝑜𝑟𝑑𝑒𝑟 𝑎𝑝𝑟𝑜𝑥 48 An alternative way to get the 𝑤𝑗 walsh coefficients. › 𝑤𝑗 = 𝑓(𝑥) 𝜓𝑗 (𝑥) = 𝑤1 𝜓1 ⋯ 𝑤2𝑙 𝜓2𝑙 𝜓𝑗 (𝑥) =𝑤1 𝜓1 𝜓𝑗 (𝑥) + ⋯ + 𝑤𝑗 𝜓𝑗 𝜓𝑗 𝑥 › 𝑓 ℎ = + ⋯ + 𝑤2𝑙 𝜓2𝑙 𝜓𝑗 𝑥 𝑥∈ℎ 𝑓(𝑥) ℎ › 𝑓 ℎ = 𝑎𝑙𝑙 𝑓𝑖𝑥𝑒𝑑 𝑠𝑐ℎ𝑒𝑚𝑎 = Schema 𝑥∈ℎ 𝑓(ℎ) ℎ = ℎ 𝑓(ℎ) ℎ = 𝑓(ℎ) Fitness average as sum of Walsh coef 000 𝑤0 + 𝑤1 + 𝑤2 + 𝑤3 + 𝑤4 + 𝑤5 + 𝑤6 + 𝑤7 001 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 + 𝑤4 − 𝑤5 + 𝑤6 − 𝑤7 101 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 − 𝑤4 + 𝑤5 − 𝑤6 + 𝑤7 49 X 000 001 010 011 100 101 110 111 50 51 𝑓 𝑥3 𝑥2 𝑥1 = 10 + 5𝑥1 − 10𝑥2 + 0.1𝑥3 𝑥𝑖 ∈ 0,1 A linear bit functions: › Will receive max at 101 to take only positive values. › bit-wise contribution to maximum. › Two bit together do not contribution differently than each one by itself. › Therefor all the 2-or-more bit 𝑗 than 𝑤𝑗 = 0 j x 𝒇(𝒙) 𝒘𝐣 0 000 10 7.55 1 001 15 −2.5 2 010 0 5 3 011 5 0 4 100 10.1 −0.05 5 101 15.1 0 6 110 0.1 0 7 111 5.1 0 j x 𝒇(𝒙) 𝒘𝒋 1 001 15 −2.5 2 010 0 5 3 011 5 0 52 Part I: An algebraic introduction. Part II: Overview of the Walsh Transform Part III: Walsh Analysis of Fitness Part IV: Walsh Coefficients Part V: Sumation ← 53 Walsh functions are a basis of the vector space 𝐹 𝑋 :𝑋 → 𝑅 2𝑙 −1 𝑗=0 𝑤𝑗 𝑊𝑎𝑙𝑠ℎ 𝑝𝑜𝑙𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝑓 𝑥 = Fitness average of schema: › 𝑓 ℎ = 𝑥∈ℎ 𝑓(𝑥) ℎ = 𝑗 𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 ℎ∈𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛 𝑗 𝜓𝑗 (𝑥) 𝑤𝑗 𝑠𝑖𝑔𝑛(ℎ, 𝑗) Fast walsh transform Schema Fitness average as sum of Walsh coef *** 𝑤0 1** 𝑤0 − 𝑤4 11* 𝑤0 − 𝑤2 − 𝑤4 + 𝑤6 101 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 − 𝑤4 + 𝑤5 − 𝑤6 + 𝑤7 54 55 𝑓 𝑥 = 𝑥2 Will receive max at 111 is 49. Estimation of the first order schema: 000, 001, 010, 100 is : 𝑓 1 111 = 𝑤0 − 𝑤1 − 𝑤2 − 𝑤4 = 17.5 + 3 + 7.5 + 14 = 42 Second order schema: 𝑓 2 111 = 𝑤0 − 𝑤1 − 𝑤2 + 𝑤3 − 𝑤4 + 𝑤5 + 𝑤6 = 17.5 + 3 + 7.5 + 1 + 14 + 2 + 4 = 49 j x 𝒇(𝒙) 𝒘𝐣 0 000 0 17.5 1 001 1 −3 2 010 4 −7.5 3 011 9 1 4 100 16 −14 5 101 25 2 6 110 36 4 7 111 49 0 j x 𝒇(𝒙) 𝒘𝒋 1 001 1 −3 2 010 4 −7.5 4 100 9 −14 ∆𝑓 2 = 7, is positive. Only contributes to prev sum. This indicates type of problem will be easy for GA to solve. 56 We shall explain the term by an example. Lets consider a 2 bit problem. Our schema are shown below. En example of Deception is a fitness-function : › Maximum is at 11 › But 𝑓 ∗ 0 > 𝑓 ∗ 1 or 𝑓 0 ∗ > 𝑓(1 ∗) › It would be better deception if both could take place, but its impossible. Order of schema Schema 0 (∗∗) 1 2 ∗0 00 01 ∗1 10 0 ∗ (1 ∗) 11 ← 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 57 𝑓 ∗ 0 > 𝑓 ∗ 1 ⟹ 𝑤0 + 𝑤1 > 𝑤0 − 𝑤1 ⟹ 𝑤1 > 0 𝑓 0 ∗ > 𝑓 1 ∗ ⟹ 𝑤0 + 𝑤2 > 𝑤0 − 𝑤2 ⟹ 𝑤2 > 0 Both are impossible because: 𝑓 11 > 𝑓 00 ⟹ 𝑤0 − 𝑤1 − 𝑤2 + 𝑤3 > 𝑤0 + 𝑤1 + 𝑤2 + 𝑤3 ⟹ 𝑤1 + 𝑤2 < 0 We will pick the deception 𝑓 ∗ 0 > 𝑓 ∗ 1 ⟹ 𝑤1 > 0 Other equations: 𝑓 11 > 𝑓 01 ⟹ 𝑤0 − 𝑤1 − 𝑤2 + 𝑤3 > 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 ⟹ 𝑤2 − 𝑤3 < 0 𝑓 11 > 𝑓 10 ⟹ 𝑤0 − 𝑤1 − 𝑤2 + 𝑤3 > 𝑤0 − 𝑤1 + 𝑤2 − 𝑤3 ⟹ 𝑤2 − 𝑤3 < 0 58 Building-blocks: a component that fits with others to form a whole. The Walsh coeff fit with the intuitive understanding of building blocks. That higher schema are built from their lower order schema. 59 Schema: Schema is a similarity subset. A set of strings with well defined similarity. Example : 𝑡ℎ𝑒 𝑠𝑐ℎ𝑒𝑚𝑎 ∗ 1 ∗ = 010,011,110,111 We shall note a schema as › ℎ = ℎ1 … ℎ2𝑙 , ℎ𝑖 ∈ ∗, 0,1 60