Download 1 Computational Intelligence - Chair 11: ALGORITHM ENGINEERING

Lecture 01 Plan for Today Organization (Lectures / Tutorials) Computational Intelligence Overview CI Winter Term 2009/10 Introduction to ANN McCulloch Pitts Neuron (MCP) Minsky / Papert Perceptron (MPP) Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 2 Organizational Issues Lecture 01 Who are you? Organizational Issues Lecture 01 Who am I ? either studying “Automation and Robotics” (Master of Science) Module “Optimization” Günter Rudolph Fakultät für Informatik, LS 11 Guenter.Rudolph@tu-dortmund.de OH-14, R. 232 or studying “Informatik” - BA-Modul “Einführung in die Computational Intelligence” - Hauptdiplom-Wahlvorlesung (SPG 6 & 7) G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 3 ← best way to contact me ← if you want to see me office hours: Tuesday, 10:30–11:30am and by appointment G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 4 1 Lecture 01 Organizational Issues Lectures Wednesday 10:15-11:45 OH-14, R. E23 Tutorials Wednesday Thursday 16:15-17:00 16:15-17:00 OH-14, R. 304 OH-14, R. 304 Tutor Nicola Beume, LS11 Lecture 01 Prerequisites Knowledge about group 1 group 2 • mathematics, • programming, • logic is helpful. Information http://ls11‐www.cs.unidortmund.de/people/rudolph/ teaching/lectures/CI/WS2009‐10/lecture.jsp But what if something is unknown to me? Slides Literature • covered in the lecture • pointers to literature see web see web ... and don‘t hesitate to ask! G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 5 Overview “Computational Intelligence“ Lecture 01 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 6 Overview “Computational Intelligence“ Lecture 01 What is CI ? ⇒ umbrella term for computational methods inspired by nature • originally intended as a demarcation line ⇒ establish border between artificial and computational intelligence • artifical neural networks • evolutionary algorithms • term „computational intelligence“ coined by John Bezdek (FL, USA) backbone • nowadays: blurring border • fuzzy systems our goals: • swarm intelligence • artificial immune systems new developments 1. know what CI methods are good for! • growth processes in trees 2. know when refrain from CI methods! • ... 3. know why they work at all! 4. know how to apply and adjust CI methods to your problem! G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 7 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 8 2 Introduction to Artificial Neural Networks Lecture 01 Biological Prototype Introduction to Artificial Neural Networks Lecture 01 Abstraction ● Neuron human being: 1012 neurons - Information gathering (D) electricity in mV range - Information processing (C) speed: 120 m / s - Information propagation (A / S) nucleus / cell body dendrites axon synapse … axon (A) cell body (C) signal input nucleus dendrite (D) signal processing signal output synapse (S) G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 9 Introduction to Artificial Neural Networks Lecture 01 Model G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 10 Introduction to Artificial Neural Networks Lecture 01 1943: Warren McCulloch / Walter Pitts ● description of neurological networks x1 → modell: McCulloch-Pitts-Neuron (MCP) funktion f x2 f(x1, x2, …, xn) … ● basic idea: - neuron is either active or inactive - skills result from connecting neurons xn McCulloch-Pitts-Neuron 1943: xi  { 0, 1 } =: B ● considered static networks (i.e. connections had been constructed and not learnt) f: Bn → B G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 12 3 Lecture 01 Introduction to Artificial Neural Networks McCulloch-Pitts-Neuron Lecture 01 Introduction to Artificial Neural Networks McCulloch-Pitts-Neuron n binary input signals x1, …, xn n binary input signals x1, …, xn threshold  > 0 threshold  > 0 NOT x1 ≥0 y1 in addition: m binary inhibitory signals y1, …, ym boolean OR x1 ● if at least one yj = 1, then output = 0 x2 x2 ● otherwise: ≥1 ≥n ... ... ⇒ can be realized: boolean AND x1 xn xn =1 - sum of inputs ≥ threshold, then output = 1 else output = 0 =n G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 13 Introduction to Artificial Neural Networks Lecture 01 Analogons Neurons simple MISO processors (with parameters: e.g. threshold) Synapse connection between neurons (with parameters: synaptic weight) Topology interconnection structure of net Propagation working phase of ANN → processes input to output Training / Learning adaptation of ANN to certain data G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 14 Lecture 01 Introduction to Artificial Neural Networks Assumption: x1 inputs also available in inverted form, i.e. ∃ inverted inputs. x2 ≥ ⇒ x1 + x2 ≥  Theorem: Every logical function F: Bn → B can be simulated with a two-layered McCulloch/Pitts net. Example: x1 x2 x3 x1 x2 x3 x1 x4 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 15 ≥3 ≥3 ≥1 ≥2 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 16 4 Introduction to Artificial Neural Networks Lecture 01 Lecture 01 Introduction to Artificial Neural Networks Proof: (by construction) Generalization: inputs with weights Every boolean function F can be transformed in disjunctive normal form ⇒ 2 layers (AND - OR) 1. Every clause gets a decoding neuron with  = n ⇒ output = 1 only if clause satisfied (AND gate) x1 0,2 x2 0,4 0,2 x1 + 0,4 x2 + 0,3 x3 ≥ 0,7 fires 1 if ≥ 0,7 2 x1 + 0,3 2. All outputs of decoding neurons are inputs of a neuron with  = 1 (OR gate) 4 x2 + 3 x3 ≥ 7 ⇒ x3 · 10 duplicate inputs! q.e.d. x1 x2 ≥7 x3 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 17 Introduction to Artificial Neural Networks Lecture 01 ⇒ equivalent! G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 18 Introduction to Artificial Neural Networks Theorem: Lecture 01 Conclusion for MCP nets Weighted and unweighted MCP-nets are equivalent for weights ∈ Q+. + feed-forward: able to compute any Boolean function Proof: „⇒“ Let N + recursive: able to simulate DFA − very similar to conventional logical circuits Multiplication with yields inequality with coefficients in N − difficult to construct Duplicate input xi, such that we get ai b1 b2  bi-1 bi+1  bn inputs. − no good learning algorithm available Threshold  = a0 b1  bn „⇐“ Set all weights to 1. q.e.d. G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 19 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 20 5 Lecture 01 Introduction to Artificial Neural Networks Lecture 01 Introduction to Artificial Neural Networks =0 Perceptron (Rosenblatt 1958) AND NAND OR =1 NOR → complex model → reduced by Minsky & Papert to what is „necessary“ 1 → Minsky-Papert perceptron (MPP), 1969 0 What can a single MPP do? J 1 N 0 isolation of x2 yields: 0 J 1 N 0 1 XOR 1 Example: separating line 1 J  0 N 0 ? 0 separates R2 0 1 x1 x2 xor 0 0 0 ⇒0 < 0 1 1 ⇒ w2 ≥  1 0 1 ⇒ w1 ≥  1 1 0 ⇒ w1 + w2 <  1 Lecture 01 1969: Marvin Minsky / Seymor Papert ● book Perceptrons → analysis math. properties of perceptrons ⇒ w1 + w2 ≥ 2 contradiction! w1 x1 + w2 x2 ≥  in 2 classes G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 21 Introduction to Artificial Neural Networks w1, w2 ≥  > 0 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 22 Introduction to Artificial Neural Networks Lecture 01 how to leave the „dead end“: 1. Multilayer Perceptrons: x1 x2 x1 x2 ● disillusioning result: perceptions fail to solve a number of trivial problems! - XOR-Problem 2 1 ⇒ realizes XOR 2 - Parity-Problem 2. Nonlinear separating functions: - Connectivity-Problem ● „conclusion“: All artificial neurons have this kind of weakness!  research in this field is a scientific dead end! ● consequence: research funding for ANN cut down extremely (~ 15 years) with =0 g(0,0) = –1 g(0,1) = +1 g(1,0) = +1 g(1,1) = –1 1 0 0 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 23 g(x1, x2) = 2x1 + 2x2 – 4x1x2 -1 XOR 1 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 24 6 Introduction to Artificial Neural Networks Lecture 01 Lecture 01 Introduction to Artificial Neural Networks How to obtain weights wi and threshold  ? Perceptron Learning as yet: by construction Assumption: test examples with correct I/O behavior available example: NAND-gate Principle: x1 x2 NAND 0 0 1 ⇒0≥ 0 1 1 ⇒ w2 ≥  1 0 1 ⇒ w1 ≥  1 1 0 ⇒ w1 + w2 <  (1) choose initial weights in arbitrary manner requires solution of a system of linear inequalities (∈ P) (e.g.: w1 = w2 = -2,  = -3) (2) fed in test pattern (3) if output of perceptron wrong, then change weights (4) goto (2) until correct output for al test paterns graphically: now: by „learning“ / training → translation and rotation of separating lines G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 25 Introduction to Artificial Neural Networks Perceptron Learning Lecture 01 P: set of positive examples N: set of negative examples G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 26 Introduction to Artificial Neural Networks Example 1. choose w0 at random, t = 0 threshold as a weight: w = (, w1, w2)‘ 2. choose arbitrary x ∈ P ∪ N I/O correct! 4. if x ∈ P and wt‘x ≤ 0 then wt+1 = wt + x; t++; goto 2 let w‘x ≤ 0, should be > 0! (w+x)‘x = w‘x + x‘x > w‘ x 5. if x ∈ N and wt‘x > 0 then wt+1 = wt – x; t++; goto 2 let w‘x > 0, should be ≤ 0! (w–x)‘x = w‘x – x‘x < w‘ x ⇒ 3. if x ∈ P and wt‘x > 0 then goto 2 if x ∈ N and wt‘x ≤ 0 then goto 2 Lecture 01 1 - x1 w ≥0 x2 w1 2 suppose initial vector of weights is w(0) = (1, -1, 1)‘ 6. stop? If I/O correct for all examples! remark: algorithm converges, is finite, worst case: exponential runtime G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 27 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 28 7 Introduction to Artificial Neural Networks Lecture 01 Introduction to Artificial Neural Networks Lecture 01 ⇒ separates plane in two half planes We know what a single MPP can do. Single MPP What can be achieved with many MPPs? Many MPPs in 2 layers ⇒ can identify convex sets Many MPPs in 3 layers ⇒ can identify arbitrary sets ⇒ separates plane in two half planes Single MPP Many MPPs in 2 layers ⇒ can identify convex sets ⇒ 2 layers! 1. How? A B ⇐ 2. Convex? Many MPPs in > 3 layers ⇒ not really necessary! arbitrary sets: 1. partitioning of nonconvex set in several convex sets 2. two-layered subnet for each convex set 3. feed outputs of two-layered subnets in OR gate (third layer) ∀ a,b ∈ X:  a + (1-) b ∈ X for  ∈ (0,1) G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 29 G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10 30 8

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 Computational Intelligence - Chair 11: ALGORITHM ENGINEERING