Download 1 Computational Intelligence - Chair 11: ALGORITHM ENGINEERING

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Catastrophic interference wikipedia , lookup

Technological singularity wikipedia , lookup

AI winter wikipedia , lookup

Convolutional neural network wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Intelligence explosion wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Transcript
Lecture 01
Plan for Today
Organization (Lectures / Tutorials)
Computational Intelligence
Overview CI
Winter Term 2009/10
Introduction to ANN
McCulloch Pitts Neuron (MCP)
Minsky / Papert Perceptron (MPP)
Prof. Dr. Günter Rudolph
Lehrstuhl für Algorithm Engineering (LS 11)
Fakultät für Informatik
TU Dortmund
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
2
Organizational Issues
Lecture 01
Who are you?
Organizational Issues
Lecture 01
Who am I ?
either
studying “Automation and Robotics” (Master of Science)
Module “Optimization”
Günter Rudolph
Fakultät für Informatik, LS 11
Guenter.Rudolph@tu-dortmund.de
OH-14, R. 232
or
studying “Informatik”
- BA-Modul “Einführung in die Computational Intelligence”
- Hauptdiplom-Wahlvorlesung (SPG 6 & 7)
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
3
← best way to contact me
← if you want to see me
office hours:
Tuesday, 10:30–11:30am
and by appointment
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
4
1
Lecture 01
Organizational Issues
Lectures
Wednesday
10:15-11:45
OH-14, R. E23
Tutorials
Wednesday
Thursday
16:15-17:00
16:15-17:00
OH-14, R. 304
OH-14, R. 304
Tutor
Nicola Beume, LS11
Lecture 01
Prerequisites
Knowledge about
group 1
group 2
• mathematics,
• programming,
• logic
is helpful.
Information
http://ls11‐www.cs.unidortmund.de/people/rudolph/
teaching/lectures/CI/WS2009‐10/lecture.jsp
But what if something is unknown to me?
Slides
Literature
• covered in the lecture
• pointers to literature
see web
see web
... and don‘t hesitate to ask!
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
5
Overview “Computational Intelligence“
Lecture 01
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
6
Overview “Computational Intelligence“
Lecture 01
What is CI ?
⇒ umbrella term for computational methods inspired by nature
• originally intended as a demarcation line
⇒ establish border between artificial and computational intelligence
• artifical neural networks
• evolutionary algorithms
• term „computational intelligence“ coined by John Bezdek (FL, USA)
backbone
• nowadays: blurring border
• fuzzy systems
our goals:
• swarm intelligence
• artificial immune systems
new developments
1. know what CI methods are good for!
• growth processes in trees
2. know when refrain from CI methods!
• ...
3. know why they work at all!
4. know how to apply and adjust CI methods to your problem!
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
7
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
8
2
Introduction to Artificial Neural Networks
Lecture 01
Biological Prototype
Introduction to Artificial Neural Networks
Lecture 01
Abstraction
● Neuron
human being: 1012 neurons
- Information gathering
(D)
electricity in mV range
- Information processing
(C)
speed: 120 m / s
- Information propagation
(A / S)
nucleus /
cell body
dendrites
axon
synapse
…
axon (A)
cell body (C)
signal
input
nucleus
dendrite (D)
signal
processing
signal
output
synapse (S)
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
9
Introduction to Artificial Neural Networks
Lecture 01
Model
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
10
Introduction to Artificial Neural Networks
Lecture 01
1943: Warren McCulloch / Walter Pitts
● description of neurological networks
x1
→ modell: McCulloch-Pitts-Neuron (MCP)
funktion f
x2
f(x1, x2, …, xn)
…
● basic idea:
- neuron is either active or inactive
- skills result from connecting neurons
xn
McCulloch-Pitts-Neuron 1943:
xi  { 0, 1 } =: B
● considered static networks
(i.e. connections had been constructed and not learnt)
f: Bn → B
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
11
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
12
3
Lecture 01
Introduction to Artificial Neural Networks
McCulloch-Pitts-Neuron
Lecture 01
Introduction to Artificial Neural Networks
McCulloch-Pitts-Neuron
n binary input signals x1, …, xn
n binary input signals x1, …, xn
threshold  > 0
threshold  > 0
NOT
x1
≥0
y1
in addition: m binary inhibitory signals y1, …, ym
boolean OR
x1
● if at least one yj = 1, then output = 0
x2
x2
● otherwise:
≥1
≥n
...
...
⇒ can be realized:
boolean AND
x1
xn
xn
=1
- sum of inputs ≥ threshold, then output = 1
else output = 0
=n
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
13
Introduction to Artificial Neural Networks
Lecture 01
Analogons
Neurons
simple MISO processors
(with parameters: e.g. threshold)
Synapse
connection between neurons
(with parameters: synaptic weight)
Topology
interconnection structure of net
Propagation
working phase of ANN
→ processes input to output
Training /
Learning
adaptation of ANN to certain data
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
14
Lecture 01
Introduction to Artificial Neural Networks
Assumption:
x1
inputs also available in inverted form, i.e. ∃ inverted inputs.
x2
≥
⇒ x1 + x2 ≥ 
Theorem:
Every logical function F: Bn → B can be simulated
with a two-layered McCulloch/Pitts net.
Example:
x1
x2
x3
x1
x2
x3
x1
x4
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
15
≥3
≥3
≥1
≥2
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
16
4
Introduction to Artificial Neural Networks
Lecture 01
Lecture 01
Introduction to Artificial Neural Networks
Proof: (by construction)
Generalization: inputs with weights
Every boolean function F can be transformed in disjunctive normal form
⇒ 2 layers (AND - OR)
1. Every clause gets a decoding neuron with  = n
⇒ output = 1 only if clause satisfied (AND gate)
x1
0,2
x2
0,4
0,2 x1 + 0,4 x2 + 0,3 x3 ≥ 0,7
fires 1 if
≥ 0,7
2 x1 +
0,3
2. All outputs of decoding neurons
are inputs of a neuron with  = 1 (OR gate)
4 x2 +
3 x3 ≥
7
⇒
x3
· 10
duplicate inputs!
q.e.d.
x1
x2
≥7
x3
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
17
Introduction to Artificial Neural Networks
Lecture 01
⇒ equivalent!
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
18
Introduction to Artificial Neural Networks
Theorem:
Lecture 01
Conclusion for MCP nets
Weighted and unweighted MCP-nets are equivalent for weights ∈ Q+.
+ feed-forward: able to compute any Boolean function
Proof:
„⇒“
Let
N
+ recursive: able to simulate DFA
− very similar to conventional logical circuits
Multiplication with
yields inequality with coefficients in N
− difficult to construct
Duplicate input xi, such that we get ai b1 b2  bi-1 bi+1  bn inputs.
− no good learning algorithm available
Threshold  = a0 b1  bn
„⇐“
Set all weights to 1.
q.e.d.
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
19
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
20
5
Lecture 01
Introduction to Artificial Neural Networks
Lecture 01
Introduction to Artificial Neural Networks
=0
Perceptron (Rosenblatt 1958)
AND
NAND
OR
=1
NOR
→ complex model → reduced by Minsky & Papert to what is „necessary“
1
→ Minsky-Papert perceptron (MPP), 1969
0
What can a single MPP do?
J
1
N
0
isolation of x2 yields:
0
J
1
N
0
1
XOR
1
Example:
separating line
1
J

0 N
0
?
0
separates R2
0
1
x1
x2
xor
0
0
0
⇒0 <
0
1
1
⇒ w2 ≥ 
1
0
1
⇒ w1 ≥ 
1
1
0
⇒ w1 + w2 < 
1
Lecture 01
1969: Marvin Minsky / Seymor Papert
● book Perceptrons → analysis math. properties of perceptrons
⇒ w1 + w2 ≥ 2
contradiction!
w1 x1 + w2 x2 ≥ 
in 2 classes
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
21
Introduction to Artificial Neural Networks
w1, w2 ≥  > 0
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
22
Introduction to Artificial Neural Networks
Lecture 01
how to leave the „dead end“:
1. Multilayer Perceptrons:
x1
x2
x1
x2
● disillusioning result:
perceptions fail to solve a number of trivial problems!
- XOR-Problem
2
1
⇒ realizes XOR
2
- Parity-Problem
2. Nonlinear separating functions:
- Connectivity-Problem
● „conclusion“: All artificial neurons have this kind of weakness!
 research in this field is a scientific dead end!
● consequence: research funding for ANN cut down extremely (~ 15 years)
with
=0
g(0,0) = –1
g(0,1) = +1
g(1,0) = +1
g(1,1) = –1
1
0
0
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
23
g(x1, x2) = 2x1 + 2x2 – 4x1x2 -1
XOR
1
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
24
6
Introduction to Artificial Neural Networks
Lecture 01
Lecture 01
Introduction to Artificial Neural Networks
How to obtain weights wi and threshold  ?
Perceptron Learning
as yet: by construction
Assumption: test examples with correct I/O behavior available
example: NAND-gate
Principle:
x1
x2
NAND
0
0
1
⇒0≥
0
1
1
⇒ w2 ≥ 
1
0
1
⇒ w1 ≥ 
1
1
0
⇒ w1 + w2 < 
(1) choose initial weights in arbitrary manner
requires solution of a system of
linear inequalities (∈ P)
(e.g.: w1 = w2 = -2,  = -3)
(2) fed in test pattern
(3) if output of perceptron wrong, then change weights
(4) goto (2) until correct output for al test paterns
graphically:
now: by „learning“ / training
→ translation and rotation of separating lines
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
25
Introduction to Artificial Neural Networks
Perceptron Learning
Lecture 01
P: set of positive examples
N: set of negative examples
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
26
Introduction to Artificial Neural Networks
Example
1. choose w0 at random, t = 0
threshold as a weight: w = (, w1, w2)‘
2. choose arbitrary x ∈ P ∪ N
I/O correct!
4. if x ∈ P and wt‘x ≤ 0 then
wt+1 = wt + x; t++; goto 2
let w‘x ≤ 0, should be > 0!
(w+x)‘x = w‘x + x‘x > w‘ x
5. if x ∈ N and wt‘x > 0 then
wt+1 = wt – x; t++; goto 2
let w‘x > 0, should be ≤ 0!
(w–x)‘x = w‘x – x‘x < w‘ x
⇒
3. if x ∈ P and wt‘x > 0 then goto 2
if x ∈ N and wt‘x ≤ 0 then goto 2
Lecture 01
1 -
x1 w ≥0
x2 w1
2
suppose initial vector of
weights is
w(0) = (1, -1, 1)‘
6. stop? If I/O correct for all examples!
remark: algorithm converges, is finite, worst case: exponential runtime
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
27
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
28
7
Introduction to Artificial Neural Networks
Lecture 01
Introduction to Artificial Neural Networks
Lecture 01
⇒ separates plane in two half planes
We know what a single MPP can do.
Single MPP
What can be achieved with many MPPs?
Many MPPs in 2 layers ⇒ can identify convex sets
Many MPPs in 3 layers ⇒ can identify arbitrary sets
⇒ separates plane in two half planes
Single MPP
Many MPPs in 2 layers ⇒ can identify convex sets
⇒ 2 layers!
1. How?
A
B
⇐
2. Convex?
Many MPPs in > 3 layers ⇒ not really necessary!
arbitrary sets:
1. partitioning of nonconvex set in several convex sets
2. two-layered subnet for each convex set
3. feed outputs of two-layered subnets in OR gate (third layer)
∀ a,b ∈ X:
 a + (1-) b ∈ X
for  ∈ (0,1)
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
29
G. Rudolph: Computational Intelligence ▪ Winter Term 2009/10
30
8