Download Intrusion Detection

Document related concepts
no text concepts found
Transcript
ecs236 Winter 2006:
Intrusion Detection
#3: Anomaly Detection
Dr. S. Felix Wu
Computer Science Department
University of California, Davis
http://www.cs.ucdavis.edu/~wu/
sfelixwu@gmail.com
01/04/2006
ecs236 winter 2006
1
Intrusion Detection
Model
Input event
sequence
Intrusion
Detection
Results
Pattern matching
01/04/2006
ecs236 winter 2006
2
Scalability of Detection
Number of signatures, amount of analysis
 Unknown exploits/vulnerabilities

01/04/2006
ecs236 winter 2006
3
Anomaly vs. Signature

Signature Intrusion (Bad things happen!!)
– Misuse produces observable bad effect
– Specify and look for bad behaviors

Anomaly Intrusion (Good things did not
happen!!)
– We know what our normal behavior is
– Looking for an deviation from the normal
behavior, raise early warning
01/04/2006
ecs236 winter 2006
4
Reasons for “AND”
Unknown attacks (insider threat)
 Better scalability

– AND  target/vulnerabilities
– SD  exploits
01/04/2006
ecs236 winter 2006
5
Another definition…

Signature-based detection
Convert
ourthe
limited/partial
– Predefine
signatures of anomalies
understanding/modeling
about the target system
– Pattern matching
or protocol into detection heuristics (i.e.,
BUTTERCUP signatures)

Statistics-based detection
– Buildon
statistics
profile for expected
Based
our experience,
select behaviors
a set of
– Compare testing
behaviors
expected behaviors
“features”
that will
likelywith
to distinguish
– Significant
deviation
expected
from
unexpected behavior.
01/04/2006
ecs236 winter 2006
6
What is “vulnerability”?
01/04/2006
ecs236 winter 2006
7
What is “vulnerability”?
Signature Detection
create “effective/strong/scaleable” signatures
Anomaly Detection
detect/discover “unknown vulnerabilities”
01/04/2006
ecs236 winter 2006
8
AND
(ANomaly Detection)
Unknown Vulnerabilities/Exploits
 Insider Attacks

Understand How and Why these things
happened
 Understand the limit of AND from both
sides

01/04/2006
ecs236 winter 2006
9
What is an anomaly?
01/04/2006
ecs236 winter 2006
10
Input Events
For each sample of the statistic measure, X
(0, 1]
(1, 3]
(3, 15]
(15, +)
40%
30%
20%
10%
SAND
qi
01/04/2006
ecs236 winter 2006
qi 1
11
“But, which feature(s) to profile??”
raw events
0
0 5 10 15 20 25 30
function
F
long term profile
quantify the anomalies
threshold control
01/04/2006
alarm generation
ecs236 winter 2006
12
What is an anomaly?
Anomaly Detection
Events
Expected Behavior Model
01/04/2006
ecs236 winter 2006
13
What is an anomaly?
Anomaly Detection
Events
Expected Behavior Model
Knowledge about the Target
01/04/2006
ecs236 winter 2006
14
Model vs. Observation
the Model
Anomaly Detection
Conflicts  Anomalies
It could be an attack, but it might well be misunderstanding!!
01/04/2006
ecs236 winter 2006
15
Statistic-based ANomaly Detection
(SAND)
choose a parameter (a random variable
hopefully without any assumption about its
probabilistic distribution)
 record its statistical “long-term” profile
 check how much, quantitatively, its shortterm behavior deviates from its long term
profile
 set the right threshold on the deviation to
raise alarms

01/04/2006
ecs236 winter 2006
16
timer control
update
decay
clean
0
0 5 10 15 20 25 30
long term profile
raw events
compute the
deviation
threshold control
01/04/2006
ecs236 winter 2006
alarm generation
17
Statistical Profiling

Long-Term profile:
 capture
long-term behavior of a particular
statistic measure
 e.g., update once per day
 half-life: 30 updates
 recent
50%
 31-60:
25%
 the newer contributes more
01/04/2006
30:
ecs236 winter 2006
18
Statistical Pros and Cons
Slower to detect - averaging window
 Very good for unknown attacks - as long as
“relevant measures” are chosen
 Environment (protocol, user, etc)
dependency

– Need good choices on statistical measures
– Statistical profiles might be hard to build
– Thresholds might be hard to set
01/04/2006
ecs236 winter 2006
19
Long-term Profile

Category, C-Training
learn the aggregate distribution of a statistic
measure

Q Statistics, Q-Training
learn how much deviation is considered normal

Threshold
01/04/2006
ecs236 winter 2006
20
Long-term Profile: C-Training
For each sample of the statistic measure, X



(0, 50]
(50, 75]
(75, 90]
(90, +)
20%
30%
40%
10%
k bins
Expected Distribution, P1 P2 ... Pk , where k
i 1 pi  1
Training time: months
01/04/2006
ecs236 winter 2006
21
Long-term Profile: Q-Training (1)
For each sample of the statistic measure, X



(0, 50]
(50, 75]
(75, 90]
(90, +)
20%
40%
20%
20%
k bins, Yi  samples fall intoi th bin
N samples in total ( ik1Yi  N  )
Weighted Sum Scheme with the fading factor s
01/04/2006
ecs236 winter 2006
22
Threshold

Predefined threshold, 
k (Y   N   p ) 2
i
i
Q


If Prob(Q>q) < , raise alarm
N p
i 1
i
0.08
Probability

TH_yellow
TH_red
0
0
5
10
15
20
25
30
Q bins
01/04/2006
ecs236 winter 2006
23
Long-term Profile: Q-Training (2)

Deviation:
Example:
(Yi N   pi )2
Q
N   pi
i 1
k
(2  10  0.2)2 (4  10  0.3)2 (2  10  0.4)2 (2  10  0.1)2
Q



 2.33
10  0.2
10  0.3
10  0.4
10  0.1

Qmax
the largest value among all Q values
01/04/2006
ecs236 winter 2006
24
Long-term Profile: Q-Training (3)

Q Distribution
[0, Qmax) is equally divided into 31 bins and
the last bin is [Qmax, +)
distribute all Q values into the 32 bins
01/04/2006
ecs236 winter 2006
25
Weighted Sum Scheme

Problems of Sliding Window Scheme
Keep the most recent N pieces of audit records
required resource and computing time are O(N)

Assume

When Ei occurs, update
K: number of bins
Yi  Yi  2   1
Yi: count of audit records falls

th
Y

Y

2
, ji
into i bin
j
j
k
N: total number of audit
N  i 1Yi  N  2   1
records
: fading factor
01/04/2006
ecs236 winter 2006
26
FTP Severs and Clients
FTP Client
SHANG
FTP Servers
Heidelberg
NCU
SingNet
UIUC
01/04/2006
ecs236 winter 2006
27
Q-Measure

Deviation:
Example:
(Yi N   pi )2
Q
N   pi
i 1
k
(2  10  0.2)2 (4  10  0.3)2 (2  10  0.4)2 (2  10  0.1)2
Q



 2.33
10  0.2
10  0.3
10  0.4
10  0.1

Qmax
the largest value among all Q values
01/04/2006
ecs236 winter 2006
28
qi
01/04/2006
ecs236 winter 2006
qi 1
29
Threshold

Predefined threshold, 
k (Y   N   p ) 2
i
i
Q


If Prob(Q>q) < , raise alarm
N p
i 1
0.08
Probability

TH_yellow
i
TH_red
0
0
5
10
15
20
25
30
False
positive
Q bins
01/04/2006
ecs236 winter 2006
30
0.2
0.2
0.16
0.16
0.14
0.14
0.12
0.12
0.1
0.08
0.1
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0
0
0
5
10
15
20
Q bins
25
30
0
35
0.2
5
10
15
20
Q bins
25
30
35
0.2
SingNet
0.18
UIUC
0.18
0.16
0.16
0.14
0.14
0.12
0.12
Probability
Probability
NCU
0.18
Heidelberg
Probability
Probability
0.18
0.1
0.08
0.1
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0
0
0
5
10
01/04/2006
15
20
Q bins
25
30
0
35
ecs236 winter 2006
5
10
15
20
Q bins
25
30
35
31
Mathematics

Many other techniques:
– Training/learning
– detection
01/04/2006
ecs236 winter 2006
32
timer control
update
decay
clean
0
0 5 10 15 20 25 30
long term profile
raw events
compute the
deviation
threshold control
01/04/2006
ecs236 winter 2006
alarm generation
33
Dropper Attacks
Intentional or Unintentional??
P%
01/04/2006
ecs236 winter 2006
Per
Ret
Ran
(K,I,S)
(K,S)
(K)
34
Periodical Packet Dropping

Parameters (K, I, S)
 K, the total number of dropped packets in a connection
 I, the interval between two consecutive dropped packets
 S, the position of the first dropped packet.

Example (5, 10, 4)
 5 packets dropped in total
 1 every 10 packets
 start from the 4th packet
 The 4th, 14th, 24th, 34th and 44th packet will be dropped
01/04/2006
ecs236 winter 2006
35
Retransmission Packet Dropping

Parameters (K, S)
K, the times of dropping the packet's retransmissions
S, the position of the dropped packet

Example (5, 10)
first, drops the 10th packet
then, drops the retransmissions of the 10th packet 5
times
01/04/2006
ecs236 winter 2006
36
Random Packet Dropping

Parameters (K)
K, the total number of packets to be dropped in a
connection

Example (5)
randomly drops 5 packets in a connection
01/04/2006
ecs236 winter 2006
37
Experiment Setting
FTP Client
FTP Server
FTP
xyz.zip 5.5M
Divert
Socket
Attack
Agent
Data Packets
01/04/2006
Internet
ecs236 winter 2006
38
Impacts of Packet Dropping On
Session Delay
300
260.3
250.9
250
Session Delay (s)
218.4
200
183.9
Normal
RanPD(7)
PerPD(7, 4, 5)
RetPD(7, 5)
150
125.8
108.2
98.6
100
86.9
77.1
56
63.4 66
62.6
44.6
50
23.6
26.5
0
Heidelberg
01/04/2006
NCU
SingNet
ecs236 winter 2006
UIUC
39
Compare Impacts of Dropping
Patterns
500
500
PerPD
RanPD
RetPD
Heidelberg
450
400
400
350
Session delay
Session delay
350
300
250
200
300
250
200
150
150
100
100
50
50
0
0
0
10
20
30
40
0
Number of victim packets
20
30
40
RanPD
RetPD
400
PerPD
UIUC
450
RanPD
RetPD
400
350
Session delay
350
PerPD:
I=4, S=5
RetPD:
S=5
500
PerPD
SingNet
450
Session delay
10
Number of victim packets
500
300
250
200
150
300
250
200
150
100
100
50
50
0
0
0
10
20
30
Number of victim packets
50
0
01/04/2006
0
-10
PerPD
RanPD
RetPD
NCU
450
40
40
0
10
20
30
40
Number of victim packets
ecs236 winter 2006
40
FTP server
fire
FTP client
FTP data
redwing
152.1.75.0
congestion
bone
172.16.0.0
UDP flood
light
192.168.1.0
TFN target
air
TFN master
01/04/2006
TFN agents
ecs236 winter 2006
41
12
12
flood 1, Stop 5
10
Number of Lost Packets
Number of Lost Packets
flood 1, Stop 20
8
6
4
2
10
8
6
4
2
0
0
0
20
40
60
Time (s)
80
100
0
20
80
100
80
100
12
12
flood 5, Stop 10
flood 5, Stop 2
10
Number of Lost Packets
Number of Lost Packets
40
60
Time (s)
8
6
4
2
0
10
8
6
4
2
0
0
01/04/2006
20
40
60
Time (s)
80
100
0
20
ecs236 winter 2006
40
60
Time (s)
42
TDSAM Experiment Setting
FTP Client
FTP Server
FTP
p1, p2, p3, p5, p4
max
TDSAM
reordering counting
xyz.zip 5.5M
Divert
Socket
Attack
Agent
Data Packets
01/04/2006
Internet
ecs236 winter 2006
43
0.2
0.2
0.16
0.16
0.14
0.14
0.12
0.12
0.1
0.08
0.1
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0
0
0
5
10
15
20
Q bins
25
30
0
35
0.2
5
10
15
20
Q bins
25
30
35
0.2
SingNet
0.18
UIUC
0.18
0.16
0.16
0.14
0.14
0.12
0.12
Probability
Probability
NCU
0.18
Heidelberg
Probability
Probability
0.18
0.1
0.08
0.1
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0
0
0
5
10
01/04/2006
15
20
Q bins
25
30
0
35
ecs236 winter 2006
5
10
15
20
Q bins
25
30
35
44
0.3
0.3
NCU
0.25
0.2
0.2
Probability
Probability
Heidelberg
0.25
0.15
0.15
0.1
0.1
0.05
0.05
0
0
0
5
10
15
20
25
30
35
0
5
10
15
Q bins
20
25
0.3
35
0.3
SingNet
UIUC
0.25
0.25
0.2
0.2
Probability
Probability
30
Q bins
0.15
0.1
0.05
0.15
0.1
0.05
0
0
0
5
10
15
20
25
30
35
0
Q bins
01/04/2006
5
10
15
20
25
30
35
Q bins
ecs236 winter 2006
45
Results: Position Measure
Position
Heidelberg
nbin=5
NCU
SingNet
UIUC
DR
MR
DR
MR
DR
MR
DR
MR
Normal*
-
4.0%
-
5.4%
-
3.5%
-
6.5%
-
PerPD
(10, 4, 5)
99.7%
0.3%
100%
0%
100%
0.0%
100%
0%
(20, 4, 5)
100%
0%
98.1%
1.9%
99.2%
0.8%
100%
0%
(40, 4, 5)
96.6%
3.4%
100%
0%
100%
0%
98.5%
1.5%
(20, 20, 5)
100%
0%
100%
0%
100%
0%
100%
0%
(20, 100, 5)
98.9%
1.1%.
99.2%
0.8%
99.6%
0.4%
99.1%
0.9%
(20, 200, 5)
0%
100%
76.5%
23.5%
1.5%
98.5%
98.3%
1.7%
(100, 40, 5)
0.2%
99.8%
0%
100%
0%
100%
100%
0%
RetPD
(5, 5)
84.9%
15.1%
81.1%
18.9%
94.3%
5.7%
97.4%
2.6%
RanPD
10
0%
100%
42.3%
57.7%
0%
100%
0%
100%
40
0%
100%
0%
100%
0%
100%
0%
100%
Intermittent
5
98.6%
1.4%
100%
0%
98.2%
1.8%
100%
0%
(10, 4, 5)
50
34.1%
65.9%
11.8%
88.2%
89.4%
10.6%
94.9%
5.1%
01/04/2006
ecs236 winter 2006
46
Results: Delay Measure
Delay
Heidelberg
NCU
SingNet
UIUC
nbin=3
DR
MR
DR
MR
DR
MR
DR
MR
Normal*
-
1.6%
-
7.5%
-
2.1%
-
7.9%
-
PerPD
(10, 4, 5)
97.4%
2.6%
95.2%
4.8%
94.5%
5.5%
99.2%
0.8%
(20, 4, 5)
99.2%
0.8%
98.5%
1.5%
100%
0%
100%
0%
(40, 4, 5)
100%
0%
100%
0%
100%
0%
100%
0%
(20, 20, 5)
96.3%
3.7%
100%
0%
92.6%
7.4%
98.9%
1.1%
(20, 100, 5)
100%
0%
95.3%
4.7%
98.7%
1.3%
100%
0%
(20, 200, 5)
98.6%
1.4%
99%
1%
97.1%
2.9%
100%
0%
(100, 40, 5)
100%
0%
100%
0%
100%
0%
100%
0%
RetPD
(5, 5)
100%
0%
100%
0%
100%
0%
100%
0%
RanPD
10
74.5%
25.5%
26.8%
73.2%
67.9%
32.1%
99.5%
0.5%
40
100%
0%
100%
0%
100%
0%
100%
0%
Intermittent
5
25.6%
74.4%
0%
100%
0%
100%
97.3%
2.7%
(10, 4, 5)
50
0%
100%
24.9%
75.1%
0%
100%
3.7%
96.3%
01/04/2006
ecs236 winter 2006
47
Results: NPR Measure
NPR
Heidelberg
nbin=2
NCU
SingNet
UIUC
DR
MR
DR
MR
DR
MR
DR
MR
Normal*
-
4.5%
-
5.8%
-
8.2%
-
2.9%
-
PerPD
(10, 4, 5)
0%
100%
14.4%
85.6%
29.1%
70.9%
100%
0%
(20, 4, 5)
83.1%
16.9%
94.2%
5.8%
95.2%
4.8%
100%
0%
(40, 4, 5)
100%
0%
97.4%
2.6%
100%
0%
100%
0%
(20, 20, 5)
91.6%
8.4%
92%
8%
93.5%
6.5%
100%
0%
(20, 100, 5)
94.3%
5.7%
92.2%
7.8%
96.4%
3.6%
100%
0%
(20, 200, 5)
0%
100%
96.5%
3.5%
94.8%
5.2%
100%
0%
(100, 40, 5)
100%
0%
100%
0%
100%
0%
100%
0%
RetPD
(5, 5)
0%
100%
84.7%
15.3%
23.9%
76.1%
46.5%
53.5%
RanPD
10
0%
100%
0%
100%
100%
0%
100%
0%
40
100%
0%
100%
0%
100%
0%
100%
0%
Intermittent
5
0%
100%
0%
100%
82.2%
17.8%
100%
0%
(10, 4, 5)
50
0%
100%
1%
99%
40%
60%
64.8%
35.2%
01/04/2006
ecs236 winter 2006
48
Results (good and bad)

False Alarm Rate
 less than 10% in most cases, the highest is 17.4%

Detection Rate
 Position: good on RetPD and most of PerPD

at NCU, 98.7% for PerPD(20,4,5), but 0% for PerPD(100, 40, 5) in
which dropped packets are evenly distributed
 Delay: good on those significantly change session delay, e.g.,
RetPD, PerPD with a large value of K

at SingNet, 100% for RetPD(5,5), but 67.9% for RanPD(10)
 NPR: good on those dropping many packets

01/04/2006
at Heidelberg, 0% for RanPD(10), but 100% for RanPD(40)
ecs236 winter 2006
49
Performance Analysis

Good sites correspond to a high detection rate.
 stable and small session delay or packet reordering
 e.g., using Delay Measure for RanPD(10): UIUC (99.5%) >
Heidelberg(74.5%) > SingNet (67.9%) > NCU (26.8%)

How to choose the value of nbin is site-specific
 e.g., using Position Measure, lowest false alarm rate occurs when nbin=
5 at Heidelberg(4.0%) and NCU(5.4%), 10 at UIUC(4.5%) and 20 at
SingNet(1.6%)
01/04/2006
ecs236 winter 2006
50
timer control
update
decay
clean
0
0 5 10 15 20 25 30
long term profile
raw events
compute the
deviation
threshold control
01/04/2006
ecs236 winter 2006
alarm generation
51
Information
Visualization
Toolkit
update
decay
clean
raw events
cognitive profile
cognitively
identify the
deviation
alarm identification
01/04/2006
ecs236 winter 2006
52
What is an anomaly?
01/04/2006
ecs236 winter 2006
53
What is an anomaly?

The observation of a target system is
inconsistent, somewhat, with the expected
conceptual model of the same system
01/04/2006
ecs236 winter 2006
54
What is an anomaly?

The observation of a target system is
inconsistent, somewhat, with the expected
conceptual model of the same system

And, this conceptual model can be
ANYTHING.
– Statistical, logical, or something else
01/04/2006
ecs236 winter 2006
55
Model vs. Observation
the Model
Anomaly Detection
Conflicts  Anomalies
It could be an attack, but it might well be misunderstanding!!
01/04/2006
ecs236 winter 2006
56
The Challenge
Anomaly Detection
False Positives & Negatives
Events
Expected Behavior Model
Knowledge about the Target
01/04/2006
ecs236 winter 2006
57
Challenge
We know that the detected anomalies can be
either true-positive or false-positive.
 We try all our best to resolve the puzzle by
examining all information available to us.
 But, the “ground truth” of these anomalies
is very hard to obtain

– even with human intelligence
01/04/2006
ecs236 winter 2006
58
Problems with AND
We are not sure about whatever we want to
detect…
 We are not sure either when something is
caught…
 We are still in the dark… at least in many
cases…

01/04/2006
ecs236 winter 2006
59
Anomaly Explanation

How will a human resolve the conflict?

The Power of Reasoning and Explanation
– We detected something we really want to detect
 reducing false negative
– Our model can be improved reduce false
positive
01/04/2006
ecs236 winter 2006
60
Without Explanation
AND is not as useful??
 Knowledge is the power to utilize
information!

– Unknown vulnerabilities
– Root cause analysis
– Event correlation
01/04/2006
ecs236 winter 2006
61
Anomaly Explanation
the Model
Anomaly Detection
Anomaly Analysis and Explanation
Explaining both the attack and
the normal behavior
EBL
01/04/2006
ecs236 winter 2006
62
Explanation
Experiments
Or
Observatinon
Simulation
Conflicts  Anomalies
01/04/2006
ecs236 winter 2006
63
observed system events
SBL-based
Anomaly
Detection
model
update
the Model
Explanation
Based
Learning
01/04/2006
model-based
event analysis
Example
Selection
ecs236 winter 2006
analysis
reports
64
AND  EXPAND

Anomaly Detection
– Detect
– Analysis and Explanation
– Application
01/04/2006
ecs236 winter 2006
65
01/04/2006
ecs236 winter 2006
66
01/04/2006
ecs236 winter 2006
67
01/04/2006
ecs236 winter 2006
68
01/04/2006
ecs236 winter 2006
69
Related documents