Download Towards a Practical Approach to Discover Internal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Transcript
Towards a Practical Approach to Discover
Internal Dependencies in Rule-Based Knowledge
Bases
Roman Simiński, Agnieszka Nowak-Brzezińska,
Tomasz Jach, and Tomasz Xiȩski
University of Silesia, Institute of Computer Science
ul. Bȩdzińska 29, 41-200 Sosnowiec, Poland
{roman.siminski,agnieszka.nowak,
tomasz.jach,tomasz.xieski}@us.edu.pl
Abstract. In this paper, we intend to introduce the conception of discovering the knowledge about rules saved in large rule-based knowledge
bases, both generated automatically and acquired from human experts
in the classical way. This paper presents a preliminary study of a new
project in which we are going to join the two approaches: the hierarchical
decomposition of large rule bases using cluster analysis and the decision
units conception. Our goal is to discover useful, potentially implicit and
directly unreadable information from large rule sets.
Keywords: rule knowledge bases, inference, decision unit, cluster analysis, data mining.
1
Introduction
The last decade resulted in significant development of data mining methods,
tools and applications. Data mining and knowledge-discovery in databases is
the process of automatically searching large volumes of data for useful patterns
(typically rules) [1, 2]. We will assume that the result of data mining is a set
of rules, herein after referred as a knowledge base. From our point of view, the
resulting knowledge base is the most interesting feature, rather than the process
of acquiring rules from data, but usually we will have in mind the automatic
rules generation method based on the rough set theory [3–5].
When the rule sets have been induced from data they can be applied in several
ways, for example to classify the unseen cases in classifiers or to build the knowledge base of decision support systems. However, in many applications the process
of rule induction is in fact discovering previously unknown knowledge, discovered
rules are previously unknown and potentially surprising. Several hundred and
sometimes thousands of rules is a frequent result of a data mining process applied on the real-world data sets. The dependencies between rules from different
large rule sets are typically not simple and clear enough.
An interesting paradox arises: data mining, which was to bring out knowledge hidden in large databases, often discovers and provides large sets of rules,
J.T. Yao et al. (Eds): RSKT 2011, LNCS 6954, pp. 232–237, 2011.
c Springer-Verlag Berlin Heidelberg 2011
Towards a Practical Approach to Discover Internal Dependencies
233
which contain knowledge which is hard to understand and are directly useless for
domain experts. Thus, in many cases the methods of ”the nontrivial extraction of
implicit ... and potentially useful information from...” ”...large data sets...” produce large rule sets with nontrivial, probably useful, but unreadable knowledge
for human experts who attempt to verify the discovered knowledge.
The main goal of this paper is to introduce the conception of discovering the
knowledge about rules saved in large rule-based knowledge bases, both generated
automatically and acquired from human experts in the classical way. This paper
presents a preliminary study of a new project in which we are going to join the
two approaches: the hierarchical decomposition of large rule bases using cluster
analysis and the decision units conception. The proposed approach assumes that
we reorganize any attributive rule knowledge base from the set of not related
rules to groups of similar rules (using cluster analysis) [8] or decision units [6].
These structurs will be a source for extracting relevant information about the
rules, this information is useful for the understanding of the internal structure
of the knowledge base, as it describes the knowledge about knowledge saved in
the rule sets. We can use this information to extract the current decision model
implicitly stored in the rule sets, as well as for formal verifying and validating
the rules against the expert knowledge.
We plan to build a modular, hierarchically organized rule base using the cluster analysis method and decision units. These methods have been successfully
used in the optimization of the inference task [6] due to the analysis of the internal properties discovered in the rule sets [8, 12]. In this work we want to
introduce an extension of the clustered rules and the decision units oriented for
extraction of additional knowledge from rule-based knowledge bases. The practical goal of the project is the implementation of visual-oriented software tool
for knowledge engineering: a new, second version of kbBuilder system [12] and
HKB_Builder system [8].
2
Preliminaries and Basic Notation
The proposed approach is dedicated to the rule knowledge bases containing the
Horn clause rules, where literals are coded using attribute-value pairs. Such form
of rules representation is very popular and widely used in data mining. Let
be the rule knowledge base containing m rules: = {r1 , r2 , . . . , rm }, where
r : l1 ∧ l2 ∧ . . . ∧ ln → c and n is the number of conditional literals in the rule
r, li denotes the i-th conditional literal of r rule and c denotes the conclusion
literal of r rule. Let A = {a1 , a2 , . . . an } is a non-empty finite set of conditional
and decision attributes and for every a ∈ A the set Va is called the value set of
a: A = {a1 , a2 , . . . an }, Va = {v1a , v2a , . . . , vka }.
Each attribute in a ∈ A may be a conditional and/or a decision attribute.
Let literals of the rules from become (a, via ) where a ∈ A and via ∈ Va and
let notations (a, via ) and a = via be equivalent. We make no assumption about
rules in the set . This set can be a result of a data mining process as well as
it may be created by human expert or knowledge engineer, still it is possible
234
R. Simiński et al.
that contains errors and anomalies, we will use such a rule set for forward and
backward inference, which probably will create a deep inference path.
3
Decision Units as a Decision Model for Rule Base
The concept of decision units originally came into existence as a tool of rule base
verification, the decision units can be also considered as a simple tool for rule
knowledge base modeling. The concept of a decision unit is described in the [10–
12] and due to the limited space in this paper can not be presented in detail. The
idea of decision units allows us to divide a set of rules into subsets according to
a simple criterion: dividing by the elementary decision (the rules contained in
the subset have exactly this same attribute-value pair (a, v) in the conclusion),
dividing by the decision (the rules contained in the subset have exactly this
same attribute a in the each attribute-value pair (a, v) in the conclusion) or
dividing by the general decision (like above, but we do not take into account the
information about values appearing in the conclusions of the rules).
The decision unit is the triple (I, O, R) where R is the subset of rules (R ⊆ )
created according to one of the three criteria described above. O denotes the
set of the output entries of the decision unit, output entries are conclusions of
rules from R, I denotes the set of the input entries of decision unit, input entries
are the conditions of rules from R [12]. According to the selected rules dividing
criterion, we obtain three kinds of decision units [11]: elementary decision units
— denoted as triple U e = (I e , Oe , Re ), ordinal decision units — denoted as triple
U = (I, O, R), general decision units, related to ordinal decision units, but only
information about attributes are considered, denoted as triple U g = (I g , Og , Rg ).
In the case of the two kinds of decision units — elementary and ordinal — sets
I e , I and Oe , O contain attribute-value pairs (a, v). For the general decision units
we consider only the attributes, therefore I g and Og are the sets of attributes.
An attribute usually represents in knowledge base particular concept or property
from the real word. The value of an attribute express the determined meaning
of the concept or state of the property. If we consider a specified (a, v) pair, we
think about a particular kind of concept or about a particular property state.
Because in the considered case the pair (a, v) is a conclusion literal, attribute a
is a decision attribute. On the high level of abstraction, an elementary decision
i
tn the decision table [4].
units De is similar to i-th decision class XA
e
e
O represent concrete, the R describes reasoning about such concrete: we can
confirm (or not) hypothesis about concrete using goal-driven backward inference
or we can check the availability of concrete based on currently know fact using
data-driven forward inference. We can say that an elementary decision units is a
model of elementary decision about concrete from real-word concepts. An ordinary decision can be considered as the model of decision making about particular
concept or property from the real word. We consider decision about concepts to
be on higher level of abstraction than decision about meaning of concepts. Thus
the rule set R can be constructed as the composition of elementary decision units
De . A decision unit is similar to decision table DT = (U, A ∪ {d}) with single
decision attribute d.
Towards a Practical Approach to Discover Internal Dependencies
235
Sometimes it is interesting which relations the particular attribute a depends
on. It is especially useful when we attempt to discover hidden decision model
in an extensive data set which is not known at the early stage. Connections
between attributes can express the relations between concepts. For this reason,
we introduce general decision units U g . The connections between conclusion and
conditional literals are usually hidden in rule bases and only during an inference
we discover how sometimes deep the inference chains are. Such knowledge bases
generate a few decision units with connections between input and output entries
— in this way we obtain a decision units net. An example of the usage of the
properties of decision units in the inference optimization task is presented in [6,
7], same issues concerning rule base modeling contains [12].
4
Rules Clusters as a Decision Model for Rule Base
The clustering techniques have to be used whenever the size of input data exceeds
capabilities of the traditional methods of data analysis. This is the case especially
when we deal with a problem of large number of rules in knowledge bases in
modern typical decision support system, we should cluster rules in order to
increase the overall efficiency. Effective inference in such structures can be a
tremendous task usually overwhelming classical systems. This is why the authors
try to find some previously unknown connections in the rules which can aid to
create more effective systems.
At first, the rules are organized into groups of similar rules. Clustering is
considered optimal if each cluster consists of very similar rules and if different
clusters are easily distinguished with every another cluster. There is a vast number of methods to achieve this goal, among them there are the hierarchical and
the non-hierarchical clustering algorithms. Both try to divide the datasets into
the clusters, but the first one produces the hierarchical structure of the rules.
By organizing data into the clusters, several additional benefits are achieved.
Primo, additional information about analyzed data is acquired. The groups of
rules can be further analyzed in order to observe the additional patterns which
can be used to simplify the structure of the knowledge base either by simplifying
the rules or by reducing the number of them. This data would not be found if
the division into groups was not established.
Secundo, clustering rules is the faster method of finding relevant rules in the
knowledge base. It is an undeniable fact that the cluster analysis methods allow to look for the relevant rule much faster than browsing through the entire
knowledge base. In this case, the hierarchical methods are better because of the
possibility of finding the relevant rule by traversing through the tree of rules
made by the hierarchical algorithm: AHC as well as mAHC [6–8]. Every level
of a tree produces more accurate answer and, additionally, the number of comparisons needed is much less than in the classical systems. The concept of new
hierarchical model of the knowledge base is described in [6–8] and due to the
limits of this paper can not be presented in detail. Let us just remind that
it is represented by T ree = {w1 , · · · , w2n−1 } or T ree = {w1 , · · · , wk } where
236
R. Simiński et al.
k ≤ 2n − 1 (if we apply the mAHC algorithm which creates a smaller tree). It
is essential to note that each node in that tree: wi = {di , ci , f, i, j} (where i, j
are numbers of clustered groups) is the representative of the conditional parts
ci as well as the decisions di of the rules belonging to it. Also the similarity
values f : × → R|[0 · · · 1] of rules forming the cluster wi to each other is
stored. Clustering the rules from the knowledge base allows us to agglomerate
a set of the rules into the subsets according to a simple criterion of their inner
similarity. Apparently, each representant ci consists of all (or the most frequent)
literals ci : l1 ∧ l2 ∧ . . . ∧ lk → dj of the conditional parts of the rules clustered
in a cluster i (k is the number of conditional literals). At the first step of the
algorithm, having the set of rules from knowledge base (), each rule ri (rj ∈ )
represents a node wi in created T ree structure. In the second step, two most
similar rules (on the basis of comparison the set of conditional attributes and
their values) are found and linked in a cluster, which creates a new node at the
higher level of the T ree. We may say that all clusters from nodes that are not
leaves represent some kind of concepts. That is why the description of the rules
forming a cluster is just a connection of all (or the most frequent) pairs (a, v)
from conditional part of those rules. The process ends when there is only one
cluster with every rule from the system in it (AHC) or at the desired, previously
set level (mAHC)[6, 8].
Having rules organized into clusters shortens the time needed to find relevant
ones. Because the created structure is a kind of binary tree, we can state that
the time efficiency of searching such trees is O(log2 n). It means that the rule
interpreter in a given decision support system does not have to search the whole
knowledge base in the time efficiency of O(n) (like it is for the classical knowledge
bases). However, in order to have hierarchical structure computed (needed for
fast lookup of queries), firstly the clustering algorithm has to be executed. Time
complexity of typical hierarchical algorithms varies, but is often a member of
O(n2 ) class (noticeable fact: one must execute this algorithm only once).
5
Conclusions
The main purpose of this study was to introduce a foundations of practical approach to the internal dependencies discovery in rule-based knowledge bases. We
have presented the motivation and objectives of the project, next we have introduced the concept of the representation of knowledge using the clusters analysis
and the decision unit. We plan to build a modular, hierarchically organized rule
base using these methods. We paraphrase the classical definition of data mining
and we shall to introduce the the term knowledge mining — the clusters of rules
can be analyzed in order to discover the hidden patterns in the knowledge base,
the decision units allow us to discover, verify and improve the current decision
model. These methods have been successfully used in the optimization of the
inference task. We believe that the techniques described in this paper will allow
us to extract some kind of the new, metaknowledge from the large knowledge
bases. This information will be useful both in knowledge engineering and in the
Towards a Practical Approach to Discover Internal Dependencies
237
utilisation of rule bases within the decision support systems. In the first case we
expect that the proposed approach will result in improved the quality of knowledge base, in the second case the groving of efficiency of inference is expected.
The practical goal of the project is the implementation of visual-oriented software tool for knowledge engineering: a new, second version of kbBuilder system
and HKB_Builder system.
References
1. Frawley, W., Piatetsky-Shapiro, G., Matheus, C.: Knowledge Discovery in
Databases: An Overview. AI Magazine, 213–228 (Fall 1992)
2. Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)
3. Nguyen, H.S., Skowron, A.: Rough Set Approach to KDD (Extended Abstract).
In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.)
RSKT 2008. LNCS (LNAI), vol. 5009, pp. 19–20. Springer, Heidelberg (2008)
4. Skowron, A., Komorowski, H.J., Pawlak, Z., Polkowski, L.T.: A rough set perspective on data and knowledge. In: Handbook of Data Mining and Knowledge
Discovery, pp. 134–149. Oxford University Press, Oxford (2002)
5. Moshkov, M., Skowron, A., Suraj, Z.: On Minimal Rule Sets for Almost All Binary
Information Systems. Fundamenta Informaticae 80 (2007)
6. Nowak, A., Simiński, R., Wakulicz-Deja, A.: Towards modular representation of
knowledge base. In: Advances in Soft Computing, pp. 421–428. Physica-Verlag,
Springer Verlag Company, Heidelberg (2006)
7. Nowak, A., Simiński, R., Wakulicz-Deja, A.: Two-way optimizations of inference
for rule knowledge bases. In: Proceedings of International Conference CSP 2008,
Concurrency, Specification And Programming, September 29-October 1, vol. 3, pp.
398–409. Humboldt-Universität, Berlin (2008)
8. Nowak, A., Wakulicz-Deja, A.: The way of rules representation in composited
knowledge bases. In: Advanced In Intelligent and Soft Computing, Man - Machine
Interactions, pp. 175–182. Springer, Heidelberg (2009)
9. Nowak-Brzezińska, A., Jach, T., Xiȩski, T.: Wybór algorytmu grupowania a efektywność wyszukiwania dokumentów. Studia Informatica, Zeszyty Naukowe Politechniki Śląskiej 31(2A(89)), 147–162 (2010)
10. Simiński, R., Wakulicz-Deja, A.: Verification of Rule Knowledge Bases Using Decision Units. In: Advances in Soft Computing, Intelligent Information Systems, pp.
185–192. Physica Verlag, Springer Verlag Company, Heidelberg (2000)
11. Simiński, R., Wakulicz-Deja, A.: Decision units as a tool for rule base modeling
and verification. In: Advances in Soft Computing, Information Processing and Web
Mining, pp. 553–556. Springer Verlag Company, Heidelberg (2003)
12. Simiński, R.: Decision units approach in knowledge base modeling. In: Recent
Advances in Intelligent Information Systems, pp. 597–606. Academic Publishing
House EXIT, New Jersey (2009)