Download Integration of Deduction and Induction for Mining Supermarket

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
,QWHJUDWLRQRI'HGXFWLRQDQG,QGXFWLRQIRU0LQLQJ
6XSHUPDUNHW6DOHV'DWD
)RVFD*LDQQRWWL*LXVHSSH0DQFR
0LUFR1DQQL'LQR3HGUHVFKL)UDQFR7XULQL
CNUCE – CNR
Via S. Maria 36, I-56126 Pisa, Italy
F.Giannotti@cnuce.cnr.it,
G.Manco@cnuce.cnr.it
Dipartimento di Informatica, Università di Pisa
Corso Italia 40, I-56125 Pisa, Italy
nnanni@di.unipi.it, pedre@di.unipi.it,
turini@di.unipi.it
$EVWUDFW
Datasift, a prototype system for the analysis of supermarket sales data based
on data mining techniques, is presented. In market basket analysis, mining
knowledge on customer behavior, which is actually useful to support
marketing actions, is a difficult task, which requires non-trivial methods of
employing and combining the data mining tools. To this purpose, we
propose an architecture that integrates the deductive capabilities of a logicbased database language, LDL++, with the inductive capabilities of diverse
data mining algorithms and tools, notably association rules. This paper
presents an application of the integrated system to market basket analysis,
and illustrates the high degree of expressiveness reached in dealing with
high-level business rules.
027,9$7,216$1'2%-(&7,9(6
Market basket analysis is rapidly becoming a key factor of success in the highly competing scene
of big supermarket retailers. The reason of this fact lies in the consideration that a clear-cut
knowledge of customers and their purchasing behavior brings potentially huge added value to
retail companies. Consider the chart of fig.1 (source: Databank), which shows the population of
the light, medium and top classes of customers, compared to the economical relevance of the
same classes. As an example, we can observe that less of 20% of customers are worth 80% of the
company’s revenue: a precise characterization of the high-spending classes of customers, as well
as other classes and niches, is therefore an essential piece of knowledge to develop effective
marketing strategies.
/LJKW
0HGLXP
KRZPDQ\FXVWRPHUV
7RS
KRZPXFKWKH\VSHQG
)LJXUHDistribution of Purchases.
However, knowledge on purchasing behavior can hardly be discovered using only traditional
aggregates or OLAP tools. Any analysis based on aggregated information on sold/delivered
goods abstracts away from the essential information contained in each basket – the cash register
transaction which represents the collection of items in a single purchase of a customer. A more
effective approach is based on data mining algorithms and tools, notably association rules, which
allow discovering relations between items purchased in the same basket [3, 2]. A famous example
of association rule mined from supermarket sales data is that beer and diapers are often likely to
be together in the same basket [5]. This form of information is more useful for describing the
different purchasing habits of customers. Still, however, association rules are often too low-level
to be directly used as a support of marketing decisions. Market analysts expect answers to more
general questions, such as “Is supermarket assortment adequate for the company’s target class of
customers?” “Is a promotional campaign effective in establishing a desired purchasing habit in
the target class of customers?” These are business rules, and association rules are necessary,
albeit insufficient, basic mechanisms for their construction. Business rules require also the ability
of combining association rule mining with deduction, or reasoning: reasoning on the temporal
dimension, reasoning at different levels of granularity of products, reasoning on association rules
themselves.
The objective of this paper is precisely to demonstrate how a suitable integration of deductive
reasoning, such as that supported by logic database languages, and inductive reasoning, such as
that supported by association rules, provides a viable solution to many high-level problems in
market basket analysis. We briefly present Datasift, a prototype system for the analysis of
supermarket sales data, based on an architecture that integrates the deductive capabilities of a
logic-based database language, LDL++ [1], with the inductive capabilities of diverse data mining
algorithms and tools, notably association rules. The effectiveness of the proposed system in
dealing with the various aspects of market basket analysis is discussed through a series of
examples, covering data preparation, rule extraction, postprocessing, and business rules. The
application has been developed in collaboration with COOP Toscana Lazio, a division of one of
the major Italian supermarket companies.
The deductive/inductive approach taken in this paper exhibits similarities with the various
proposals of integration of data mining with databases by extending SQL to support mining
operators. For instance, the query language DMQL [8] extends SQL with a collection of
operators for mining different forms of rules. Other similar approaches include the M-SQL
proposal [9], the mine operator of [11], as well as the query flocks proposal of [13]. However, we
argue that combining association rules with the deductive capabilities of a logic query language
yields a higher degree of expressiveness and flexibility, which is crucial in addressing problems
such as those posed by market basket analysis. The examples discussed later in this paper show
how this extra expressiveness is essentially due to the fact that deductive rules provide a uniform
logic-based notation for data, queries, constraints, association rules and business rules, which
support sophisticated forms of reasoning, and helps in bridging the gap between association rule
mining and market basket analysis.
Inductive
Actions
Deductive
Actions
Knowledge
Base
DMBS
)LJXUHintegration of deduction and induction for intelligent data analysis.
29(59,(:2)6<67(0$5&+,7(&785(
The Datasift system adopts a hierarchical client-server web-based architecture, in which:
• the client component (8VHU ,QWHUIDFH) queries the knowledge base (by mean of the query
engine) and builds the suitable representation metaphors (either graphical or textual);
• a first-level server component (4XHU\(QJLQH) maintains the knowledge base, by integrating
multiple heterogeneous data sources and by using the other server components to update the
local database by means of the available analysis services;
• a second-level server component consisting of the data mining modules and the integrated
inductive-deductive query language;
• a third-level server component supporting the access to external databases (DB2 Universal
Database Server and Oracle 8 Server.)
&OLHQW$JHQW
The main role of the client component (8VHU ,QWHUIDFH) is to build suitable representation
metaphors. Two kinds of possible client interactions are provided: through a web browser (that
provides a textual representation of the results), and through a Microsoft Excel application (that is
used for both textual and graphical representation). In both cases, the interaction with the server
is realized by HTTP connections.
User Inteface
Query Engine
DBMS
Data Mining
Modules
LDL++
Server
Inductive
Modules
)LJXUHarchitecture of Datasift.
The interaction with the query engine is provided by means of CGI scripts, that provide the
requested data. The following classes of possible interactions are supported in the current version
of the prototype:
• Extraction of association rules from a database.
• Computation of time series and time evolution of association rules.
• Computation of association rules on the basis of economical relevance.
• Clustering.
We chose to develop a web-based interaction for two main reasons:
1. It is platform-independent, and allows different clients to run on different hosts.
2. It allows the use of standard components, such as the HTTP protocol for the communication
between client and server, thus making it easier to develop an interface.
)LJXUHWeb interface (main page).
An example snapshot of the current user interface is given in fig.4. On the left side a list of the
available analyses is provided. This interface is trivially extendible e.g. by inserting java applets,
which provides some navigation capabilities over output data.
An example of user interface for the input of parameters for the analysis is the following, used in
the computation of association rules (whose meaning will be made clearer in later sections).
)LJXUHParameters input interface (association rules page)
However, a more accurate representation metaphor is currently implemented by means of a
Microsoft Excel application that, in addition to the web-based interaction capabilities previously
described, allows us to use standard representation metaphors, whose flexibility is well known. In
this perspective, the Excel application simply downloads the results of the analyses in some pivot
tables, and builds standard visualization metaphors on the basis of such tables.
6HUYHU$JHQW
The 4XHU\(QJLQH is an interface to the clients via Web, whose role is to maintain
• a set of relevant statistics over the main database, obtained by directly querying the data, and
• the results of the rule extraction queries committed to the deductive-inductive modules.
The query engine plays a crucial role in the implementation of caching strategies, especially
when the analyses are related to massive amount of data. Notice that the maintenance of a local
database of results is very important in order to implement incremental mining refinement
techniques [4], that is planned as a future extension.
In order to populate the database of the current rules, the query engine dispatches to the relative
modules the necessary extraction queries. These include specialized data mining algorithms,
which do not need preprocessing/refinement steps, and the Deductive-Inductive module, based on
a deductive database language.
Deductive databases are database management systems whose query languages and storage
structures are designed around a logical model of data. The underlying technology is an extension
to relational databases that increases the power of the query language. Among the other features,
the rule-based extensions support the specification of queries using recursion and negation.
In Datasift we adopted the LDL++ deductive database system, which provides, in addition to the
typical deductive features, a highly expressive query language [6, 15] obtained by introducing the
non-deterministic choice operator and XY-stratified negation [14], still preserving its declarative
nature [7].
A general way of dealing with data mining in a deductive framework is to directly define queries
LDL++, that is particularly appropriate as a general purpose description language. For example,
in order to define a typical (two-level) association rule query, the following LDL++ toy program
can be defined:
SDLU,,FRXQW7!←EDVNHW7,EDVNHW7,
UXOHV,,←SDLU,,&&≥
(1)
The first rule generates and counts all the possible pairs, and the second one selects the pairs with
sufficient support (i.e., at least 2). For example, with the following definitions of EDVNHW,
EDVNHWILVK
EDVNHWEUHDG
EDVNHWEUHDG
EDVNHWPLON
EDVNHWRQLRQV
EDVNHWILVK
EDVNHWEUHDG
EDVNHWRUDQJH
EDVNHWPLON
By querying UXOHV;< we obtain the answers UXOHVPLONEUHDG,
UXOHVEUHDGPLON and UXOHVILVKEUHDG and UXOHVEUHDGILVK.
From the expressiveness viewpoint, the LDL++ language is flexible enough to let define a degree
of interestingness of the rules based not necessarily on the notions of support and confidence (as
usually defined in the standard frameworks). Better, other interesting measures can be defined,
such as financial measures or general rules that are not intended to model frequent marginals in a
discrete (multinomial) probability distribution. For example, if we are interested in discovering
the associations between the cities and the products where the sales decreased of more than 30%
w.r.t. the average, we can define the following query:
DYHUDJHDYJ6DOHV!←VDOHV&LW\3URGXFW'DWH6DOHV
DYJBFS&LW\3URGXFWDYJ6DOHV!←
VDOHV&LW\3URGXFW'DWH6DOHV
DQVZHU&LW\3URGXFW←DYHUDJH$DYJBFS&LW\3URGXFW3
3≥×$
The first rule computes the average on the whole sales. The second rule computes the averages
related to the tuples &LW\3URGXFW! , and the third rule selects the relevant rules.
In addition to the above-exemplified features, the LDL++ system has an open architecture, able
to interface with a variety of external systems and components. This feature reveals very useful
for integrating the core language with new functionalities, such as induction algorithms.
In particular, we propose a technique that allows using specialized algorithms (and hence
specialized data structures), but still allows an easy integration with the language. The proposed
architecture considers some inductive modules as aggregates, when the formal specification of
the mining task is compatible with such hypothesis. For example, in the case of association rules,
which we consider in this paper, we write rules of the form:
UXOHVSDWWHUQVPLQBVXSSPLQBFRQI;«;Q!←T;«;P
where the aggregate SDWWHUQV is implemented with some typical algorithm for the computation
of the association rules, and PLQBVXSS and PLQBFRQI are required to be bound. The predicate
UXOHV is then instantiated by quadruples /56& where / and 5 represent the rule /⇒5
with (normalized) support 6 and (normalized) confidence &, such that 6≥PLQBVXSS and &≥
PLQBFRQI.
The following program that computes the two-dimensional association rules gives a
straightforward example:
UXOHVSDWWHUQV,,!←EDVNHW7,EDVNHW7,
(2)
Here, by querying UXOHV;<6&, we obtain the following tuples:
UXOHVPLONEUHDG UXOHVEUHDGPLON
UXOHVILVKEUHDG UXOHVEUHDGILVK
In the last example, differently from the previous one, an ad hoc induction algorithm directly
computes the patterns, and the deductive engine has the only task of exchanging the data to the
inductive engine on demand.
The main advantage of the approach is that an external ad hoc induction algorithm computes the
patterns (by applying ad hoc optimizations), while the deductive engine has the only task of
exchanging the data with the inductive engine on demand. It is important to stress that the use of
logic covers representation of datasets, algorithm invocation, and result representation, while the
algorithmic issues are left to the external induction tools. Rule (2) computes association rules over
the selected columns of the view defined in the right part of the rule, by using an ad hoc
algorithm (such as, e.g., Apriori [3]). Such an approach is quite different from that exemplified by
rule (1), where the algorithmic issues are specified at a higher abstraction level, but with obvious
performance decay.
$0$5.(7%$6.(7$1$/<6,6$33/,&$7,21
We now instantiate the proposed deductive/inductive system in the specific domain of market
basket analysis. After discussing the database schema, we concentrate on the data preparation, or
pre-processing, phase, the rule extraction phase, and the post-processing phase. We finally see
how these phases can be combined to perform more complex analysis, which we call EXVLQHVV
UXOHV.
'DWDEDVHVFKHPD
Figure 6 illustrates the entities involved in the domain of market basket analysis. The main entity
is the cash-register transaction, corresponding to a basket of items (products) purchased by a
customer at a given time in a given location (store). Time, location and, if available, purchaser are
dimensions which allow multiple granularities and aggregations. Analogously, items are also
organized into hierarchies: single products are grouped into families, families are grouped into
sectors, sectors are grouped into departments, and so on.
Category Attrs.
Time Dimension
Cash Transactions
Many Time Attributes
Category Key
Product Dimension
Product Key
Purchaser Dimension
Purchaser Attributes
Location Dimension
Purchaser Key
Purchaser
Key
Location Key
Transaction Key
Measures
Total
Country
Location Key
Unit
Sales
unit_sales
Region
Location Key
)LJXUHdatabase schema
3UHSURFHVVLQJGDWDSUHSDUDWLRQ
By exploiting the deductive framework, we can easily express many useful queries that partition
and group data in order to focus the analysis on a particular subset and/or a particular view:
•
6HOHFWLRQZUWYDOXHVGLPHQVLRQV.
We can partition data in segments, according to some specified values or in order to
enhance/hide some particular properties of a specific dimension. For example the following
rules specify a partition w.r.t. a time ontology (where the LQWHUYDO facts specify a
predefined granularity we are interested to).
LQWHUYDOODEHO
LQWHUYDOODEHOQ
SDUWLWLRQ/DEHO7LG,WHP←
FDVKBWUDQVDFWLRQ7LG'DWH«,WHP
LQWHUYDO/DEHO6WDUW(QG
6WDUW≤'DWH'DWH≤(QG
Analogously, we can cut off from the transactions a set of uninteresting items specified by
some constraints (e.g., ,WHP≠SODVWLFEDJ).
•
+LHUDUFKLHV.
Usually, single items in a retail context are grouped into categories, such as RUDQJHV, ELVFXLWV
and PLON, which in turn are grouped in super-categories, such as IUXLW, EDNHU\ and PLON DQG
GHULYDWLYHV, and so on. In general, hierarchies allow the creation of abstraction of items, and
consequently of the transactions built on such items. The discovered information can be
generalized/instantiated, by navigating along such hierarchy. Additional significant
knowledge can be obtained by comparing generalizations with instantiations [12].
An abstraction can be easily modeled in a logic-based language, as the following example
shows. Suppose the relation FDWHJRU\;< expresses the membership of ; to category
<:
FDWHJRU\RUDQJHIUXLW
FDWHJRU\EDQDQDIUXLW
FDWHJRU\IUXLWIRRG
FDWHJRU\IRRGDOO
LWHPVBDEVWUDFWLRQ7LG,WHP←EDVNHW7LG,WHP
LWHPVBDEVWUDFWLRQ,7LG$EV,WHP←
LWHPVBDEVWUDFWLRQ,7LG,WHP
FDWHJRU\,WHP$EV,WHP
Now, we can mine knowledge (in our context, association rules) at a given abstraction level , by
focusing on LWHPVBDEVWUDFWLRQ,«.
Another way of exploiting the existing hierarchy is to focus the analysis to a given category of
products. We can obtain this by simply selecting the items that fall in the category, or in a subcategory which in turn falls in the first one. In the following example we define an LVBD relation
as the transitive closure of FDWHJRU\, and then exploit it in order to focus on the IUXLW
category:
LVBD;<←FDWHJRU\;<
LVBD;<←FDWHJRU\;=LVBD=<
IRFXVHGBGDWDVHW7LG,WHP←EDVNHW7LG,WHP
LVBD,WHPIUXLW
5XOHH[WUDFWLRQ
Once the dataset is ready to be mined, we can apply the new mining aggregation operator. We
adopt a slight modification of the schema presented in sect.2.2, in order to deal with itemsets of
general size:
WUDQVDFWLRQBVHWV,G,WHP!←EDVNHW,G,WHP
UXOHVSDWWHUQVPLQBVXSSPLQBFRQI , 7VHW!←
WUDQVDFWLRQBVHWV,G7VHW
The first rule collects all the baskets from the EDVNHW relation. The second rule extracts the
relevant rules from the collection of baskets, by applying the induction algorithm. The picture in
fig.7 displays the visualization performed by the client engine on a sample dataset. The bar chart
represents the one-to-one association rules, where the height of the bar is proportional to the
support of the rule.
4
3,5
3
2,5
6XSSRUW
2
1,5
1
0,5
Butter
Tuna Fish
Chocolate
Milk
Sugar 2
Bread Specialties
Tinned Tomatoes
Pasta
/HIW
Christmas Cake
Lemon
0
5LJKW
)LJXUHvisualization of Association Rules.
3RVWSURFHVVLQJ
Since the computed association rules are tuples of a relation, we can manipulate them in the usual
ways. We can then, for example, select only the one-to-one rules that involve items from the
same category:
ORFDOBUXOHV/HIW5LJKW6XSS&RQI←
UXOHV^/HIW`^5LJKW`6XSS&RQI
FDWHJRU\/HIW&DWHJRU\
FDWHJRU\5LJKW&DWHJRU\.
%XVLQHVVUXOHV
By a tighter coupling of the data preparation and post-processing steps, we can specify business
high-level rules in a straightforward way. We discuss below some interesting examples.
•
:KLFKUXOHVVXUYLYHGHFD\XSRUGRZQWKHSURGXFWKLHUDUFK\"
To extract the rules which are preserved in an abstraction step, we can compute rules
separately at each abstraction level, and select those which occur both at a level , and at level
,:
LWHPVHWBDEVWUDFWLRQ,7LG,WHP!←
LWHPVBDEVWUDFWLRQ,7LG,WHP
UXOHVBDWBOHYHO,SDWWHUQ6&,WHPVHW!←
LWHPVHWBDEVWUDFWLRQ,7LG,WHPVHW
SUHVHUYHGBUXOHV/HIW5LJKW←
UXOHVBDWBOHYHO,/HIW5LJKWBB
UXOHVBDWBOHYHO,/HIW5LJKWBB
VHWBLVBD/HIW/HIWVHWBLVBD5LJKW5LJKW
•
Here, VHWBLVBD is a relation between sets which checks whether each element in the first
argument is in a LVBD relation with one element in the second argument. As an example, we
could have the rule 2UDQJHV→6SDJKHWWL as answer if it holds and also its abstraction
)UXLW→3DVWDholds
:KDWKDSSHQVDIWHUSURPRWLQJVRPHSURGXFW?
We can give an answer by finding those rules which have been established by the promotion,
i.e. rules which did not hold before the promotion, which were raised during the promotion
and persisted after the promotion:
LQWHUYDOEHIRUH∞
LQWHUYDOSURPRWLRQ
LQWHUYDODIWHU∞
LWHPVHWVBSDUWLWLRQ/DEHO7LG,WHP!←
SDUWLWLRQ/DEHO7LG,WHP
UXOHVBSDUWLWLRQ/DEHOSDWWHUQ6&,WHPVHW!←
LWHPVHWVBSDUWLWLRQ/DEHOB,WHPVHW
SUHVHUYHGBUXOHV/HIW5LJKW←
UXOHVBSDUWLWLRQSURPRWLRQ/HIW5LJKWBB
¬UXOHVBSDUWLWLRQEHIRUH/HIW5LJKWBB
UXOHVBSDUWLWLRQDIWHU/HIW5LJKWBB
•
+RZGRUXOHVFKDQJHDORQJWLPH"
One way to answer is given by the rules computed separately on each time interval, as
explained in the previous subsections. Another, slightly different, consists of computing rules
valid in the whole dataset and then checking their support and confidence in the intervals. We
then obtain a description of the evolution of rules in time, which can be used for instance to
check whether a rule holds uniformly along time or has some peak in an interval, or shows
some kind of periodicity.
FKHFNBVXSSRUW6HW/DEHOFRXQW7LG!←
LWHPVHWVBSDUWLWLRQ/DEHO7LG,WHPVHW
VXEVHW6HW,WHPVHW
UXOHVBHYROXWLRQ/HIW5LJKW/DEHO6XSS&RQI←
UXOHV/HIW5LJKWBB
XQLRQ/HIW5LJKW$OO
FKHFNBVXSSRUW$OO/DEHO6XSS
FKHFNBVXSSRUW/HIW/DEHO/6XSS
&RQI 6XSS/6XSS
In figure 8 for example, we see that the rule 3DVWD→)UHVK &KHHVH holds in the whole
database, but in a not uniform way, since it has many peaks (in particular a positive one on
December 1st 1997 and a negative one on December 2nd).
35
30
25
6XSSRUW
20
Pasta => Fresh Cheese 14
15
Bread Subsidiaries => Fresh Cheese 28
Biscuits => Fresh Cheese 14
10
Fresh Fruit => Fresh Cheese 14
Frozen Food => Fresh Cheese 14
5
05/12/97
04/12/97
03/12/97
02/12/97
01/12/97
30/11/97
29/11/97
28/11/97
27/11/97
25/11/97
26/11/97
0
)LJXUHvisualization of rules evolution.
•
:KLFKUXOHVLQYROYHWKHJUHDWHVWHFRQRPLFDOYDOXH"
We can compute a PRQHWDU\VXSSRUW for each rule, defining it as the sum of the totals of the
transactions to which the rule applies.
PRQH\BVXSSRUW/HIW5LJKWVXP7RWDO!←
UXOHV/HIW5LJKWBBLWHPVHWV7LG,WHPVHW
XQLRQ/HIW5LJKW$OOVXEVHW$OO,WHPVHW
FDVKBWUDQVDFWLRQ7LG«7RWDO«
An example of result is shown in figure 9, where rules are named by means of numbers. As
we can see, rule 1 seems to be much more interesting of rule 21, since the former involves
purchases for more than twice the amount of the latter.
1400000
1200000
1000000
800000
)LQDQFLDO9DOXH
600000
400000
200000
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 16 17 18 20 21
5XOH
)LJXUHfinancial support of association rules.
&21&/86,216
In this paper we briefly presented a system for the analysis of supermarket sales data, and
illustrated its functionalities in addressing high-level business rules for market basket analysis. At
the present stage, the Datasift system is a prototype, which served as a demonstrator to validate
with the end user COOP the viability of the approach. The system revealed as an effective tool
for the market analysts, which, using the predefined business rules, is able to visualize the answer
according to some adequate graphical metaphors, within standard desktop tools. Moreover, we
are experiencing that an expert knowledge engineer is generally able to translate quickly the
analyst’s questions into deductive/inductive rules.
A problem in the current prototype is the extensibility of the user interface. While, in fact, it is
simple to add new analyses to the deductive-inductive component in a modular way (and to
report such additions to the query engine), there is not a uniform way of describing a
corresponding representation metaphor. Hence, each time a new analysis (e.g., a business rule) is
added, the user interface needs to be modified in order to support such analysis.
Another main concern is efficiency, a typical problem with deductive database systems, such as
LDL++, and a crucial problem when analyzing massive amounts of data. For this reason, in the
planned future engineering of the Datasift system, we intend to adopt suitable data reduction
techniques.
$&.12:/('*(0(176
We are indebted with our collaborators at COOP Toscana Lazio for their support during the
Datasift project: Massimo Palla, Giuseppe Lallai and Walter Fabbri. Thanks are also owing to
Nando Gallo from Intecs Sistemi, Pisa, for his help in designing the visualization component. The
project reported in this paper has been supported by Regione Toscana under a grant "Rete
Telematica ad Alta Tecnologia".
5()(5(1&(6
[1] N.Arni, K.Ong, S.Tsur, C.Zaniolo. LDL++: A Second Generation Deductive Databases
Systems. Technical report, MCC Corporation, 1993.
[2] R.Agrawal, S.Sarawagi, S.Thomas. Integrating Association Rule Mining with Relational
Database Systems: Alternatives and Implications. In Procs. of ACM-SIGMOD’98, 1998.
[3] R.Agrawal, R.Srikant. Fast Algorithms for Mining Association Rules. In Procs. of 20th Int’l
Conference on Very Large Databases, 1994.
[4] E.Baralis, G.Psaila. Incremental Refinement of Association Rule Mining. In Procs. of
SEBD’98, 1998.
[5] M. J. A. Berry, G.Linoff. Data Mining Techniques for Marketing Sales, and Customer
Support. Wiley, 1997.
[6] F. Giannotti, G. Manco, M. Nanni, D. Pedreschi. Query Answering in Nondeterministic,
Nonmonotonic, Logic Databases. In Procs. of the Workshop on Flexible Query Answering,
number 1395 in LNAI, 1998.
[7] F. Giannotti, G. Manco, M. Nanni, D. Pedreschi. On the Effective Semantics of Temporal,
Non-monotonic, Non-deterministic Logic Languages. In Procs Int’l Conference on
Computer Science Logic, LNCS, 1998.
[8] J.Han, Y.Fu, K.Koperski, W.Wang, O.Zaiane. DMQL: A Data Mining Query Language for
Relational Databases. In SIGMOD'96 Workshop on Research Issues on Data Mining and
Knowledge Discovery (DMKD'96), 1996.
[9] T.Imielinski, A.Virmani, A.Abdulghani. Discovery Board Application Programming
Interface and Query Language for Database Mining. In Procs.2nd Int’l Conference on
Knowledge Discovery and Data Mining. 1996.
[10] R.Kimball, K.Strehlo. Why Decision Support Fails and How to Fix it. SIGMOD Record,
24(3), 1995.
[11] R.Meo, G.Psaila, S.Ceri. A Tightly-Coupled Architecture for Data Mining. In International
Conference on Data Engineering (ICDE98), pages 316-323, 1998.
[12] R.Srikant, R.Agrawal. Mining Generalized Association Rules. In Procs. of the 21st Int'l
Conference on Very Large Databases, 1995.
[13] D.Tsur and others. Query Flocks: A Generalization of Association-Rule Mining. In Procs.
of ACM SIGMOD’98, pages 1-12, 1998.
[14] C.Zaniolo, N.Arni, K.Ong. Negation and Aggregates in Recursive Rules: The LDL++
Approach. In Procs. 3rd Int. Conf. on Deductive and Object-Oriented Databases
(DOOD93), volume 760 of LNCS, 1993.
[15] C.Zaniolo, H.Wang. Logic-Based User-Defined Aggregates for the Next Generation of
Database Systems. In The Logic Programming Paradigm: Current Trends and Future
Directions. Springer Verlag, 1998.