Evaluation in Web mining

Tutorial at ECML/PKDD 2004

Pisa, Italy; 20th September, 2004

Bettina Berendt, Myra Spiliopoulou, Ernestina Menasalvas

In conjunction with the Workshop on Statistical Approaches for Web Mining

http://www.wiwi.hu-berlin.de/~berendt/evaluation04/

 


Description

Web mining has become a critical tool for competitive application intelligence. Understanding the behavior of a site's visitors requires creative extensions of KDD techniques for e-commerce and clickstream data: patterns must be discovered from a variety of data sources, and these patterns must be interpreted and transformed into actionable knowledge for redesigns that bring revenue. Redesigns encompass general improvements to information architecture and navigation options, as well as the offering of personalized recommendations and services. At the same time, a reliable discovery and interpretation of patterns cannot ignore the Web content itself. This leads to challenges on Web content mining, including text categorisation, content analysis and extraction of implicit semantics.

These issues are already broadly recognized: The research on Web mining is intensive and, in some cases, goes hand-in-hand with deployment in the market. This leads to the challenge of incorporating Web mining to the internal evaluation processes of the site operator. Web mining can be used to derive indicators that describe marketing success, the appropriateness of distribution channel mixes, or other indicators of a site's or service's success. At the same time, Web mining itself constitutes a major investment and therefore needs to be subjected to a cost-benefit evaluation. Both of these aspects, "Web mining for evaluation" and "Evaluation of Web mining" require systematic methods and a context of project management. The owners of Web sites and Web applications need a complete evaluation framework, in order to derive well-informed decisions for the extend of using Web mining as a tool for data analysis and for the deployment of its results in site and service design.

In this tutorial, we investigate the current state of Web mining evaluation from both viewpoints of evaluating a Web site and evaluating Web mining projects themselves. In particular, we address

The tutorial draws from the core domains of KDD, covering issues of data preparation, pattern discovery, and pattern analysis. We also draw on the domain of Web marketing that contributes the requirements and the economic measures, on human-computer interaction for user-centric success evaluation, and on project management dealing with gaps to be filled in order to evaluate the impact of a web mining project and having a measure of its success.

Target Audience

ECML/PKDD participants with interest in e-applications in general and in Web mining in particular. This group includes


OUTLINE OF THE TUTORIAL

Part I. Foundations and Principles of Web mining

Part II. Web mining as a project

Part III. Evaluation methods and measures

Part IV. Case study

      Evaluating the distribution policy of a multi-channel e-retailer

Part V. Infrastructure for Web mining deployment

Part VI. Outlook


Tutorial slides (PDF)

About the Organizers

Myra Spiliopoulou
Research group KMD: Knowledge Management & Discovery in Information Systems
Institute of Technical and Business Information Systems
Faculty of Computer Science
Otto-von-Guericke-Universitaet Magdeburg
PO Box 4120, D-39016 Magdeburg, Germany
http://omen.cs.uni-magdeburg.de/itikmd/Myra_Spiliopoulou.62.0.html


Myra Spiliopoulou is professor of business information systems in the Faculty of Computer Science of the Otto-von-Guericke-University Magdeburg. Her research spans the fields of knowledge discovery and knowledge management. In the area of knowledge discovery, she works on preparation, discovery and evaluation methods for web usage mining, on text mining and the extraction of semantics from implicitly structured texts, on pattern maintenance and evolution. Her teaching curriculum includes courses on data mining and e-business. She has been co-chair of the web mining workshops WEBKDD'99, WEBKDD'2000, WEBKDD'01 and WEBKDD'02 of the ACM/SIGKDD conference series. She has been tutorial presenter on subjects of web mining in the ECML/PKDD conference series: In the tutorial of ECML/PKDD'99, the emphasis was on KDD methodologies, while the tutorial of ECML/PKDD'2000 focussed more on evaluation methods. The web mining tutorials of ECML/PKDD'01 and ECML/PKDD'02 were in cooperation with Bamshad Mobasher (DePaul Univ. Chicago) and Bettina Berendt (HU Berlin) and focussed on personalisation and e-business applications. In the ECML/PKDD'03, she has been co-chair of the workshop European Web Mining Forum (EWMF'03). Under the auspices of the KDNet European Network of Excellence, she is co-organiser of the Web Mining Forum initiative that brings together researchers on web content mining, web usage mining and Semantic Web mining.

Bettina Berendt
Institute of Information Systems
Humboldt-Universitaet zu Berlin
Spandauer Str. 1, D-10178 Berlin, Germany
http://www.wiwi.hu-berlin.de/~berendt


Bettina Berendt is Assistant Professor of Information Systems at Humboldt University Berlin. Her research interests include web usage mining, psychological methods of web navigation analysis, and visualization. She served as the director of "SchulWeb" (http://www.schulweb.de/), a large non-commercial German web server. Bettina Berendt's teaching experience includes seminars and tutorials on Web mining, AI and Cognitive Science, and Visualization on the Web. She has been a co-organizer of the ECML/PKDD workshops Semantic Web Mining 2001, Semantic Web Mining 2002, the First European Web Mining Forum (2003), and the AAAI workshop on Semantic Web Personalization (2004). Together with Bamshad Mobasher and Myra Spiliopoulou, she has presented tutorials on Web mining with emphasis on personalization and E-Business applications at ECML/PKDD in 2001 and 2002.

Ernestina Menasalvas
Departamento de Lenguajes y Sistemas Informaticos e Ingenieria del Sw
d. 4303. Facultad de Informatica
Universidad Politecnica de Madrid
Campus de Montegancedo
28660 Boadilla del Monte, Madrid, Spain
http://pluton.ls.fi.upm.es/~ernes


Ernestina Menasalvas is professor of Data Bases and DataWarehouse at Facultad de Informática Universidad Politecnica de Madrid, where she coordinates the data mining laboratory. Her current research includes web usage mining, data mining as an engeneering process, data mining projects cost estimation and foundations of data mining. Ernestina Menasalvas's teaching experience includes courses on data mining and data warehousing, web mining and engineering the process of data mining. She organized the first Atlantic conference on Web Intelligence AWIC'03 and she is the co-chair of AWIC'04. She participated together with Myra Spiliopoulou and Bettina Berendt in the First European Web Mining Forum (2003).


last updated on 2004-10-19 by Bettina Berendt