Web Data Mining
Speaker: Ricardo Baeza-Yates – Palo Alto, CA, United StatesTopic(s): Information Systems, Search, Information Retrieval, Database Systems, Data Mining, Data Science
Abstract
The Web continues to grow and evolve very fast, changing our daily lives. This activity represents the collaborative work of the millions of institutions and people that contribute content to the Web as well as the three billion people that use it. In this ocean of hyperlinked data, there is explicit and implicit information and knowledge. Web Mining is the task of analyzing this data and extracting information and knowledge for many different purposes. The data comes in three main flavors: content (text, images, etc.), structure (hyperlinks) and usage (navigation, queries, etc.), implying different techniques such as text, graph or log mining. Each case reflects the wisdom of some group of people that can be used to make the Web better. For example, user generated tags in social media sites. The tutorial covers (a) the main concepts behind Web mining, the different data that is found in the Web and typical applications; (b) the mining process: data recollection, data cleaning, data warehousing and data analysis, including crawling in the case of content mining, and privacy issues in the case of usage mining; (c) the main techniques used for the different data types; and (d) use cases of the three types: content, structure and usage mining, ranging from website design to search engines.About this Lecture
Number of Slides: 300+Duration: 60 - 180 minutes
Languages Available: English, Portuguese, Spanish
Last Updated:
Request this Lecture
To request this particular lecture, please complete this online form.
Request a Tour
To request a tour with this speaker, please complete this online form.
All requests will be sent to ACM headquarters for review.