Do your data behave gently to your Machine Learning algorithms? What if not?
Speaker: Swagatam Das – Kolkata, IndiaTopic(s): Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing
Abstract
Many machine learning systems rely on implicit assumptions regarding the regularity of data. For instance, several classifiers assume that all classes have an equal number of representatives, that sub-concepts within classes are characterized by an equal number of representatives, and that all classes exhibit similar class-conditional distributions. Additionally, both classifiers and clustering methods presuppose that all features are defined and observed for every data instance. However, numerous real-world datasets violate one or more of these assumptions, resulting in data irregularities that can introduce unwarranted bias in learning systems or render them unsuitable for the data at hand.
Commencing with a taxonomy of these various data irregularities, this presentation will delve into the significant practical challenges encountered by learning systems when handling one or a combination of such irregularities, especially when pre-processing alone cannot rectify them. Furthermore, we will underscore some fundamental theoretical obstacles in analyzing the behavior of learning systems, such as deriving test error bounds for classifiers on imbalanced datasets, in the presence of irregular data.
About this Lecture
Number of Slides: 90Duration: 75 minutes
Languages Available: English
Last Updated:
Request this Lecture
To request this particular lecture, please complete this online form.
Request a Tour
To request a tour with this speaker, please complete this online form.
All requests will be sent to ACM headquarters for review.