Sign up to receive email announcements and updates about conferences and future events
WORLDCOMP'09 / DMIN'09 Tutorial: Prof. Nitesh V. Chawla
Data Mining with Sensitivity to Rare Events and Class Imbalance
Prof. Nitesh V. Chawla
University of Notre Dame
Date: Julu 13, 2009
Time: 6:00-9:00 PM
Location: Gold Room
Recent years brought increased interest in applying data mining techniques to difficult 'real-world' problems, many of which are characterized by imbalanced learning data, where at least one class is much rarer relative to others. Examples include (but are not limited to): fraud/intrusion detection, risk management, medical diagnosis/monitoring, bioinformatics, text categorization and personalization of information. The problem of imbalanced data is also often associated with asymmetric costs of misclassifying elements of different classes. Additionally the distribution of the test data may differ from that of the learning sample and the true misclassification costs may be unknown at learning time. Predictive accuracy, a popular choice for evaluating performance of a classifier, will not be appropriate when the data is imbalanced and/or the costs of different errors vary markedly.
This tutorial will introduce the problem of class imbalance, address the scope of solutions available, present and contrast the appropriate metrics for evaluating performance, and discuss the applications with case studies.