Unbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of unbalanced data sets. In this paper we present a brief review of existing solutions to the class-imbalance problem proposed both at the data and algorithmic levels. Even though a common practice to handle the problem of imbalanced data is to re-balance them artificially by oversampling and/or under-sampling, some researchers proved that modified support vector machine, rough set based minority class oriented rule learning methods, cost sensitive classifier perform good on imbalanced data set. It is observed that current research in imbalance data problem is moving to hybrid algorithms. (Author/publisher)
Abstract