Variable target values neural network for dealing with extremely imbalanced datasets
Schizas, Christos N.
14th Mediterranean Conference on Medical and Biological Engineering and Computing, MEDICON 2016
Google Scholar check
MetadataShow full item record
An original classification algorithm is proposed for dealing with extremely imbalanced datasets that often appear in biomedical problems. Its originality comes from the way a neural network is trained in order to get a decent hypothesis out of a dataset that comprises of a huge sized majority class and a tiny size minority class. This situation is especially probable when forming machine learning databases describing rare medical conditions. The algorithm is tested on a large dataset in order to predict the risk of preeclampsia in pregnant women. Conventional machine learning algorithms tend to provide poor hypothesis for extremely imbalanced datasets by favoring the majority class. The proposed algorithm is not trained on the basis of the mean squared error objective function and thus avoids the overwhelming effect of the highly asymmetric class sizes. The methodology provides preeclampsia detection rate of 49% and normal case detection rate slightly above 76%. © Springer International Publishing Switzerland 2016.