2 The role of the different class size configurations

2.1 Decision Tree, datasets with 3 classes

The values of OC Ratio obtained by the decision tree classifier on triangle datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

The values of Recall obtained by the decision tree classifier on triangle datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

2.2 K-nearest neighbors, datasets with 3 classes

The values of OC Ratio obtained by the kNN classifier on triangle datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

The values of Recall obtained by the kNN classifier on triangle datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

2.3 Bagging with 30 decision trees, datasets with 3 classes

The values of OC Ratio obtained by the bagging classifier on triangle datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

The values of Recall obtained by the bagging classifier on triangle datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

2.4 Decision Tree, datasets with 4 classes

The values of OC Ratio obtained by the decision tree classifier on square datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

The values of Recall obtained by the decision tree classifier on square datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

2.5 K-nearest neighbors, datasets with 4 classes

The values of OC Ratio obtained by the kNN classifier on square datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

The values of Recall obtained by the kNN classifier on square datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

2.6 Bagging with 30 decision trees, datasets with 4 classes

The values of OC Ratio obtained by the bagging classifier on square datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

The values of Recall obtained by the bagging classifier on square datasets for the imbalanced ratio IR growing from 1 to 20 with three different class overlapping std in {1, 3, 10} (columns) and three different types of class distributions (represented in next rows).

3 Overlapping and interrelation between different class types

3.1 Decision Tree

Results of Recalls on the traingle datasets (IR=8), where the class overlap was increased only with the selected class. Each row presents the results for one dataset configuration (the class cardinalities distribution is indicated in the plot title). First two plots in each row (form the left) shows class recalls when the selected class (green line) was overlapped with one of two other classes. The third plot (the right column) shows comparison of green lines from the first two plots for the selected class.

Results of OC Ratio on the traingle datasets (IR=8), where the class overlap was increased only with the selected class. Each row presents the results for one dataset configuration (the class cardinalities distribution is indicated in the plot title). First two plots in each row (form the left) shows class recalls when the selected class (green line) was overlapped with one of two other classes. The third plot (the right column) shows comparison of green lines from the first two plots for the selected class.

What makes multi-class imbalanced problems difficult? An experimental study.

Mateusz Lango, Jerzy Stefanowski

What makes multi-class imbalanced problems difficult? An experimental study.

Mateusz Lango, Jerzy Stefanowski

1 Impact of class overlapping and imbalanced ratio on the classifiers performance

1.1 Decision Tree, datasets with 3 classes

1.2 K-nearest neighbors, datasets with 3 classes

1.3 Bagging with 30 decision trees, datasets with 3 classes

1.4 Decision Tree, datasets with 4 classes

1.5 K-nearest neighbors, datasets with 4 classes

1.6 Bagging with 30 decision trees, datasets with 4 classes

2 The role of the different class size configurations

2.1 Decision Tree, datasets with 3 classes

2.2 K-nearest neighbors, datasets with 3 classes

2.3 Bagging with 30 decision trees, datasets with 3 classes

2.4 Decision Tree, datasets with 4 classes

2.5 K-nearest neighbors, datasets with 4 classes

2.6 Bagging with 30 decision trees, datasets with 4 classes

3 Overlapping and interrelation between different class types

3.1 Decision Tree

3.2 K-nearest neighbors

3.3 Bagging with 30 decision trees

4 Increasing the number of classes

4.1 Decision Tree

4.2 K-nearest neighbors

4.3 Bagging with 30 decision trees