The goal of this experiment is to check whether clustering can be used as a feature extraction method for classification. The basic premise is to cluster the dataset into k clusters and use each cluster as a new feature or create a single additional feature reflecting the cluster assignment. The values of these features would be either binary (example assigned to a cluster or not), continuous (degree of cluster membership), or discrete (cluster number, in case of adding a single feature). We would like to investigate:


## Warning: package 'apcluster' was built under R version 4.0.2
## Warning: package 'spatstat' was built under R version 4.0.2
## Warning: package 'spatstat.data' was built under R version 4.0.2
## Warning: package 'matrixcalc' was built under R version 4.0.2
## Warning: package 'PMCMR' was built under R version 4.0.2
## Warning: package 'scmamp' was built under R version 4.0.2
  1. Experiment settings
supportedDatasets = c("wine", "breast-cancer-wisconsin", "yeast", "glass", "ecoli",
             "vowel-context", "iris", "pima-indians-diabetes", "sonar.all",
             "image-segmentation", "ionosphere", "letter", "magic", "optdigits",
             "pendigits", "spectrometer", "statlog-satimage", "statlog-vehicle")

supportedClassifiers = c("PART" ,"multinom", "pda", "gbm", "bayesglm", "rpart", "knn", "svmLinear", "svmRadial")

dataset.name = "wine"
classifier = "svmLinear"

feature.types = c("factor", "binary", "distance", "binaryFS", "binaryDist", "revDistSquared", "distFS", "membership")
feature.type = feature.types[8]
measures = c("euclidean", "mahalanobis")
measure = measures[1]
clusteringPerClass = FALSE
newFeaturesOnly = TRUE
number_of_clusters = 10
scaling = FALSE;

train_testSplitRatio = 0.5
folds = 5
repeats = 2

set.seed(23)
  1. Read data

For now, let’s use the wine dataset.

##  Class        V2                 V3                V4          
##  1:59   Min.   :-2.42739   Min.   :-1.4290   Min.   :-3.66881  
##  2:71   1st Qu.:-0.78603   1st Qu.:-0.6569   1st Qu.:-0.57051  
##  3:48   Median : 0.06083   Median :-0.4219   Median :-0.02375  
##         Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.00000  
##         3rd Qu.: 0.83378   3rd Qu.: 0.6679   3rd Qu.: 0.69615  
##         Max.   : 2.25341   Max.   : 3.1004   Max.   : 3.14745  
##        V5                  V6                V7                 V8         
##  Min.   :-2.663505   Min.   :-2.0824   Min.   :-2.10132   Min.   :-1.6912  
##  1st Qu.:-0.687199   1st Qu.:-0.8221   1st Qu.:-0.88298   1st Qu.:-0.8252  
##  Median : 0.001514   Median :-0.1219   Median : 0.09569   Median : 0.1059  
##  Mean   : 0.000000   Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.0000  
##  3rd Qu.: 0.600395   3rd Qu.: 0.5082   3rd Qu.: 0.80672   3rd Qu.: 0.8467  
##  Max.   : 3.145637   Max.   : 4.3591   Max.   : 2.53237   Max.   : 3.0542  
##        V9               V10                V11               V12          
##  Min.   :-1.8630   Min.   :-2.06321   Min.   :-1.6297   Min.   :-2.08884  
##  1st Qu.:-0.7381   1st Qu.:-0.59560   1st Qu.:-0.7929   1st Qu.:-0.76540  
##  Median :-0.1756   Median :-0.06272   Median :-0.1588   Median : 0.03303  
##  Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.00000  
##  3rd Qu.: 0.6078   3rd Qu.: 0.62741   3rd Qu.: 0.4926   3rd Qu.: 0.71116  
##  Max.   : 2.3956   Max.   : 3.47527   Max.   : 3.4258   Max.   : 3.29241  
##       V13               V14         
##  Min.   :-1.8897   Min.   :-1.4890  
##  1st Qu.:-0.9496   1st Qu.:-0.7824  
##  Median : 0.2371   Median :-0.2331  
##  Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.7864   3rd Qu.: 0.7561  
##  Max.   : 1.9554   Max.   : 2.9631

The number of classes in this dataset is 3.

  1. Select the number of clusters

Since our goal is to use the clusters as attributes for classification, it makes sense to use more clusters than there are classes in the dataset.

The number of clusters found in wine dataset is 4.

  1. Clustering

K-memans clustering

Clustering dataset into 10 clusters.

Affinity propagation clustering

Spectral clustering

Fuzzy CMeans features

list[trainAccuracy, testAccuracy] = trainTestEvaluate(tmp.cm$train, tmp.cm$test, classifier, 5, 2)
tmp.cm = addClusteringFeatures("ap", trainSet,testSet, feature.type, scaling, number_of_clusters, measure, clusteringPerClass, newFeaturesOnly, FALSE)

  1. Classification

Without new features

  • Training
## Support Vector Machines with Linear Kernel 
## 
## 90 samples
## 13 predictors
##  3 classes: '1', '2', '3' 
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 2 times) 
## Summary of sample sizes: 71, 73, 72, 72, 72, 71, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9947368  0.9920502
## 
## Tuning parameter 'C' was held constant at a value of 1
  • Testing
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  1  2  3
##          1 29  5  0
##          2  0 30  2
##          3  0  0 22
## 
## Overall Statistics
##                                          
##                Accuracy : 0.9205         
##                  95% CI : (0.843, 0.9674)
##     No Information Rate : 0.3977         
##     P-Value [Acc > NIR] : < 2.2e-16      
##                                          
##                   Kappa : 0.8795         
##                                          
##  Mcnemar's Test P-Value : NA             
## 
## Statistics by Class:
## 
##                      Class: 1 Class: 2 Class: 3
## Sensitivity            1.0000   0.8571   0.9167
## Specificity            0.9153   0.9623   1.0000
## Pos Pred Value         0.8529   0.9375   1.0000
## Neg Pred Value         1.0000   0.9107   0.9697
## Prevalence             0.3295   0.3977   0.2727
## Detection Rate         0.3295   0.3409   0.2500
## Detection Prevalence   0.3864   0.3636   0.2500
## Balanced Accuracy      0.9576   0.9097   0.9583

With new k-means features

  • Training
## Support Vector Machines with Linear Kernel 
## 
## 90 samples
## 10 predictors
##  3 classes: '1', '2', '3' 
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 2 times) 
## Summary of sample sizes: 72, 72, 72, 72, 72, 72, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9833333  0.9748837
## 
## Tuning parameter 'C' was held constant at a value of 1
  • Testing
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  1  2  3
##          1 28  2  0
##          2  1 32  1
##          3  0  1 23
## 
## Overall Statistics
##                                           
##                Accuracy : 0.9432          
##                  95% CI : (0.8724, 0.9813)
##     No Information Rate : 0.3977          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.9139          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 1 Class: 2 Class: 3
## Sensitivity            0.9655   0.9143   0.9583
## Specificity            0.9661   0.9623   0.9844
## Pos Pred Value         0.9333   0.9412   0.9583
## Neg Pred Value         0.9828   0.9444   0.9844
## Prevalence             0.3295   0.3977   0.2727
## Detection Rate         0.3182   0.3636   0.2614
## Detection Prevalence   0.3409   0.3864   0.2727
## Balanced Accuracy      0.9658   0.9383   0.9714
  • Variable importance

With new affinity propagation features

  • Training
## Support Vector Machines with Linear Kernel 
## 
## 90 samples
##  8 predictor
##  3 classes: '1', '2', '3' 
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 2 times) 
## Summary of sample sizes: 71, 72, 72, 72, 73, 71, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9774166  0.9656101
## 
## Tuning parameter 'C' was held constant at a value of 1
  • Testing
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  1  2  3
##          1 28  3  0
##          2  1 31  2
##          3  0  1 22
## 
## Overall Statistics
##                                          
##                Accuracy : 0.9205         
##                  95% CI : (0.843, 0.9674)
##     No Information Rate : 0.3977         
##     P-Value [Acc > NIR] : < 2.2e-16      
##                                          
##                   Kappa : 0.8793         
##                                          
##  Mcnemar's Test P-Value : NA             
## 
## Statistics by Class:
## 
##                      Class: 1 Class: 2 Class: 3
## Sensitivity            0.9655   0.8857   0.9167
## Specificity            0.9492   0.9434   0.9844
## Pos Pred Value         0.9032   0.9118   0.9565
## Neg Pred Value         0.9825   0.9259   0.9692
## Prevalence             0.3295   0.3977   0.2727
## Detection Rate         0.3182   0.3523   0.2500
## Detection Prevalence   0.3523   0.3864   0.2614
## Balanced Accuracy      0.9573   0.9146   0.9505
  • Variable importance

With new spectral clustering features

  • Training
## Support Vector Machines with Linear Kernel 
## 
## 90 samples
## 10 predictors
##  3 classes: '1', '2', '3' 
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 2 times) 
## Summary of sample sizes: 72, 72, 72, 72, 72, 72, ... 
## Resampling results:
## 
##   Accuracy   Kappa    
##   0.9888889  0.9833333
## 
## Tuning parameter 'C' was held constant at a value of 1
  • Testing
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  1  2  3
##          1 28  2  0
##          2  1 32  2
##          3  0  1 22
## 
## Overall Statistics
##                                           
##                Accuracy : 0.9318          
##                  95% CI : (0.8575, 0.9746)
##     No Information Rate : 0.3977          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.8964          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 1 Class: 2 Class: 3
## Sensitivity            0.9655   0.9143   0.9167
## Specificity            0.9661   0.9434   0.9844
## Pos Pred Value         0.9333   0.9143   0.9565
## Neg Pred Value         0.9828   0.9434   0.9692
## Prevalence             0.3295   0.3977   0.2727
## Detection Rate         0.3182   0.3636   0.2500
## Detection Prevalence   0.3409   0.3977   0.2614
## Balanced Accuracy      0.9658   0.9288   0.9505
  • Variable importance

  1. Experiments

5.1. Comparative evaluation

Let us now perform an experiment with different classifiers over multiple datasets. For each setting we will test classification without added features and with features generated using affinity propagation and k-means. The experimental methodology is organized as follows. Each dataset is scaled and split into training and testing sets with split ratio equal to 0.5. Next, new features are added to the datasets using both clustering algorithms. Afterwards, for each classifier, three models are trained on original, afinity propagation-enriched, and k-means-enriched training sets. Training is performed using 5-fold cross-validation repeated 2 times. Finally, the trained models are tested on corresponding test sets and evaluated using accuracy.

## [1] "---------------------------------------------------------------------"
## [1] "multinom"
## [1] "---------------------------------------------------------------------"
## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.61237, df = 17.969, p-value = 0.548
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01105099  0.02014190
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9772727          0.9727273 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.68869, df = 17.998, p-value = 0.4998
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.005412156  0.010690748
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9697947          0.9671554 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.85884, df = 15.04, p-value = 0.4039
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.005612108  0.013189916
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5917456          0.5879567 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 3.5893, df = 14.46, p-value = 0.002827
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.02221850 0.08771347
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6780952          0.6231293 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.64006, df = 15.505, p-value = 0.5315
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01118396  0.02082251
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8524096          0.8475904 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 43.061, df = 14.154, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.2524383 0.2788748
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9210101          0.6553535 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.67557, df = 16.381, p-value = 0.5087
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01421367  0.02754701
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9546667          0.9480000 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.2542, df = 17.973, p-value = 0.8022
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01447758  0.01135258
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7627604          0.7643229 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.963, df = 17.769, p-value = 0.0655
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.002352436  0.068371853
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7747573          0.7417476 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.0114, df = 14.025, p-value = 0.0639
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0003892559  0.0121641477
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9526407          0.9467532 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.799, df = 12.999, p-value = 0.01506
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.005215301 0.040498984
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9034286          0.8805714 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 3.9621, df = 13.07, p-value = 0.001608
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.003207474 0.010890070
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9690281          0.9619794 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 38.085, df = 16.176, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.03161707 0.03534087
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9869106          0.9534316 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.055465, df = 16.969, p-value = 0.9564
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02148896  0.02264916
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.4114235          0.4108434 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 26.988, df = 17.779, p-value = 7.064e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.04335165 0.05067820
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9021455          0.8551306 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.27621, df = 15.585, p-value = 0.786
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01268559  0.01647706
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7954976          0.7936019 
## 
## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display[result.display$Classifier == classifier, ]$orig and result.display[result.display$Classifier == classifier, ]$augm
## W = 106, p-value = 0.423
## alternative hypothesis: true location shift is not equal to 0
## 
## [1] "---------------------------------------------------------------------"
## [1] "pda"
## [1] "---------------------------------------------------------------------"
## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.19825, df = 17.716, p-value = 0.8451
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01092019  0.01319292
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9784091          0.9772727 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.4753, df = 17.227, p-value = 0.02398
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.001306504 0.016288804
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9697947          0.9609971 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.19961, df = 17.521, p-value = 0.8441
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01162571  0.01406143
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5882273          0.5870095 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 3.797, df = 13.058, p-value = 0.002204
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.02382342 0.08665277
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6685714          0.6133333 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.70235, df = 17.782, p-value = 0.4916
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03608941  0.01801713
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8493976          0.8584337 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 27.66, df = 12.06, p-value = 2.793e-12
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.2181276 0.2554078
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8204040          0.5836364 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.3254, df = 16.772, p-value = 0.7489
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01464151  0.01997485
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9653333          0.9626667 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.65942, df = 17.884, p-value = 0.518
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.016357486  0.008544986
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7585937          0.7625000 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.4895, df = 17.876, p-value = 0.1538
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.00838382  0.04916052
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7194175          0.6990291 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.96411, df = 17.898, p-value = 0.3478
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.008535151  0.003167186
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9128139          0.9154978 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.68751, df = 14.295, p-value = 0.5028
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01690898  0.03290898
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8697143          0.8617143 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 10.606, df = 17.955, p-value = 3.68e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.01492990 0.02230755
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9690637          0.9504450 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 71.17, df = 10.537, p-value = 1.763e-15
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.1042991 0.1109931
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9833788          0.8757328 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 6.177, df = 12.905, p-value = 3.449e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.05267266 0.10939784
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5171798          0.4361446 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 19.271, df = 17.32, p-value = 3.846e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.04046246 0.05039575
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8827736          0.8373445 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.4975, df = 17.557, p-value = 0.152
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.004132102  0.024511248
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7819905          0.7718009 
## 
## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display[result.display$Classifier == classifier, ]$orig and result.display[result.display$Classifier == classifier, ]$augm
## W = 107, p-value = 0.445
## alternative hypothesis: true location shift is not equal to 0
## 
## [1] "---------------------------------------------------------------------"
## [1] "bayesglm"
## [1] "---------------------------------------------------------------------"
## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.66169, df = 12.234, p-value = 0.5204
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.007792505  0.014610687
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7170455          0.7136364 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.64606, df = 17.672, p-value = 0.5265
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.005293230  0.009985312
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9695015          0.9671554 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 11.26, df = 17.998, p-value = 1.399e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.04787999 0.06984667
## sample estimates:
## mean in group Augm mean in group Orig 
##         0.14018945         0.08132612 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.7527, df = 17.872, p-value = 0.09679
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.004175768  0.046080529
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5152381          0.4942857 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.8858, df = 15.774, p-value = 0.07788
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0006802667  0.0115236402
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6361446          0.6307229 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 8.233, df = 18, p-value = 1.625e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.01309071 0.02206080
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.1745455          0.1569697 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## [1] "Error: Error in t.test.default(x = c(0.666666666666667, 0.666666666666667, 0.666666666666667, : data are essentially constant\n"
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.47886, df = 17.974, p-value = 0.6378
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.015433643  0.009704476
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7617188          0.7645833 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.81802, df = 17.927, p-value = 0.4241
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01980348  0.04504620
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7708738          0.7582524 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.7424, df = 15.112, p-value = 0.1017
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0025015130  0.0002504307
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.2831169          0.2842424 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.7358, df = 17.115, p-value = 0.01402
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.004584246 0.035415754
## sample estimates:
## mean in group Augm mean in group Orig 
##              0.908              0.888 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.57514, df = 15.907, p-value = 0.5732
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0005847149  0.0010198242
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.1983308          0.1981132 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 31.426, df = 12.775, p-value = 1.761e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.008323063 0.009554236
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.2069543          0.1980157 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.88465, df = 17.865, p-value = 0.3881
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.002711764  0.001105338
## sample estimates:
## mean in group Augm mean in group Orig 
##        0.001204819        0.002008032 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.3043, df = 13.125, p-value = 0.2145
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0006718074  0.0027240462
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.3390858          0.3380597 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.2723, df = 17.058, p-value = 0.03629
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.0004080283 0.0109663793
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.4928910          0.4872038
## Warning in wilcox.test.default(result.display[result.display$Classifier == :
## cannot compute exact p-value with ties
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  result.display[result.display$Classifier == classifier, ]$orig and result.display[result.display$Classifier == classifier, ]$augm
## W = 121.5, p-value = 0.8211
## alternative hypothesis: true location shift is not equal to 0
## 
## [1] "---------------------------------------------------------------------"
## [1] "rpart"
## [1] "---------------------------------------------------------------------"
## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 6.3403, df = 17.226, p-value = 6.947e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.05082653 0.10144620
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9409091          0.8647727 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 4.4351, df = 13.643, p-value = 0.0006005
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.01465576 0.04223573
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9671554          0.9387097 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.4125, df = 17.976, p-value = 0.1749
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.027265349  0.005343834
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5571042          0.5680650 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.9851, df = 14.01, p-value = 0.06707
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.075289829  0.002908877
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6200000          0.6561905 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.77585, df = 17.978, p-value = 0.4479
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03797485  0.01749292
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7921687          0.8024096 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.5203, df = 17.945, p-value = 0.02142
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.005641479 0.062237309
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5428283          0.5088889 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.2528, df = 17.66, p-value = 0.2266
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.006340694  0.025007360
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9426667          0.9333333 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.2051, df = 15.432, p-value = 0.2463
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01035025  0.03743359
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7395833          0.7260417 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.36979, df = 13.786, p-value = 0.7172
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03267897  0.04627120
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7019417          0.6951456 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -2.2607, df = 17.999, p-value = 0.03641
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0215484110 -0.0007892514
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9202597          0.9314286 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.78916, df = 17.393, p-value = 0.4406
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03983388  0.01811960
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8788571          0.8897143 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 11.666, df = 13.601, p-value = 1.825e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.0776449 0.1127431
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8008544          0.7056604 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 31.225, df = 13.213, p-value = 9.038e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.07707728 0.08851529
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8495176          0.7667213 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.56631, df = 17.964, p-value = 0.5782
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.05864356  0.03374396
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.4032129          0.4156627 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 4.2365, df = 17.933, p-value = 5e-04
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.007584358 0.022515145
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8479789          0.8329291 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.493, df = 16.189, p-value = 0.1547
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.008331649  0.048142075
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6571090          0.6372038 
## 
## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display[result.display$Classifier == classifier, ]$orig and result.display[result.display$Classifier == classifier, ]$augm
## W = 117, p-value = 0.6963
## alternative hypothesis: true location shift is not equal to 0
## 
## [1] "---------------------------------------------------------------------"
## [1] "knn"
## [1] "---------------------------------------------------------------------"
## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.2439, df = 17.057, p-value = 0.2303
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.021442679  0.005533588
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9556818          0.9636364 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.53033, df = 17.659, p-value = 0.6025
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.005220565  0.008739627
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9700880          0.9683284 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -3.367, df = 17.973, p-value = 0.003439
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.032744568 -0.007580195
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5560217          0.5761840 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.097642, df = 13.24, p-value = 0.9237
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.04397092  0.04016140
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6457143          0.6476190 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -2.366, df = 18, p-value = 0.0294
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.035257026 -0.002092372
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8319277          0.8506024 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.6641, df = 16.662, p-value = 0.01655
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.005306429 0.046006702
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8143434          0.7886869 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.7975, df = 17.291, p-value = 0.08974
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.049237387  0.003904054
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9280000          0.9506667 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.1705, df = 17.999, p-value = 0.2571
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.026202734  0.007452734
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7304687          0.7398438 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 2.0025, df = 17.612, p-value = 0.06088
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.001480327  0.059732754
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7766990          0.7475728 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -5.1233, df = 16.082, p-value = 0.0001005
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02937363 -0.01218481
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9086580          0.9294372 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 6.5673, df = 15.585, p-value = 7.382e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.06223846 0.12176154
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9028571          0.8108571 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -14.983, df = 17.125, p-value = 2.82e-11
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02534061 -0.01908801
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9487006          0.9709149 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -9.7915, df = 15.132, p-value = 6.093e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.006671636 -0.004287766
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9844529          0.9899326 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## [1] "Error: Error in t.test.default(x = c(0.44578313253012, 0.409638554216867, 0.469879518072289: not enough 'y' observations\n"
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -2.9059, df = 17.966, p-value = 0.009438
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.012162347 -0.001954568
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8896144          0.8966729 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -5.2987, df = 14.527, p-value = 9.934e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.07881645 -0.03350582
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6270142          0.6831754
## Warning in wilcox.test.default(result.display[result.display$Classifier == :
## cannot compute exact p-value with ties
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  result.display[result.display$Classifier == classifier, ]$orig and result.display[result.display$Classifier == classifier, ]$augm
## W = 133.5, p-value = 0.8505
## alternative hypothesis: true location shift is not equal to 0
## 
## [1] "---------------------------------------------------------------------"
## [1] "svmRadial"
## [1] "---------------------------------------------------------------------"
## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.77667, df = 14.136, p-value = 0.4502
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02135805  0.00999441
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9704545          0.9761364 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.058069, df = 16.339, p-value = 0.9544
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01039443  0.01098094
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9592375          0.9589443 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.22728, df = 4.4175, p-value = 0.8303
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03284173  0.02769965
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5814614          0.5840325 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.66875, df = 13.893, p-value = 0.5146
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02735552  0.05211742
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.6904762          0.6780952 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.99087, df = 15.517, p-value = 0.337
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.022733973  0.008276142
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8469880          0.8542169 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -7.5356e-14, df = 14.437, p-value = 1
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.009452846  0.009452846
## sample estimates:
## mean in group Augm mean in group Orig 
##           0.960404           0.960404 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.1335, df = 14.297, p-value = 0.8957
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02004634  0.02271301
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9426667          0.9413333 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.74302, df = 15.336, p-value = 0.4687
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.019114619  0.009218785
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7549479          0.7598958 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -6.8366e-15, df = 17.945, p-value = 1
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03412523  0.03412523
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8359223          0.8359223 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.0821, df = 15.414, p-value = 0.2959
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.009242208  0.003008442
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9564502          0.9595671 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.4611, df = 16.807, p-value = 0.1624
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.025151619  0.004580191
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9285714          0.9388571 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -4.4895, df = 16.848, p-value = 0.0003297
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.009316787 -0.003356762
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9788537          0.9851905 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -5.8277, df = 15.584, p-value = 2.856e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.002831958 -0.001318780
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9932642          0.9953395 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.2425, df = 15.899, p-value = 0.2321
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01050633  0.04022521
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5461847          0.5313253 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.63777, df = 16.981, p-value = 0.5321
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.003301813  0.006162509
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9111318          0.9097015 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.45255, df = 17.527, p-value = 0.6564
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01607029  0.01038308
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7924171          0.7952607 
## 
## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display[result.display$Classifier == classifier, ]$orig and result.display[result.display$Classifier == classifier, ]$augm
## W = 132, p-value = 0.8965
## alternative hypothesis: true location shift is not equal to 0
## 
## [1] "---------------------------------------------------------------------"
## [1] "rf"
## [1] "---------------------------------------------------------------------"
## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.20934, df = 21.373, p-value = 0.8362
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.010344167  0.008450227
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9801136          0.9810606 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 0.38877, df = 16.176, p-value = 0.7025
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.006521948  0.009454500
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9682307          0.9667644 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -3.0538, df = 21.58, p-value = 0.005906
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.029172498 -0.005559121
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5898737          0.6072395 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.89474, df = 21.7, p-value = 0.3807
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.04478985  0.01780572
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7238095          0.7373016 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.926, df = 22, p-value = 0.06712
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.044830070  0.001657379
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8313253          0.8529116 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.2342, df = 21.89, p-value = 0.2302
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.025725351  0.006533431
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9106061          0.9202020 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.9363, df = 19.002, p-value = 0.06785
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0254335352  0.0009890908
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9422222          0.9544444 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.39045, df = 20.526, p-value = 0.7002
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01649385  0.01128552
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7502170          0.7528212 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.34815, df = 20.44, p-value = 0.7313
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.05084944  0.03628633
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.8098706          0.8171521 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -2.5345, df = 19.371, p-value = 0.02002
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.013033829 -0.001251885
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9629149          0.9700577 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = 1.0098, df = 21.904, p-value = 0.3236
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.006526118  0.018907070
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9290476          0.9228571 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -14.395, df = 17.198, p-value = 5.017e-11
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02333146 -0.01737104
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9584965          0.9788478 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -3.2692, df = 14.499, p-value = 0.005376
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0047573632 -0.0009954131
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9855634          0.9884398 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.9089, df = 4.3274, p-value = 0.4112
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.04990296  0.02473562
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.5287818          0.5413655 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -0.17167, df = 21.361, p-value = 0.8653
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.004413334  0.003739619
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.9082193          0.9085562 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Features
## t = -1.136, df = 19.895, p-value = 0.2694
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.021287402  0.006279503
## sample estimates:
## mean in group Augm mean in group Orig 
##          0.7298578          0.7373618 
## 
## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display[result.display$Classifier == classifier, ]$orig and result.display[result.display$Classifier == classifier, ]$augm
## W = 138, p-value = 0.724
## alternative hypothesis: true location shift is not equal to 0

5.2 Cluster representation test

After clustering of the training examples, there are many ways we can use this information to create new features. In this experiment, we will compare several methods to determine which one works best. The two main options are encoding each cluster as a binary or a numerical feature. In the binary case, the values indicate wheter examples belong to a given cluster (1) or not (0). In the numerical case, the values indicate the distance from each example to a given cluster representative. There are also many possible variations of these two variants. All in all, We will consider the following options:

## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Representation  4 0.02793 0.006983   19.56 2.18e-09 ***
## Residuals      45 0.01606 0.000357                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.65918   -       -        -         
## distance   3.3e-07   1.0e-06 -        -         
## inv.dist.2 0.00018   0.00080 0.04602  -         
## prob       3.3e-07   1.0e-06 1.00000  0.04602   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.6667 0.6185
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.94178, p-value = 0.01585
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
##                Df   Sum Sq   Mean Sq F value Pr(>F)
## Representation  4 0.000276 6.897e-05   0.752  0.562
## Residuals      45 0.004128 9.173e-05               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary distance inv.dist.2
## binary     0.95      -      -        -         
## distance   0.68      0.68   -        -         
## inv.dist.2 0.68      0.68   0.68     -         
## prob       0.68      0.68   0.95     0.68      
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.8776 0.4849
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97904, p-value = 0.512
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Representation  4 0.01995 0.004986   17.75 8.23e-09 ***
## Residuals      45 0.01264 0.000281                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.19445   -       -        -         
## distance   1.7e-07   8.1e-06 -        -         
## inv.dist.2 0.09926   0.67992 2.4e-05  -         
## prob       2.5e-06   0.00012 0.37199  0.00038   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.1782 0.9485
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97546, p-value = 0.3801
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## Representation  4 0.2721 0.06802    21.2 7.01e-10 ***
## Residuals      45 0.1444 0.00321                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.12554   -       -        -         
## distance   4.9e-05   4.1e-07 -        -         
## inv.dist.2 0.00032   0.02648 2.2e-10  -         
## prob       0.47877   0.03039 0.00032  4.9e-05   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  1.2684 0.2964
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97218, p-value = 0.2832
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Representation  4 0.02332 0.005831   9.945 7.47e-06 ***
## Residuals      45 0.02638 0.000586                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary distance inv.dist.2
## binary     0.3881    -      -        -         
## distance   0.0182    0.1022 -        -         
## inv.dist.2 0.0350    0.0057 3.1e-05  -         
## prob       0.0080    0.0505 0.6988   1.7e-05   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.7483 0.5644
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97904, p-value = 0.512
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
##                Df Sum Sq Mean Sq F value Pr(>F)    
## Representation  4 1.9464  0.4866    72.8 <2e-16 ***
## Residuals      45 0.3008  0.0067                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.89075   -       -        -         
## distance   2.1e-12   2.4e-12 -        -         
## inv.dist.2 0.00024   0.00018 < 2e-16  -         
## prob       1.3e-10   1.7e-10 0.17700  1.2e-15   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  4  9.8924 7.883e-06 ***
##       45                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.69586, p-value = 6.98e-09
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Representation  4 0.09402 0.023506   10.77 3.29e-06 ***
## Residuals      45 0.09822 0.002183                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.1147    -       -        -         
## distance   0.0087    0.2574  -        -         
## inv.dist.2 0.0065    6.1e-05 2.0e-06  -         
## prob       0.7997    0.0772  0.0065   0.0087    
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.4596 0.7649
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.95042, p-value = 0.03549
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Representation  4 0.03721 0.009303    47.7 1.26e-15 ***
## Residuals      45 0.00878 0.000195                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.0019    -       -        -         
## distance   9.4e-11   8.7e-15 -        -         
## inv.dist.2 0.0034    1.1e-07 1.8e-06  -         
## prob       1.1e-07   3.3e-12 0.0322   0.0019    
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  1.4998 0.2183
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.91429, p-value = 0.001476
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## Representation  4 0.1633 0.04083   10.98 2.68e-06 ***
## Residuals      45 0.1673 0.00372                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.17925   -       -        -         
## distance   0.00992   0.00021 -        -         
## inv.dist.2 0.01002   0.17925 5.4e-06  -         
## prob       0.14459   0.00760 0.20653  0.00020   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.4089 0.8012
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.98959, p-value = 0.9361
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## Representation  4 0.4853 0.12132   8.627 2.95e-05 ***
## Residuals      45 0.6328 0.01406                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary distance inv.dist.2
## binary     0.7665    -      -        -         
## distance   0.1627    0.2220 -        -         
## inv.dist.2 0.0033    0.0016 4.5e-05  -         
## prob       0.2220    0.3223 0.7665   6.1e-05   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  4  6.1258 0.0005048 ***
##       45                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.60945, p-value = 2.697e-10
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## Representation  4 0.2142 0.05356   13.61 2.38e-07 ***
## Residuals      45 0.1771 0.00394                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary distance inv.dist.2
## binary     0.6348    -      -        -         
## distance   0.0064    0.0180 -        -         
## inv.dist.2 0.0074    0.0029 1.1e-06  -         
## prob       0.0035    0.0099 0.7768   8.1e-07   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  4  6.8709 0.0002094 ***
##       45                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.8741, p-value = 7.443e-05
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
##                Df   Sum Sq   Mean Sq F value Pr(>F)    
## Representation  4 0.012428 0.0031071   117.2 <2e-16 ***
## Residuals      45 0.001193 0.0000265                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.1899    -       -        -         
## distance   < 2e-16   < 2e-16 -        -         
## inv.dist.2 < 2e-16   4.2e-15 0.0099   -         
## prob       < 2e-16   < 2e-16 0.1568   0.2114    
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.5514  0.699
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97185, p-value = 0.2748
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
##                Df   Sum Sq   Mean Sq F value Pr(>F)    
## Representation  4 0.003688 0.0009219   216.7 <2e-16 ***
## Residuals      45 0.000191 0.0000043                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.557     -       -        -         
## distance   < 2e-16   < 2e-16 -        -         
## inv.dist.2 < 2e-16   < 2e-16 6.4e-05  -         
## prob       < 2e-16   < 2e-16 0.036    0.032     
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  4  3.1226 0.02379 *
##       45                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.98452, p-value = 0.7509
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq Mean Sq F value Pr(>F)    
## Representation  4 0.31140 0.07785   121.4 <2e-16 ***
## Residuals      45 0.02886 0.00064                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.042     -       -        -         
## distance   < 2e-16   < 2e-16 -        -         
## inv.dist.2 4.0e-08   4.2e-05 7.4e-15  -         
## prob       < 2e-16   4.5e-15 8.3e-05  6.8e-09   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  0.8537  0.499
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97051, p-value = 0.2428
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq  Mean Sq F value Pr(>F)    
## Representation  4 0.01355 0.003387   88.12 <2e-16 ***
## Residuals      45 0.00173 0.000038                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.76      -       -        -         
## distance   6.7e-15   2.2e-15 -        -         
## inv.dist.2 2.2e-15   2.0e-15 0.76     -         
## prob       4.4e-15   2.2e-15 0.84     0.84      
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  1.3802 0.2559
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97923, p-value = 0.5201
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
##                Df  Sum Sq Mean Sq F value   Pr(>F)    
## Representation  4 0.13323 0.03331    18.1 6.34e-09 ***
## Residuals      45 0.08281 0.00184                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Representation 
## 
##            bin.dist. binary  distance inv.dist.2
## binary     0.91198   -       -        -         
## distance   6.5e-08   6.5e-08 -        -         
## inv.dist.2 0.30243   0.28470 1.9e-06  -         
## prob       0.00085   0.00076 0.00518  0.01417   
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  4  1.5019 0.2177
##       45               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.68079, p-value = 3.801e-09

The results clearly indicate that the distance-based approach is superior to all other variants. To further verify this observation we performed the Friedman and the post-hoc Nemenyi tests, which confirm that, indeed, distance-based approach is significantly better than the alternatives.

##  bin.dist. inv.dist.2       bin.       prob      dist. 
##      1.875      2.125      2.250      4.000      4.750
## 
##  Friedman rank sum test
## 
## data:  friedmanData
## Friedman chi-squared = 42.6, df = 4, p-value = 1.253e-08
## 
##  Pairwise comparisons using Nemenyi multiple comparison test 
##              with q approximation for unreplicated blocked data 
## 
## data:  friedmanData 
## 
##            bin.dist. inv.dist.2 bin.    prob  
## inv.dist.2 0.9917    -          -       -     
## bin.       0.9627    0.9994     -       -     
## prob       0.0014    0.0071     0.0150  -     
## dist.      2.7e-06   2.6e-05    7.6e-05 0.6651
## 
## P value adjustment method: none

5.3 Comparison of clustering algorithms

## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)  
## Clustering   3 0.002213 0.0007377   2.618 0.0679 .
## Residuals   32 0.009018 0.0002818                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap    cm    km   
## cm 0.274 -     -    
## km 0.653 0.236 -    
## sc 0.272 0.064 0.317
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.5215 0.6706
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.9668, p-value = 0.3447
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
##             Df    Sum Sq   Mean Sq F value Pr(>F)
## Clustering   3 0.0000471 1.569e-05    0.16  0.922
## Residuals   32 0.0031375 9.805e-05               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap  cm  km 
## cm 0.9 -   -  
## km 0.9 0.9 -  
## sc 0.9 0.9 0.9
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.8851 0.4593
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.98277, p-value = 0.8344
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq Mean Sq F value   Pr(>F)    
## Clustering   3 0.03407 0.01136   33.39 5.65e-10 ***
## Residuals   32 0.01088 0.00034                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap      cm      km  
## cm 3.9e-08 -       -   
## km 0.26    3.9e-09 -   
## sc 0.26    2.9e-08 0.88
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.4506 0.7186
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.98151, p-value = 0.7952
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value Pr(>F)  
## Clustering   3 0.01853 0.006178   3.165 0.0377 *
## Residuals   32 0.06246 0.001952                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap    cm    km   
## cm 0.091 -     -    
## km 0.774 0.114 -    
## sc 0.474 0.048 0.411
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.6883 0.5658
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.95503, p-value = 0.1505
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)
## Clustering   3 0.002886 0.0009619   1.888  0.151
## Residuals   32 0.016303 0.0005095               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap   cm   km  
## cm 0.26 -    -   
## km 0.72 0.18 -   
## sc 0.82 0.26 0.82
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  3  3.3372 0.03147 *
##       32                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97775, p-value = 0.669
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
##             Df Sum Sq Mean Sq F value Pr(>F)    
## Clustering   3 1.5223  0.5074   996.7 <2e-16 ***
## Residuals   32 0.0163  0.0005                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap     cm     km  
## cm <2e-16 -      -   
## km 0.39   <2e-16 -   
## sc 0.65   <2e-16 0.65
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value   Pr(>F)   
## group  3  6.2322 0.001867 **
##       32                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96635, p-value = 0.3345
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value Pr(>F)  
## Clustering   3 0.01083 0.003610   2.827 0.0541 .
## Residuals   32 0.04086 0.001277                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap    cm    km   
## cm 0.063 -     -    
## km 0.869 0.063 -    
## sc 0.869 0.063 0.869
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.2705 0.8462
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96089, p-value = 0.2289
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq  Mean Sq F value   Pr(>F)    
## Clustering   3 0.021183 0.007061   33.78 4.92e-10 ***
## Residuals   32 0.006689 0.000209                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap    cm    km  
## cm 1e-08 -     -   
## km 0.87  1e-08 -   
## sc 0.50  1e-08 0.50
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value   Pr(>F)   
## group  3  4.9791 0.006015 **
##       32                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.95827, p-value = 0.1899
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Clustering   3 0.3201 0.10671   56.01 7.83e-13 ***
## Residuals   32 0.0610 0.00191                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap      cm      km  
## cm 1.0e-11 -       -   
## km 0.83    2.1e-11 -   
## sc 0.85    2.8e-10 0.85
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.8642 0.4697
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.98203, p-value = 0.8116
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Clustering   3 0.01369 0.004563   38.33 1.05e-10 ***
## Residuals   32 0.00381 0.000119                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap      cm      km  
## cm 3.9e-09 -       -   
## km 0.61    1.4e-09 -   
## sc 0.61    5.8e-09 0.81
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  1.6778 0.1914
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.94638, p-value = 0.08042
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)  
## Clustering   3 0.004644 0.0015480   3.121 0.0395 *
## Residuals   32 0.015869 0.0004959                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap    cm    km   
## cm 0.083 -     -    
## km 0.744 0.050 -    
## sc 0.805 0.083 0.805
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.4384 0.7271
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.98146, p-value = 0.7936
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
##             Df Sum Sq Mean Sq F value Pr(>F)    
## Clustering   3 1.6721  0.5574   555.3 <2e-16 ***
## Residuals   32 0.0321  0.0010                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap     cm     km  
## cm <2e-16 -      -   
## km 0.98   <2e-16 -   
## sc 0.98   <2e-16 0.98
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  3  11.854 2.204e-05 ***
##       32                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.68614, p-value = 1.725e-07
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq  Mean Sq F value Pr(>F)    
## Clustering   3 0.025723 0.008574    1500 <2e-16 ***
## Residuals   32 0.000183 0.000006                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap     cm     km  
## cm <2e-16 -      -   
## km 0.85   <2e-16 -   
## sc 0.77   <2e-16 0.77
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  1.2316 0.3143
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.78154, p-value = 7.147e-06
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Clustering   3 0.1187 0.03958   52.76 1.73e-12 ***
## Residuals   32 0.0240 0.00075                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap      cm      km  
## cm 2.1e-10 -       -   
## km 0.20    1.0e-11 -   
## sc 0.42    3.3e-10 0.70
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.8042 0.5008
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.92051, p-value = 0.01303
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)    
## Clustering   3 0.006261 0.0020871   129.6 <2e-16 ***
## Residuals   32 0.000515 0.0000161                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap      cm      km  
## cm < 2e-16 -       -   
## km 0.89    < 2e-16 -   
## sc 0.89    6.6e-15 0.92
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  0.4058 0.7498
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97162, p-value = 0.4714
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
##             Df Sum Sq Mean Sq F value Pr(>F)    
## Clustering   3 0.4243 0.14142     413 <2e-16 ***
## Residuals   32 0.0110 0.00034                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$TestAccuracy and test.results[test.results$Dataset == dataset, ]$Clustering 
## 
##    ap     cm     km  
## cm <2e-16 -      -   
## km 0.39   <2e-16 -   
## sc 0.72   <2e-16 0.34
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  3  1.4021 0.2601
##       32               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96816, p-value = 0.3775
##    cm    ap    km    sc 
## 1.125 2.625 2.875 3.375
## 
##  Friedman rank sum test
## 
## data:  friedmanData
## Friedman chi-squared = 27, df = 3, p-value = 5.887e-06
## 
##  Pairwise comparisons using Nemenyi multiple comparison test 
##              with q approximation for unreplicated blocked data 
## 
## data:  friedmanData 
## 
##    cm      ap      km     
## ap 0.00560 -       -      
## km 0.00073 0.94719 -      
## sc 4.9e-06 0.35432 0.69233
## 
## P value adjustment method: none

5.4 Global vs local clustering

The intuition behind our approach is that clustering of training examples regardless of their class could help generalization through the use of global information. We refer to this approach as global. However, one could argue for an alternative approach in which clustering is performed per class. We call this approach local. This way, we are still adding some global information about distant objects’ similarity, however, with the additional potential benefit of modeling the space occupied by each class. To verify which of these approaches is better, we compared these two approaches empirically.

To make the comparison meaningful, we have to ensure an equal number of clusters in both approaches, to make sure that the results solely rely on the generated clusters and not their quantity. To achieve this goal, the experiment was performed as follows. First, we performed the clustering separately in each class using affinity propagation to automatically determine the number of clusters. Next, we performed the same experiment using the global approach using k-means with the number of clusters equal to the total number of clusters in all classes. The results of this experiment are presented below.

## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 5.9959, df = 13.802, p-value = 3.481e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.03427849 0.07253969
## sample estimates:
## mean in group global  mean in group local 
##            0.9613636            0.9079545 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 0.72127, df = 17.253, p-value = 0.4804
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.005635996  0.011501099
## sample estimates:
## mean in group global  mean in group local 
##            0.9633431            0.9604106 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = -1.0681, df = 17.011, p-value = 0.3004
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.016909103  0.005542391
## sample estimates:
## mean in group global  mean in group local 
##            0.5926928            0.5983762 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 4.4209, df = 17.688, p-value = 0.0003427
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.02745658 0.07730533
## sample estimates:
## mean in group global  mean in group local 
##            0.6847619            0.6323810 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 0.31944, df = 16.244, p-value = 0.7535
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0169526  0.0229767
## sample estimates:
## mean in group global  mean in group local 
##            0.8554217            0.8524096 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 9.0132, df = 17.746, p-value = 4.855e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.05467333 0.08795294
## sample estimates:
## mean in group global  mean in group local 
##            0.9072727            0.8359596 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 7.1795, df = 17.954, p-value = 1.121e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.05941482 0.10858518
## sample estimates:
## mean in group global  mean in group local 
##                0.952                0.868 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = -0.10756, df = 15.922, p-value = 0.9157
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01618464  0.01462214
## sample estimates:
## mean in group global  mean in group local 
##            0.7617188            0.7625000 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 3.0262, df = 17.042, p-value = 0.007601
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.01705914 0.09556221
## sample estimates:
## mean in group global  mean in group local 
##            0.7601942            0.7038835 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 4.9222, df = 12.51, p-value = 0.000312
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.007796851 0.020081937
## sample estimates:
## mean in group global  mean in group local 
##            0.9451082            0.9311688 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 2.2086, df = 14.541, p-value = 0.04371
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.0008487463 0.0517226822
## sample estimates:
## mean in group global  mean in group local 
##            0.9314286            0.9051429 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 2.0559, df = 13.042, p-value = 0.06038
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0001347205  0.0054746991
## sample estimates:
## mean in group global  mean in group local 
##            0.9760413            0.9733713 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = -2.1648, df = 16.351, p-value = 0.04552
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.124070e-03 -2.411837e-05
## sample estimates:
## mean in group global  mean in group local 
##            0.9900601            0.9911342 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 1.2103, df = 2.4397, p-value = 0.3303
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.05989874  0.11960423
## sample estimates:
## mean in group global  mean in group local 
##            0.5425703            0.5127175 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = 6.5571, df = 17.996, p-value = 3.68e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.009551438 0.018558015
## sample estimates:
## mean in group global  mean in group local 
##            0.9009639            0.8869092 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Clustering
## t = -0.069655, df = 17.94, p-value = 0.9452
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01477215  0.01382428
## sample estimates:
## mean in group global  mean in group local 
##            0.7194313            0.7199052

The results of this experiment do not show a clear winner, although the global approach works better in more cases than the local. Nevertheless, the Wilcoxon signed ranks test was unable to find a significant difference between these two approaches at alpha=0.05, so we conclude that both approaches are equally valid. Given the above, we lean towards the global approach as it generally detects smaller number of clusters and, therefore, generates less new features which, in turn, helps generalization.

## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display$global and result.display$local
## W = 145, p-value = 0.2696
## alternative hypothesis: true location shift is greater than 0

5.5 Supervised vs semi-supervised learning

Since the method discussed in this research creates new features regardless of the decision attribute, it is very easy to use it in a semi-supervised setting. Therefore, in this experiment we would like to check whether clustering on both training and testing data will produce better features.

## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 0.20328, df = 9.9817, p-value = 0.843
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01887067  0.02265855
## sample estimates:
## mean in group Full mean in group Semi 
##          0.9715909          0.9696970 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = -0.89332, df = 5.8177, p-value = 0.4071
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01617229  0.00757014
## sample estimates:
## mean in group Full mean in group Semi 
##          0.9609971          0.9652981 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = -0.047385, df = 10.038, p-value = 0.9631
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01732001  0.01659831
## sample estimates:
## mean in group Full mean in group Semi 
##          0.5972936          0.5976545 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 0.36697, df = 8.8116, p-value = 0.7223
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03620936  0.05017762
## sample estimates:
## mean in group Full mean in group Semi 
##          0.7038095          0.6968254 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 1.031, df = 13.606, p-value = 0.3205
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.009812663  0.027884952
## sample estimates:
## mean in group Full mean in group Semi 
##          0.8554217          0.8463855 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = -2.8507, df = 13.118, p-value = 0.01354
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.024612036 -0.003401432
## sample estimates:
## mean in group Full mean in group Semi 
##          0.8977778          0.9117845 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 2.664, df = 13.662, p-value = 0.01882
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.004976059 0.046579497
## sample estimates:
## mean in group Full mean in group Semi 
##          0.9346667          0.9088889 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = -0.86017, df = 9.8588, p-value = 0.4101
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02652851  0.01177156
## sample estimates:
## mean in group Full mean in group Semi 
##          0.7565104          0.7638889 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = -0.73758, df = 7.9018, p-value = 0.4821
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.06821805  0.03520834
## sample estimates:
## mean in group Full mean in group Semi 
##          0.7893204          0.8058252 
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 0.91768, df = 8.5292, p-value = 0.384
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.006089771  0.014286019
## sample estimates:
## mean in group Full mean in group Semi 
##          0.9388745          0.9347763 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 2.082, df = 13.23, p-value = 0.0573
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.000654393  0.037225822
## sample estimates:
## mean in group Full mean in group Semi 
##          0.9354286          0.9171429 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 1.1522, df = 12.744, p-value = 0.2704
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.001397495  0.004577749
## sample estimates:
## mean in group Full mean in group Semi 
##          0.9751869          0.9735968 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 0.038616, df = 12.294, p-value = 0.9698
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.001677079  0.001737762
## sample estimates:
## mean in group Full mean in group Semi 
##          0.9909885          0.9909582 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 8.1245e-15, df = 9.0049, p-value = 1
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.03091005  0.03091005
## sample estimates:
## mean in group Full mean in group Semi 
##          0.5421687          0.5421687 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = 1.5716, df = 10.528, p-value = 0.1456
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.001616296  0.009535035
## sample estimates:
## mean in group Full mean in group Semi 
##          0.8965796          0.8926202 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Supervision
## t = -1.4466, df = 12.778, p-value = 0.1721
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.031348696  0.006230212
## sample estimates:
## mean in group Full mean in group Semi 
##          0.7090047          0.7215640
## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display$Semi and result.display$Full
## W = 124, p-value = 0.8965
## alternative hypothesis: true location shift is not equal to 0

5.6 Distance measure

## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = -0.47108, df = 17.754, p-value = 0.6433
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01862804  0.01180986
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.9693182                 0.9727273 
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 1.0327, df = 17.841, p-value = 0.3155
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.00425198  0.01246312
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.9662757                 0.9621701 
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 2.9695, df = 17.809, p-value = 0.008281
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.005610075 0.032820237
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.6027064                 0.5834912 
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 3.2122, df = 16.744, p-value = 0.00519
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.01956735 0.09471837
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.6761905                 0.6190476 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = -1.194, df = 11.55, p-value = 0.2564
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.032422719  0.009531152
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.8530120                 0.8644578 
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 12.114, df = 14.886, p-value = 4.148e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.06241872 0.08909643
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.8686869                 0.7929293 
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = -4.4353, df = 11.058, p-value = 0.0009901
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.07978259 -0.02688408
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.9093333                 0.9626667 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 0.25762, df = 17.692, p-value = 0.7997
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01306204  0.01670787
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.7690104                 0.7671875 
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 10.645, df = 16.505, p-value = 8.331e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.1579356 0.2362392
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.7563107                 0.5592233 
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 0.57522, df = 17.056, p-value = 0.5727
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01523945  0.02666803
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.9371429                 0.9314286 
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = -3.1449, df = 13.971, p-value = 0.007179
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.0046509238 -0.0008789207
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.9739765                 0.9767414 
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 12.77, df = 17.994, p-value = 1.85e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.008563161 0.011935656
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.9896960                 0.9794466 
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 34.129, df = 13.111, p-value = 3.394e-14
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.3163897 0.3591123
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.5449799                 0.2072289 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = 10.985, df = 16.369, p-value = 5.75e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.01187471 0.01754072
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.8893657                 0.8746580 
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
## 
##  Welch Two Sample t-test
## 
## data:  TestAccuracy by Distance
## t = -11.619, df = 13.605, p-value = 1.914e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.10194123 -0.07009668
## sample estimates:
##   mean in group Euclidean mean in group Mahalanobis 
##                 0.7080569                 0.7940758
## 
##  Wilcoxon rank sum exact test
## 
## data:  result.display$Maha and result.display$Eucl
## W = 126, p-value = 0.9556
## alternative hypothesis: true location shift is not equal to 0

5.7 Sensitivity test

In this experiment we will examine how the number of clusters influences the quality of classification. We will only analyze the new features and discard the original ones. In addition to test set accuracy, we will also report training set accuracy to check when the model starts overfitting due to high dimensionality of the new feature space. We will vary the number of clusters from 1 to a ridiculus 200, just to observe what impact will it exactly have on classification quality. Let us begin with linear SVM on pima-indians-diabetes dataset.

## Warning: Removed 1 row(s) containing missing values (geom_path).

As we can see, after reaching a test accuracy of approximately 78% with around 20 clusters, the test accuracy stops improving and starts diverging from the training accuracy at around k=35 mark, while the training accuracy keeps getting better, which is a clear sign of overfitting. Since SVM is a reasonably robust algorithm, this effect isn’t as dramatic as one would expect, so in order to emphasize this issue let us use a simple logistic regression classifier to get a clear indication where the overfitting actually begins.

Now we can observe this effect even clearer, with first significant differences appearing at around k=30 and the two lines clearly starting to diverge after k=40 mark. Interestingly, affinity propagation picked 35 as the number of clusters for this dataset, which (judging by these plots) seems just about right!

Now let’s look how this experiment turns out for other datasets with linear SVM.

## [1] "wine"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "breast-cancer-wisconsin"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "yeast"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "glass"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "ecoli"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "vowel-context"

## [1] "iris"

## [1] "pima-indians-diabetes"

## [1] "sonar.all"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "image-segmentation"

## [1] "ionosphere"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "optdigits"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "pendigits"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "spectrometer"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "statlog-satimage"
## Warning: Removed 1 row(s) containing missing values (geom_path).

## [1] "statlog-vehicle"
## Warning: Removed 1 row(s) containing missing values (geom_path).

5.8 Is there any differnce between high and low quality features?

Since we have already established in our sensitivity experiment that the number of clusters has a clear influence on the quality of classification, let us now check wherer the quality of the new features as treated separately makes any difference. In order to do so, we will cluster the dataset into a certain number of clusters, encode the clusters as new features, and evaluate the quality of each new feature using Fisher Score. Next, we will add new features one by one in order of their increasing and decreasing quality in order to observe the effect they have on classification accuracy. Again, we will use linear SVM and pima-indians-diabetes dataset with k=35 (as determined by affinity propagation).

## quartz_off_screen 
##                 2

Ultimately what we are doing in our approach is selecting points in n-dimensional space and calculating the distances between all data points and these new points and encoding these points as new features. This plot proves (at least to some degree) that the choice of these points matters and has a high impact on the quality of classification. On the diagram, the blue line represents classfication quality when adding new features according to their descending fisher score, while the green line represents the same in an ascending order. The lines obvoiusly meet at the end, since in both cases in the end all features are used for classification. However, what happens before that is a clear indication that some points (clusters) hold more information than others.

5.9 All vs new vs no

## [1] "---------------------------------------------------------------------"
## [1] "wine"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)
## Features     2 0.000370 0.0001851     0.9  0.418
## Residuals   27 0.005553 0.0002057               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both New 
## New      0.86 -   
## Original 0.45 0.45
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.3424 0.7131
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96749, p-value = 0.473
## 
## [1] "---------------------------------------------------------------------"
## [1] "breast-cancer-wisconsin"
## [1] "---------------------------------------------------------------------"
##             Df    Sum Sq   Mean Sq F value Pr(>F)
## Features     2 0.0001967 9.833e-05    0.88  0.426
## Residuals   27 0.0030151 1.117e-04               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both New 
## New      0.59 -   
## Original 0.59 0.67
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.7833  0.467
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.98226, p-value = 0.882
## 
## [1] "---------------------------------------------------------------------"
## [1] "yeast"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)
## Features     2 0.000601 0.0003004   1.115  0.343
## Residuals   27 0.007273 0.0002694               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both New 
## New      0.84 -   
## Original 0.37 0.37
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.8032 0.4583
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.9734, p-value = 0.6357
## 
## [1] "---------------------------------------------------------------------"
## [1] "glass"
## [1] "---------------------------------------------------------------------"
##             Df Sum Sq  Mean Sq F value   Pr(>F)    
## Features     2 0.0364 0.018198   11.02 0.000317 ***
## Residuals   27 0.0446 0.001652                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both   New   
## New      0.2012 -     
## Original 0.0003 0.0046
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  2.0531 0.1479
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.87356, p-value = 0.002013
## 
## [1] "---------------------------------------------------------------------"
## [1] "ecoli"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)
## Features     2 0.001749 0.0008746   2.388  0.111
## Residuals   27 0.009889 0.0003663               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both New 
## New      0.33 -   
## Original 0.33 0.11
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  1.1061 0.3454
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97262, p-value = 0.613
## 
## [1] "---------------------------------------------------------------------"
## [1] "vowel-context"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq Mean Sq F value Pr(>F)    
## Features     2 0.10140 0.05070   173.1  4e-16 ***
## Residuals   27 0.00791 0.00029                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both    New    
## New      0.47    -      
## Original 3.9e-15 6.1e-15
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.4048 0.6711
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96362, p-value = 0.382
## 
## [1] "---------------------------------------------------------------------"
## [1] "iris"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Features     2 0.01593 0.007964    14.9 4.37e-05 ***
## Residuals   27 0.01444 0.000535                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both    New  
## New      0.00048 -    
## Original 0.31145 6e-05
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  2  4.0645 0.02864 *
##       27                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.94517, p-value = 0.1254
## 
## [1] "---------------------------------------------------------------------"
## [1] "pima-indians-diabetes"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value Pr(>F)
## Features     2 0.000848 0.0004241   1.188   0.32
## Residuals   27 0.009637 0.0003569               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both New 
## New      0.47 -   
## Original 0.40 0.47
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  2  2.6448 0.08934 .
##       27                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.93299, p-value = 0.05898
## 
## [1] "---------------------------------------------------------------------"
## [1] "sonar.all"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value Pr(>F)
## Features     2 0.00683 0.003415    2.16  0.135
## Residuals   27 0.04270 0.001581               
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both New 
## New      0.36 -   
## Original 0.36 0.14
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  1.8938   0.17
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97885, p-value = 0.7942
## 
## [1] "---------------------------------------------------------------------"
## [1] "image-segmentation"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value   Pr(>F)    
## Features     2 0.004130 0.0020651   30.24 1.28e-07 ***
## Residuals   27 0.001844 0.0000683                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both    New    
## New      9.2e-08 -      
## Original 0.013   4.7e-05
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  1.5225 0.2363
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97807, p-value = 0.7723
## 
## [1] "---------------------------------------------------------------------"
## [1] "ionosphere"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Features     2 0.01475 0.007374   14.81 4.56e-05 ***
## Residuals   27 0.01345 0.000498                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both  New    
## New      0.016 -      
## Original 0.012 2.8e-05
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.1139 0.8928
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96776, p-value = 0.4799
## 
## [1] "---------------------------------------------------------------------"
## [1] "optdigits"
## [1] "---------------------------------------------------------------------"
##             Df    Sum Sq   Mean Sq F value  Pr(>F)   
## Features     2 0.0001173 5.864e-05   5.927 0.00735 **
## Residuals   27 0.0002671 9.890e-06                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both  New  
## New      0.015 -    
## Original 0.689 0.012
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  2  4.6172 0.01885 *
##       27                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96918, p-value = 0.5171
## 
## [1] "---------------------------------------------------------------------"
## [1] "pendigits"
## [1] "---------------------------------------------------------------------"
##             Df    Sum Sq   Mean Sq F value   Pr(>F)    
## Features     2 0.0005513 2.757e-04   109.1 1.16e-13 ***
## Residuals   27 0.0000682 2.530e-06                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both    New    
## New      0.35    -      
## Original 7.6e-13 2.1e-12
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)  
## group  2  2.7838 0.07958 .
##       27                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.95426, p-value = 0.2196
## 
## [1] "---------------------------------------------------------------------"
## [1] "spectrometer"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Features     2 0.01054 0.005271    11.2 0.000287 ***
## Residuals   27 0.01270 0.000470                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both    New    
## New      0.00057 -      
## Original 0.93462 0.00057
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  1.0135 0.3763
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.958, p-value = 0.2752
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-satimage"
## [1] "---------------------------------------------------------------------"
##             Df   Sum Sq   Mean Sq F value   Pr(>F)    
## Features     2 0.004302 0.0021508     100 3.28e-13 ***
## Residuals   27 0.000581 0.0000215                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both    New    
## New      0.053   -      
## Original 9.2e-13 2.1e-11
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.3275 0.7235
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.97185, p-value = 0.591
## 
## [1] "---------------------------------------------------------------------"
## [1] "statlog-vehicle"
## [1] "---------------------------------------------------------------------"
##             Df  Sum Sq  Mean Sq F value   Pr(>F)    
## Features     2 0.06253 0.031263   119.1 4.03e-14 ***
## Residuals   27 0.00709 0.000262                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  test.results[test.results$Dataset == dataset, ]$Accuracy and test.results[test.results$Dataset == dataset, ]$Features 
## 
##          Both    New    
## New      1.5e-13 -      
## Original 0.089   1.8e-12
## 
## P value adjustment method: BH
## Warning in leveneTest.default(y = y, group = group, ...): group coerced to
## factor.
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  1.1009  0.347
##       27               
## 
##  Shapiro-Wilk normality test
## 
## data:  aov_residuals
## W = 0.96752, p-value = 0.4736
##    New   Orig   Both 
## 1.7500 1.8125 2.4375
## 
##  Friedman rank sum test
## 
## data:  friedmanData
## Friedman chi-squared = 4.625, df = 2, p-value = 0.09901
## 
##  Pairwise comparisons using Nemenyi multiple comparison test 
##              with q approximation for unreplicated blocked data 
## 
## data:  friedmanData 
## 
##      New  Orig
## Orig 0.98 -   
## Both 0.13 0.18
## 
## P value adjustment method: none

5.10 Measuring the influence of datasets

We’ll measure the influence of 4 factors: - the number of features, - the dataset difficulty, i.e., how well separated are the classes - the number of classes - the class distribution

## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

5.11 Single vs double averaging

The purpose of this experiment is to verify if our averaging strategy has any impact on the measured outcome (averaging bias?). It was pointed out to us by one of the reviewers that kmeans can produce potentially different results with each run on the same data (which we verified to be the case). In this case, it makes sense to check whether averaging the results over multiple runs of kmeans on a single split of dataset produces different results than simply averaging many single runs on each split.

5.12 Clustering-generated vs random reference points

In this section we’re about to find out whether this whole clustering makes any sense at all. After all, when you think about it, it’s just a fancy way of selecting reference points, so we were wandering if there’s any difference between clustering-generated points and points selected at random. Let’s find out!

I think there is an important takeaway here. There is more to clustering than just generating random points, however, in some cases - not that much more! This hints that other methods for generating reference points may work even better.