Comparing machine learning classifiers based on their hyperplanes or decision boundaries
In Japanese version of this blog, I've written a series of posts about how each kind of machine learning classifiers draws various classification hyperplanes or decision boundaries.
So in this post I want to show you a summary of the series and how their hyperplanes or decision boundaries vary (translated from Japanese version). It must be interesting and help you understand a nature of each classifier. Here I chose some representative classifiers as follows: decision tree (DT), logistic regression (LR: only for linearly separable cases), support vector machine (SVM), neural networks (NN: back-propagation multi-layer perceptron) and random forest (RF). They are all supervised learning methods and easy to import in R*1.
I'm still new to this field and just a "package-user", not serious expert in machine learning and its scientific basis*2. For such people, explaining meanings of algorithms or theorems is not helpful for understanding how they work -- instead, visualized feature (= hyperplanes or decision boundaries) well help us, I believe.
*1:Functions and packages used here were: rpart(){mvpart} for decision tree, glm(){stats} or vglm(){MASS} for logistic regression or multinomial logit, svm(){e1071} for SVM, nnet(){nnet} for neural networks and randomForest(){randomForest} for random forest
*2:Even I'm Ph. D. in a certain experimental research field