R
This post is reproduced from a post of my Japanese blog.A friend of mine, an academic researcher in machine learning field tweeted as below.imbalanced data に対する対処を勉強していたのだけど,[Wallace et al. ICDM'11] https://t.co/ltQ942lKP…
Two years ago, I published a book -- written in Japanese so I'm afraid most of the readers can't read it :'( Actually this book was written as a summary of 10 major data science methods. But as two years have gone, the content of the book …
Actually I've known about MXnet for weeks as one of the most popular library / packages in Kaggler, but just recently I heard bug fix has been almost done and some friends say the latest version looks stable, so at last I installed it. MXn…
In the previous post, we successfully estimated a model with a nonlinear trend by using Stan. But please remember this is a time series dataset. Does it include any other kind of nonlinear components? Yes, we have to be careful for seasona…
The previous post reviewed how to estimate a simple hierarchical Bayesian models. You can see more complicated cases in a great textbook "The BUGS book". But personally hierarchical Bayesian modeling is the most useful for time-series anal…
In 2 previous posts, you learned what Bayesian modeling and Stan are and how to install them. Now you are ready to try it on some very Bayesian problems - as many people love - such as hierarchical Bayesian model. Definition of hierarchica…
The previous post overviewed what and how is Stan on R. Bayesian modeling with R and Stan (1): Overview - Data Scientist in Ginza, Tokyo Are you ready now? OK, this post reviews how to install Stan. Let's start here! :) In principle this p…
Although I've written a series of posts titled "Machine Learning for package uses in R", usually I don't run machine learning on daily analytic works because my current coverage is so-called an ad-hoc analysis. Instead of machine learning,…
In many cases of digital marketing especially if it's online, marketers or analysts usually love to apply A/B tests in order to find the most influential metric on KGI/KPIs from a huge set of explanatory metrics, such as creative component…
As far as I've known, Xgboost is the most successful machine learning classifier in several competitions in machine learning, e.g. Kaggle or KDD cups. Indeed the team winning Higgs-Boson competition used Xgboost and below is their code rel…
Random Forest is still one of the strongest supervised learning methods although these days many people love to use Deep Learning or Convolutional NN. Of course because it's simple architecture and a lot of implementation in various enviro…
These days almost everybody appears to love a variation of Neural Network (NN) -- Deep Learning. I already argued about how Deep Learning works and what kind of parameters characterizes it in the previous post. What kind of decision bounda…
Actually support vector machine (SVM) is the one that I love the most among various machine learning classifiers... because of its strong generalization and beautiful decision boundary (in high dimensional space). Although there are other …
I think a lot of people love logistic regression because it's pretty light and fast. But we know it's just a linear classifying function -- I mean it's only for linearly separable patterns, not linearly non-separable ones. It's primary ide…
Notice Currently {mvpart} CRAN package was removed from CRAN due to expiration of its support. For installation, 1) please download the latest (but expired) package archive from the old archive site and 2) install it following the procedur…
Below is the most popular post in this blog that recorded an enormous number of PV and received a lot of comments even here or outside this blog. Comparing machine learning classifiers based on their hyperplanes or decision boundaries - Da…
In the previous post we saw how Deep Learning with {h2o} works and how Deep Belief Nets implemented by h2o.deeplearning draw decision boundaries for XOR patterns.What kind of decision boundaries does Deep Learning (Deep Belief Net) draw? P…
For a while (at least several months since many people began to implement it with Python and/or Theano, PyLearn2 or something like that), nearly I've given up practicing Deep Learning with R and I've felt I was left alone much further away…
On Apr 17, I joined Global TokyoR #1 and talked about a stuff below. Visualization of Supervised Learning with {arules} + {arulesViz} from Takashi J Ozaki (Note: please install {igraph} package before installing {arulesViz}) By the way, th…
(The original posts in Japanese version are here and here ) In Japan, from my own experience, there may be a dichotomy between "analytics" and "data science". It has been said that real business matters require rapid analyses and rapid act…
In Japanese version of this blog, I've written a series of posts about how each kind of machine learning classifiers draws various classification hyperplanes or decision boundaries. So in this post I want to show you a summary of the serie…