KazAnova on Stacking: leveraging multiple machine learning algorithms for better predictive models

Machine learning can be a powerful tool in the creation of predictive models. But it doesn’t provide a magic bullet. In the end, effective machine learning works very much like other high-value human endeavors. It requires experimentation, evaluation, lots of work, and a measure of hard-earned wisdom.

As Kaggle Competitions Grandmaster Marios Michailidis (AKA KazAnova) explains:

No model is perfect. Almost every time the models make mistakes. Plus, each model has different advantages and disadvantages and they tend to seize the data from different angles. Leveraging the uniqueness of each model is of the essence for building very predictive models.

To help with this process, David H. Wolpert introduced the concept of stacked generalization in a 1992 paper.

Michailidis explains the process as follows:

Stacking or Stacked Generalization … normally involves a four-stage process. Consider 3 datasets A, B, C. For A and B we know the ground truth (or in other words the target variable y). We can use stacking as follows:

We train various machine learning algorithms (regressors or classifiers) in dataset A.
We make predictions for each one of the algorithms for datasets B and C and we create new datasets B1 and C1 that contain only these predictions. So if we ran 10 models then B1 and C1 have 10 columns each.
We train a new machine learning algorithm (often referred to as Meta learner or Super learner) using B1.
We make predictions using the Meta learner on C1.

As part of his own PhD work, Michailidis developed a software stack, named StackNet to speed up the process.

Marios Michailidis describes StackNet in this way:

StackNet is a computational, scalable and analytical framework implemented with a software implementation in Java that resembles a feedforward neural network and uses Wolpert’s stacked generalization in multiple levels to improve accuracy in classification problems. In contrast to feedforward neural networks, rather than being trained through back propagation, the network is built iteratively one layer at a time (using stacked generalization), each of which uses the final target as its target.

StackNet is available in GitHub under the MIT license.

Be sure to read the interview with Michailidis about stacking and StackNet on the Kaggle blog, here.

data enhanced

website of David Cochran, dean at Newman University, data science nerd, etc.

KazAnova on Stacking: leveraging multiple machine learning algorithms for better predictive models

Leave a comment

Share this:

Leave a comment