Watch or follow along our demo that analyzes the success of bank telemarketing via a dataset provided by Satoshida Tamoto through Kaggle to predict if a bank client will subscribe to a term deposit. We will use a subset of the original dataset that has 45210 rows.
Follow along our video demo or jump over to terminal to run Brainome with us. If you have not used Brainome before, follow along our quickstart tutorial to get started.
First we will run the command -h in order to see all the possibilities we have while running Brainome. We will use -f to force the model, -e to increase effort, and -split to determine the split of our dataset for training and validation.
In our first run, we will just run Brainome in AutoML mode to analyze the data. We will immediately ignore the column “Index” by using the command -ignorecolumns. By running in AutoML mode, we can compare four types of models and then use this information to later force the model to the one we prefer. We can see in this first run that Random Forest is the recommended model and the dataset was split at 50% for training and 50% for validation. You can follow along a summary of the results on the right hand side of the video.
For the second run, we increase the effort to 10, force the model to Random Forest, and force the Split to 80%, let’s take look at the results.
A little longer run time here, but we see a slight improvement in our results by increasing the effort and split. About 1% increase in Training Accuracy and a little over half of a percent for validation accuracy. Let’s try another run with even higher effort to compare the results.
We increase to effort 50 and run again.
This run took much longer, just over 8 minutes. We can see the results are varying, very slight increase in training accuracy, and decrease in validation accuracy.
The real question here is if more effort is actually better?
We can see that the middle model is more useful. There is a tendency for people to focus just on accuracy, but as we see here the model capacity has increased. The model picked a slightly lower validation accuracy, because the MEC is reduced.
Get started with pip install brainome, and try a dataset on your own.