How to tune hyperparameters automatically using Bayesian optimization

Hyperparameter tuning is one of the most important steps in machine learning. As the ML algorithms will not produce the highest accuracy out of the box. You need to tune their hyperparameters to achieve the best accuracy. You can follow any one of the below strategies to find the best parameters.

In this post, I will discuss Bayesian Optimization.

GridSearchCV tries out ALL the parameter combinations, RandomSearchCV tries only a few ‘random’ combinations. Bayesian Optimization takes an intelligent guess about the next combination to be tried by looking at the results of previous combinations. Whichever set of hyperparameter produced better results, it will move towards those values. Hence, optimizing the selection of hyperparameters.

Hence, Bayesian Optimization also tries only a few combinations out of all the possible combinations but it chooses the next set of parameters by extrapolating the results from previous choices.

Let’s say, the Bayesian Optimizer tries the parameter n_estimators = 100, 150, and 200 so far. It observes that the accuracy outcome from n_estimators=150 is highest, so the next set of values chosen will be around 150. So on and so forth.

There can be a situation where some of the hyperparameter values which you have provided may not be tried at all because the optimizer chose to move around a specific value.

For Bayesian Optimization in Python, you need to install a library called hyperopt.

# installing library for Bayesian optimization
pip install hyperopt

1 2	# installing library for Bayesian optimization pip install hyperopt

In the below code snippet Bayesian optimization is performed on three hyperparameters, n_estimators, max_depth, and criterion.

###################################################################
#### Create Loan Data for Classification in Python ####
import pandas as pd
import numpy as np
ColumnNames=['CIBIL','AGE', 'SALARY', 'APPROVE_LOAN']
DataValues=[[480, 28, 610000, 'Yes'],
             [480, 42, 140000, 'No'],
             [480, 29, 420000, 'No'],
             [490, 30, 420000, 'No'],
             [500, 27, 420000, 'No'],
             [510, 34, 190000, 'No'],
             [550, 24, 330000, 'Yes'],
             [560, 34, 160000, 'Yes'],
             [560, 25, 300000, 'Yes'],
             [570, 34, 450000, 'Yes'],
             [590, 30, 140000, 'Yes'],
             [600, 33, 600000, 'Yes'],
             [600, 22, 400000, 'Yes'],
             [600, 25, 490000, 'Yes'],
             [610, 32, 120000, 'Yes'],
             [630, 29, 360000, 'Yes'],
             [630, 30, 480000, 'Yes'],
             [660, 29, 460000, 'Yes'],
             [700, 32, 470000, 'Yes'],
             [740, 28, 400000, 'Yes']]

#Create the Data Frame
LoanData=pd.DataFrame(data=DataValues,columns=ColumnNames)
LoanData.head()

#Separate Target Variable and Predictor Variables
TargetVariable='APPROVE_LOAN'
Predictors=['CIBIL','AGE', 'SALARY']
X=LoanData[Predictors].values
y=LoanData[TargetVariable].values

##############################################################
# Bayesian hyperparameter optimization
from hyperopt import hp, fmin, tpe, STATUS_OK, Trials, anneal
from sklearn.model_selection import cross_val_score

#Random Forest (Bagging of multiple Decision Trees)
from sklearn.ensemble import RandomForestClassifier
RF = RandomForestClassifier()

# Defining the hyper parameter space as a dictionary
parameter_space = { 'n_estimators': hp.quniform('n_estimators',5,50,5),
                       'max_depth': hp.quniform('max_depth', 2,10,1),
                      'criterion': hp.choice('criterion', ['gini', 'entropy'])
                  }

# Defining a cost function which the Bayesian algorithm will optimize
def objective(parameter_space):
    
    # The accuracy parameter is the average accuracy obtained by cross validation of the data
    # See different scoring methods by using sklearn.metrics.SCORERS.keys()
    Error = cross_val_score(RF, X, y, cv = 5, scoring='accuracy').mean()

    # We return the loss which will be minimized by the fmin() function
    return {'loss': -Error, 'status': STATUS_OK }

import warnings
warnings.filterwarnings('ignore')

# Finding out which set of hyperparameters give highest accuracy
trials = Trials()
best_params = fmin(fn= objective,
            space= parameter_space,
            #algo= tpe.suggest,
            algo=anneal.suggest,  # the logic which chooses next parameter to try
            max_evals = 100,
            trials= trials)

###################################################################

#### Create Loan Data for Classification in Python ####

import pandas as pd

import numpy as np

ColumnNames=['CIBIL','AGE', 'SALARY', 'APPROVE_LOAN']

DataValues=[[480, 28, 610000, 'Yes'],

[480, 42, 140000, 'No'],

[480, 29, 420000, 'No'],

[490, 30, 420000, 'No'],

[500, 27, 420000, 'No'],

[510, 34, 190000, 'No'],

[550, 24, 330000, 'Yes'],

[560, 34, 160000, 'Yes'],

[560, 25, 300000, 'Yes'],

[570, 34, 450000, 'Yes'],

[590, 30, 140000, 'Yes'],

[600, 33, 600000, 'Yes'],

[600, 22, 400000, 'Yes'],

[600, 25, 490000, 'Yes'],

[610, 32, 120000, 'Yes'],

[630, 29, 360000, 'Yes'],

[630, 30, 480000, 'Yes'],

[660, 29, 460000, 'Yes'],

[700, 32, 470000, 'Yes'],

[740, 28, 400000, 'Yes']]

#Create the Data Frame

LoanData=pd.DataFrame(data=DataValues,columns=ColumnNames)

LoanData.head()

#Separate Target Variable and Predictor Variables

TargetVariable='APPROVE_LOAN'

Predictors=['CIBIL','AGE', 'SALARY']

X=LoanData[Predictors].values

y=LoanData[TargetVariable].values

##############################################################

# Bayesian hyperparameter optimization

from hyperopt import hp, fmin, tpe, STATUS_OK, Trials, anneal

from sklearn.model_selection import cross_val_score

#Random Forest (Bagging of multiple Decision Trees)

from sklearn.ensemble import RandomForestClassifier

RF = RandomForestClassifier()

# Defining the hyper parameter space as a dictionary

parameter_space = { 'n_estimators': hp.quniform('n_estimators',5,50,5),

'max_depth': hp.quniform('max_depth', 2,10,1),

'criterion': hp.choice('criterion', ['gini', 'entropy'])

}

# Defining a cost function which the Bayesian algorithm will optimize

def objective(parameter_space):

# The accuracy parameter is the average accuracy obtained by cross validation of the data

# See different scoring methods by using sklearn.metrics.SCORERS.keys()

Error = cross_val_score(RF, X, y, cv = 5, scoring='accuracy').mean()

# We return the loss which will be minimized by the fmin() function

return {'loss': -Error, 'status': STATUS_OK }

import warnings

warnings.filterwarnings('ignore')

# Finding out which set of hyperparameters give highest accuracy

trials = Trials()

best_params = fmin(fn= objective,

space= parameter_space,

#algo= tpe.suggest,

algo=anneal.suggest, # the logic which chooses next parameter to try

max_evals = 100,

trials= trials)

Sample Output

How to access the best hyperparameters?

The best hyperparameters are returned by the function ‘fmin()’. We have stored the results in the ‘best_params’ variable.

print('The best parameters are:', best_params)

# Dataframe of results from optimization
search_results = pd.DataFrame({'loss': trials.losses(), 
                               'n_estimators': trials.vals['n_estimators'],
                               'max_depth': trials.vals['max_depth']})

# Visualizing all the parameter trials
%matplotlib inline
import matplotlib.pyplot as plt
fig, subPlots=plt.subplots(nrows=1, ncols=2, figsize=(15,3))
search_results.sort_values(by='n_estimators').plot(x='n_estimators', y='loss', ax=subPlots[0])
search_results.sort_values(by='max_depth').plot(x='max_depth', y='loss', ax=subPlots[1])

print('The best parameters are:', best_params)

# Dataframe of results from optimization

search_results = pd.DataFrame({'loss': trials.losses(),

'n_estimators': trials.vals['n_estimators'],

'max_depth': trials.vals['max_depth']})

# Visualizing all the parameter trials

%matplotlib inline

import matplotlib.pyplot as plt

fig, subPlots=plt.subplots(nrows=1, ncols=2, figsize=(15,3))

search_results.sort_values(by='n_estimators').plot(x='n_estimators', y='loss', ax=subPlots[0])

search_results.sort_values(by='max_depth').plot(x='max_depth', y='loss', ax=subPlots[1])

Sample Output

Best hyperparameters in Bayesian optimization

Author Details

Farukh Hashmi

Lead Data Scientist

Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!

https://thinkingneuron.com/

thinkingneuron@gmail.com

How to access the best hyperparameters?

Leave a Reply! Cancel Reply