How to tune hyperparameters automatically using Bayesian optimization

Hyperparameter tuning is one of the most important steps in machine learning. As the ML algorithms will not produce the highest accuracy out of the box. You need to tune their hyperparameters to achieve the best accuracy. You can follow any one of the below strategies to find the best parameters.

In this post, I will discuss Bayesian Optimization.

GridSearchCV tries out ALL the parameter combinations, RandomSearchCV tries only a few ‘random’ combinations. Bayesian Optimization takes an intelligent guess about the next combination to be tried by looking at the results of previous combinations. Whichever set of hyperparameter produced better results, it will move towards those values. Hence, optimizing the selection of hyperparameters.

Hence, Bayesian Optimization also tries only a few combinations out of all the possible combinations but it chooses the next set of parameters by extrapolating the results from previous choices.

Let’s say, the Bayesian Optimizer tries the parameter n_estimators = 100, 150, and 200 so far. It observes that the accuracy outcome from n_estimators=150 is highest, so the next set of values chosen will be around 150. So on and so forth.

There can be a situation where some of the hyperparameter values which you have provided may not be tried at all because the optimizer chose to move around a specific value.

For Bayesian Optimization in Python, you need to install a library called hyperopt.

In the below code snippet Bayesian optimization is performed on three hyperparameters, n_estimators, max_depth, and criterion.

Sample Output

Bayesian Hyperparameter optimization
Bayesian Hyperparameter optimization

How to access the best hyperparameters?

The best hyperparameters are returned by the function ‘fmin()’. We have stored the results in the ‘best_params’ variable.

Sample Output

Best hyperparameters in Bayesian optimization
Best hyperparameters in Bayesian optimization

Author Details
Lead Data Scientist
Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!

Leave a Reply!

Your email address will not be published. Required fields are marked *