How to tune hyperparameters using Random Search CV in python

Hyperparameter tuning is one of the most important steps in machine learning. As the ML algorithms will not produce the highest accuracy out of the box. You need to tune their hyperparameters to achieve the best accuracy. You can follow any one of the below strategies to find the best parameters.

In this post, I will discuss the Random Search CV. The CV stands for cross-validation.

What is the difference between GridSearch CV and RandomSearchCV?

The main difference between these two techniques is the obligation to try all parameters. GridSearchCV has to try ALL the parameter combinations, however, RandomSearchCV can choose only a few ‘random’ combinations out of all the available combinations.

For example in the below parameter options, GridSearchCV will try all 20 combinations, however, for RandomSearchCV you can specify how many to try out of all these. by specifying a parameter called “n_iter“. If you keep n_iter=5 it means any random 5 combinations will be tried.

Exhaustive Combinations of hyperparameters

In the below code, the RandomizedSearchCV function will try any 5 combinations of hyperparameters.

We have specified cv=5. This means the model will be tested(cross-validated) 5 times. By dividing the data into 5 parts, choosing one part as testing and the other four as training data. The final accuracy for each combination of hyperparameter is the average of these five iterations.

Hence here total times the model will be fitted is n_iter=5 X cv=5 = 25 times!

n_jobs=1 specifies the number of parallel threads to run and verbose=5 means how much detail to print out while fitting the model, the higher the value, the more the details printed.

Sample Output

Hyperparameter tuning using RandomizedSearchCV
Hyperparameter tuning using RandomizedSearchCV

How to access the best hyperparameters?

The best combination of hyperparameters is stored as “best_params_” in the results.

Sample Output

Accessing best hyperparameters for RandomizedSearchCV
Accessing best hyperparameters for RandomizedSearchCV

Author Details
Lead Data Scientist
Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!

Leave a Reply!

Your email address will not be published. Required fields are marked *