# How to measure the correlation between two categorical variables in python

This is a situation that arises often during classification machine learning. The target variable is categorical and the predictors can be either continuous or categorical, so when both of them are categorical, then the strength of the relationship between them can be measured using a Chi-square test.

Chi-square test finds the probability of a Null hypothesis(H0).

• Assumption(H0): The two columns are NOT related to each other
• Result of Chi-Sq Test: The Probability of H0 being True

It can help to understand whether both the categorical variables are correlated with each other or not.

In the below scenario, we try to measure the correlation between GENDER and LOAN_APPROVAL.

Sample Output:

H0: The variables are not correlated with each other. This is the H0 used in the Chi-square test.

In the above example, the P-value came higher than 0.05. Hence H0 will be accepted. Which means the variables are not correlated with each other.

This means, if two variables are correlated, then the P-value will come very close to zero. Author Details
1. 1. 