How to visualize the relationship between two continuous variables in Python

A Scatter plot is the chart used when you want to visualize the relationship between two continuous variables in data. Typically used in Supervised ML(Regression). Where the target variable is a continuous variable. So if you want to check which continuous predictor has a clear relationship with the target variable, then you look at the scatter plots.

Consider the below scenario Here the target variable is “Weight” and we are trying to predict it based on the number of hours a person works out at the gym and the number of calories they consume in a day.

If you plot the scatter chart between weight and calories, you can see an increasing trend. We can easily deduce from this graph that, if the calory intake increases, then the weight also increases. This is known as a positive correlation. We can see a “clear trend”, hence, there is a relationship between weight and calories. In other words, the predictor variable calories can be used to predict weight.

Similarly, you can see there is a clear decreasing trend between Weight and the Hours, It means if the number of hours at the gym increases, the weight decreases. This is known as a Negative correlation. Again, there is a “clear trend”, hence there is a relationship between weight and hours. In other words, hours can be used to predict weight.

Sample Output:

A Scatter plot showing positive correlation
Scatter plot showing a positive correlation

Scatter plot showing a negative correlation
Scatter plot showing a negative correlation


What if there is no clear trend in the scatter plot?

If you cannot see any kind of trend(increasing/decreasing) in the scatter plot, that means the variables are not correlated with each other. Hence, it will not be possible to create a model using those two variables.

for example, look at below scatter plot between the prices of diamonds and their depth. You cannot see a clear increasing or decreasing trend, hence, no model can be created between depth and price. In other words, depth cannot be used to predict the price.

Scatter plot showing no correlation
Scatter plot showing no correlation

Author Details
Lead Data Scientist
Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!

Leave a Reply!

Your email address will not be published. Required fields are marked *