How to treat outliers in data in Python

Outliers are treated by either deleting them or replacing the outlier values with a logical value as per business and similar data.

Consider the below scenario, where you have an outlier in the Salary column.

Sample Output

Outlier analysis in Python
Outlier analysis in Python

Based on the above charts, you can easily spot the outlier point located beyond 4000000.


Treating the outlier values

You can sort and filter the data based on outlier value and see which is the closet logical value to the whole data.

Once you find the closest logical value, replace all the outlier points with that value.

Sample Output

Treating outlier values in Python
Treating outlier values in Python


Removing the outlier values

This is done only when the number of outlier rows is much less than the total rows in the data.

Sample Output

Removing outlier values from data in python
Removing outlier values from data in python

Author Details
Lead Data Scientist
Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!

Leave a Reply!

Your email address will not be published. Required fields are marked *