Data Science Interview Questions for IT Industry Part-1: Statistics

Introduction and Most Important Statistical concepts

Facing an interview for your next data science job can be easy!

Just make sure that the basics are in place and your passion for solving problems is highlighted during the interaction with the interview panel.

6 out of 10 questions in a data science interview will be pure concept based or theoretical. Hence it is easy to score well in this area.

Statistics is a vast subject and no one knows all of it! But there are some bare minimum concepts which you must know as a Data Scientist.

I have listed here some of the important questions asked very frequently in data science interviews related to statistics. I will write about Machine Learning and Data Science in the next series of blogs.

If you need answers for any specific question let me know in the comments section! I will be happy to help.

Tell me about yourself
Tell me about your recent data science project
How did you measure your predictive model accuracy?
What is MAPE?
What is the Median APE?
What is RMSE?
Which one is better RMSE or MAPE?
How to measure accuracy for Regression Models?
What is R Squared value in Regression?
How R Squared value is calculated?
What is Adjusted R Squared Value?
How to create a Confusion Matrix?
What is Precision?
What is Recall?
What is F1-Score?
What is Sensitivity?
What is Specificity?
What is ROC?
What is AUC?
What is Hypothesis testing?
What is P-Value?
What is Alpha Value?
What is Confidence Interval?
What is Type-1 error?
What is Type-2 error?
What is Standard Deviation?
What is an Outlier?
What is Correlation?
What is Sampling?
What are the types of sampling?
Why do we use sampling in machine learning?
What is central limit theorem?
What is T-Test?
What is Z-Test?
What is F-Test?
What is ANOVA test?
What is Chi-Square test?
What is AIC?
What is BIC?
What is Entropy?
What is Information Gain?
What is Gini Index?
What is Multicollinearity?

Let’s get started!

Q: Tell me about yourself

The most cliche question ever! But still, it is asked in every interview 😉

This is the opportunity to showcase how you have redefined yourself as a data scientist by learning the required concepts and started working on it by generating a use case in your current project.

In the IT industry, you can’t control which project you will get assigned to. Hence, many of us keep doing what we were given as the first project. If you can showcase that you have taken control of your career path by finding out what you like to do and learned it. It shows that you are serious about your career and goals. This is a great booster as it creates the image of a self-learning and self-motivated individual.

Write, rehearse, memorize this introduction and give a hint of your data science transition, with a line like “I learned data science from XYZ classes and then got an opportunity to work on predictive analytics. The first implementation I did was supervised machine learning regression problem”

Q: Tell me about your recent data science project

This is another fixed question which you must prepare well. Since you know it will be asked! Almost every interviewer will ask this question because this is where the next set of questions will emerge. So it is very important for you to have this ready in detail. This will control the flow of your interview.

A good way is to go with below flow

State the business problem
Explain what was your approach to solve it
Narrate which machine learning algorithm you used
How was the deployment done
How it helped to improve the business

This flow approach will show your maturity as an experienced professional. Never state the technical details first. Focus on the bigger picture and then if asked about specifics of the algorithms or deployment, then only go into details.

If you are asked to provide a more detailed flow then follow this blog post to get a step-by-step approach in order to narrate the story.

Q: How did you measure your predictive model accuracy?

Accuracy is measured differently for Regression and Classification predictive models. Depending on what type of project you did, mention how accuracy was measured and how the model was deployed in production.

Regression: 100-MAPE or 100- (Median APE)

Classification: F1-Score, Precision, Recall, AUC, ROC

All these terms are explained one by one in the following section.

how to measure predictive model accuracy — Measuring predictive model accuracy

Q: What is MAPE?

Mean Absolute Percent Error.

This value gives an idea that, on an average, how much error the predictive model is doing for each of the predictions.

APE is calculated by finding the absolute percentage difference between the predicted and original values. Take a look at the above example, in the first row the APE is 25% because the absolute difference between Original and Prediction is 5 and when we divide it by Original value 20 it gives 25%.

In a way you can say the prediction was 25% away from the original, hence the accuracy for this prediction was 100-25 = 75%. Now if we need to understand the overall accuracy, then we take the average of error for each of the predictions. This is known as Mean Absolute Percent Error.

Accuracy of the model is 100- MAPE

Q. What is the Median APE?

Median Absolute Percent Error.

You can see MAPE is higher because of the outlier ‘25.0’, which means there is one prediction which has 25% error.

The Median APE is used because Mean APE is affected by outliers and can go above 100% also, which will make the accuracy value(100-MAPE) negative.

This is helpful to analyze the central tendency of the error committed by the predictive model. For example, if the Median APE is 5% then it tells that if there are 60 total predictions done, 30 of those will have an error value of less than 5%.

Q. What is RMSE?

Root Mean Squared Error

Find out the difference between original and predicted values for each row.
Square the differences
Sum all squared differences
Take the average of the above sum
Take the square root of above average

Q. Which one is better RMSE or MAPE?

In terms of interpretation, MAPE is better because it is easy to visualize it. It represents the on an average error committed by the predictive model. RMSE does not have such a clear visualization.

In terms of penalizing the large error (Outliers), RMSE is better. MAPE gets affected by the outliers.

Q. How to measure accuracy for Regression Models?

Subtract the Mean Absolute Percent Error from 100 and the value is accuracy. All the below calculations will fetch accuracy values for the predictive model.

100-MAPE
100-(Median APE)

Q. What is R-Squared value in Regression?

R2 value measures the goodness of fit. It is NOT the Accuracy of the model. Accuracy is measured using 100-MAPE value.

It tells how many data points are being explained by the model out of all the data points. That means variance explained by the model Vs Total Variance.

Max value of R2 is 1
Min Value of R2 is 0

An Ideal range of R2 value is between 0.6 to 0.9. This means the predictive model is able to explain a good amount of variance in the data and can be taken into consideration for testing and accuracy calculation on Test Data.

R2 < 0.5 means tending towards Underfitting of the model
R2 > 0.9 means tending towards Overfitting of the model

From a visual perspective How many points are closer to the line of best fit Vs how many points which are far away from the line.

Q. How R-Squared value is calculated?

Consider below an example of five predictions and original values

How to calculate R-Squared — How to calculate R2

SSres means the Sum of Squared Residuals.

In the above example, SSres is 39.

SStotal means the Sum of Squared distance of each point from the mean value.

In the above example, SStotal is 75.2

The calculation for SSres and SStot can be seen below.

SSres = (12-10)² + (14-13)² + (18-15)² + (20-33)² + (11-15)²
SStot = (10-14.6)² + (14.6-14)² + (18-14.6)² + (20-14.6)² + (11-14.6)²

This equates to R2 = 1- (39/75.2) = 0.48

This means, for the given data, model was able to explain 48% of variance out of total variance.

Q. What is Adjusted R-Squared Value?

p = Total number of explanatory variables in the model
n = number of rows in training data.

The adjusted R2 value is always less than R2 and It can be negative also.

Adjusted R2 takes into account the addition of new predictors to the model. It adjusts the value and does not allow the variance explained to increase just for adding new predictor.

The Adjusted R2 value increases only if the new predictor is significant and helpful to predict the target variable. Whereas, R2 increases with every new predictor’s addition to the model.

Hence Adjusted R2 value is more accurate while judging the goodness of fit for regression models.

Q. How to create a Confusion Matrix?

A Confusion matrix is created by comparing original values with predicted values in a classification model.

True Positive(TP): How many times Yes was predicted as Yes
True Negative(TN): How many times No was predicted as No
False Positive(FP): How many times No was predicted as Yes
False Negative(FN): How many times Yes was predicted as No

In below example, all of the above have been counted and the resultant matrix is known as a confusion matrix.

How to create confusion matrix — How to create a confusion matrix

Q. What is Precision?

How many correct predictions were done for a class out of all predictions for that class?

Precision for ‘Yes’ class will tell out of all the ‘Yes’ predicted by the algorithm, how many were correct? Similarly, Precision for ‘No’ class will tell out of all the ‘No’ predicted by the algorithm, how many were correct?

i.e how precise the prediction is for that class.

A Good range for precision is 0.7-0.9

How to calculate Precision for any class?

Q. What is Recall?

How many actual values were correctly recalled by the model? In other terms, how many predictions were correct out of all the original values for that class.

Recall for ‘Yes’ will tell out of all the Actual ‘Yes’ values how many were correctly predicted by the model.

Recall for ‘No’ will tell out of all the Actual ‘No’ values how many were correctly predicted by the model.

A good range for the recall is 0.7-0.9.

Q. What is F1-Score?

F1-Score is the harmonic mean of Precision and recall.

It is the accuracy of classification predictive model. It tells how efficient the model is while predicting Yes as Yes and No as No.

A good range for F1-Score is 0.7-0.9

Q. What is Sensitivity?

Recall(Yes) is also known as Sensitivity. The True Positive Rate (TPR)

How to calculate Sensitivity

Q. What is Specificity?

Recall(No) class is also known as Specificity. The True Negative Rate (TNR)

How to calculate Specificity

Q. What is ROC?

The curve between True Positive Rates(TPR) in Y-Axis and False Positive Rates(FPR) in X-Axis is known as the ROC curve. ROC stands for Receiver Operating Characteristic.

The plot is generated by capturing (TPR, FPR) values for multiple iterations of sampling and predictions.

Q. What is AUC?

Area Under the Curve (AUC)

The amount of area covered under the ROC curve. Perfect classification will have its value as 1. A good range for AUC is 0.6-0.9. Which helps to understand the performance of the model. Higher the AUC the better it is.

If the value of AUC is less than 0.5 then it means the predictive model is not able to discriminate between the classes.

Q. What is Hypothesis testing?

Hypothesis means assumption.

To test whether our assumption is correct based on given data is Hypothesis testing.

Consider a scenario from a tire factory. The radius of the ideal tire must be 16 inches. However, even if there is a deviation of 8% then it is accepted. Hence in this scenario, we can apply hypothesis testing like below using some dummy values for the explanation.

Define the Null Hypothesis (H0): The radius of the tire= 16 Inch
Define the alternate Hypothesis(Ha): The radius of the tire != 16 Inch
Define the error tolerance limit: 8%
Conduct the test: Chosen T-Test
Look at the P-value generated by the test: P-value= 0.79
If P-Value > 0.05 then accept the Null Hypothesis otherwise reject it. : Accept the Hypothesis, Hence, The tire produced is of good quality

Q. What is P-Value?

P-Value is the probability of H0 being True.

The higher the P-value, the better the chances of our assumption(H0) to be true. The Textbook threshold to reject a Null Hypothesis is 5%. So, if P-Value is less than 0.05, this means there is less than 5% chance of Null Hypothesis being true, hence it is rejected. Otherwise, if P-Value is more than 0.05, then the Null Hypothesis is accepted.

Q. What is Alpha-Value?

The acceptable error threshold. Also known as Level of significance.

In the above tire example, the acceptable error amount was 8%

Q. What is Confidence Interval?

The range of values that can contain the population mean is based on the error threshold (Alpha Value).

In the above example the population mean is 16. As we have assumed that all good tyres are produced with 16 inches radius.

If we take a sample of 50 tyres, then we will have values like 16.2, 16.3, 15.98, 15.96, 15.99, 16.23…. so on an so forth.

For the sake of understanding let’s say the mean radius of those 50 tires came out to be 16.15. This is called Sample mean.

Now, based on this sample, we can calculate a range. The min and max values between which the mean of the population can be seen. The mean of the population is the mean of the radius of all the tires.

So basically, we are trying to estimate, how the mean of all the tyres look based on the given sample. And instead of giving a single value answer, we are providing a range of values. This range is known as Confidence Interval.

The confidence interval is affected by the alpha value. For every alpha value, we find the value of the statistic which gets multiplied with the standard error.

Confidence Interval = [ Mean(Sample) + N*(SE), Mean(Sample) + N*(SE)]

SE=Standard Error=Standard Deviation of sample/sqrt(number of samples)
N= Value of the statistic. If the population follows Normal Distribution then Z-statistic, if the population follow t-distribution then the t-statistic value for the given alpha value(probability of error margin)

For example, let us choose the alpha value of 5%. Hence, we are 95% confident that the mean value of the population will fall in between the confidence interval we find. Assuming normal distribution the value of N is 1.96 for alpha=5%. Similarly, the value of N is 2.68 for alpha=1%. So on and so forth. These “N” values are generated out of the probability distribution Z-values or the ideal bell curve distribution.

Hence, to calculate a confidence interval of the population mean. We need a sample of values, we calculate its mean, we calculate its standard deviation, we find the N-value based on the alpha level.

For the sake of explanation, assume below values were found for a sample of 50 tyres.

Sample Mean of radius=16.15
Standard deviation of 50 radius values=0.64
n=50
N=1.96 for alpha=5%

For the above values, the confidence interval will be calculated as [ 16.15 – 1.96*(0.64/sqrt(50)) , 16.15 + 1.96*(0.64/sqrt(50)) ].

Which comes out as [15.97 , 16.32].

Hence, based on the given sample of 50 Tyres we are 95% confident that the mean value of the radius of all the tires (population) will be somewhere between 15.97 and 16.32.

Q. What is Type-1 error?

A type-1 error, also known as an error of the first kind, occurs when the null hypothesis (H0) is really true but is rejected.

In terms of the confusion matrix, the False Positives(FP) are Type-1 errors.

Q. What is Type-2 error?

A type II error, also known as an error of the second kind, occurs when the null hypothesis is false, but it is erroneously accepted as true.

In terms of the confusion matrix, the False Negatives(FN) are Type-II errors.

Q. What is Standard Deviation?

Standard Deviation tells us the overall spread of the values by giving us the average distance of each point from the mean value. In other terms, on an average how far each point is from the mean.

All we need to do is take the square root of the variance. We call this the standard deviation. If this value is large, it means the data is very scattered, if this is small then the data is consolidated and close to each other in value. More details about standard deviation can be found in this blog post.

Q. What is an Outlier?

Certain values which are extremely low or extremely high compared
to all other values in a dataset are called outliers.

Eg (1, 2, 3, 4, 5, 6, 50), here 50 is an outlier because it is
abnormally large than most of the values in the dataset.

Q. What is Correlation?

Correlations are mathematical relationships between variables. You
can identify correlations on a scatter diagram by the distinct
patterns they form. The correlation is said to be linear if the scatter
diagram shows the points lying in an approximately straight line.
Let’s take a look at a few common types of correlation between
two variables:

Positive linear correlation (r=0 to 1)

Positive linear correlation is when low values on the x-axis
correspond to low values on the y-axis, and higher values of x
correspond to higher values of y. In other words, y tends to
increase as x increases.

Negative linear correlation(r= -1 to 0)

Negative linear correlation is when low values on the x-axis
correspond to high values on the y-axis, and higher values of x
correspond to lower values of y. In other words, y tends to
decrease as x increases.

No correlation(r=0)

If the values of x and y form a random pattern, then we say there’s
no correlation.

Q. What is Sampling?

Sampling means choosing random values.

Consider a bubble gum jar below with various colors of bubble gums.

If you ‘randomly’ select a few gums from the jar, it is very likely that the selected ones will have gums of all colors.

Hence, you can say that the randomly selected sample is a representative of all the gums present in the jar.

In statistical language, these randomly selected gums are known as the sample, and the jar is known as the population.

More details about Sampling Theory can be found in this blog post.

Q. What are the types of sampling?

There are four major types of sampling techniques listed below. More details can be found on this blog post

Simple Random Sampling
Stratified Sampling
Systematic Sampling
Biased sampling

Q. Why do we use sampling in machine learning?

Sampling Theory helps you to examine how good the predictive model will perform BEFORE it is deployed in production. Typically we keep 70% to train the model and 30% to test the model. However, this ratio can be changed to 80:20 or 75:25 and the results are observed. More details about it can be found in this blog post.

Q. What is central limit theorem?

If you repeatedly take large(more than 30 values) samples of size n from a population, then the mean values of all those samples will follow a normal distribution. i.e if you plot its histogram then it will form a bell curve.

Q. What is T-Test?

The T-Test is one of the many Tests employed in Hypothesis testing.

It is Used to see if the mean of the population is statistically different from an assumed value(Null Hypothesis).

Consider below example where we are selecting some random number of gumballs from a jar.

Assumption: The average size of all gumballs inside the jar is 25mm (µ₀)

If you randomly select some 20 gumballs from the jar then the average size of those gumballs should be 25mm. However, it can vary a little bit due to manufacturing defects so let us say the average came out to be 24.3mm(X̄)

Assuming Standard Deviation of sizes of all gumballs: 0.1mm(sd)d

T-Value= (X̄ – µ₀ ) / (s / √(n)

Here in our case : T-Value= (24.3 – 25 )/ (0.1 /√20) = -31.30

Higher the absolute T-Value, the difference between the mean of population and sample will be statistically significant.

Lower the absolute T-Value, the difference between the mean of population and sample will NOT be statistically significant. I.e the means are equal to each other from both sources.

The t-test is also used in Linear Regression to test which variable is helping to predict the target variable and which is not.

H0: The variable is not helping

The t-test is conducted for each of the variables and it produces a T-Value and a probability. If the probability (p-value) is less than 0.05 then we reject the hypothesis(H0). That means the variable is helping and our assumption was wrong. So we select the variable in the model.

Q. What is Z-Test?

Z-Test is same as T-Test. We use Z-Test when the sample size is MORE than 30 and otherwise T-test is used.

The t-test is used when the sample size is LESS than 30.

In Z-test, if the variance σ is not known, then it is approximated using the sample values as sd/√(n).

Z-score can be calculated from the following formula.

z = (X – μ) / σ

Where z is the z-score, X is the value of the element, μ is the population mean, and σ is the standard deviation.

Q. What is F-Test?

F-Test is used to check if the VARIANCES are equal for two populations.

It is also used in Linear Regression, where the Null Hypothesis(H0) is: Model cannot be created. if the P-Value < 0.05 that means the H0 was incorrect and hence rejected and the model is accepted.

Q. What is ANOVA test?

Analysis of Variance is used when sample means from more than 3 populations are to be compared. The F-Test is employed to do the comparison.

ANOVA stands for Analysis of Variances. It is used when means of more than 3 groups are to be compared. The t-test cannot be used here since T-Test can compare means from a maximum of 2 Groups. Hence When we have more than 2 groups we use ANOVA is performed using the F-Test.

Q. What is Chi-Square test?

Chi-Square test is used to check if there is any relationship between two categorical variables. We cannot compute the correlation value between two categorical variables hence Chi-square test is used.

χ2 = (Observed – Expected)² /Expected

Q. What is AIC?

AIC The Akaike Information Criterion (AIC) provides a method for assessing the quality of your predictive model through comparison of related models. The number itself is not meaningful. If you have more than one model then you should select the model that has the smallest AIC.

AIC is used in Logistic Regression to perform goodness of fit test since there is no R2 for Logistic Regression.

AIC is computed using below equation

AIC = -2 * log-likelihood + K * nPar

log-likelihood: The log-likelihood of logistic regression
K : 2
nPar: Number of columns

Q. What is BIC?

The formula for the Bayesian information criterion (BIC) is similar to the formula for AIC, but with a different penalty for the number of parameters. Unlike the AIC, the BIC penalizes free parameters more strongly.

BIC is computed using the below equation which is the same as AIC. The only difference is the value of K which is used in BIC

AIC = -2 * log-likelihood + K * nPar

log-likelihood: The log-likelihood of logistic regression
K : log(number of Rows)
nPar: Number of columns

Q. What is Entropy?

Entropy means randomness. It is used to measure the randomness/impurity in a group.

In simple terms, if all the entities in a group are of the same type then its pure and its randomness is also less, hence the entropy is Zero.

What is Entropy — Entropy means randomness in a group

Formula to calculate Entropy

pi is the probability of class i.

The entropy of a group in which all examples belong to the same class is Zero. It means minimum randomness.

entropy= -1 log2 (1)=0

The entropy of a group in which 50% of examples belong to the same class is 1. It means maximum randomness.

Maximum Entropy: How to calculate Entropy

entropy = -0.5 log20.5 – 0.5 log20.5 =1

Machine Learning Algorithms ID3 (Iterative Dichotomiser 3),C4.5, C5.0 all of these uses Entropy in order to find the best root node and split further nodes.

Q. What is Information Gain?

How much information is gained if a node is split in a decision tree? In other terms how much discriminative power is gained if the node is split.

Formally it is defined as below. The total entropy of Parent node minus the weighted average of Entropy of all child nodes.

Information Gain= Entropy(Parent Node) – Average(Entropy(All Child Nodes))

In the best case scenario, the parent node will have highest Entropy=1 and all the child nodes will have Entropy=0. The information gain in this scenario will be the highest. The value will be equal to 1.

How to calculate Information gain in decision trees.

Machine Learning Algorithm CART (Classification and Regression Trees) uses Information Gain or Gini Index to find the best root node and split further nodes.

Q. What is Gini-Index?

Gini Index is similar to Entropy. If all values are same, then the value of Gini Index will be zero. Otherwise, it will be some positive value calculated using the below formula.

For example, if there are two classes (Binary Classification) and both of them are present 50/50 then the value calculated will be as below:

Gini= 1 – (1/2)^2 — (1/2)^2 = 0.5

Q. What is Multicollinearity?

Collinearity means a linear relationship between two variables.

Two variables are perfectly collinear if there is an exact linear relationship between them. For example, V2 = a* V1 +b. If there is such a relation, then V1 and V2 are collinear.

Multicollinearity refers to a situation in which two or more explanatory variables in a Multiple Regression model are highly linearly related.

More commonly, the issue of multicollinearity arises when there is an approximately linear relationship between two or more independent variables(Predictors).

In simple terms Two Predictor variables that have a high correlation value will generate Multicollinearity.

It is bad for the R2 value since it inflates it. This happens because the model thinks it is explaining a lot of variances, but it is actually explaining the same variance twice (High Correlation between Predictors).

Q. How to remove Multicollinearity in Data?

Check the VIF of all the Predictor variables using vif() function from the library(car) in R. OR the variance_inflation_factor() function present in statsmodel lib in python
If any variable Has VIF>5 then remove it from the regression equation.
Re-check the VIF
Repeat Steps 1-3 till all variables have VIF<5

Q. What is VIF

Variance Inflation Factor. It is used to detect multicollinearity in data.

Formula to calculate VIF

R²j is the R-Squared of regression of Predictor j on all the other Predictors.

A tolerance of less than 0.20 or 0.10 and/or a VIF of 5 or 10 and above indicates a multicollinearity problem.
If VIF is found with multiple Predictors, The Predictor with Highest VIF is removed and the test is conducted again.
library(car) in R has a function called vif() to calculate VIF for each of the predictors.
variance_inflation_factor() function present in statsmodel lib in python.

Conclusion and Further Reading:

Statistics is the backbone of Data Science. All machine learning is possible because statistics are combined with programming.

In order to apply machine learning as data science, you must understand the statistics behind it because it helps to choose the right thing at the right place.

I will highly recommend you to go through the below resources to deep dive into statistics.

Head First Statistics: This book makes statistics fun! Written in an easy to understand way for anyone who hates mathematics.
Khan Academy: This website takes you from very basic to advanced level concepts in a step by step way

If you feel there is any concept for which you need explanation. Please submit your question in the comments. I will add that to this list.

Author Details

Farukh Hashmi

Lead Data Scientist

Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!

https://thinkingneuron.com/

thinkingneuron@gmail.com

13 thoughts on “Data Science Interview Questions for IT Industry Part-1: Statistics”

R
May 29, 2019 at 4:36 pm

In Z-test section, Sample size should be less than 30 for T- test

1. Farukh Hashmi
  May 30, 2019 at 2:17 am
  
  Thank you for pointing!
  It is corrected now 🙂
  
Dinesh
June 28, 2020 at 9:37 pm

Hi FARUKH, I find the calculation of RMSE is wrong. Instead of taking the square of APE, it should be the square of the difference between the original and predicted value.

1. Farukh Hashmi
  June 29, 2020 at 7:22 am
  
  Hi Dinesh!
  Thank you for pointing that out!
  I have corrected the mistake in the image 🙂
  
Biswajit
July 6, 2021 at 4:11 am

Very Helpul and organised

1. Farukh Hashmi
  July 6, 2021 at 3:38 pm
  
  Thank you Biswajit!
  I am glad it was helpful to you!
  
Abhishek Belkar
July 4, 2022 at 5:41 pm

Ekdam awesome Content!!
I will come back here again and again

amol
August 15, 2022 at 6:51 am

can you explain what is end to end steps for ML project if possible make full video on end to end ml
project
ex. lone pridiction

amol vyavhare
August 15, 2022 at 7:34 am

can you explain what is end to end steps for ML project if possible make full video on end to end ml
project
ex. lone pridiction

1. Farukh Hashmi
  August 15, 2022 at 8:06 am
  
  Hi Amol
  
  Sure, please take a look at this video which helps to document the whole process with timelines for any ML project
  
  https://youtu.be/PVoPLqR9_dU
  
Debleena Mallick
October 1, 2022 at 10:11 am

Its awesome, thank you sir 🙂

Chinmaya Gokhale
August 21, 2023 at 5:30 am

Dear sir,
The content is awesome..
Can you please check FP & FN in confusion matrix.
Because i am often getting confused when i see the other contents.
On which side, FP and FN has to come..
Thank you.

Chinmaya Gokhale
August 21, 2023 at 6:02 am

My bad, mistake from my side..
The original and predicted sides had got interchanged..