Feature Selection
- How to select columns for machine learning?
- Bar plots: Categorical variable data quality check
- Histograms: Continuous variable data quality check
Visual Correlation Analysis
- Scatter Plots: continuous vs continuous columns
- Box Plot: continuous vs categorical columns
- Grouped Bar Charts: categorical vs categorical columns
Statistical Correlation Analysis
- Correlation value: continuous vs continuous columns
- ANOVA test: continuous vs categorical columns
- Chi-Square test: categorical vs categorical columns
Data Pre-processing before Machine Learning
Supervised ML: Regression
Supervised ML: Classification
- Logistic Regression
- Decision Tree Classifier
- Random Forest Classifier
- Adaboost Classifier
- XGBoost Classifier
- KNN Classifier
- SVM Classifier
Model Testing and validation
Model Tuning and hyperparameter optimization
Unsupervised ML: Clustering
Unsupervised ML: Dimension Reduction
Association Rule Mining
Text Analysis
- Wordcloud
- Bigram/Trigram Wordcloud
- How to extract tweets in python from the Twitter database
- Wordcloud analysis of tweets
- Sentiment Analysis of tweets
NLP
- Tokenization of text
- How to generate N-Grams
- How to do POS tagging of text
- How to do Chunking
- Named Entity Recognition(NER) using Spacy
- TF-IDF/Count Vectorization of text
- Word2Vec
- GloVe
- BERT sentiment analysis