Linear Regression
How It Works: Finds the best linear relationship (line) that describes the correlation between the independent and dependent variables.
Uses: Regression analysis to predict a continuous value.
Advantages: Simple, interpretable, fast.
Limitations: Assumes a linear relationship between variables.
Disadvantages: Can't model complex relationships without transformation.
Business Usecase:
Predicting housing prices based on features.
Forecasting sales for the next quarter.
Estimating life expectancy based on health parameters.
Logistic Regression
How It Works: Estimates the probability that a given instance belongs to a particular category.
Uses: Binary and multi-class classification problems.
Advantages: Probabilistic approach, interpretable, fast.
Limitations: Assumes linearity between features and log odds.
Disadvantages: Struggles with non-linear boundaries.
Business Usecase:
Predicting customer churn.
Determining if a given email is spam or not.
Classifying loan applicants as low or high risk.
Decision Trees
How It Works: Constructs a tree where each node tests an attribute and branches its answer, leading to further nodes or final decisions.
Uses: Classification and regression tasks.
Advantages: Easy to visualize and interpret, handles non-linear relationships.
Limitations: Prone to overfitting, can be sensitive to small changes in data.
Disadvantages: Might not generalize well without proper tuning.
Business Usecase:
Credit scoring based on applicant features.
Deciding promotional offers for customers.
Predicting if a machine part will fail in the next week.
Random Forest
How It Works: Ensemble of decision trees, usually trained with the "bagging" method.
Uses: Classification and regression tasks.
Advantages: Robust to overfitting, handles non-linearities, provides feature importance.
Limitations: Slower prediction time.
Disadvantages: More complex than single trees, harder to interpret.
Business Usecase:
Fraud detection in financial transactions.
Predicting disease outbreaks based on health metrics.
Segmenting customers based on shopping behavior.
Support Vector Machines (SVM)
How It Works: Finds the hyperplane that best separates the classes of data by maximizing the margin.
Uses: Classification and regression.
Advantages: Effective in high-dimensional spaces, kernel trick can model non-linear boundaries.
Limitations: Sensitive to hyperparameters, slower training time for large datasets.
Disadvantages: Not easily interpretable, requires good kernel choice.
Business Usecase:
Text categorization in document classification.
Image classification.
Biometric identification (e.g., face or fingerprint recognition).
K-Nearest Neighbors (KNN)
How It Works: Classifies a data point based on the majority class of its 'k' nearest neighbors.
Uses: Classification and regression.
Advantages: Simple, no training phase.
Limitations: Sensitive to irrelevant features, slow at query time.
Disadvantages: Requires feature scaling, computationally intensive for large datasets.
Business Usecase:
Product recommendation based on similar users' preferences.
Predicting stock prices based on historical patterns.
Identifying likely voters in a political campaign.
Neural Networks/Deep Learning
How It Works: Composed of interconnected nodes (neurons) that transform input data through layers to produce an output. Weighted connections are adjusted via backpropagation.
Uses: Image recognition, language processing, etc.
Advantages: Can model complex, non-linear relationships.
Limitations: Requires a lot of data, computationally intensive.
Disadvantages: Black box, overfitting without regularizations.
Business Usecase:
Image recognition for quality control in manufacturing.
Voice assistants and chatbots for customer service.
Diagnosing medical conditions from MRI scans or X-rays.
Gradient Boosted Trees (e.g., XGBoost, LightGBM)
How It Works: Builds trees one at a time, where each new tree corrects errors of the previous one. Gradient boosting focuses on minimizing the loss via gradient descent.
Uses: Classification and regression.
Advantages: High performance, handles missing data, provides feature importance.
Limitations: Prone to overfitting if not tuned well.
Disadvantages: Requires careful tuning, more complex than random forests.
Business Usecase:
Predicting customer lifetime value for targeted marketing.
Energy consumption forecasting for utilities.
Predictive maintenance for machinery.
Naive Bayes
How It Works: Based on Bayes' theorem, it assumes independence between features and calculates the probability of a particular class given the features.
Uses: Text classification, spam filtering.
Advantages: Simple, efficient, particularly effective with high dimensions.
Limitations: Assumes feature independence which is not always the case.
Disadvantages: Can be outperformed by more complex models.
Business Usecase:
Spam email classification.
Sentiment analysis of product reviews.
Document categorization in large libraries.
Principal Component Analysis (PCA)
How It Works: A dimensionality reduction technique that identifies the axes in the data space that maximize variance.
Uses: Dimensionality reduction, feature extraction.
Advantages: Reduces feature space, helps with visualization.
Limitations: Assumes linear correlations.
Disadvantages: Loss of information, purely variance-driven.
Business Usecase:
Data visualization for better understanding of multi-dimensional data.
Anomaly detection in credit card transactions.
Preprocessing step before applying other ML algorithms to reduce training time.