~/blog/week-6-prep.mdx
$cat~/blog/week-6-prep.mdx
Week 6 Prep: Classification
February 16, 20252min
High-Level Overview
- Naive Bayes
- A probabilistic classifier based on Bayes’ Theorem
- Assumes features are independent (naive)
- Works well for text classification (i.e. spam detection, sentiment analysis)
- Fast and efficient, even with larger datasets
- K-Nearest Neighbors (KNN)
- A non-parametric, instance-based learning algorithm
- Classifies a data point based on the majority class of its k-nearest neighbors
- No training phase, all computation happens at prediction time
- Very simple but computationally expensive for larger datasets
- Support Vector Machine (SVM)
- Finds the optimal hyperplane that maximizes the margin between classes
- Can handle non-linear data using the kernel trick
- Works well for high-dimensional spaces
- Computationally expensive for larger datasets
- Random Forest
- An ensemble method that combines multiple decision trees
- Reduces overfitting compared to a single decision tree
- Works well with both categorical and numeric data
- Less interpretable compared to individual decision trees
Deeper Explanation of Naive Bayes
Bayes Theorem
Naive Bayes is based on Bayes’ Theorem, which states:
Where:
- is the posterior probability (i.e. probability of class given feature )
- is the likelihood (i.e. probability of feature given class )
- is the prior probability (i.e. probability of class occurring)
- is the evidence (i.e. probability of feature occurring)
Classification Using Naive Bayes
For a given input with multiple features , the probability of it belonging to class is:
Using the naive assumption that features are conditionally independent, the likelihood simplifies to:
To classify, we choose the class that maximizes:
Pros and Cons of Different Classification Models
| Algorithm | Pros | Cons | Best Use Cases |
|---|---|---|---|
| Naive Bayes | Fast, works well with high-dimensional data, handles missing values | Assumes independence of features, not good for complex decision boundaries | Text classification, spam filtering |
| KNN | Simple, no training phase, works well with non-linear data | Slow for large datasets, memory-intensive, sensitive to irrelevant features | Small datasets, recommendation systems |
| SVM | Handles high-dimensional data well, effective for complex classification tasks | Computationally expensive, hard to interpret | Image classification, bioinformatics |
| Random Forest | Reduces overfitting, handles mixed data types, robust | Less interpretable, slower training | General-purpose classification, fraud detection |
// EOF
suggested reads
01Week 5 Prep: Decision Trees & Project 2 Intro1min
In this blog post we will discuss decision trees in more detail.
02Week 4 Prep: Classification & Decision Trees2min
In this blog post we will discuss a high level overview of some classification algorithms.
03Week 11 Prep: Project 4 Intro2min
In this blog post, we will choose a problem to solve using clustering for Project 4.