Skip to main content

Posts

Showing posts from June, 2025

Kernelized Support Vector Machines

1. Introduction to SVMs Support Vector Machines (SVMs) are supervised learning algorithms primarily used for classification (and regression with SVR). They aim to find the optimal separating hyperplane that maximizes the margin between classes for linearly separable data. Basic (linear) SVMs operate in the original feature space, producing linear decision boundaries. 2. Limitations of Linear SVMs Linear SVMs have limited flexibility as their decision boundaries are hyperplanes. Many real-world problems require more complex, non-linear decision boundaries that linear SVM cannot provide. 3. Kernel Trick: Overcoming Non-linearity To allow non-linear decision boundaries, SVMs exploit the kernel trick . The kernel trick implicitly maps input data into a higher-dimensional feature space where linear separation might be possible, without explicitly performing the costly mapping . How the Kernel Trick Works: Instead of computing ...

Ensembles of Decision Trees

1. What are Ensembles? Ensemble methods combine multiple machine learning models to create more powerful and robust models. By aggregating the predictions of many models, ensembles typically achieve better generalization performance than any single model. In the context of decision trees, ensembles combine multiple trees to overcome limitations of single trees such as overfitting and instability. 2. Why Ensemble Decision Trees? Single decision trees: Are easy to interpret but tend to overfit training data, leading to poor generalization,. Can be unstable because small variations in data can change the structure of the tree significantly. Ensemble methods exploit the idea that many weak learners (trees that individually overfit or only capture partial patterns) can be combined to form a strong learner by reducing variance and sometimes bias. 3. Two Main Types of Tree Ensembles (a) Random Forests Random forests are ensembles con...

Decision Trees

1. What are Decision Trees? Decision trees are supervised learning models used for classification and regression tasks. They model decisions as a tree structure , where each internal node corresponds to a decision (usually a test on a feature), and each leaf node corresponds to an output label or value. Essentially, the tree learns a hierarchy of if/else questions that partition the input space into regions associated with specific outputs. 2. How Decision Trees Work The model splits the dataset based on feature values in a way that increases the purity of the partitions (i.e., groups that are more homogeneous with respect to the target). At each node, the algorithm evaluates possible splits on features and selects the one that best separates the data, according to a criterion such as Gini impurity , entropy (information gain), or mean squared error (for regression). The process recursively continues sp...

Naive Bayes Classifiers

1. What are Naive Bayes Classifiers? Naive Bayes classifiers are a family of probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features. Despite their simplicity, they are very effective in many problems, particularly in text classification. They assume that the features are conditionally independent given the class. This "naive" assumption simplifies computation and makes learning extremely fast. 2. Theoretical Background: Bayes' Theorem Given an instance x = ( x1 ​ , x2 ​ , ... , xn ​ ) , the predicted class Ck ​ is the one that maximizes the posterior probability: C^ = argmax Ck ​​ P ( Ck ​ ∣ x ) = argmax Ck ​​ P ( x ) P ( x ∣ Ck ​ ) P ( Ck ​ ) ​ Since P ( x ) is the same for all classes, it can be ignored: C^ = argmax Ck ​​ P ( x ∣ Ck ​ ) P ( Ck ​ ) The naive assumption factors the likelihood as: P ( x ∣ Ck ​ ) = ∏ i = 1n ​ P ( xi ​ ∣ Ck ​ ) This reduces the problem of modeling a joint distri...