Classification
Definition:
Classification is the supervised
learning task of predicting a categorical
class label from input data. Each example in the dataset
belongs to one of a predefined set of classes.
Characteristics:
- Outputs are discrete.
- The goal is to assign each input to a single class.
- Classes can be binary (two classes) or multiclass (more
     than two classes).
Examples:
- Classifying emails as spam or not spam (binary
     classification).
- Classifying iris flowers into one of three species
     (multiclass classification),,.
Types of Classification:
- Binary Classification:
     Distinguishing between exactly two classes.
- Multiclass Classification:
     Distinguishing among more than two classes.
- Multilabel Classification:
     Assigning multiple class labels to each instance (less commonly covered in
     this book).
Key Concepts:
- The class labels are discrete and come from a finite set.
- Often expressed as a yes/no question in binary
     classification (e.g., “Is this email spam?”).
- The predicted class labels are often encoded numerically
     but represent categories (e.g., 0, 1, 2 for iris species).
Regression
Definition:
Regression is the supervised
learning task of predicting a continuous
numerical value based on input features.
Characteristics:
- Outputs are continuous and often real-valued numbers.
- The model predicts a numeric quantity rather than a
     class.
Examples:
- Predicting a person’s annual income from age, education,
     and location.
- Predicting crop yield given weather and other factors.
Key Concepts:
- Unlike classification, the output is a continuous value.
- The task is about estimating the underlying function that
     maps inputs to continuous outputs.
- Outputs can theoretically be any number within a range,
     reflecting real-world quantities.
Distinguishing Between
Classification and Regression
An intuitive way to differentiate
is based on the continuity of
the output:
- If the output is discrete
     (categorical classes), the problem is classification.
- If the output is continuous
     (numerical values), the problem is regression.
Practical Examples and
Representations:
- The Iris
     dataset is a classic example for classification, with
     three species as classes.
- For regression, datasets might involve predicting house
     prices, temperatures, or yields, with outputs as continuous numbers.
- Input data can be numerical or categorical, but models
     require proper encoding and representation (e.g., one-hot encoding for
     categorical variables).
Summary and Usage
- Classification and regression are foundational supervised
     learning tasks.
- Choosing the right algorithm depends on the nature of the
     output (categorical vs continuous).
- Preprocessing and feature representation are critical for
     both tasks to achieve good performance.
- Many algorithms can be adapted for either task, but the
     interpretation and training differ accordingly.
 

Comments
Post a Comment