Overview of Probabilistic Interpretation
- Connection
Between Likelihood and Cost Function:
A common
example is the interpretation of least-squares regression. Under specific
probabilistic assumptions about the data, minimizing the least-squares cost
corresponds exactly to maximizing the likelihood of the observed data.
- Suppose
the data (x(i),y(i)) are generated according to
the model:
y(i)=θTx(i)+ϵ(i),
where ϵ(i)∼N(0,σ2) are independent Gaussian noise terms.
- The
likelihood of the dataset under parameters θ is:
p(y(i)∣x(i);θ)=2πσ1exp(−2σ2(y(i)−θTx(i))2).
- Maximizing
this likelihood (or equivalently, the log-likelihood) over all data points
is:
θMLE=argmaxθ∑i=1nlogp(y(i)∣x(i);θ),
which
simplifies to minimizing the squared error loss:
argminθ21∑i=1n(y(i)−θTx(i))2.
Hence,
least-squares regression corresponds to the Maximum Likelihood Estimation (MLE) under
Gaussian noise assumptions.
- Bayesian
Perspective:
Going
beyond MLE, Bayesian statistics treats model parameters θ
as random variables with a prior distribution p(θ). This
leads to the posterior distribution over parameters given data S={(x(i),y(i))}i=1n:
p(θ∣S)=p(S)p(S∣θ)p(θ)=∫θ(∏i=1np(y(i)∣x(i),θ)p(θ))dθ(∏i=1np(y(i)∣x(i),θ))p(θ)
where the
posterior incorporates both the likelihood and prior belief about θ. This approach regularizes the problem, reduces
overfitting, and allows for uncertainty quantification.
- Summary :
- The
equivalence between least squares and MLE is shown under Gaussian noise
assumptions.
- Bayesian
view treats parameters as random variables with priors, leading to
posterior inference that naturally incorporates regularization effects.
- Such
probabilistic interpretations provide principled foundations for many
classical and modern machine learning methods, like logistic regression
and generative models.
Additional Notes:
- Probabilistic
interpretations enable the design of new models and cost functions by
modeling p(y∣x;θ) in flexible ways.
- They
facilitate working with uncertainty, making predictions that include
confidence estimates.
- Extensions
include generalized linear models and generative learning algorithms,
which build on these ideas.
Comments
Post a Comment