Overview of non-parametric estimation and classification Chapter 4 in Duda et. al.
Overview of non-parametric estimation and classification Chapter 4 in Duda et. al.
Problem formulation
How can we find (learn) ? We can thus consider each class separately! Will thus talk about estimating from data We have c classes (categories)
Each class has a ”true” prior probability of occurenceand a true class conditional pdf
First we assume we have labeled data, i.e.,that we know are generated from
Further assume that gives no info about
The Histogram
The histogram
Bin size vs number of training samples
Small bin size
High resolution
Larger variance of the estimate
Large bins
Smoothed estimate
More reliable estimate
Parzen windows
Each training data vector contributes with a smooth kernel pdf
How choose the variance of the kernel pdf?
No simple answer for finite data set. Trial and error!
In the limit as n grows large
k-nearest neighbor estimation
Algorithm:
Like a histogram with adaptive bin sizes
How large should k be?
Large k makes the volumes larger => smoothed, reliable estimate
ensures convergence in the limit of many data points
In practice trial and error!
Estimation of the aposteriori pdf
x Given data from all classes, weestimate
How choose k?
See previous slide!
This defines a classifier!
The so-called k-nearest neighbor classification rule
Find the k nearest neighbors
Choose class by majority vote
Nearest neighbor classification
x Blue! A special case of k-nearest neighbor with k=1
Find nearest neighbor among the labeled training data to feature vector x
The nearest neighbor’s class label gives the output of the NN classifier
Can be shown to yield a classification error rate P(e)where P* is the Bayes risk (minimum possible error rate)
Pros and Cons
No assumptions made about the data
Computationally complex
All training data points are needed every time a pdf values needs to be calculated
There are techniques to reduce the complexity
Large data sets needed for accuracy
Especially for high dimensional feature spaces
Curse of dimensionality
Other topics in chapter 4
Distance measures
So far we have used Euclidean distance
Discussion on metrics that are invariant to transformations
Skip section 4.7 (unless particular interest in fuzzy classification)
Skim through sections on
RCE networks
Approximations by series expansion
Comments