Newest Viewed Downloaded

Overview of non-parametric estimation and classification Chapter 4 in Duda et. al.

Overview of non-parametric estimation and classification Chapter 4 in Duda et. al.

Problem formulation

How can we find (learn) ? We can thus consider each class separately! Will thus talk about estimating from data We have c classes (categories) Each class has a ”true” prior probability of occurence and a true class conditional pdf First we assume we have labeled data, i.e., that we know are generated from Further assume that gives no info about

The Histogram

The histogram Bin size vs number of training samples Small bin size High resolution Larger variance of the estimate Large bins Smoothed estimate More reliable estimate

Parzen windows

Each training data vector contributes with a smooth kernel pdf How choose the variance of the kernel pdf? No simple answer for finite data set. Trial and error! In the limit as n grows large

k-nearest neighbor estimation

Algorithm: Like a histogram with adaptive bin sizes How large should k be? Large k makes the volumes larger => smoothed, reliable estimate ensures convergence in the limit of many data points In practice trial and error!

Estimation of the aposteriori pdf

x Given data from all classes, we estimate How choose k? See previous slide! This defines a classifier! The so-called k-nearest neighbor classification rule Find the k nearest neighbors Choose class by majority vote

Nearest neighbor classification

x Blue! A special case of k-nearest neighbor with k=1 Find nearest neighbor among the labeled training data to feature vector x The nearest neighbor’s class label gives the output of the NN classifier Can be shown to yield a classification error rate P(e) where P* is the Bayes risk (minimum possible error rate)

Pros and Cons

No assumptions made about the data Computationally complex All training data points are needed every time a pdf values needs to be calculated There are techniques to reduce the complexity Large data sets needed for accuracy Especially for high dimensional feature spaces Curse of dimensionality

Other topics in chapter 4

Distance measures So far we have used Euclidean distance Discussion on metrics that are invariant to transformations Skip section 4.7 (unless particular interest in fuzzy classification) Skim through sections on RCE networks Approximations by series expansion

Showing 1 - 9 of 9 items Details

Name: 
overview_nonparam
Author: 
Jonas
Company: 
S3
Description: 
Overview of non-parametric estimation and classification Chapter 4 in Duda et. al.
Tags: 
data | estim | neighbor | class | nearest | pdf | classif | train
Created: 
5/17/2005 6:05:30 AM
Slides: 
9
Views: 
0
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap