Newest Viewed Downloaded

VC theory, Support vectors and Hedged prediction technology

VC theory, Support vectors and Hedged prediction technology

Overfitting in classification

Assume a family C of classifiers of points in feature space F. A family of classifiers is a map from CF to {0,1} (Negative and positive class). For each subset X of F and each c in C, c(X) defines a partitioning of X into two classes. C shatters X if every partitioning of X is accomplished by some c in C If every point set X of size d is shattered by C, then the VC dimension is at least d. If a point set of d+1 elements cannot be shattered by C, then the VC-dimension is at most d.

VC-dimension of hyperplanes

The set of points on the line shatters any two points, but not three The set of lines in the plane shatters any three non-collinear points, but no four points. Any d+2 points in E^d can be partitioned into two blocks whose convex hulls intersect. VC-dimension of hyperplanes in E^d is thus d+1.

Why VC-dimension?

Elegant and pedagogical, not very useful. Bounds future error of classifier, PAC-learning. Exchangeable distribution of (xi, yi). For first N points, training error for c is observed error rate for c. Goodness of selecting from C a classifier with best performance on training set depends on VC-dimension h:

Why VC-dimension?

Classify with hyperplanes

Frank Rosenblatt (1928 – 1971) Pioneering work in classifying by hyperplanes in high-dimensional spaces. Criticized by Minsky-Papert, since real classes are not normally linearly separable. ANN research taken up again in 1980:s, with non-linear mappings to get improved separation. Predecessor to SVM/kernel methods

Find parallel hyperplanes

Separate examples by wide margin hyperplanes (classifications). Enclose examples between hyperplanes (regression). If necessary, non-linearly map examples to high-dimensional space where they are better separated.

Find parallel hyperplanes

Classification Red true separating plane. Blue: wide margin separation in sample Classify by plane between blue planes

Find parallel hyperplanes

Regression Red: true central plane. Blue: narrowest margin enclosing sample New xk : predict yk so (xk, yk) lies on mid- plane (dotted).

From vector to scalar product

Soft Margins

Soft Margins

Quadratic programming goes through also with soft margins. Specification of softness constant C is part of most packages. However, no prior rule for setting C is established, and experimentation is necessary for each application. Choice is between narrowing margin, allowing more outliers, and using a more liberal kernel (to be described).

SVM packages

Inputs xi, yi, and KERNEL and SOFTNESS information Only output is , non-zero coefficients indicate support vectors. Hyperplane obtained by

Kernel Trick

Kernel Trick

Showing 1 - 20 of 22 items Details

Name: 
kddlsvm
Author: 
Nada KTH
Company: 
Nada KTH
Description: 
VC theory, Support vectors and Hedged prediction technology
Tags: 
hyperplan | point | kernel | margin | dimens | classifi | plane | set
Created: 
1/31/2009 9:18:06 AM
Slides: 
22
Views: 
2
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap