Financial Accounting Systems, Åbo Akademi 2010
Jaana Aaltonen & Ralf Östermark Financial classification models
Financial Accounting Systems, Åbo Akademi 2010
Jaana Aaltonen & Ralf Östermark Financial classification models
Contents The classification problem
Classification models
Discriminant analysis
Logistic regression
Recursive partitioning algorithm (RPA)
Mathematical programming
Linear programming models
Quadratic programming models
Neural network classifiers
Contents Case: Bankruptcy prediction of Spanish banks
Some comments on hypothesis testing
References
1. The classification problem In a traditional classification problem the main purpose is to assign one of k labels (or classes) to each of n objects, in a way that is consistent with some observed data, i.e. to determine the class of an observation based on a set of variables known as predictors or input variables
Typical classification problems in finance are for example
Financial failure/bankruptcy prediction
Credit risk rating
Discriminant analysis Discriminant analysis is the most common technique for classifying a set of observations into predefined classes
The model is built based on a set of observations for which the classes are known
This set of observations is sometimes referred to as the training set or estimation sample
Discriminant analysis... Based on the training set, the technique constructs a set of linear functions of the predictors, known as discriminant functions, such that
L = b1x1 + b2x2 + … + bnxn + c,
where the b's are discriminant coefficients, the x's are the input variables or predictors and c is a constant.
Discriminant functions The discriminant functions are optimized to provide a classification rule that minimizes the probability of misclassification
In order to achieve optimal performance, some statistical assumptions about the data must be met
Each group must be a sample from a multivariate normal population
The population covariance matrices must all be equal
In practice the discriminant has been shown to perform fairly well even though the assumptions on data are violated
Discriminant functions
k-1 Canonical functions
k Fisher’s functions
Distributions of the discriminant scores for two classes A discriminant function is optimized to minimize the common area for the distributions
Discriminant analysis... The discriminant functions are used to predict the class of a new observation with unknown class
For a k class problem, k discriminant functions are constructed
Given a new observation, all the k discriminant functions are evaluated and the observation is assigned to class i if the i:th discriminant function has the highest value.
Variable selection: Analyzing group differences Although the variables are interrelated and the multivariate statistical techniques such as discriminant analysis incorporate these dependencies, it is often helpful to begin analyzing the differences between groups by examining univariate statistics
The first step is to compare the group means of the predictor variables
A significant inequality in group means indicates the predictor variable’s ability to separate between the groups
The significance test for the equality of the group means is an F-test with 1 and n-g degrees of freedom
If the observed significance level is less than 0.05, the hypothesis of equal group means is rejected
Analyzing group differences: Wilks’ Lambda Another statistic used to analyze the univariate equality of group means is Wilks’ Lambda, sometimes called the U-statistic
Lambda is the ratio of the within-groups sum of squares to the total sum of squares
Lambda has values between 0 and 1
A lambda of 1 occurs when all observed group means are equal
Values close to 0 occur when within-groups variability is small compared to total variability
Large values of lambda indicate that group means do not appear to be different while small values indicate that group means do appear to be different
Multivariate Wilks’ Lambda statistic In the case of several variables {X1, X2,...,Xp}, the total variability is expressed by the total cross product matrix T
The sum of cross-product matrix T is decomposed into the within-group sum of cross- product matrix W and the between-group sum of cross-product matrix B such that
T = W + B W = T - B
Multivariate Wilks’ Lambda statistic... For the set of the X variables, the multivariate global Wilks’ Lambda is defined as
Lp = |W| / |W + B| = |W| / |T| ~ L(p,m,n)
where
|W| = the determinant of the within-group SSCP matrix
|B| = the determinant of the between-groups SSCP matrix
|T| = the determinant of the total sum of cross product matrix
L(p,m,n) = Wilks’ Lambda distribution
For large m, Bartlett's (1954) approximation allows Wilks' lambda to be approximated by a Chi-square distribution
Variable selection: Correlations between predictor variables
Since interdependencies among the variables affect most multivariate analyses, it is worth examining the correlation matrix of the predictor variables
Including highly correlated variables in the analysis should be avoided as correlations between variables affect the magnitude and the signs of the coefficients
If correlated variables are included in the analysis, care should be exercised when interpreting the individual coefficients
Logistic Regression Logistic regression is part of a category of statistical models called generalized linear models
Whereas discriminant analysis can only be used with continuous independent variables. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these
Generally, the dependent or response variable is dichotomous, such as presence/absence or success/failure.
Logistic Regression... Even though the dependent variable in logistic regression is usually dichotomous, that is, the dependent variable can take the value 1 with a probability of success q, or the value 0 with probability of failure 1-q, applications of logistic regression have also been extended to cases where the dependent variable is of more than two cases
Logistic Regression... The independent or predictor variables in logistic regression can take any form, i.e. logistic regression makes no assumption about the distribution of the independent variables
They do not have to be normally distributed, linearly related or of equal variance within each group
The relationship between the predictor and response variables is not a linear function, instead, the logistic regression function is used, which is the logit transformation of probability q
Logistic Regression... The Model:
where a is the constant of the equation and, b:s are the coefficient of the predictor variables
An alternative form of the logistic regression equation is:
Logistic Regression... The goal of logistic regression is to correctly predict the category of outcome for individual cases using the most parsimonious model
To accomplish this goal, a model is created that includes all predictor variables that are useful in predicting the response variable.
Different methods for model creation
Stepwise regression
Backward stepwise regression
Comments