Newest Viewed Downloaded

Bayesian Methods

Bayesian Methods

Recall Decision Strategy

Decisions are chosen to maximise expected utility. That is, choose d, to maximise:

The inputs

In order to implement the strategy, we have to know about the following; The utility function, u(.) The pdf for that which is uncertain, q These are both elicited from the decision maker. Because of this, the inference is said to be ‘subjective.’

Subjectivity

We define a function as subjective if it is allowed to differ between individuals. One can formalise this by saying that the quantity or function of interest is determined conditional upon the experience of the individual. Thus we may write P(q|E) rather than P(q).

Values vs Beliefs

A decision maker’s utility function tells us something about their values. For example, when we elicit this, we get to know something about the importance of money to them, or the relative importance of beer or ice cream etc. However, the assessment of the probability of an event occurring is telling us something about how likely they believe a particular outcome to be.

General Framework

Analyse the problem Separate value and belief judgements. For values elicit utility For belief elicit probability For extra information consider experiment for improving knowledge. This will change your beliefs Combine using expected utility. Consider results.

The influence of Data

Given the results of an experiment or observation, we can often determine more about something which has been uncertain. These results are termed ‘Data.’ They arise from an experimental ‘Model.’ Because we are finding out more about that which was uncertain, we say we are updating our beliefs.

Formally

The data X depend on the uncertain quantity, q, through a probability model. Thus, we have P(X|q). We need to summarise belief about the unknown conditional on the data. P(q|X,E). Bayes Theorem gives: P(q|X,E)  P(X|q)* P(q|E) P(q|X,E) is called the posterior distribution P(q|E) is called the prior distribution

Details in overview Simple examples

Inverse Probability

Given a particular probability model, and given some data, what can we say about the parameters of the model? Typically, for example, consider the situation of pulling socks from a washing machine and wishing to estimate the proportion of socks that are black. The appropriate model here is the binomial model, with unknown parameter p. The data consist of N socks, k of which are black.

Classical idea

Come up with a point and interval estimate for p This is a function of the data estimate together with the standard error. The standard error itself is estimated from the data. For example k/N estimates p The variance of k/N is p(1 – p)/N → interval estimate is k/N ± 2 sqrt((k/N) (1 – k/N) / N) An alternative approach is likelihood.

What values of p are most supported by the data?

Statisticians talk about likelihood of data, given parameter values. The likelihood function tells us to what extent the different possible values of the parameter are supported by particular (observed) data. This is (proportional to) the probability of seeing what we actually saw, for all the possible values of the parameter. In order to get the ‘best’ estimate, it is common to take the value which maximises the likelihood.

A likelihood for 10 and 100 socks.

3 out of 10 socks were black. Support for p = .3 but also lots of other values. 30 of 100. The support for p =.3 is now stronger.

120 out of 400 socks.

A Likelihood Interpretation

It is clear that in either case of 3/10 or 30/100 the value of p which maximises the likelihood function is 0.3. The ‘information’ contained in the data is related to the curvature of the likelihood – the sharper the curve, the greater the ‘information’ about p. It is possible to create an interval estimate for p by ‘chopping’ the curve. Can we say all of these are probable values for p? The first attempts at doing such were by Bayes (1763), and Laplace (1812).

A Bayesian interpretation …

A statement of the form “Having seen 3 out of 10 heads for my experiment, the probability that p is between .1 and .5 is 95%.” This is allowed only in the Bayesian world; it is a very neat interpretation. However, it necessarily relies on quantifying the prior beliefs about p; Laplace considered a uniform prior to be natural in the above situation: P(p | E) = 1, 0 < p < 1. Then posterior: P(p | X, E)  pk (1-p)N-k, 0 < p < 1 Here data X = “k from N”

Simplicity of Bayesian Interpretation

The prior summarises uncertainty before including information from a study or experiment. The likelihood is the information in the data. The posterior summarises uncertainty given what was ‘known’ before and the information from the experiment.

Showing 1 - 20 of 52 items Details

Name: 
Lecture 13 SPW -...
Author: 
walshc
Company: 
Trinity College Dublin
Description: 
Bayesian Methods
Tags: 
the | for | that | data | this | and | about | are
Created: 
11/3/2001 8:05:17 PM
Slides: 
52
Views: 
0
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap