Newest Viewed Downloaded

language modelling María Fernández Pajares Verarbeitung gesprochener Sprache

language modelling María Fernández Pajares Verarbeitung gesprochener Sprache

Index:

1.introduction 2. regular grammars 3. stochastics languages 4. N-grams models 5. perplexity

Introduction: Language models

What is a language model? It´s a language structure defining method, in order to limit the most probable linguistic units sequences. They tend to be useful for aplications which show a complex syntax and/or semantic. A good ML should only accept( with a high probability) right sentences and reject (or give a low probability) to wrong word sequences. CLASSIC MODELS: - N-gramms - Stochastic Grammars.

Introduction: general scheme of a system

signal measurement of parameters comparison of models Rule of decision Acustic and grammar models text

Introduction: task´s difficulty measurement

Determined by the admited language`s real flexibility Perplexity: average of options There are finer measures that take into account the difficulty of the words or the acustics models Speech recognizers seek the word sequence W which is most likely to be produced from acoustic evidence A Speech recognition involves acoustic processing, acoustic modelling, language modelling, and search

Language models (LMs) assign a probability estimate P(W ) to word sequences W = {w1,...,wn} subject to Language models help guide and constrain the search among alternative word hypotheses during recognition Huge vocabularies: integration of the acoustic models and of the language in a hidden macro-model in the Markov to all the language.

Introduction: problems dificulty dimensions

conectivity speakers Vocabulary and language complexity (+noise, robustness)

Introduction: MODELS BASED IN GRAMMARS * They represent language restrictions in a natural way *They allow the modelling of dependencies as long as required *the definition of these models involves a big difficulty for tasks that entail languages next to natural languages (pseudo-natural) *Integration with the acustic model isn´t very natural

Introduction: Kinds of grammars

If we take the following grammar G=(N,S,P,S) Chomsky hierarchy 0. No restrictions in the rules too complex to be useful 1 Sensible rules to the context too complex 2 Independent of the contextthey are used in experimental systems 3 regulars or Finite state

Grammars and automat

Every kind of grammar is relationed with a kind of automat, that recognizes it: Kind 0 (without restrictions): Turing Machine Kind 1(free of context): lineal limited automat Kind 2 (sensibles to the context):push-down automat Kind 3 (regulars): finite state automat

Regular grammars

Languages Generated by Regular Grammars Regular Languages A regular grammar is any right-linear or left-linear grammar Examples: Regular grammars generate regular languages

space search

An example:

Grammars and stochastics languages

Add a probability to each of the production rules A stochastics grammar is a couple (G,p) Where G is a grammar and p is a function p:P[0,1] that has the property Where represents a set of grammar rules who´s antecedent is A. A stochastic language over an alphabet is a pair that fulfill the following conditions:

example

P(W) can be broken down like: When n=2 bigrams When n=3trigrams N-gramms models

Example: Let us suppose that the result of an acoustic decoding assigns to resemblances probabilities to the phrases: If: * P(pig | the)=P(big | the) then the election of one or another depends of the word dog. * P(the pig dog)=P(the). P(pig | the). P(dog | the pig) * P(the big dog)=P(the). P(big | the). P(dog | the big) as P(dog | the big)> P(dog | the pig) the model helps to decode the sentences correctly Problems: Necessity of elevating number of learning samples: unigram: bigram: trigram :

Advantages: • Probabilities are based on data • Parameters determined automatically from corpora • Incorporate local syntax, semantics, and pragmatics • Many languages have a strong tendency toward standard word order and are thus substantially local • Relatively easy to integrate into forward search methods such as Viterbi (bigram) or A∗ Disadvantages: • Unable to incorporate long-distance constraints • Not well suited for flexible word order languages • Cannot easily accommodate – New vocabulary items – Alternative domains – Dynamic changes (e.g., discourse) • Not as good as humans at tasks of – Identifying and correcting recognizer errors – Predicting following words (or letters) • Do not capture meaning for speech understanding

Estimation of the Probabilities

We go to you suppose that the model of N-gramms has been modelized with a finite automat: Unigram: bigram w1w2: trigram w1w2w3: Let us suppose that they we have a sample of training, on which has considered a model of N-gramms, represented like a finite automat. A state of the automat is q, and is c (q) is total number of events (N-gramas) observed in the sample when model is in state q.

C(w|q) is the number of times that the word w has been observed in the sample,being the model in the state q. P(w|q) is the probability of observation of the word w conditioned to the state q. The set of words observed in the sample when the model is in the state q. The total vocabulary of the language that has to be modelate For example in a bigram: This attitude approach assigns the probability 0 to the events that haven´t been said this cause problems of coverthe solution is smooth the modelwe can smooth the model with:plane,lineal,no lineal, back-off, sintact back-off..

Showing 1 - 20 of 25 items Details

Name: 
languagemodelling
Author: 
M F P
Company: 
N/A
Description: 
language modelling María Fernández Pajares Verarbeitung gesprochener Sprache
Tags: 
model | languag | word | grammar | probabl | automat | regular | cluster
Created: 
12/16/2006 7:01:43 PM
Slides: 
25
Views: 
6
Downloads: 
4
Rating: 
0


Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap