Automatic Key Term Extraction fromSpoken Course LecturesUsing Branching Entropy and Prosodic/Semantic Features
Speaker: 黃宥、陳縕儂 National Taiwan University, Taiwan
Hello, everybody. I am Vivian Chen, coming from National Taiwan University. Today I’m going to present my work about automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features. ‹#›
Outline
Introduction Proposed Approach Branching Entropy Feature Extraction Learning Method Experiments & Evaluation Conclusion 2 Key Term Extraction, NTU
First I will define what is key term. A key term is a term that has higher term frequency and includes core content. There are two types of key terms. One of them is phrase, we call it key phrase. For example, “language model” is a key phrase. Another type is single word, we call it keyword, like “entropy”. Then there are two advantages about key term extraction. They can help us index and retrieve. We also can construct the relationships between key terms and segment documents. Here’s an example. ‹#›
Introduction
Key Term Extraction, NTU 3
Definition
Key Term Higher term frequency Core content Two types Keyword Key phrase Advantage Indexing and retrieval The relations between key terms and segments of documents 4 Key Term Extraction, NTU
First I will define what is key term. A key term is a term that has higher term frequency and includes core content. There are two types of key terms. One of them is phrase, we call it key phrase. For example, “language model” is a key phrase. Another type is single word, we call it keyword, like “entropy”. Then there are two advantages about key term extraction. They can help us index and retrieve. We also can construct the relationships between key terms and segment documents. Here’s an example. ‹#›
5 Key Term Extraction, NTU Introduction
We can show some key terms related to acoustic model. If the key term and acoustic model co-occurs in the same document, they are relevant, so that we can show them for users. ‹#›
6 acoustic model language model hmm n gram phone hidden Markov model Key Term Extraction, NTU Introduction
Then we can construct the key term graph to represent the relationships between these key terms like this. ‹#›
7 hmm acoustic model language model n gram hidden Markov model phone bigram Target: extract key terms from course lectures Key Term Extraction, NTU Introduction
Similarly, we can also construct the relation between language model and other terms. Then we can show the whole graph to know the organization of key terms. ‹#›
Proposed Approach
8 Key Term Extraction, NTU
Automatic Key Term Extraction
9 ▼ Original spoken documents Archive of spoken documents Branching Entropy Feature Extraction Learning Methods K-means Exemplar AdaBoost Neural Network ASR speech signal ASR trans Key Term Extraction, NTU
Here’s the flow chart. Now there are a lot of spoken documents. ‹#›
Automatic Key Term Extraction
10 Archive of spoken documents Branching Entropy Feature Extraction Learning Methods K-means Exemplar AdaBoost Neural Network ASR speech signal ASR trans Key Term Extraction, NTU
Here’s the flow chart. Now there are a lot of spoken documents. ‹#›
Automatic Key Term Extraction
11 Archive of spoken documents Branching Entropy Feature Extraction Learning Methods K-means Exemplar AdaBoost Neural Network ASR speech signal ASR trans Key Term Extraction, NTU
Here’s the flow chart. Now there are a lot of spoken documents. ‹#›
Automatic Key Term Extraction 12 Archive of spoken documents Branching Entropy Feature Extraction Learning Methods K-means Exemplar AdaBoost Neural Network ASR speech signal ASR trans Key Term Extraction, NTU First using branching entropy to identify phrases
Here’s the flow chart. Now there are a lot of spoken documents. ‹#›
Phrase Identification Automatic Key Term Extraction 13 Archive of spoken documents Branching Entropy Feature Extraction Learning Methods K-means Exemplar AdaBoost Neural Network ASR speech signal ASR trans Key Term Extraction, NTU Learning to extract key terms by some features Key terms entropy acoustic model :
Here’s the flow chart. Now there are a lot of spoken documents. ‹#›
Phrase Identification Automatic Key Term Extraction 14 Archive of spoken documents Branching Entropy Feature Extraction Learning Methods K-means Exemplar AdaBoost Neural Network ASR speech signal ASR trans Key Term Extraction, NTU Key terms entropy acoustic model :
Here’s the flow chart. Now there are a lot of spoken documents. ‹#›
Branching Entropy
“hidden” is almost always followed by the same word 15 hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : : Key Term Extraction, NTU
The target of this work is to decide the boundary of a phrase, but where’s the boundary? We can observe some characteristics first. Hidden is almost always followed by Markov. ‹#›
Branching Entropy “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word 16 hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : : Key Term Extraction, NTU
The target of this work is to decide the boundary of a phrase, but where’s the boundary? We can observe some characteristics first. Hidden is almost always followed by Markov. ‹#›
Branching Entropy “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word “hidden Markov model” is followed by many different words 17 hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : : Key Term Extraction, NTU boundary Define branching entropy to decide possible boundary
The target of this work is to decide the boundary of a phrase, but where’s the boundary? We can observe some characteristics first. Hidden is almost always followed by Markov. ‹#›
Branching Entropy
18 hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : : Key Term Extraction, NTU Definition of Right Branching Entropy Probability of children xi for X Right branching entropy for X X xi
The target of this work is to decide the boundary of a phrase, but where’s the boundary? We can observe some characteristics first. Hidden is almost always followed by Markov. ‹#›
Branching Entropy 19 hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : : Key Term Extraction, NTU X Decision of Right Boundary Find the right boundary located between X and xi where boundary
The target of this work is to decide the boundary of a phrase, but where’s the boundary? We can observe some characteristics first. Hidden is almost always followed by Markov. ‹#›
Branching Entropy
20 hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : : Key Term Extraction, NTU
The target of this work is to decide the boundary of a phrase, but where’s the boundary? We can observe some characteristics first. Hidden is almost always followed by Markov. ‹#›
Comments