Newest Viewed Downloaded

Statistical Machine Translation with Rule Based Re-ordering of Source Sentences Amit Sangodkar Vasudevan N Om P. Damani (CSE, IIT Bombay)

Illustration - Re-ordering

sung_have poets praise_in songs land_of this Many Bengali nsubj prep dobj amod nn prep det Output: Many Bengali poets this land of praise in

Illustration - Re-ordering

sung_have poets praise_in songs land_of this Many Bengali nsubj prep dobj amod nn prep det Output: Many Bengali poets this land of praise in

Illustration - Re-ordering

sung_have poets praise_in songs land_of this Many Bengali nsubj prep dobj amod nn prep det Output: Many Bengali poets this land of praise in songs

Illustration - Re-ordering

sung_have poets praise_in songs land_of this Many Bengali nsubj prep dobj amod nn prep det Output: Many Bengali poets this land of praise in songs sung have कई बंगाली कवियों ने इस महान भूमि की प्रशंसा के गीत गाए हैं

Experimental Setup

Procedure Train Moses using Training data with 6-gram language model Tune the Moses using Development data Decode Testing data using trained Moses This experimentation procedure on pure data and reordered data

Results

4.0140 3.7335 4.2426 3.9036 NIST 0.0853 0.0836 0.0842 0.0815 BLEU IIIT Data Set 4.6923 4.8539 4.7287 4.7600 NIST 0.1601 0.1751 0.1450 0.1488 BLEU EILMT Test Dev Test Dev Re-ordered Baseline Metric Corpus

Translation Example - I

Actual : इसी वर्ष नील व़्यापार और नील उत़्पादन के इतिहास में एक मोड़ आया. Baseline : इस वर्ष में एक निर्धारित बिंदु रहे के इतिहास में नील व्यापार और नील उत़्पादन. Re-ordered : इस साल नील व्यापार और नील उत़्पादन के इतिहास में यह एक रहा था.

Translation Example - II

Actual : वे गुलामी की जिंदगी से रिहाई चाहते हैं. Baseline : वे चाहते हैं कि deliverance का जीवन से गुलामी की है. Re-ordered : वे गुलामी की जिंदगी से रिहाई चाहते हैं.

Conclusion

Using Linguistic knowledge appears to improve the SMT quality BLEU score applicability in this context needs to be investigated

Acknowledgements

We acknowledge the Department of IT (DIT), Government of India and the English-to-Indian Languages (EILMT) consortium for making the EILMT tourism dataset available. IIIT Data Set: Data acquired during DARPA TIDES MT project 2003 and later refined at LTRC,IIIT-H.

References

[Hieu2008] Hieu Hoang, Philipp Koehn, Design of the Moses Decoder for Statistical Machine Translation, ACL Workshop on Software engineering, testing, and quality assurance for NLP 2008. [Marie2006] Marie-Catherine de Marneffe, Bill MacCartney and Christopher D. Manning, Generating Typed Dependency Parses from Phrase Structure Parses. In Proceedings of LREC-06. 2006. [Manual2008] Stanford Dependencies Manual, Available at http://nlp.stanford.edu/software/dependencies_manual.pdf.. [Moses] Moses Tutorial, Available at http://www.statmt.org/moses/?n=Moses.Tutorial. . [Singh2007] Smriti. Singh, Mrugunk. Dalal, Vishal Vachhani, Pushpak Bhattacharyya, Om P. Damani. Hindi Generation from Interlingua (UNL), Machine Translation Summit XI, 2007.

Showing 21 - 31 of 31 items Details

Name: 
iconSMT08
Author: 
N/A
Company: 
N/A
Description: 
Statistical Machine Translation with Rule Based Re-ordering of Source Sentences Amit Sangodkar Vasudevan N Om P. Damani (CSE, IIT Bombay)
Tags: 
??? | prep | many | poets | bengali | sung | this | praise | land
Created: 
12/20/2008 12:54:57 AM
Slides: 
31
Views: 
33
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap