Newest Viewed Downloaded

Hindi Wordnet at IIT Bombay Current Team: Pushpak Bhattacharyya, Prabhakar Pandey, Laxmi Kashyap, Salil Joshi, Arun Karthikeyan, Prachur Goel and many previous PhD, Masters and Bachelor Students and Research Staff

Hindi Wordnet at IIT Bombay Current Team: Pushpak Bhattacharyya, Prabhakar Pandey, Laxmi Kashyap, Salil Joshi, Arun Karthikeyan, Prachur Goel and many previous PhD, Masters and Bachelor Students and Research Staff

Great Language Diversity of India

Languages and the speaker population

13, 000, 000 Nepali 6000 Sanskrit 7, 000, 000 Konkani 72, 000, 000 Marathi 450, 000, 000 Hindi Population (2001 census; rounded to most significant digit) Language

Languages and the speaker population (contd.)

1, 000, 000 Bodo 1, 000, 000 Manipuri 33, 000, 000 Malayalam 60, 000, 000 Tamil 13, 000, 000 Assamese 5, 000, 000 Kashmiri Population (2001 census; rounded to most significant digit) Language

Major Language Processing Initiatives

IIT Bombay Natural Language Processing Group heavily supported by Government and Industry Mostly from the Government: Ministry of IT, Ministry of Human Resource Development, Department of Science and Technology Recently great drive from the industry: NLP efforts with Indian language in focus Google Microsoft IBM Research Lab Yahoo TCS

What is Hindi Wordnet

Wordnet – A lexical database Hindi Wordnet Inspired by the English WordNet Built conceptually Synsets or the Synonymy Sets are the basic building blocks Different organizing principles for different syntactic categories

Example Entry in Hindi Wordnet

Synset {गाय,गऊ, गैया, धेनु} {gaaya ,gauu, gaiyaa, dhenu}, Cow Gloss Text definition सींगवाला एक शाकाहारी मादा चौपाया (siingwaalaa eka shaakaahaarii maadaa choupaayaa) (a horny, herbivorous, four-legged female animal) Example sentence हिन्दू लोग गाय को गो माता कहते हैं एवं उसकी पूजा करते हैं। (hinduu loga gaaya ko go maataa kahate hain evam usakii puujaa karate hain) (The Hindus considers cow as mother and worship it.) In this slide : Only red things appear in hindi Wordnet not black.

Relations in Wordnet

Synonymy Hypernymy / Hyponymy Antonymy Meronymy / Holonymy Gradation Entailment Troponymy

गाय, गऊ (gaaya ,gauu) Cow चौपाया,पशु (chaupaayaa, pashu) Four-legged animal सींगवाला एक शाकाहारी मादा चौपाया (siingwaalaa eka sakaahaarii maadaa choupaayaa) A horny, herbivorous, four-legged female animal) पगुराना ( paguraanaa) ruminate बैल (baila) Ox कामधेनु kaamadhenu A kind of cow मैनी गाय mainii gaaya A kind of cow थन (thana) udder पूँछ (puunchh ) Tail शाकाहारी (shaakaahaarii) herbivorous Hypernym Attribute Hyponym Gloss Ability Verb meronym Antonym WordNet Sub-Graph: Hindi

Statistics

260000 Hits 13000 Hindi-English Linked Synsets 33500 Related Synsets 80400 Unique Words 33500 Synsets

Impact, Use and Visibility of Hindi Wordnet

Free download with API under GPL Available from LDC (linguistics data consortium), Upenn: topmost linguistic data repository in the worlds Commercial license purchased by Google for work on Indian language search engine To be available from ELRA: language data repository of Europe Available from LDC-IL: LDC of India

Impact, Use and Visibility of created resources (continued)

Daily reference form all over the world More than 2 Lakh hits so far since 2006 More than 3000 downloads Pivot for wordnets of many Indian languages Base resource used by many researchers for IL work on translation, summarization, cross lingual search

Hindi Wordnet Dravidian Language Wordnet North East Language Wordnet Marathi Wordnet Sanskrit Wordnet English Wordnet Bengali Wordnet Punjabi Wordnet Konkani Wordnet Hindi Wordnet giving rise to other Indian Language wordnets

Linked wordnets

Immense Lexical Resource Great benefits to machine translation, cross lingual search Very useful for language teaching, pedagogy, comparative linguistics Akin to Eurowordnet, but critical differences due to typical Indian language characteristics

Pan-India Dictionary Standard based on wordnet

… … … (मुलगा, पुत्र, लेक, चिरंजीव, तनय ) (पुत्र,बेटा,लड़का,लाल,सुत,बच्चा,सूत,नंदन,नन्दन,पूत,तनय) (son, boy) … … … (मुलगा, पोरगा, पोर, पोरगे) (लड़का, बालक, बच्चा, छोकड़ा, छोरा) (cub, lad, laddie, sonny, sonny boy) ... ... ... (सूर्य, भानु, दिवाकर, भास्कर, रवि, दिनेश, दिनमणी) (सूर्य, सूरज, भानु, भास्कर, प्रभाकर, दिनकर, अंशुमान, अंशुमाली) (sun) (W1, W2, W3) (W1, W2, W3, W4) (W1, W2 , W3) (W1, W2, W3) (W1, W2, W3, W4, W5, W6 ) (W1, W2, W3, W4, W5, W6 ) Tamil Oriya Bangali Marathi Hindi Senses

Recognition

P.K.Patwardhan Award of IIT Bombay, 2008 Research Grant from Microsoft Research India for Multilingual database creation based on Hindi Wordnet IBM India research grant for Unstructured Information Management with Hindi Wordnet as component

International Global Wordnet Conference, Jan 31-Feb 4, 09 A major International Event Granted to IIT Bombay Because of The success Of Hindi Wordnet

Showing 1 - 17 of 17 items Details

Name: 
Hindi-Wordnet_En-09
Author: 
Media Lab Asia
Company: 
Media Lab Asia
Description: 
Hindi Wordnet at IIT Bombay Current Team: Pushpak Bhattacharyya, Prabhakar Pandey, Laxmi Kashyap, Salil Joshi, Arun Karthikeyan, Prachur Goel and many previous PhD, Masters and Bachelor Students and Research Staff
Tags: 
wordnet | 000 | hindi | language | the | and | for | from
Created: 
6/12/2006 3:12:44 AM
Slides: 
17
Views: 
5
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap