A computational analysis of nepali morphology. A model for natural language processing [PDF]

  • Commentary
  • 1098103
  • 0 0 0
  • Suka dengan makalah ini dan mengunduhnya? Anda bisa menerbitkan file PDF Anda sendiri secara online secara gratis dalam beberapa menit saja! Sign Up
File loading please wait...
Citation preview

A COMPUTATIONAL ANALYSIS OF NEPALI MORPHOLOGY: A MODEL FOR NATURAL LANGUAGE PROCESSING



A Dissertation



Submitted to the Faculty of Humanities and Social Sciences of Tribhuvan University in Fulfillment of the Requirements for the Degree of



DOCTOR OF PHILOSOPHY in LINGUISTICS



By



BALARAM PRASAIN Ph.D. Reg. No.: 19-2058 Magh TU Reg. No.: 288-83 March 2011



ACKNOWLEDGEMENTS



My profound indebtedness is due to my supervisor prof. Dr. Yogendra Prasad Yadava, the former head, Central Department of Linguistics, Tribhuvan University, Nepal for his insistent encouragement, continuous guidance, valuable suggestions and insightful comments in accomplishing this dissertation. I would like to express my sincere gratitude to Professor Dr. Miriam Butt, Department of Linguistics, Konstanz University, Germany for her constructive suggestions, proper guidance, insightful comments to improve this dissertation. I owe a great deal to Dr. Andrew Hardie, Lancaster University, for his valuable suggestions and providing his articles which help me understand the basic concepts and also for helping in using online NNC corpus. I would like to extend my thanks to Dr. Dan Raj Regmi, head of the Central Department of Linguistics, Tribhuvan University, for his encouragement, useful suggestions and comments provided to my study. I would like to extend my sincere gratitude to Prof. Madhav Prasad Pokharel, Central Department of Linguistics, Tribhuvan University, for his prompt answer to any queries regarding Nepali morphology and its structure. I would like to express my gratitude to Prof. Dr. Chuda Mani Bandhu and Prof. Dr. Tej R. Kansakar, Former heads, Central Department of Linguistics, for their inspiration and encouragement given to this study. I extend my thanks to Krishna Prasad Chalise, Central Department of Linguistics, Dubi Nanda Dhakal, Central Department of Linguistics, Krishna Poudel, Central Department of Linguistics, for their valuable comments and active participation in discussion whenever the problem was raised. I am equally thankful to Ram Raj Lohani, Central Department of Linguistics, Bhim Narayan Regmi, Central Department of Linguistics, Karnakhar Khatiwada, Central Department of Linguistics, Bhim Lal Gautam, Central Department of Linguistics, Krishna Prasad Parajuli, Central Department of Linguistics for their support and encouragement.



iv



I am extremely thankful to Santa Bahadur Basnet for his tokenizer and a number of computational concepts. I would like to thank Madan Puraskar Library and its staff for their help in different point of time. My sincere thanks go to Dr. Tika Ram Poudel, Tina Bögel, Sebastian Sulger, Kanstanz Univeristy for their help. I would like to thank Tribhuvan University for providing me study leave for two years, SIL International for providing me travel grants while attending the workshops and institute in University of California, Bangkok, Thailand and IIT Hyderabada, India; Bhashasanchar project for supporting me financially while attending the training on text-to-speech in Gotenberg University, Sweden; and Department of Linguistics, University of Kanstanz for supporting me financially to attend the school of computational and natural language processing in Konstanz University, Germany. I would like to take this opportunity to express my sincere appreciation to my spouse Mrs. Nirmala Prasain, son Aryan Prasain and daughter Sanskriti Prasain for their tolerance. Finally, I express my thanks to all the Central Department’s non-teaching staff for their help whenever required.



BALARAM PRASAIN



v



ABSTRACT



The main goal of this study is to present a computational analysis of morphology in Nepali for developing a model for natural language processing by applying the finite state approach. The morphological categories have been analyzed according to the principle of Two-level morphology (Koskeniemmi 1983), and these categories have been implemented using Xerox finite state tool (Beesley and Kartumnen 2003) to create the morphological analyzer. A version of finite state automaton called finite state transducer is used in this study which handles relation between two languages, namely upper language and lower language. Upper language is equivalent to lexical level and lower language is equivalent to surface level. The finite state transducer is bidirectional, i.e., moving from surface level to lexical level is analysis and from lexical level to surface level is generation. This study is organized into eight chapters. Chapter 1 presents the general morphological concepts, the objectives, methodology, the significance and limitations of the study. Chapter 2 presents the theoretical framework that is adopted for the study. Chapter 3 analyzes nouns, pronouns, adjectives, numerals and classifiers in Nepali. Chapter 4 analyzes the verbs in Nepali from computational approach in the first part and verbal inflections in the second part. Chapter 5 deals with indeclinable words in Nepali. Chapter 6 analyzes the derivational process. Chapter 7 implements the outcome of analysis in previous chapters into a finite state transducer using Xerox Finite State Tool. Chapter 8 summarizes the findings of the study. This study has identified fourteen groups of nouns, eight groups of pronouns, four groups of adjectives, one group of cardinal numerals, two groups of ordinal numerals, three groups of classifiers, ten groups of verbs, seven groups of adverbs, two groups of conjunctions, three groups of postpositions, one group of particles and fifteen groups of derivations in Nepali. The phonological rules for each group have also been identified. The finite state transducer for each group with corresponding morphological tags and phonological rules have been created; and all of them have been put together into a single transducer which can be used as a morphological analyzer for Nepali.



vi



TABLE OF CONTENTS



Recommendation Letter



i



Approval Letter



ii



Declaration



iii



Acknowledgments



iv



Abstract



vi



List of tables



xiii



List of figures



xx



List of abbreviations



xxv



Chapter 1: Introduction



1



1.1 Background



1



1.2 Statement of the problem



6



1.3 Objectives of the study



7



1.3 Literature review



7



1.5 Significance of the study



16



1.6 Research methodology



16



1.7 Limitations



17



1.8 Organization of the study



17



Chapter 2: Theoretical framework



19



2.0 Outline



19



2.1 Computational concept



19



2.2 Regular expression



20



2.3 Finite state technology



22



2.4 Regular language



23



2.5. Finite state machine



24



2.5.1 Finite state automata (FSA)



24



2.5.2 Finite state transducer (FST)



25



2.5.3 Some important operations on FSTs



26



vii



2.6 FST in computational morphology



30



2.7 Xerox finite state tool syntax (XFST)



31



2.7.1 LEXC grammar



33



2.7.2 XFST interface



38



2.8 Summary



41



Chapter 3: Nominal morphology



42



3.0 Outline



42



3.1 Nouns in Nepali



42



3.1.1 Characteristics of nouns in Nepali



43



3.2 Classification of nouns in Nepali



55



3.2.1 O-ending nouns



55



3.2.2 Non-o-ending nouns



60



3.3 Pronouns



69



3.3.1 Characteristics of pronouns in Nepali



69



3.3.2 Grouping of pronouns



71



3.4 Adjectives



91



3.4.1 Characteristics of adjectives in Nepali



91



3.4.2 Classification of adjectives



95



3.5 Numerals



100



3.5.1 Cardinal numbers



100



3.5.2 Ordinal number



101



3.5.2 Other numerals



105



3.6 Classifiers in Nepali



107



3.6.1 Numeral classifiers



107



3.6.2 Quasi classifiers



108



3.7 Summary



110



Chapter 4: Verbal morphology



111



4.0 Outline



111



4.1 Characteristics of verb in Nepali



111



viii



4.1.1 Significant verb stem finals



111



4.1.2 Transitivity



118



4.1.3 Syllabicity



120



4.1.4 Sound आ a



122



4.2 Morphological processes



122



4.2.1 Causativization/transitivization



122



4.2.2 Passivization



127



4.2.3 Negativization



129



4.3 Stem formation



130



4.4 Grouping of verb stems



131



4.4.1 Intransitive verb stems



131



4.4.2 Transitive verb stem



138



4.4.3 Irregular verb stems



144



4.4.4 Suppletive verb stems



145



4.5 Verbal inflections



147



4.5.1 Auxiliary verbs in Nepali



147



4.5.2 Tense



155



4.5.3 Aspects



163



4.5.4 Moods



173



4.5.5 Participial forms



179



4.6 Summary



187



Chapter 5: Adverbs, conjunctions, postpositions and particles



188



5.0 Outline



188



5.1 Adverbs in Nepali



188



5.1.1 Temporal adverbs



188



5.1.2 Spatial adverbs



189



5.1.3 Amount adverbs



190



5.1.4 Manner adverbs



191



5.1.5 Frequency adverbs



191



5.1.6 Reason adverbs



192 ix



5.1.7 Sentential adverbs



193



5.2 Conjunctions in Nepali



194



5.2.1 Coordinate conjunctions



194



5.2.2 Subordinate conjunctions



195



5.3 Postpositions in Nepali



196



5.3.1 Plural/collective marker



196



5.3.2 Case markers in Nepali



196



5.3.3 Adverbial postpositions



198



5.4 Particles and interjections in Nepali



201



5.4.1 Particles



201



5.4.2 Emphatic markers



202



5.4.3 Interjections in Nepali



203



5.5 Summary



204



Chapter 6: Derivational morphology



205



6.0 Outline



205



6.1 Prefixation



205



6.1.2 Noun to noun derivation



206



6.1.3 Noun to adjective derivation



207



6.1.4 Noun to adverb derivation



208



6.1.5 Adjective to adjective derivation



209



6.2 Suffixation



209



6.2.1 Noun to noun derivation



209



6.2.2 Noun to adjective derivation



210



6.2.3 Noun to noun/adjective derivation



212



6.2.4 Adjective to noun derivation



213



6.2.5 Adjective/noun to noun derivation



214



6.2.6 Verb to noun derivation



215



6.2.7 Verb to adjective derivation



217



6.2.8 Verb to adverb derivation



219



6.2.9 Adverb to adjective derivation



220



x



6.2.10 Verb to noun conversion



221



6.2.11 Verb to adjective/noun conversion



222



6.2.12 Verb to noun derivation



223



6.3 Summary



224



Chapter 7: Implementation



225



7.0 Outline



225



7.1 Morphotactics: syntax of morphemes



225



7.1.1 Morphological categories



225



7.1.2 Grammatical categories



226



7.2 Lexc grammar



228



7.2.1 Nouns



228



7.2.2 Pronouns



230



7.2.3 Verbs



235



7.2.4 Adjectives



246



7.2.5 Numerals and classifiers



247



7.2.6 Adverbs



250



7.2.7 Postpositions



251



7.2.8 Conjunctions, particles and interjections



252



7.2.9 Derivations



256



7.3 Realization: rules of alternations



269



7.3.1 Phonological rules for nouns



269



7.3.2 Phonological rules for pronouns



271



7.3.3 Phonological rules for verbs



271



7.3.4 Phonological rules for adjectives



274



7.3.5 Phonological rules for adverbs



275



7.3.6 Phonological rules for postpositions



275



7.3.7 Phonological rules for particles and interjections



276



7.3.8 Phonological rules for numerals and classifiers



276



7.3.9 Phonological rules for derivations



276



7.4 Summary



278



xi



Chapter Eight: Summary and conclusion



279



Annexes



282



Annex-1: Devanagari – IPA



282



Annex-2: Nepali nouns sample



284



Annex-3: Nepali pronouns



305



Annex-4: Adjectives in Nepali



310



Annex-5: Numerals and classifiers in Nepali



311



Annex-6: Adverbs in Nepali



318



Annex-7: Verbs in Nepali



325



Annex-8: Verbal Inflections in Nepali



328



Annex-9: Conjunctions and particles in Nepali



334



Annex-10: Postpositions in Nepali



337



Annex-11: Words and affixes for derivation in Nepali



340



References



348



xii



List of Tables



Table 1.1: Simple, complex, compound and reduplicated words



3



Table 1.2 Free morphemes in Nepali



3



Table 1.3 Bound morphemes in Nepali



4



Table 1.4: Lexical and surface levels representation



5



Table 2.1: The sample regular expressions



20



Table 2.2: Some operators used in regular expressions



21



Table 2.3: Regular expressions and regular language



24



Table 2.4: The transition table for घर and घरह



25



Table 3.1: The o-ending and non-o-ending nouns



42



Table 3.2: Number: singular and plural



44



Table 3.3: Lexical gender



46



Table 3.4: Morphological gender



47



Table 3.5: Direct and oblique forms



48



Table 3.6: Honorificity: non-honorific and honorific



49



Table 3.7: Augmentative and dimunitive



50



Table 3.8: NounType 1a



54



Table 3.9: NounType 1b



56



Table 3.10: NounType 1c



57



Table 3.11: NounType 1d



58



Table 3.12: NounType 21a



59



Table 3.13: NounType 21b



60



Table 3.14: NounType 21c



62



Table 3.15: NounType 21d



63



Table 3.16: NounType 22a



64



Table 3.17: NounType 22b



64



Table 3.18: NounType 22c



65



Table 3.19: NounType 22d



66



Table 3.20: NounType 22e



66



Table 3.21: NounType 22f



67



Table 3.22: Pronouns with respect to persons



69



Table 3.23: Persons number in number distinctions



70



xiii



Table 3.24: Form of pronouns: direct and oblique



70



Table 3.25: Honorific levels in Nepali pronouns



71



Table 3.26: First person singular pronouns



72



Table 3.27: First person plural pronouns



73



Table 3.28: Second person singular non-honorific pronouns



74



Table 3.29: Second person honorific pronouns



75



Table 3.30: Second person high honorific pronouns



75



Table 3.31: Second person royal honorific pronoun



76



Table 3.32: Third person pronoun ऊ u:



77



Table 3.33: Third person pronouns यो tjo and ती ti:



78



Table 3.34: Third person pronouns यो jo and यी ji:



79



Table 3.35: The reflexive pronouns



80



Table 3.36: The demonstrative pronouns यो jo and यी ji:



81



Table 3.37: The demonstrative pronouns यो tjo and ती ti:



82



Table 3.38: The demonstrative pronouns ऊ u:



83



Table 3.39: The remaining demonstrative pronouns



84



Table 3.40: The relative pronouns



84



Table 3.41a: The interrogative pronouns



85



Table 3.41b: The indefinite pronouns derived from interrogative pronouns



86



Table 3.42: The indefinite pronouns derived from relative pronouns



87



Table 3.43a: The definite pronouns



88



Table 3.43b: The definite pronoun अक



88



Table 3.44a: The reciprocal pronouns



89



Table 3.44b: The reciprocal pronouns



90



Table 3.45: O-ending and non-o-ending adjectives



91



Table 3.46: Number: singular and plural



92



Table 3.47: Gender: masculine and feminine



93



Table 3.48: Form: direct and oblique



93



Table 3.49: Honorificity: non-honorific and honorific



94



Table 3.50: Degree: positive, comparative and superlative



95



xiv



Table 3.51: O-ending adjectives



96



Table 3.52: Type 1 marked adjectives



97



Table 3.53: Type 2 marked adjectives



99



Table 3.54: Unmarked adjectives



99



Table 3.55: Some cardinal numbers



100



Table 3.56: Some regular ordinal numbers



102



Table 3.57: Irregular ordinal numbers of one



103



Table 3.58: Irregular ordinal numbers of two



103



Table 3.59: Irregular ordinal numbers of three



103



Table 3.60: Irregular ordinal numbers of four



103



Table 3.61: Some ordinal numbers from Sanskrit loan



105



Table 3.62: Frequency numerals (I)



105



Table 3.63: Frequency numerals (II)



105



Table 3.64: Frequency numerals (III)



106



Table 3.65: Frequency numerals (IV)



106



Table 3.66: Some portion numerals



106



Table 3.67: Numeral classifiers



107



Table 3.68: o-ending classifiers



108



Table 3.69: General non-o-ending classifiers



109



Table 4.1: i-ending intransitive verb stems



112



Table 4.2: i-ending transitive verb stems



112



Table 4.2a: i-ending transitive verb stems



113



Table 4.3: Alternative forms of i-ending verb stems



114



Table 4.4: a-ending verb stems (group 1)



115



Table 4.5: a-ending verb stems (group 2)



115



Table 4.6: o -ending verb stems



116



Table 4.7a: Change of o to u in o-ending verb stems



116



Table 4.7b: ʌ-ending verb stems



116



Table 4.7c: ʌ-ending verb stems



116



Table 4.7d: Verb stems ending with a voiceless consonant



117



xv



Table 4.8: Alternative forms from stems ending with voiceless consonant



117



Table 4.9: Verb stems ending with voiced consonant



118



Table 4.10: Alternative forms from stems ending with voiced consonant



118



Table 4.11: Intransitive verbs



119



Table 4.12 Some transitive verbs



120



Table 4.13: Some ditransitive verbs



120



Table 4.14: Monosyllabic verb stems



121



Table 4.15: Polysyllabic verb stems



121



Table 4.16 Verb stems with a sound



122



Table 4.17: Verb stems without a sound



122



Table 4.18 Causative verb stems



123



Table 4.19: Verb stems forming causatives with -आ -a and आल् -al



124



Table 4.20: Verb stems forming causatives by changing अ ʌ to आ a



125



Table 4.21a: Verb stems forming causatives by chaning उ u to ओ o



126



Table 4.21b: Verb stems forming causatives by suffixing -आ –a



126



Table 4.22: Verb stems form causatives by inserting a



127



Table 4.23: Some passive verb stems



129



Table 4.24: Negation by the prefixation of negative marker न- nʌ-



130



Table 4.25: Negation by the suffixation of negative marker -न -nʌ



130



Table 4.26: Pattern of the stem formation



131



Table 4.27: Type1a verb stems



132



Table 4.28: Type1b verb stems



133



Table 4.29: Type1c verb stems



134



Table 4.30: Type1d verb stems



136



Table 4.31: Type1e verb stems (i)



137



Table 4.32: Type1e verb stems (ii)



137



Table 4.33: Type2a verb stems (i)



139



Table 4.34: Type2a verb stems (ii)



139



Table 4.35: Type2b verb stems



140



Table 4.36: Type2c verb stems



142 xvi



Table 4.37: Type2d verb stems



143



Table 4.38: Irregular verb stems



145



Table 4.39: Suppletive verb stems



146



Table 4.39: Inflections for non-past existential verb छ chʌ ‘be’ (affirmative)



148



Table 4.40: Inflection for non-past existential verb छ chʌ 'be' (negative)



149



Table 4.41: Inflections for non-past identificational verb हो ɦo ‘be’ (affirmative)



151



Table 4.42: Inflection for non-past identificational verb हो ɦo ‘be’ (negative)



152



Table 4.43: Inflections for past existential verb थ tʰi 'be' (affirmative)



153



Table 4.44: Inflections for past existential verb थ tʰi ‘be’ (negative)



155



Table 4.45: Inflections for non-past tense (affirmative)



157



Table 4.46: Inflections for non-past tense negative 1



158



Table 4.47: Inflections for non-past tense negative 2



159



Table 4.48: Inflections for past tense (affirmative)



161



Table 4.49: Inflections for past tense (negative)



162



Table 4.50: Inflections for perfect aspect



164



Table 4.51: Inflections for imperfect aspect



166



Table 4.52: Inflections for past habitual aspect (affirmative)



168



Table 4.53: Inflections for habitual aspect (negative)



169



Table 4.54: Inflections for inferential aspect (affirmative)



171



Table 4.55: Inflections for inferential aspect (negative)



172



Table 4.56: Inflection for imperative mood



174



Table 4.57: Inflections for optative mood (affirmative)



176



Table 4.58: Inflections for potential mood (affirmative)



178



Table 4.59: Inflection for absolutive participle



179



Table 4.60: Inflections for infinitive participle



181



Table 4.61: Inflections for purposive participle



182



Table 4.62: Inflection for prospective participle



183



Table 4.63: Inflections for durative participle



183



Table 4.64: Inflections for conjunctive participle



184



Table 4.65: Inflection for conditional participle



185



Table 4.65: Inflection for perfective participle



186



xvii



Table 5.1: Temporal adverbs



189



Table 5.2: Spatial adverbs



189



Table 5.3: Amount adverbs



190



Table 5.4: Manner adverbs



191



Table 5.5: Frequency adverbs



192



Table 5.6: Reason adverbs



192



Table 5.7: Sentential adverbs



193



Table 5.8: Coordinate conjunctions



194



Table 5.9: Subordinate conjunctions



195



Table 5.10: Collective/plural marker



196



Table 5.11a: Case marker postpositions (i)



197



Table 5.11b: Case marker postpositions (ii)



197



Table 5.12a: Adverbial postpositions (a)



198



Table 5.12b: Adverbial postpositions (b)



199



Table 5.13: Particles in Nepali



202



Table 5.14: Interjections in Nepali



204



Table 6.1: Noun to noun derivation



206



Table 6.2: Noun to adjective derivation



207



Table 6.3: Noun to adverb derivation



208



Table 6.4: Adjective to adjective derivation



209



Table 6.5: Noun to noun derivation



209



Table 6.6: Noun to adjective derivation



211



Table 6.7: Noun to noun/adjective derivation



212



Table 6.8: Adjective to noun derivation



213



Table 6.9: Adjective/noun to noun derivation



214



Table 6.10: Verb to noun derivation



216



Table 6.11: Verb to adjective derivation



218



Table 6.12: Verb to adverb derivation



219



Table 6.13: Adverb to adjective derivation



220



Table 6.14: Verb to noun conversion



221



Table 6.15: Verb to adjective/noun conversion



222



Table 6.16: Verb to noun (vowel insertion)



223



Table 7.1: The open word classes



225



Table 7.2: The closed word classes



226 xviii



Table 7.3: The grammatical categories and features



226



Table 7.3: The arbitrary tags



227



xix



List of Figures Figure 1.1: Lexical and surface levels of Nepali word केटो ket ̺o 'boy'



6



Figure 2.1. A finite state automaton that accepts घर ‘house’ and घरह ‘houses’



24



Figure 2.2: A finite state transducer that transduces between घर ‘house’ and घर+NOUN+SG



26



Figure 2.3: FST unioned from three FSTs for nouns, adjectives and adverbs



27



Figure 2.4: A finite state transducer concatenated from two FSTs above



28



Figure 2.5: A finite state transducer from composing two FSTs above



29



Figure 2.6: The interrelation among language, regular expression and finite state network 30 Figure 2.7: The structure of Lexc grammar (Beesley and Kartumnen 2003:205)



34



Figure 2.8: xfst interface can compile lexicon and rule and compose them into single FST (Karttunen 2000)



39



Figure 3.1: A finite state transducer for NounsType 1a



56



Figure 3.2: A finite state transducer for NounType 1b



57



Figure 3.3: A finite state transducer for NounType 1c



58



Figure 3.4: A finite state transducer for NounsType 1d



60



Figure 3.5: A finite state transducer for NounType21a



61



Figure 3.6: A finite state transducer for NounType 21b



62



Figure 3.7: A finite state transducer for NounType 21d



64



Figure 3.8: A finite state transducer for NounType 22a



65



Figure 3.9: A finite state transducer for NounsType 22b



66



Figure 3.10: A finite state transducer for NounType 22c



66



Figure 3.11: A finite state transducer for NounType 22d



67



Figure 3.12: A finite state transducer for NounType 22e



68



Figure 3.13: A finite state transducer for NounType 22f



68



Figure 3.14: A finite state transducer for first person singular pronouns



72



Figure 3.15: A finite state transducer for first person plural pronouns



73



Figure 3.16: A finite state transducer for second person singular non-honorific pronouns 74 Figure 3.17: A finite state transducer for second person honorific pronouns



75



Figure 3.18: A finite state transducer for second person higher honorific pronouns



76



xx



Figure 3.19: A finite state transducer for second person highest honorific pronoun



76



Figure 3.20: A finite state transducer for third person uː



77



Figure 3.21: A finite state transducer for third person pronouns यो tjo and ती ti:



78



Figure 3.22: A finite state transducer for third person pronouns यो jo and यी ji:



79



Figure 3.23: A finite state transducer for reflexive pronouns



80



Figure 3.24: A finite state transducer for demonstrative pronouns यो jo and यी ji:



81



Figure 3.25: A finite state transducer for demonstrative pronouns यो tjo and ती ti:



82



Figure 3.26: A finite state transducer for demonstrative pronouns ऊ u:



83



Figure 3.27: A finite state transducer for remaining demonstrative pronouns



84



Figure 3.28: A finite state transducer for relative pronouns



85



Figure 3.29: A finite state transducer for interrogative pronouns



86



Figure 3.30: A finite state transducer for indefinite pronouns derived from interrogative pronouns



87



Figure 3.31: A finite state transducer for indefinite pronouns derived from relative pronouns



87



Figure 3.32a: A finite state transducer for definite pronouns



88



Figure 3.32b: A finite state transducer for definite pronouns



89



Figure 3.33a: A finite state transducer for reciprocal pronouns



90



Figure 3.33b: A finite state transducer for reciprocal pronouns



90



Figure 3.34: A finite state transducer for o-ending adjectives



96



Figure 3.35: A finite state transducer for Type 1 marked adjectives



98



Figure 3.36: A finite state transducer for Sanskrit loan adjectives



99



Figure 3.37: A finite state transducer for unmarked adjectives



100



Figure 3.38 A finite state transducer for cardinal numbers and regular ordinal numbers



102



Figure 3.39: A finite state transducer for irregular ordinal numerals



104



Figure 3.40: A finite state transducer for ordinal numerals form Sanskrit loan



105



Figure 3.41: A finite state transducer for frequency numerals



106



Figure 3.42: A finite state transducer for portion numerals



107



Figure 3.43: A finite state transducer for numeral classifiers



107



Figure 3.44: A finite state transducer for general classifier type 1



108



Figure 3.45: A finite state transducer for general classifier type 2



109



xxi



Figure 4.1: A finite state transducer for Type1a verb stems



132



Figure 4.2: A finite state transducer for Type1b verb stems



133



Figure 4.3: A finite state transducer for Type1c verb stems



135



Figure 4.4 A finite state transducer for Type1d verb stems



136



Figure 4.5: A finite state transducer for Type1e verb stems



138



Figure 4.6: A finite state transducer for Type2a verb stems



139



Figure 4.7: A finite state transducer for Type2b verb stems



141



Figure 4.8: A finite state transducer for Type2c verb stems



142



Figure 4.9: A finite state transducer for Type2d verb stems



144



Figure 4.10: A finite state transducer for inflections of non-past existential verb छ chʌ ‘be’ (affirmative)



149



Figure 4.11: A finite state transducer for inflections of non-past existential verb छ chʌ 'be' (negative)



150



Figure 4.12: A finite state transducer for inflections of non-past identificational verb हो ɦo ‘be’ (affirmative)



151



Figure 4.13: A finite state transducer for inflection of non-past identificational verb हो ɦo ‘be’ (negative)



153



Figure 4.14: A finite state transducer for inflections of past existential verb थ tʰi 'be' (affirmative)



154



Figure 4.15: A finite state transducer for inflections of past existential verb थ tʰi ‘be’ (negative)



155



Figure 4.16: A finite state transducer for inflections of non-past tense



157



Figure 4.17: A finite state transducer for inflections of non-past tense negative 1



159



Figure 4.17a: A finite state transducer for inflections of non-past tense negative 2



160



Figure 4.18: A Finite State Transducer for inflections of past tense (affirmative)



161



Figure 4.19: A finite state transducer for inflections of past tense (negative)



163



Figure 4.20: A finite state transducer for inflections of perfect aspect



165



Figure 4.21 A finite state transducer for Inflections of imperfect aspect



166



Figure 4.22: A finite state transducer for inflections of habitual aspect (affirmative)



168



Figure 4.23: A finite state transducer for inflections of habitual aspect (negative)



170



Figure 4.24: A finite state transducer for inflections of inferential aspect (affirmative)



171



xxii



Figure 4.25: A finite state transducer for inflections of inferential aspect (negative)



173



Figure 4.26: A finite state transducer for inflections of imperative mood



175



Figure 4.27: A finite state transducer for inflections of optative mood



176



Figure 4.28: A finite state transducer for inflections of potential mood



178



Figure 4.29: A finite state transducer for inflection of absolutive form



180



Figure 4.30: A finite state transducer for inflections of infinitive participial form



181



Figure 4.31: A finite state transducer for inflections of purposive participial form



182



Figure 4.32: A finite state transducer for inflection of prospective participial form



183



Figure 4.33 A finite state transducer for inflections of durative participial forms



184



Figure 4.34: A finite state transducer for inflections of conjunctive participial form



185



Figure 4.35: A finite state transducer for conditional participial form



186



Figure 4.36: A finite state transducer for inflection of conditional participial form



187



Figure 5.1: A finite state transducer for temporal adverbs



189



Figure 5.2: A finite state transducer for spatial adverbs



190



Figure 5.3: A finite state transducer for amount adverbs



190



Figure 5.4 A finite state transducer for manner adverbs



191



Figure 5.5: A finite state transducer for frequency adverbs



192



Figure 5.6: A finite state transducer for reason adverbs



193



Figure 5.7: A finite state transducer for sentential adverbs



193



Figure 5.8: A finite state transducer for coordinate conjunctions



195



Figure 5.9: A finite state transducer for subordinate conjunctions



195



Figure 5.10: A finite state transducer plural/collective marker



196



Figure 5.12a: A finite state transducer for adverbial postpositions that do not take emphatic marker



199



Figure 5.12b: A finite state transducer for adverbial postpositions that take emphatic marker



201



Figure 5.13: A finite state transducer for particles



202



Figure 5.14: A finite state transducer for interjections



204



Figure 6.1: A finite state transducer for noun to noun derivation



207



Figure 6.2: A finite state transducer for noun to adjective derivation



208



Figure 6.3: A finite state transducer for noun to adverb derivation



208



Figure 6.4: A finite state transducer for adjective to adjective derivation



209



Figure 6.5: A finite state transducer for noun to noun derivation



210



Figure 6.6 A finite state transducer for noun to adjective derivation



211



xxiii



Figure 6.7: A finite state transducer for noun to noun/adjective derivation



212



Figure 6.8: A finite state transducer for noun to adjective derivation



214



Figure 6.9: A finite state transducer for noun/adjective to noun derivation



215



Figure 6.10: A finite state transducer for verb to noun derivation



217



Figure 6.11: A finite state transducer for verb to adjective derivation



218



Figure 6.12: A finite state transducer for verb to adverb derivation



220



Figure 6.13: A finite state transducer for noun to adjective derivation



221



Figure 6.14: A finite state transducer for verb to adverb derivation



222



Figure 6.15: A finite state transducer for verb to adverb derivation



222



Figure 6.16: A finite state transducer for verb to adverb derivation



223



xxiv



List of abbreviations +ABL +ABS +ADJ +ADV +AMOUNT +AUG +CARD +CAUSE +CCONJ +CLF +COM +COMP +COND +CONJUCT +DAT +DEF +DEM +DIM +ALL +DIRT +DUR +EMPH +ERG +EXIST +FEM +FREQ +GEN +HAB +HHON +HON +ID +IMP +IMPERF +INDEF +INF +INFER +INST +INTERJ +INTERRO IPA +LOC +MANNER +MASC +NHON +NOUN NP +NPST



= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =



Ablative Absolutive Adjective Adverb Amount Augmentative Cardinal Causative Coordinate Conjunction Classifier Commitative Comparative Conditional Conjunctive Dative Definite Demonstrative Dimunitive Directional Direct Durative Emphatic Ergative Existential Feminine Frequency Genitive Habitual High Honorific Honorific Identificational Imperative Imperfect Indefinite Infinitive Inferential Instrument Interjecction Interrogative International Phonetic Alphabet Locative Manner Masculine Non honorific Noun Noun Phrase Non past xxv



+NUM +OBL +OPT +ORD +PST +PARTICLE +PASS +PERF +PERFT +PL +PLACE +PORT POS +POSIT +POSTP +POT +PRON +PROPER +PROS +PROX +PURP +REASON +RECIP +REFL +REL +RHON +SCONJ +SENT +SG +SPAC +SUPER +TEMP +VERB +VOC



1 2 3



= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =



Numeral Oblique Optative Ordinal Past Particle Passive Perfect Perfective Plural Place Portion Parts of speech Positive Postposition Potential Pronoun Proper Prospecctive Proximate Purposive Reason Reciprocal Reflexive Relative Royal honorific Subordinate conjunction Sentential Singular Spatial Superlative Temporal Verb Vocative First person Second person Third person



xxvi



CHAPTER 1 INTRODUCTION 1.1 Background This study is an attempt to analyze morphology in Nepali and design a computational model for natural language processing within the framework of the finite state technology in general and 'Two-level morphology' developed by Koskeniemmi (1983) in particular. For the implementing of analyzed data and creating a computational model, the Xerox Finite State Tool developed by Beesley and Kartumnen (2003) has been employed. Language as a means of human communication is a tool to express the greater part of human ideas and emotions. Language shapes human thoughts, has a structure and carries meaning. Learning and expressing new concepts and ideas through language are so natural that it is hardly realized how the natural language is processed in our brain. Thus, it can be claimed that there must be some sorts of language representation and a processing module in the human brain (Siddiqui and Tiwari 2008:1). This type of content in the brain also helps to represent the language in real time world. Every time the language activities take place, there is always a very fast and accurate natural language processing that finally performs a successful communicative event. To capture this reality, the computational linguistics attempts to develop computational models of aspects of human language processing. Developing such automated tools for language processing and gaining a better understanding of human communication are the main reasons that have inspired the linguists and the computer scientists in this ever growing field. In fact, language is the outer form of the content it expresses. Therefore, the language processing means the processing of content it possesses. The language, generally, is manifested either in the written or spoken form. Both forms of the language can be processed with the help of computer. To achieve this goal, a computational model of a particular language based on formal approach is required to be designed and implemented into the computer. As computers are not able to understand the natural language, the computational models and methods are developed to map its content in a formal language. And such formal languages are extended to account the natural language phenomena at various



1



levels of the language. The representation of the whole body of the knowledge of language can be an ambitious project. Thus, the language can be graded into various levels such as phonology, morphology, syntax, semantics and pragmatics. Each level is perceived and defined in different ways by the people in different disciplines according to the goals set up. It is not possible to create a mega computational model at a time to cover the entire language. Therefore, the main goal of this study is to represent, design a computational model and process the morphology of written from of the Nepali text. Nepali is an Indo-Aryan language characterized by agglutinating morphology in general. The verb is predominantly inflectional whereas the noun is heavily agglutinating. There have not yet been made attempts to analyze Nepali morphology for the development of computational model. The basic linguistic concepts in relation to morphology and its representation in order to further clarify the computational aspects of morphology are briefly discussed in the following subsections.1



I. Morphology The words are considered to be fundamental building blocks of language (O'Grady, Dobrovolsky and Aronoff 1997:118; Jurafsky and Martin 2000:45). Every human language has words (Mathews 1991:20), they are said to be granted for at least in descriptive linguistics (Katamba 1993:17). Of all the units of linguistic analysis, the word is the central, most familiar and crucial. The smallest free form found in a language are said to be words (Bloomfield 1933); however, a free form that can occur in isolation is not only atomic but also molecular in its structure. A word (i.e. wordform), in real sense, can either be in simple, complex, compound or reduplicated form. Table 1.1 presents simple, complex, complex and reduplicated words in Nepali.



1



The general concepts of the linguistic categories may be required to be qualified and specified for computational purposes. Therefore, some basic concepts are briefly discussed in sections I and II.



2



Table 1.1: Simple, complex, compound and reduplicated words Type Simple



Word घर



IPA/gloss gʰʌr 'house'



Meaning house



Complex



घरबाट



gʰʌr-bat 'house-ABL'



from house



Compound



घरप रवार



gʰʌr-pʌriwar 'house-family'



family



Reduplication



घरघरै



gʰʌr-gʰʌr-ʌi 'house-house-EMPH'



each house



Table 1.1 shows that the word घर gʰʌr 'house' is simple; घरबाट gʰʌr-batʌ 'from house' is complex; घरप रवार gʰʌr-pʌriwar 'famliy' is compound; and घरघरै gʰʌr-gʰʌr-ʌi 'each house' is reduplicated.



a. Morpheme: free and bound The smallest (minimal) unit of grammar that carries information about the meaning and function is said to be a morpheme (Bloomfield 1933; Katmaba 1993; O'Grady, Dobrovolsky, Aronoff 1997; Mathews 1991). The morpheme is an abstract entity that may correspond to various forms at the surface level. The morpheme can be free and bound. A morpheme may be a word by itself or it may not be. The lexical categories such as nouns, verbs, adjective, adverbs, etc. are free morphemes whereas a morpheme that must be attached to another element (normally a free morpheme) is called a bound morpheme. Free morphemes are lexical and bound morphemes are generally grammatical. Table 1.2 lists some free morphemes in Nepali. Table 1.2: Free morphemes in Nepali Morpheme घर



IPA gʰʌr



Gloss house



जा



dza



go



असल



ʌsʌl



good



आज



adzʌ



today



The morphemes घर gʰʌr 'house', जा dza 'go', असल ʌsʌl 'good' and आज adzʌ 'today' in Table 1.2 can stand as words. The bound morphemes in Nepali cannot stand by themselves. Table 1.3 lists some bound morphemes in Nepali.



3



Table 1.3: Bound morphemes in Nepali Morpheme -नु



IPA -nu



Gloss -INF



-एको



-eko



-PERF



-ला



-la



-POT



-आइ



-ai



-NML



The morphemes -नु -nu 'INF', -एको -eko 'PERF', -ला -la 'POT' and -आइ -ai 'NML' in Table 1.3 cannot stand alone. They appear with other free morphemes and express some grammatical functions.



b. Root, stem, base, affixes and word Root is an ultimate and irreducible constituent element common to all word-forms of the same family. It is not an abstract but a concrete form. The root constitutes the core of the word and carries the major component of its meaning and typically belongs to a lexical category such as noun, verb, adjective or adverb. Therefore, a root corresponds to a free morpheme (Katamba 1993:45; Payne 1997:25). For example, man, book, tea, etc. are the roots. A stem is a part of the word except the last (generally for inflectional purpose) affix. Therefore, the stem may be composed of minimally a root and it may have more elements (Katamba 1993:45). cat in cats is the stem for instance. A base is the word or part of the word to which an affix can be attached. It may be called stem for the inflectional purpose (Katamba 1993:45). In English, the word work can be a root, a stem and a base whereas the word worker can only be a base for the word workers. The root, stem and base have something in common in their definition from linguistic perspective. A base can be a stem as well as a root and a stem can be a root as well.2 The concept of root, stem and base overlaps with one another from theoretical point of view. But, from computational perspective, it makes no difference among them especially in the computer processing. So, the term 'stem' is used to represent any one of them in this study. That means any sequence of character to which some other sequences of characters can be attached.



2



I have not discussed the bound roots, see Katamba (1993) for details.



4



An affix is a bound morpheme when added to the radical element (root, stem or base) it changes the meaning or function of a word by creating a new word-form. Therefore, the affixes are basically involved in the inflectional and derivational phenomena of the language. The affixes can be of various kinds but only prefixes and suffixes are discussed here for the present purpose. The prefix is an affix that gets attached in front of the stem and suffix at the end (Katamba 1993:44). In English, affixes re-, un- and



in- in the words reunion, unhappy and intolerable are prefixes whereas -ly, -ing, -er, ed in the words slowly, working, walker and walked are suffixes. Thus, it is clear that from computational point of view a word minimally consists of a stem and optionally one or more affixes.3 In analysis, words are decomposed into their constituents and represented by following certain formalism whereas in generation, the process is reversed.



II. Levels of representation: lexical and surface Words in written or spoken texts, in fact, represent the outer form, i.e. surface form. But a word carries various kinds of information which can be represented at least at two levels. The lexical level of the word is its canonical form or lemma word and a set of tags showing its syntactic category and morphological features. They are the possible parts of speech and/or inflectional properties such as gender, number, person, tense, aspect and mood. Thus, the lexical level represents the sequence of morphemes in a certain fashion. The actual arrangement of morphemes is governed by the language specific rules. Table 1.4 presents lexical and surface level representations of words in Nepali.



Table 1.4: Lexical level and surface level representation Lexical level केटो+NOUN+MASC+SG



Surface level केटो



Gloss of stem boy



खा+VERB+P.3SG



खायो



eat



यून+ADJ+SUPER



3



यूनतम



least



र+CC







and



र+PART







uncertain



See Katamba (1993:17-23) for the detail.



5



Table 1.4 presents the lexical level representation consisting of the sequences of morphemes attached to the stem resulting to certain word forms at the surface level. Thus, the representation at the surface level in Nepali corresponds to actual spelling of the word. In this study, an attempt has been made to represent a word in Nepali at the two levels. The pair of lexical level and surface level can be taken as a relation between two languages and can be used as morphological analyzer and generator simply by changing the direction of transitions. Figure 1.1 illustrates two levels and the process of analysis and generation of words in Nepali.



LEXICAL LEVEL



: केटो+NOUN+MASC+SG



SURFACE LEVEL



: केटो



Figure 1.1: Lexical and surface levels of Nepali word केटो ket ̺o 'boy' In Figure 1.1, for instance, moving from the surface level केटो to lexical level केटो+NOUN+MASC+SG represents morphological analysis and reverse represents



generation, respectively.



1.2 Statement of the problem Nepali is a morphologically rich language. There exist a number of morphological studies in Nepali. Most of them are descriptive in nature. There also exist some scanty works from computational point of view (see, review of literature in 1.3). Morphology of Nepali has not yet been fully analyzed from computational perspective. The main problem of this study is to analyze morphology in Nepali that can be implemented from computational point of view. The specific problems of this study are as follows:



a. What are the morphological categories in Nepali? b. What are the morphological processes in the language? c. What are the rules involved in the morphological processes? d. What is the computational model for morphology in Nepali?



6



1.3 Objectives of the study The main objective of this study is to analyze the morphology in Nepali from computation perspective. The specific objectives of the study are as follows:



a. To identify the morphological categories in Nepali; b. To identify the morphological processes in the language; c. To formulate the phonological/orthographic rules in Nepali; and d. To design and develop the computational model for Nepali morphology.



1.3 Review of literature There are only a few scanty works in Nepali morphology from computational perspective. However, there are a number of works in Nepali morphology from traditional and descriptive perspectives. These works contribute to the understanding of the main problem of the study to some extent. Such works have been thematically reviewed in four groups.4



a. Nepali morphology Pandit (2051VS (1969 VS)) has classified word categories in Nepali from traditional point of view. The categories include noun, pronoun, adjective, verb and indeclinable. Nouns are grouped into common noun, proper noun and abstract noun. Each noun is discussed with respect to gender: masculine, feminine and neuter; number: singular and plural; cases and case markers: subjective, objective, instrumental, dative, ablative and locative. He has also presented a detailed inflectional paradigm of nouns and verbs. Bandhu (1973) has analyzed clause patterns of Nepali from tagmemic approach. The basic clause patterns are sub-classified and illustrated with examples. Under the 4



The literatures available are not directly related to this study. However, they provide knowledge to understand the research problems. Therefore, instead of evaluating them critically as per the review style, their contributions to this study have been mentioned under four themes, they are (i) Nepali morphology, (ii) Nepali computational morphology, (iii) Nepali language and related NLP works, and (iv) NLP works in selected languages.



7



inflected patterns, he has analyzed the inflectional categories, inflectional system, mood and finite system, aspects and copulas, modals, negation and post verbal particles. The paradigms for each inflectional category are presented alone with the illustrations in the sentence. Dahal (1974) is an extensive description of colloquial and literary Nepali. He has classified the stem formation process into three classes, namely, derived stems, composite stems and reduplicated stems and has also discussed the derived stems as suffix-derived noun stems, suffix-derived adjective stems, suffix-derived verb stems, suffix-derived adverbs, modification-derived stems and prefix-derived stems. Inflectional categories and their realizations have been described under two headings, namely nominal inflections and verbal inflections. Adhikari (1980) has examined a set of Nepali verbs ending in relation of



ch [tsʰ] and the



ch with the time of speech. There can be other elements between verb



stem and element



ch and



ch always refers to the non-past tense and it is always



followed by concord marker. Sharma (1980) has presented the verbal structure in a formulation as V Æ stem ((+Aspect + BE) + Tense) (+Neg) + concord from descriptive approach. Among them, only the stem and concord are obligatory and others are controlled by some other constraints. The verbal stems are divided into simple and complex. The morphophonemic changes that occur when the suffixes for tense, aspect, negation and concord appear to the simple and complex stems are discussed in a length. Wallace (1985) is an attempt to test Nepali data against the Relational Grammar and Government and Binding theory. It is mentioned that Nepali nouns show number and the case relations that are indicated by postpositions. Finite verb forms indicate tense, aspect, affirmation or negation and they agree with subject. Adjectives are discussed as noun phrase modifiers where they agree with the head noun in terms of their number and gender. Chapagain (2046 VS) has also classified the Nepali words into nouns, pronouns, adjectives, verbs, adverbs, conjunction and from traditional and descriptive point of view. The word formation processes such as prefixation, suffixation, compounding and reduplication are well discussed with illustrations.



8



Acharya (1991) is a corpus based study. He has classified the form classes into inflected forms: noun, adjective, pronoun and verb; and uninflected forms: adverb, conjunction, postposition, interjection and nuance particle. Nouns are discussed with respect to the number: singular and plural; cases: nominative, accusative, instrumental, dative, ablative, genitive and locative. Adjectives are discussed with reference to their endings. The verbs are discussed according to the inflectional suffixes for present, past and future tenses with their corresponding number, person, gender and honorificity. The verbs can have simple and compound stems. The adverbs are placed under the uninflected form class and described in term of comparative and superlative structures with examples from the corpus. The conjunctions, postpositions, interjection and particles are also discussed. Adhikari (1993) has classified Nepali words into nouns, pronouns, adjectives, verbs, adverbs, postpositions, conjunctions and interjections from descriptive approach. The agreement system is extensively discussed with respect to gender, number, person and five levels of honorificity. The classification of the verbs and stem formation, inflections and derivation are of a great help in designing the alternation rules. Adhikari (2052 VS) is a study of Nepali case system using Filmorian framework. He has classified the Nepali cases into two main categories. They are core cases and peripheral cases. The former includes semantic cases such as agent, affected, resultative, neutral, experiencer, recipient, essive and the latter includes cases such as locative, instrumental, cause, ablative, beneficiary, purposive and comitative. Pokharel (2054 VS) has presented the analysis of morphological and syntactic levels of voice, causativization, tense, aspect and mood in the simple verb, compound verbs and negation from descriptive approach. The classification of the verbs, the agreement, honorificity, various kinds of classifiers, gender, number, case in nouns and the grammar of the pronouns have been described and illustrated. Lohani (1999) has studied the complex predicates in Nepali using theoretical framework of lexical functional grammar. He has classified complex predicates into nominal, verbal, adjectival and adverbial complex predicates. Sharma (2056 VS) has discussed nine parts of speech categories, namely nouns, pronouns, adjectives, adverbs, verbs, postpositions, conjunctions, interjections and



9



particles extensively from traditional and descriptive approach. Each class is further sub-classified and discussed with illustrations. Acharya (2058 VS) has classified words into various classes on different bases, viz: original and loan words; underived and derived; compound and reduplication; declinable and indeclinable from traditional approach. Nouns, pronouns, adjectives, verbs, adverbs are discussed with illustration. Dhakal (2058 VS) has analyzed the Nepali numerical words from a historical perspective. He has compared the Nepali word forms with Sanskrit and Prakrit and has also shown the changes that occurred during the evolutionary period. He has also analyzed numerical words into their component parts and has observed the sound changes. Pokharel (2010a) has analyzed noun class agreement system in Nepali from descriptive point of view. On the basis of analysis, Nepali nouns are grouped into eleven agreement classes. The gender assignment in Nepali is 'strictly semantic'. The use of classifiers for gender distinction is a unique feature not commonly found in languages. Nepali has based nominal agreement on human vs. non-human distinction. Pokharel (2010b) has presented various strategies to derive the verb root in Nepal. Derivation from citation form, imperative singular form and probalilitative singular form are compared. None of the strategies can derive the entire verb roots, so he has proposed the mathematical strategy of generative phonology to this problem. According to this, if all the verb forms of a root are taken together and calculated the highest common factor, it will generate the root form and this will be the general formula of verb root derivation in Nepali.



b. Nepali computational morphology Keshari et al. (2005) has discussed the development of a rule based system that guesses the part of speech of words in Nepali in a raw corpus without the use of lexicon. The system uses the linguistic information at morphological level and guesses the POS by looking at the affixes. The system has three modules, namely, lexicon maintainer module, rule maintainer module and POS guesser module. The modules interact with lexicon database, guessing rules and corpus.



10



Upadhyaya et al. (2005) has developed a morphological analyzer for Nepali language. The finite state technology is used in designing the analyzer but it can handle only the surface forms but not on the lexical level. It also lacks the detailed description of various aspects dealt in the process. Aryal et al. (2006) has developed a system that produces the parsed text with maximum possible POS tag from computational perspective. The process consists of three phases: tokenizing into syllables, morphological analysis and disambiguation. Paudel et al. (2006) has developed a morphological analyzer along with a spell checker. Nepali words are categorized into two major types, namely, declinable words and indeclinable words. The declinable and indeclinable words are further grouped into subclasses. The morphological analyzer consists of a root word dictionary and a rule dictionary. In the main engine, the root dictionary and rule dictionary interact with one another at various levels and spell checking has also been done in a coordinated manner. Bal (2007) is an attempt to analyze the structure of Nepali grammar from the computational perspective. Even though the main focus of the paper is on the morphological and syntactic aspects of the Nepali language, it has also given a space for writing system of Nepali language. Despite of a novel start, it lacks detailed and deeper observation into the morphological structure of the words in Nepali. Bal and Shrestha (2007c) discuss the design and implementation issues as well as the linguistic aspects of the morphological analyzer and a stemmer for Nepali language. The stemming algorithms and their limitations have also been discussed. Bal and Shrestha (2007b) has presented a stemmer for Nepali language. This stemmer is especially designed to assist the morphological analysis and parts of speech tagging. Based on paradigm approach the stemmer is capable of splitting the words into meaningful units. Aryal et al. (2007) has presented the techniques used for the syntactic and semantic disambiguation for Nepali language. The process of parsing has been done with two components, namely tokenizer and morphological analyzer. The work includes both syntactic and semantic levels. Prasain (2008) is an attempt to analyze the Nepali basic verbs from computational perspective. The basic verbs are classified into two broad categories in terms of the 11



ending, viz., consonant ending and vowel ending. The former group has two types of ending: voiced and voiceless and latter has five types: i-ending, u-ending, a-ending, ʌending and vowel sequence ending. The analysis uses the finite state technology; since it is a preliminary work in nature, it does not cover all the aspects of the verbs. Shrestha (2008) has developed a system that disambiguates the Nepali word senses from natural language processing perspective. He has used modified Lesk algorithm and the wordNet. The processes in Nepali word sense disambiguation have been completed in four stages, namely, (i) tokenizer, (ii) context selection, (iii) finding the senses of target word and (iv) sense identification. Hardie (2008) has analyzed Nepali postpositions applying a collocation-based technique to the categorization of postposition in Nepali from corpus linguistic perspective using the Nepali National Corpus. He has examined the most significant collocations of several postpositions for patterns that characterize postposition as a category or categories. The collocation with semantically coherent nouns, and collocation with words for which the postposition functions as a subcategorizer are identified. Hardie et al. (2009) has described the linguistic rationale underlying the part-ofspeech tagset used for tagging the Nepali National Corpus. The implementation of the tagset in an automated tagging system has also been outlined. This work further supports the classification of words into various groups for designing the finite state transducer for each of the groups. Prasain (2010) is an attempt to analyze Nepali basic nouns and implement them into computer using finite state approach. Various noun characteristics: number, gender, form, honorificity, augmentative/diminutive and significant stem finals are analyzed. On the basis of these features, Nepali basic nouns are grouped into fourteen classes. Each group of nouns are implemented following xfst format (Beesley and Karttumen 2003): lexc grammar to create lexicon finite state transducer and xfst interface to create rule finite state transducer. And finally these finite state transducers are composed into one to create a single finite state transducer which can directly be used to analyze and generate the basic nouns.



12



c. Nepali language and related NLP works Bandhu (1971) is a computer concordance of Nepali spoken corpus. The corpus has been morphologically analyzed and forms are segmented according to their functions. Most of the data for the collection were collected from Palpa district (January, 1971), Syangja and Pokhara (January, 1970) and Gorkha district (April-May 1971). Now, this corpus is available at [http://cqpweb.lancs.ac.uk/bandhu/index.php] and the information such as title, speakers list, text type and POS tags based on Nelralec tagset are available. Gurung and Khatiwada (2007) is an attempt to analyze Devanagari script used in Nepali writing system for the collation sequence. The study also discusses the development process of a lexicon for Nepali language. To make the lexicon computer readable, the XML format is used. Each entry is provided with pronunciation, syllable break, parts of speech, meaning, and synonyms. And the framework of Hunspell is used so that it could be used in the spellchecker in OpenOffice. Bista et al. (2007) has presented the Nepali lexicon development process to be used in spell checking system for Nepali language. This paper also reports about the collection of words and the tools used the problems and issues faced during the process of lexicon development. The architecture consisting of various modules such as Lexicon database, Lexicon maintainer, Rules database, Rule maintainer, Corpus and Rule Interpreter. These modules interact with other concerned modules as required during the process of developing the lexicon. Bal and Shrestha (2007a) has developed simple spellchecker to be used in Nepali OpenOffice.org. The head words are stored in a file and the affix rules in another file. Bal et al. (2007) discusses a general overview of the technical and linguistic research and development being carried out for the development of Nepali spellchecker 1.1. OO.org using HunSpell framework with Unicode support. The dictionary is populated with stems of nouns, verbs, pronouns, adjectives, adverbs, conjunctions, interjections, particles, postpositions and compound words. The possible word forms are generated applying the affix rules. Hyoju and Shrestha (2007) presents an overview of the contemporary Nepali dictionary based on Nepali national corpus. The various components that incorporated in this dictionary are the headword, part of speech, phrase category, guide word,



13



pragmatics, definition, example, usage note, various form, suppletive form, extra information, phrase, idiom, compound, proverb and cross-reference. Gurung and Thapa (2007) has described the process of building text-to-speech for the Nepali language from speech processing perspective. The multilingual speech synthesis system known as Festival has been used. A component of Festival called Festvox provides a framework in building synthetic voices. Corpus based rule generation and statistical modeling methodologies are used. The sentences for building speech database are taken from the Nepali National Corpus. The normalized text is fed to module where generation of wave form takes place using letter-to-sound rules, concatenation of diphones and the pitch extraction. Yadava et al. (2008) describes the construction of the 14-million-word Nepali National Corpus (NNC) (http://cqpweb.lancs.ac.uk/nncv2/index.php) which includes spoken corpus, written corpus, parallel corpus and speech corpus. The NNC is encoded as Unicode text and marked up in CES-compatible XML and follows FLOB and Frown frameworks.



d. NLP works in some selected languages Megerdoomian (2003) provides a detailed description and analysis of Persian inflectional morphology from a computational perspective. The morphological analyzer designed for Persian language uses a unification-based grammar with typed feature structure. The linguistic analysis and implementation to the Samba Grammar for developing the morphological analyzer are main tasks. The surface form is formally represented as a regular expression. The morphological features are specified as a feature structure that contains the lexical and inflectional information provided by the rule. These features describe how the stem and the morphological features of the affixes are combined. Hussain (2004) has developed a finite-state morphological analyzer for Urdu. She has described the general morphological concepts such as morpheme, roots, bases, affixes, inflection, derivation and causation. The analysis has been done following the two-level morphology formalism using finite-state transducer. Makedonski (2005) is a finite state approach to the inflectional morphology of Turkish nouns. The finite state transducer is used for analyzing the nouns and the 14



implementation is done in Xerox Finite State Toolbox in two levels namely the lexicon and rule component. Ziai (2006) has developed a finite state morphological analyzer for Persian simple verbs from finite state technology approach. The system presented covers the full inflectional paradigm of modern Persian for both regular and a large number of irregular verbs. Islam (2007) describes the inflectional Bangla verb and noun morphology and also mentions the rules, lexicons and grammar for Bangla morphological analysis. This analysis is based on PC-KIMMO, a two level morphological analyzer. Dasgupta et al. (2007) has discussed the inflectional behaviors of the compound words in Bangla language from computational perspective. The Bangla compound words may retain the inflectional suffixes on both the constituents and the resultant compound that may further be inflected as a single word. Khan and Fatima (2007) investigate the inflectional properties of Pashto nouns from finite state perspective. The main focus is on the classification of the Pashto nouns. The finite state transducer is used for analyzing the Pashto nouns. Bharati and Kulkarni (2007) has discussed the importance of Paninian grammar from the perspective of information coding. The study has applied the finite state technology. The theoretical and practical aspects of computational linguistics concerning Hindi and Sanskrit and application of Paninian approach to English language are highlighted. The complexity of word formation in Sanskrit is captured by a finite state automata, analyzer for Sanskrit has been developed which provides the output with morph analysis. Bögel et al. (2007) discusses a number of issues, in particular, potential ambiguity and non-concatenative morphology . This approach deals with the treatment of both Urdu and Hindi via a cascade of FSTs that transliterates the very different scripts into a common ASCII transcription system; and the implementation of the analysis is based on the xerox finite state toolkit. Shrivastava et al. (n.d.) has developed a rule-based part-of-speech tagger for Hindi with stemmer and morphological analyzer. The developed stemmer and morphological analyzer are integrated with Hindi WordNet, Hindi Generation and Question Answering Projects. 15



1.5 Significance of the study It has been clear from the literature review (1.3) that the Nepali language is primarily described from two approaches: notional and descriptive. There have been very few and sporadic works done from formal and computational perspective. The numbers of natural language processing works are growing in many languages of the world and in South Asian languages in particular. In this context, the Nepali language is lagging behind. Therefore, there is an urgent need to develop the computational models and implement it in computing processes so that various kinds of computer applications such as spell checker, grammar checker, part of speech tagger and syntactic parser can be developed. When the applications related to Nepali language are developed, specially, end users who are seeking the information stored in Nepali language (i.e. texts) can be benefited. In this regard, this work can be a foundational and very much useful. The computational analysis of morphology in Nepali would be a central and essential component for the development of various kinds of other Nepali language processing applications.



Further analysis of linguistic levels such as syntax,



semantics, and pragmatics can also be done taking this work as the reference point. Therefore, this study can be of a great importance by itself and much more useful for both academicians and practitioners of natural language processing.



1.6 Research methodology The methodology consists of data collection, classification, analysis and implementation. Data collection: The study is primarily based upon secondary data for morphological analysis; however the example sentences illustrated are elicited. The secondary data, especially word-froms have been taken from the Nepali National Corpus developed by Bhashanchar Project (Nelralec) and cross-checked them from 'Brihad Nepali Sabdakosh' (Pokharel et al., 2040 VS). Being a native speaker of Nepali, I have also used my intuition for cross checking and analyzing the data. Classification: The unique word-forms are classified into different categories such as nouns, verbs, adjective, etc. and further subdivisions have been made according to their morpho-syntactic behaviors.



16



Analysis: The classified data are analyzed into stems and affixes for each category and the inflectional and derivational processes have been treated separately. Then phonological rules have been identified and formalized. Implementation: Finite state transducers for each group of words have been created following concept of ‘two-level morphology’. Then, a computational model for Nepali morphology has been implemented by using the tool referred to as Xerox Finite State Tool (XFST) developed by Beesley and Kartumnen (2003). See Chapter 2 for detailed description of the theoretical framework.



1.7 Limitations of the study This study has dealt mainly with the written form of words in Nepali. Only inflectional and derivational aspects of the words have been taken care of in this study. Compounding and reduplication are also morphologically important, but they have not been dealt with in this study. Despite the fact that there are a number of models/approaches for computational analysis in the literature, only finite state approach is employed in this study. Moreover, only the representative words have been considered in the implementation of the analysis.



1.8 Organization of the study This study has been structured into eight chapters. Chapter 1 introduces the concept of computational analysis of word level categories following the Two-level morphology. This chapter also deals with the statement of the problems, objectives, review of literature, research methodology and justification of the study. In Chapter 2, we present theoretical framework for the study. Chapter 3 looks into the general characteristic features and analyzes nominals in Nepali computationally and presents finite state transducer for each of them. In this chapter, we also deal with the phonological rules in the form of regular expressions which are implemented later on. In Chapter 4, we discuss the various features possessed by the verbs. It also presents finite state transducers and phonological rules involved in the verb morphology. Chapter 5 analyzes adverbs, conjunctions, case markers, postpositions, particles and interjections. Separate finite state transducers for each category have been presented in this chapter. In Chapter 6, we present the analyses of various derivational systems 17



in Nepali. The finite state transducers, phonological rules have also been presented in this chapter. Chapter 7 implements all those analyses done in the preceding chapters. We have summarized the study in chapter 8. And finally, a number of annexes have been provided.



18



CHAPTER 2 THEORETICAL FRAMEWORK



2.0 Outline This chapter presents the theoretical framework employed in this study. It consists of eight sections. Section 2.1 deals with general idea of computational concepts. In section 2.2, we deal with the concept of regular expression. In section 2.3, we briefly discuss the finite state technology. Section 2.4 introduces a brief idea of regular language. Section 2.5 presents the introduction of finite state machine which includes finite state automata, finite state transducer and some relevant and important operations that can be performed on finite state transducer. Section 2.6 deals with the use of finite state transducer in computational morphology in natural language. It also briefly shows the relation among regular expression, regular language and finite state network. In section 2.7, we discuss the basic concepts and application of Xerox finite state tool (xfst) used in this study for the development of computational model of Nepal morphology that can be used as morphological analyzer. In Section 2.8, we present the summary.



2.1 Computational concepts Finite state morphology has been an important and active field of research and development for a number of decades. The Natural Language Processing (NLP) system remains incomplete without morphological analysis. The words are the units of syntax and meaning of word is the basis of semantics. In fact the input to the syntactic and semantic analysis comes from the morphology. Therefore, the morphological analysis of a natural language has become important and fundamental. Relating word forms and detecting the structure of word forms are what morphological analysis is all about. The task of relating a given form to a canonical form is called lemmatization. Both lemmatization and the decomposition into parts have their uses, however, they share some common processes. The task of morphological analysis, then, is to take forms and relate them to other word forms, at the same time deriving featural information about the form (Roark and Sproat 2006).



19



It is customary in discussion of morphology to talk about inflection versus derivational morphology in terms of the kinds of features each of these encodes. This distinction is not relevant here for discussion. Rather we will concentrate purely on the computational mechanisms for performing morphological analysis and the way these mechanisms represent two kinds of linguistic information. The formal properties of morphological operations, viz. the syntagmatic combination of morphological elements and the paradigmatic relation between the forms are the crucial aspects (Roark and Sproat 2006). To realize this objective, one needs to understand some mathematical and computational notions and operations which are introduced in the subsequent sections. 2.2 Regular expression Regular expressions are the standard notation for characterizing text sequences and it is used for specifying the text strings in searching text (Jurafsky and Martin 2000:4859). They are highly applied in various natural language processing activities such as information retrieval, word-processing, computation of frequencies from corpora and other such tasks. A regular expression is a formula and a special language that is used for specifying classes of strings. A string is any sequences of alphanumeric characters or symbols for the purpose of most text-based search techniques. The set of strings in the regular expression has a pattern which is actually a value for the algebraic formula (Siddiqui and Tiwari 2008:54-9). The regular expressions are kept between the slashes to distinguish them from other ordinary set of characters. Table 2.1 lists some of the simplest regular expressions and the matches in the text.



Table 2.1: The sample regular expressions Regular expressions



Example pattern matched



/a/



There is a dog.



/book/



I have read many books.



/घर/ 'house'



तमी घरमा बस। 'You stay at home.'



/म/ '1SG'



मलाई भोक ला यो। 'I am hungry.'



20



The simple regular expressions in Table 2.1 are used for searching the text. The regular expression in the left can search the underlined text in right. Formally, regular expressions are an algebraic notation for characterizing a set of strings. Thus, they can be used to specify search strings as well as to define a language in a formal way. The characters are grouped by putting them between square brackets. For example, the pattern /[कखगघङ]/ will match any one of them. The square brackets specify the disjunction of the characters used within the square brackets. A dash '-', which specifies a range, can be used when the set of the characters within the brackets is very big. For example, [a-z] specifies any lowercase letter of Latin alphabet and [0-9] specifies any digit from 0 to 9. Some important operators used in the regular expressions, patterns and their meanings are listed in Table 2.2.



Table 2.2: Some operators used in regular expressions Operators in RE



Pattern



Meaning



[]



[abc]



a or b or c



[-]



[A-Z]



any one of the capital letters



^



[^a-z]



not a lowercase letter



*



ab*c



zero or more bs



.



a.c



any character between a and c



?



ab?c



either zero or b in between a and c



+



ab+



one or more bs



|



a|b



either a or b



()



appl(y|ies)



apply or applies



{n}



n occurrences



{n,m}



from n to m occurrences



\n



a new line



\t



a tab Source: Jurafsky and Martin 2000



The operators illustrated in Table 2.2 are used for creating the complex regular expressions.



21



The simple regular expression, i.e, any alphanumeric alphabets and various operators, can be combined together and a very complex regular expression can be constructed according to the requirement.1 Now, it is clear that a regular expression requires a pattern of the text to be searched and a corpus of texts to search through. And finally, a regular expression search function will search through the corpus returning all the texts of that pattern provided. Thus, a regular expression specifies a language according to its pattern, the complexity of the language that is represented depends on the complexity of the pattern used to specify the language. However, the regular expression is more than just a convenient metalanguage for text searching. Firstly, a regular expression is one way of describing a finite state automaton. It means the regular expression can be compiled into a finite state automaton. Secondly, it is a way of characterizing a particular kind of formal language called a regular language (see 2.6).



2.3 Finite state technology In order to understand how to build the linguistic application, we first need to be acquainted with the basics of how a finite-state machine works. A finite-state machine is a network consisting of states indicating one start state and one or more final states. Transitions between states are possible only if the required input is recognized. A path is a sequence of transition over arcs to a particular state. In computational morphology, a path is a set of alphabets equivalent to a word in natural language. So, it can be said that the technology that utilizes the finite-state network in the processing of creating an application is said to be a finite state technology. Therefore, the basic concept behind the finite state technology is a set of states with different properties and a set of arcs that connect these states. The arcs have a direction and an input symbol. That means there is a set of outgoing arcs with their respective input symbols. The sets of these states and arcs together form a network. As Chomsky (1957) stated, the finite state devices were limited in generative capacity i.e., the power to accurately describe all natural language phenomena. Therefore,



1



A detailed description, how a complex regular expression is created, is out of the scope of this study.



22



finite state technology was considered to be inefficient by the linguists at the earlier stages of its development. One reason was that it is a mathematical formal abstract device, so it was believed that it doesn’t have the descriptive power for natural language analysis. The second reason was that in its developing stages, it was not really powerful to account for the linguistic phenomena. But later on, it proved to be quite useful in modeling the parts of languages that could be considered as finite and regular. As far as the natural language is concerned, it shows this quality of being regular and finite at least in its parts if not in whole. Various tasks such as POS disambiguation, tokenization, shallow parsing, etc. are successfully accomplished using the finite state technology. Morphology is the core component of the natural language and it can be considered more or less finite and regular. Thus, the most significant application of the finite state technology has been the computational morphology in which both analysis and generation of the morphology of the natural language is performed. The computational morphological analysis has been the basis for any further kind of natural language processing.



2.4 Regular language As discussed in section 2.2, a regular expression denotes or specifies or describes a regular language using a specified pattern. A formal language is a set of strings each of which is composed of symbols from a finite symbol-set called an alphabet (Jurafsky and Martin 2000:48). A regular language is a formal language that is possibly an infinite set of finite sequences of symbols from a finite alphabet that satisfies particular mathematical properties: "A class of languages that are definable by regular expression is exactly the same as the class of languages that are characterized by finite-state automaton is said to be a regular language" (Jurafsky and Martin 2000:75). Table 2.3 illustrates the regular language from Nepali defined by a regular expression.



23



Table 2.3: Regular expressions and regular language Regular Expression



Regular Language



/खा.*छ.*/



खा छ kʰantsʰʌ,



'kʰa.*tsʰ.*'



खा छन् kʰantsʰʌn, खाइ छ kʰaintsʰʌ, खाएछ kʰaetsʰʌ,



खा छ



kʰantsʰʌũ, खा छु kʰantsʰu,



खाइरह छ kʰairʌɦʌntsʰʌ, खानुह ु छ kʰanuɦutsʰʌ, खाएछन्



kʰaetsʰʌn, खा



ँ ु kʰaẽtsʰu खाँदैछस् ौ kʰantsʰʌu, खाएछ



kʰãdʌitsʰʌs, खा छे स् kʰantsʰes, ….. The language in right column of Table 2.3 is a regular language denoted by a regular expression in left column. All the strings (words) matched by a regular expression /खा.*छ.*/ 'kʰa.*tsʰ.*' have the same pattern. 2.5. Finite state machine 2.5.1 Finite state automata (FSA) Finite state automata are a mathematical abstract device. As discussed in (2.3), they consist of states and arcs called transitions. Each FSA has exactly only one initial state and one or more final states. In between initial and final states, there can be any finite number of states called intermediate states. The transitions are the connections between these states and thus responsible for moving from one state to another. Conventionally, the states are represented as circles and the transitions between them are represented as labeled arcs; and an arrow is used to indicate the initial state and double circles are used to indicate the final states. The finite state automata are best understood as recognizers because they accept a finite set of input strings. For illustration, an automaton that accepts a string from the Nepali language घर ‘house’ and घरह ‘houses’ is visualized in Figure 2.1.



q0







q1







q2







q3







q4



◌ू



q5



Figure 2.1. A finite state automaton that accepts घर ‘house’ and घरह ‘houses’



24



The finite state automaton shown in Figure 2.1 recognizes the words by reading the input string symbol-by-symbol and matching the symbols to the labels on the arcs. This FSA accepts घर ‘house’ and घरह



‘houses’ because the inputs lead to final



states. No other strings are accepted by this FSA. Formally, the finite state automaton can be defined by the following five parameters: Q:



a finite set of N states q0, q1, … qN



Σ:



a finite input alphabet of symbols



q0 :



the start state



F:



the set of final states F ⊆ Q



δ(q,i): the transition function or transition matrix between states. Given a state q∈Q. δ is thus a relation from Q × Σ to Q (Jurafsky and Martin 2000:62). For the language automaton in Figure 2.1, Q = {q0, q1, q2, q3 q4, q5}, Σ = {घ, र, ह,



र, ◌ू}, F = { q2, q5 } and δ(q,i) is defined by the transition in Table 2.4. Table 2.4: The transition table for घर and घरह Input State



















◌ू



0



1



ø



ø



ø



ø



1



ø



2



ø



ø



ø



2:



ø



ø



3



ø



ø



3



ø



ø



ø



4



ø



4



ø



ø



ø



ø



5



5:



ø



ø



ø



ø



ø



2.5.2 Finite state transducer (FST) The finite state automaton discussed in (2.5.1) accomplishes the task of recognizing strings in a regular language by providing a way to systematically explore all the possible paths through a machine (Jurafsky and Martin 2000:97-108). However, this exploration can only address the problem whether the string is present in its language or not. The automaton of this capacity cannot be used to show the relation between



25



two or more languages. However, this problem can be solved by the use of another version of the FSA called ‘Finite State Transducer’. An FST is similar to an FSA; it consists of states and transitions with labeled arcs. However, in an FST the labels can be in a pair of symbols, i.e., the relation between two languages, instead of simple symbols. Whenever an arc has such a label, it is traversed and the input symbol matches, then it is transduced to the output symbols (Makedonski 2005; Ziai, 2006). Consider the example transducer in Figure 2.2, in the upper side is labeled as घर+NOUN+SG and lower side is labeled as घर. This FST transduces घर+NOUN+SG to घर and vice versa. That means, the input घर is matched and it outputs घर+NOUN+SG.



q0







q1







+SG:0



+NOUN:0 q2



q3



q4



Figure 2.2: A finite state transducer that transduces between घर ‘house’ and घर+NOUN+SG



Therefore, the important property of FSTs is that they are in principle bi-directional, meaning that they can also be applied backwards. Thus, the bi-directionality feature of the FST can be applied to the morphological analysis and generation.



2.5.3 Some important operations on FSTs a. Union The union of two or more networks is another set that contains all the elements of constituent networks. There is no ordering of the arcs in the network and it is denoted as [A|B], where A and B are the networks (Beesley and Kartumnen 2003). To illustrate this operation, there are three FSTs for nouns, adjectives and adverbs which are illustrated in upper part of the Figure 2.3. When operation union is performed on these three FSTs, it results into a single FST, as illustrated in the lower part of Figure 2.3.



26



FST for nouns



FST for adjectives



FST for adverbs



Figure 2.3: FST unioned from three FSTs for nouns, adjectives and adverbs



The Figure 2.3 shows the process of unioning two or more finite state transducers into a single network which contain the elements of its constituent finite state transducers. This union operation is very much useful and powerful to create the large and complex FST from smaller FSTs of the morphological word classes. Therefore, it allows working in the modular concept.



b. Concatenation Concatenation is an operation which keeps the networks in sequence. One can also concatenate two existing finite state networks with one another to build up new words productively or dynamically (Beesley and Kartumnen 2003). This is usually denoted as [A B] where A and B are the networks. This phenomenon is illustrated in Figure 2.4.



27



FST for purposive -न -nʌ and



FST for a verb बस् 'sit'



infinitive –नु –nu inflections



Figure 2.4: A finite state transducer concatenated from two FSTs above



In Figure 2.4, there are two FSTs in the upper part of the Figure 2.4, one for a Nepali verb बस् bʌs 'sit' and another for purposive -न -nʌ 'PURP' and infinitive -नु –nu 'INF' suffixes. And in the lower part of the Figure 2.4, there is an FST resulted from concatenating two FSTs, which can analyze and generate purposive and infinitive forms of the verb बस् bʌs 'sit'. This concatenation operation is useful in handling the verb stems and inflectional and derivational suffixes.



c. Composition Composition is an operation on two or more languages or relations. It is usually denoted as [A .o. B]. In fact, this operation removes the common elements from the networks used for composing (Beesley and Kartumnen 2003). For exemplification of this operation, there is a rule FST at the left top in Figure 2.5 which changes ◌ो o into ◌ा a for plural feature. An arbitrary symbol +MP is used for creating the environment



so that the rule can be applied to specific group of nouns. Figure 2.5 illustrates the process of composition.



28



FST for ◌ो+MP -> ◌ा



FST for plural form of केटो 'boy'



Figure 2.5: A finite state transducer from composing two FSTs above At the right top of the Figure 2.5, there is an FST for केटो 'boy' with +mp symbol. When these FSTs are composed, it results into a single FST in lower part of the Figure 2.5 which is capable of changing ◌ो o into ◌ा a for plural feature and also removes the arbitray symbol +mp without any intermediate FSTs. In fact, composition operation forms a sequence of transducers. It builds a cascade of FSTs into a single one by eliminating the common intermediate outputs, so, it allows working for a modular structure. Because of this feature of composition, it has been very much useful for composing rules with lexicon to obtain the correct surface forms.



d. Intersection The intersection of two networks contains the set containing all the members that are common to both. It is usually denoted as [A & B]. This operation might not be used as a major operation in this work. But it can be used to find the common words between two sets of words. e. Subtraction The subtraction of two networks contains the set containing elements that are in A but not in B. It is denoted as [A-B]. This operation is normally performed to find the words in a network which are not in another network. 29



f. Complementation or negation The complement language of a network A is the set of all strings that are not in the language A. It is usually denoted as ~A. This operation is very useful for filtering the words from a network. Among the operations discussed above, operations union, concatenation and composition are used while implementing the analyzed morphological categories and rules to create a single lexical transducer while others are used elsewhere.



2.6 FST in computational morphology Johnson (1972) was the first to prove that the finite state technology was appropriate and applicable to certain areas of computational linguistics. The most important thing that he discovered was that the language and relations used in traditional rewrite rules of generative phonology were essentially as powerful as the mathematical devices used in the finite state calculus. Kaplan and Kay (1994) showed with a detailed mathematical proof that every rewrite rule corresponds to a regular relation and thus can be modeled by means of an FST. The two-level morphology (Lexical Level and Surface Level) developed by Koskeniemmi (1983) also used the similar concept of an FST. One of the fundamental results of formal theory (Kleene 1956) is the demonstration that finite-state languages are precisely the set of languages that can be described by a regular expression. Figure 2.6 demonstrates relation among FST, regular language and regular expression.



Language/Relation Denotes Regular Expression



Encodes Compiles into



Finite-State network



Figure 2.6: The interrelation among language, regular expression and finite state network



30



Figure 2.6 indicates, a regular expression denotes a set of strings (i.e., a language) or a set of ordered pairs (i.e., a relation). It can be compiled into a finite-state network that compactly encodes the corresponding language or relation that may well be infinite (Beesley and Kartumnen 2003:44). The language of a regular expression includes the common set of operators of Boolean logic and operators such as concatenation that are specific to strings. Each of the regular expression operators for finite-state languages there is a corresponding operation that applies to finite-state network and produces a network for the resulting language. A finite-state network for a complex language can be built by first constructing a regular expression that describes the language in terms of set operations and then compiling that regular expression into a network. This is, in general, easier than constructing a complex network directly and in fact it is the only practical way for all but the most trivial infinite languages and relation.



2.7 Xerox finite state tool syntax (XFST) XFST used here for the computational analysis of Nepali morphology, developed at the Xerox Research Center Europe is based on Beesley and Karttumen (2003). It implements the standard finite state operations such as composition, concatenation, complement and union as well as several innovative operations like replacement rules and local sequentialization. XFST includes: lexc – a compiler for lexicons in the lexc language, which is specifically designed for handling morphotactics (the syntax of the morphemes) in natural languages (see 2.7.1), and xfst – the core tool providing an interface to the finite state calculus for building, accessing, manipulating finite state networks and a compiler for regular expressions and replacement rules which will be essential for this work (see 2.7.1). There are other run time tools within it but they are not relevant for this discussion. XFST defines transducers as relations between two languages. What would be referred to as an upper language could be thought of as the input and the lower language then would be the output when an input is applied to a transducer downwards. If we apply input to the transducer upwards then the roles switch – the input is applied on the lower side and the output comes from the upper side. Although



31



it seems a bit confusing the terms upper and lower remain constant (Beesley and Kartumnen 2003:85-202). In the definition of a lexical transducer, the upper side language describes the lexical (underlying) forms of the language to be analyzed and the lower side language contains the actual surface forms in the written forms. XFST has many operators used while analyzing the natural languages, but some important and essential ones are discussed as follows: (i) “ : ”



The crossproduct operator relates every symbol in the language in its



left side to every symbol on its right side. For example [घर+NOUN+PL:घरह ], the square brackets indicate grouping. The language घर+NOUN+PL is in the upper side of the transducer and घरह is in the lower side of the tranducer. (ii) “ -> ” The left-to-right replacement operator is an extended regular expression operator that provides for convenient formulation of rewrite rules in XFST. For example, a rule that replaces every x by a y might be written as x -> y. Every symbol that is not an x will be left unchanged. In generative phonology, rewrite rules have a context part which will cause the rule to apply only if the context is satisfied. XFST provides for that also. For example, if all xs are to be replaced by ys but only in the environment where xs are followed by z. One could formulate the rule like x -> y || _ z. The double bars indicate the begining of the context. (iii) “ $ ” The ‘contains’ operator denotes the language of all the strings containing in a variable. For example, $a means the language of all strings containing a. This operator can be used to operate on a specific set of strings in the network. (iv) “ | ” The union operator creates the union of two languages or relations. For example, A and B are two languages and A|B means the language formed from the union of A and B (see 2.5.3). (v) “ .o. ” The composition operator is very powerful for the combination of two or more transducers into one. Thus, a very complex network is possible by use of this operator. The main use of this operator is to compose the rules FSTs with lexical FST (see 2.5.3).



32



(vi) “ # ” This symbol is used for two purposes. One in lexc files to indicate the final state (see 2.7.1) and another in the replacement rules to indicate the word boundary where it is surrounded by dots (see 2.7.2). 2.7.1 LEXC grammar A lexc grammar consists of at least one lexicon (called Root). A lexicon contains a list of entries where each entry has a continuation class. It corresponds to a state. Each entry corresponds to a labeled arc that can be traversed only if the entry is successfully matched against the input string. The entry can be a regular expression. If an entry is matched, the arc is traversed and the continuation class, which is another state, is reached. The procedure is repeated until a final state is reached, denoted by the special continuation class # (Beesley and Kartumnen 2003:203-278). The structure and its components of the lexc grammar are presented in Figure 2.7 and discussed.



a. Multichar_Symbols First of all, there is a set of multicharacter symbols definition such as +NOUN +MASC +HUM +SG +PL +DIM +ERG +DAT +LOC +ABL +NOM +GEN +POP +OBL +FEM where sequences of symbols in the set are treated as atomic symbols. These symbols are primarily used as tags to indicate various grammatical categories and features. They are attached on the upper side that is visible only if morphological analysis is performed and on the lower side, each multicharacter symbol corresponds to respective suffix or epsilon. But, sometimes, additional tags are also used to create an environment for the replace rules and they are removed at the end. Figure 2.7 shows the structure of lexc grammar.



33



Multichar_Symbols Optional Declarations



LEXICON ROOT LEXICON X



Body



LEXICON X …….. END



Figure 2.7: The structure of lexc grammar (Beesley and Kartumnen 2003:205)



In Figure 2.7, the lexc grammar begins with optional components Multicharacter symbols and declarations, and body consisting of list lexicons



b. Definitions After the Multichar_Symbols section, an optional definition section can also appear in the lexc file. It consists of the keyword Definitions followed by one or variable assignment.



c. Lexicon As discussed in (2.7.1), the lexicon root corresponds to start state of the network to be compiled. There can be any number of other lexicons, but they must follow the lexicon root. Each entry consists of two parts: a form and a continuation class. The form can have formatives (i.e., a stem) or a regular expression and the continuation



34



class refers to the next sub-lexicon to be followed. The end of the word or final state is indicated by reserved symbol #. The lexicon is designed according to the format shown in Figure 2.7 above. A sample lexicon of nouns in Nepali is given below which account for both o-ending nouns and non-o-ending



nouns



including



number,



gender,



honorificity



and



augmentative/diminutive features. For the illustration, only one stem of a particular noun type and markers for stated features are included for this purpose.2 Multichar_Symbols +NOUN +MASC +FEM +OBL +PL +SG +DIM +VOC +HON +MP +FE +PLACE +PROPER ^b3



LEXICON ROOT Nouns; LEXICON Nouns !! Type 1a Nouns: केटो



inflection_1a;



! keto 'boy'



!!Type 1b Nouns: मुसो



inflection_1b;



! muso 'mouse'



!!Type 1c Nouns: डालो



inflection_1c;



! d̺alo 'basket'



!!Type 1d Nouns: फोटो



inflection_1d;



! pʰoto 'photo'



!!Type 21a Nouns: काका inflection_21a;



! kata 'uncle'



!!Type 21b Nouns:



2



Classification of nouns is based on their characteristic features. However, the names of the noun class and the sub-lexicon used in the lexc file are purely arbitrary as they are removed during the compilation processes.



3



Multichar_Symbols +MP, +FE and ^b are used in lower language to create environment and they are also removed later.



35



ना त



inflection_21b;



! nati 'grandson'



!!Type 21c Nouns: बाघ



inflection_21c;



! bagʰ 'tiger'



!!Type 21d Nouns: बःट



inflection_21d;



! bist ̺ 'a surname'



!!Type 22a Nouns: दाइ



inflection_22a;



! dai 'elder brother'



!!Type 22b Nouns: दद



inflection_22b;



! didi 'elder sister'



!!Type 22c Nouns: राम



inflection_22c;



! ram 'Ram'



!!Type 22d Nouns: सीता



inflection_22d;



! siːta 'Sita'



!!Type 22e Nouns: खेत



inflection_22e;



! kʰet 'farm land'



!!Type 22f Nouns: पोखरा inflection_22f;



LEXICON



! pokʰʌra 'Pokhara'



inflection_1a



+NOUN+MASC+SG:0



#;



+NOUN+MASC+PL:+MP



#;



+NOUN+MASC+OBL:+MP #; +NOUN+MASC+HON:+MP #; +NOUN+MASC+VOC:+MP #; +NOUN+FEM:+FE LEXICON



#;



inflection_1b



+NOUN+MASC+SG:0



#;



+NOUN+MASC+PL:+MP



#;



+NOUN+MASC+OBL:+MP #;



36



+NOUN+FEM:+FE LEXICON



#;



inflection_1c



+NOUN+SG:0



#;



+NOUN+PL:+MP



#;



+NOUN+OBL:+MP



#;



+NOUN+DIM:+FE



#;



LEXICON



inflection_1d



+NOUN+SG:0



#;



+NOUN+PL:+MP



#;



+NOUN+OBL:+MP #; LEXICON



inflection_21a



+NOUN+MASC:0



#;



+NOUN+FEM:◌ी



#;



LEXICON



inflection_21b



+NOUN+MASC:0



#;



+NOUN+FEM:नी



#;



LEXICON



inflection_21c



+NOUN+MASC:0



#;



+NOUN+FEM:^bि◌नी LEXICON



#;



inflection_21d



+NOUN+MASC:0



#;



+NOUN+FEM:◌ेनी



#;



+NOUN+FEM:ि◌नी #; LEXICON



inflection_22a



+NOUN+MASC:0 LEXICON



inflection_22b



+NOUN+FEM:0 LEXICON



#;



#;



inflection_22c



+NOUN+PROPER+MASC:0 #; LEXICON



inflection_22d



+NOUN+PROPER+FEM:0 #;



37



LEXICON



inflection_22e



+NOUN:0



#;



LEXICON



inflection_22f



+NOUN+PLACE:0



#;



END



2.7.2 XFST interface The xfst part of this system is mainly concerned with the realization, i.e., surface forms, and phonological alternation rules. This component takes the output of lexc transducer (lexc grammar) as input, which has stems with grammatical features labeled with tags and it is passed through additional rules to obtain the acceptable surface forms. The xfst component helps to compile the lexc grammar into an FST as well as other rule FSTs using lexc files and rule files respectively. At the same time, other various operations are also performed through the xfst. As demonstrated in Figure 2.8, first the different separate lexicons and rules are compiled, and then they are composed into a single FST. The lexical level (i.e. upper language) consists of citation form of a word and a sequence of tags indicating various features. The surface level (i.e. lower language) consists of actual spelling of the word. But, the process is not so straightforward. During the process of forming the word by placing the formative through the sublexicon in the lexc file and the spelling that is concatenated may differ. Therefore, some replace rules are applied to the lower language so that the final output would be grammatical. The orthographic rules for each FST are formulated and applied using xfst script. Sometimes, to change the sequence of tags, similar rules are applied to upper language also. The entire architecture for creating a finite state transducer that can be used as a morphological analyzer for Nepali is illustrated in Figure 2.8.



38



Lexicon component



Compiles



Lexicon FST Composition Lexicon FST



Rule component



Compiles



Rule FST



Figure 2.8: xfst interface can compile lexicon and rule and compose them into single FST (Karttunen 2000) In figure 2.8, the lexicon is compiled to lexicon FST and rules are compiled to rule FST. These two FSTs have been composed to a single FST with the help of xfst interface. All these functions and operations are systematically carried out through a single script file which defines various kinds of variables for rules and compiles them into an FST. This also compiles the lexicon into an FST and ultimately composes both of them into single FST. A sample of entire process with nouns in Nepali is illustrated below.



!! xfst script file clear define cons क|ग|घ|ङ|च|छ|ज|झ|ञ|ट|ठ|ड|ढ|ण|त|थ|द|ध|न|प|फ|ब|भ|म|य|र|ल|व|स|ष|श|ह; define liquids र|ल; define change [[◌ो %+MP -> ◌ा || _ .#.] .o. [◌ो %+FE -> ◌ी || _ .#.] .o. [◌ा -> [ ] || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ा -> [ ] || _ ◌े न ◌ी .#.] .o. [◌ा -> [ ] || _ ि◌ न ◌ी .#.]



39



.o. [◌ा -> [ ] || _ ◌ी .#.] .o. [◌ी -> ◌् || liquids _ न ◌ी .#.] .o. [◌ी -> ि◌ || _ न ◌ी .#.] .o. [[. .] -> ◌् || cons _ न ◌ी .#.] .o. [य ◌ा -> [ ] || _ न ◌ी .#.] .o. [◌ी -> [ ] || _ ि◌ न ◌ी .#.] .o. [◌ू -> ◌ु || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ी -> [ ] || _ %^b ि◌ न ◌ी .#.] .o. [%^b -> [ ] ] .o. [ि◌ -> [ ] || ◌ु _ न ◌ी .#.] ];



read lexc सु बेनी/सुि बनी subb-eniː/subb-iniː 'Female Subba' The majority of non-o-ending human nouns which have their corresponding feminine forms are kinship terms, family names (surnames), caste, social status and professions. (4) a. बेहल ु ो सु दर छ।



beɦulo sundʌr tsʰʌ bridegroom handsome be.NPST.3SG.MASC 'The bridegroom is handsome.'



b. बेहल ु कु प छे ।



beɦuliː kurup bridegroom.FEM ugly 'The bride is ugly.'



tsʰe be.NPST.3SG.FEM



47



c. काकाले घर बनाए। kaka-le gʰʌr bʌn-a-e uncle-ERG house make-CAUS-PST.3SG.HON 'The uncle made a house.' d. काक ःकुलमा काम गनु हु छ। kaki iskul-ma kam gʌr-nu hun-tsʰʌ uncle.FEM school-LOC work do-INF be-NPST.3SG.MASC 'The aunt works in the school.' e. ना त भात खा छ। nati bʰat kʰa-ntsʰʌ eat-NPST.3SG.MASC grandson.MASC rice 'The grandson eats rice.' f. ना तनी ःकुल जा छे । nati-niː iskul dza-ntsʰe grandson-FEM school go-NPST.3SG.FEM 'The grand daughter goes to school.' Table 3.4 illustrates the morphological gender system in Nepali. Table 3.4: Morphological gender Masculine छोरो tsʰoro



Gloss son



Feminine छोर tsʰoriː



Gloss daughter



बेहल ु ो beɦulo



bridegroom



बेहल beɦuliː ु



bride



काका kaka



uncle



काक kakiː



aunt



कुमार kumar



lad



कुमार kumari



lass



ना त nati



grandson



ना तनी natiniː



grand daughter



बाघ bagʰ



Tiger



बिघनी bʌgʰiniː



tigress



सु बा subba



Subba (male)



सु बेनी/सुि बनी



Subba (female)



subb-eniː/subb-iniː In Table 3.4, o-ending and non-o-ending nouns in Nepali that inflect for feminine gender by various ways are demonstrated.



d. Form The nouns in Nepali show two forms morphologically: direct and oblique. In traditional grammars, the nouns which appear as the citation forms are direct and 48



those appear with postpositions are oblique. The o-ending nouns as बरालो biralo 'cat' in (5a) change to a-ending as बराला birala in (5b) to take the oblique form. This happens only in o-ending nouns when they are followed by postpositions. The non-oending nouns do not show such changes whether they are followed by postpositions or not such as ख ruːkʰ 'tree' in (5c)4. (5)



a. बरालो दुध खा छ। biralo dudʰ kʰa-n-tsʰʌ cat.SG milk eat-φ-NPST.3SG.MASC 'The cat drinks milk.' b. बरालाले मुसा माछ। birala-le musa cat.OBL-ERG mouse.PL 'The cat kills the rats.'



mar-tsʰʌ kill-NPST.3SG.MASC



c. खमा एउटा चरो बसेको छ। ruːkʰ-ma eut ̺a tsʌro bʌs-eko tsʰʌ tree-LOC one.CLF bird sit-PERF be.NPST.3SG.MASC 'A bird is sitting in the tree.' Table 3.5 shows the alternation between direct and oblique forms of some nouns in Nepali. Table 3.5: Direct and oblique forms Direct Oblique



Horse



cat



mouse



घोडो



बरालो



मुसो



gʰod̺o



biralo



muso



घोडा



बराला



मुसा



gʰod̺a



birala



musa



tree ख



ruːkʰ ख



ruːkʰ



In Table 3.5, Nepali nouns that show the direct and oblique forms in different conditions and those which do not alter are listed.



4



For the present purpose, non-o-ending nouns when followed by postpositions are not considered as oblique forms.



49



e. Honorificity Nouns in Nepali show two levels of honorificty morphologically: non-honorific and honorific. The honorificity distinction can be found only in o-ending human nouns. Those nouns with o-ending as बेहल ु ो beɦulo 'bridegroom' in (6a) change into a-ending as बेहल ु ा beɦula 'bridegroom' in (6b) indicating non-honorificity and honorificity respectively. But non-o-ending nouns do not show this distinction, therefore, they are not listed here.



(6)



a. बेहल ु ो हा ीमा थयो।



beɦulo hatti-ma tʰi-jo bridegroom.NHON elephant-LOC be-PST.3SG.NHON 'The bridegroom was on the elephant.'



b. बेहल ु ा हा ीमा थए। beɦula hattima tʰi-e bridegroom.HON elephant-LOC be-PST.3SG.HON 'The bridgegroom was on the elephant.' Some examples of honorificity and non-honorificity in Nepali o-ending human nouns are illustrated in Table 3.6. Table 3.6: Honoficity: non-honorific and honorific Non-honorific Honorific



Boy



son



bridegroom



child



केटो



छोरो



ब चो



ket ̺o



tsʰoro



बेहल ु ो



beɦulo



bʌtstso



केटा



छोरा



बेहल ु ा



ब चा



ket ̺a



tsʰora



beɦula



bʌtstsa



The examples presented in Table 3.6 show the alternation between honorificity and non-honorificity in o-ending human nouns.



f. Augmentative and dimunitive From an evaluative point of view, nouns in Nepali show two distinctions: augmentative and diminutive. This distinction is found only in a small set of o-ending inanimate nouns which indicates the size of the object whether it is bigger or smaller.



50



The o-ending as डालो d̺alo 'basket' in (7a) changes into iː-ending as डाल d̺aliː in (7b) indicating augmentative and diminutive forms, respectively. This distinction of bigger and smaller in size is limited to morphology only because there is no augmentative or diminutive agreement in the verbs. The non-o-ending inanimate nouns do not show this distinction, therefore, they are not considered here.



(7)



a. राम डालो बनाउन जा दछ। ram d̺alo bʌnau-nʌ dzan-dʌtsʰʌ Ram basket.AUG make-INF know-NPST.3SG.MASC 'Ram knows to make the basket.' b. यो िचज डाल मा राख् ! jo tsidz d̺aliː-ma rakʰ this thing basket.DIM-LOC keep.IMP 'Keep this thing in the small basket.'



Table 3.7 illustrates some examples of augmentative and diminutive form of Nepali oending nouns. Table 3.7: Augmentative and dimunitive Basket



small hill



bag



bowl



थु को



झोलो



बटु को



d̺alo



thumko



dzʰolo



bʌt ̺uko



डाल



थु क



झोल



बटु क



d̺aliː



tʰumkiː



dzʰoliː



bʌt ̺ukiː



Augmentative डालो Diminutive



In Table 3.7, Nepali o-ending inanimate nouns change to i-ending for augmentative and diminutives forms, respectively.



g. Cases and case markers In Nepali, the cases are marked by postpositions. Even though the case markers are affixed to stems of nouns or pronouns, they are treated as a separate group of linguistic units. Thus, they are tokenized into separate tokens (Hardie et al. 2005, 2009). However, a short traditional description of cases and case markers has been



51



given here with examples. We have dealt with case markers as postpositions in (see 5.3) for computational purpose.



I. Ergative The ergative case in Nepali is marked by a postposition -ले-le as रामले ram-le in (8). Mostly the ergative case marker occurs with agent subject in perfective transitive constructions. (8). रामले भात खायो। ram-le bʰat kʰa-jo Ram-ERG rice eat-PST.3SG.MASC 'Ram ate rice.'



II. Dative The dative case in Nepali is marked by a postposition -लाई-laiː as मलाई mʌi-laiː in (9). The dative marker appears normally with indirect/direct object noun phrase. Normally the accusative case is not marked, but in some condition it is marked with same dative marker -लाई-laiː.



(9)



मैले रामलाई पट।



mʌi-le ram-laiː pit ̺-ẽ 1SG.OBL-ERG Ram-DAT beat-PST.1SG 'I beat Ram.' III. Instrumental The instrumental case in Nepali is marked by a postposition -ले-le as च चाले tsʌmtsa-



le in (10). The ergative and instrumental case markers are same in their forms but ergative case marker appears with agent as in (8), whereas instrumental case marker appears with instruments or objects with which the action is performed. (10) उसले च चाले खाना खायो। us-le tsʌmtsa-le kʰana 3SG.OBL-ERG spoon-INST meal 'He ate the food with spoon.'



kʰa-jo eat-PST.3SG.MASC



52



IV. Ablative The ablative case in Nepali is marked by a postposition -बाट -bat ̺ʌ as बजारबाट



bʌdzar-bat ̺ in (11). There is an alternative postposition to the former one, i.e., -दे िख dekʰi. which has almost the same meaning.



(11) ऊ बजारबाट/दे िख आयो। u: bʌdzar-bat ̺/dekʰi a-jo come-PST.3SG.MASC 3SG market-ABL 'He came from the market.' V. Locative The locative case in Nepali is marked by a postposition -मा-ma as घरमा gʰʌr-ma in (12a). There is another maker as -कहाँ-kʌhã in (12b) which normally occurs with pronouns and occasionally occurs with other nominals indicating the location but it is not frequent. (12)a.



म घरमा बःछु ।



mʌ gʰʌr-ma bʌs-tsʰu 1SG house-LOC sit-NPST.1SG 'I stay at home.' b. ह र मकहाँ आयो। hʌri məkəhã a-jo Hari 1SG-LOC come-PST.3SG.MASC 'Hari came to me.'



VI. Allative The allative case in Nepali is marked by a postposition - तर -tirʌ as बजार तर bʌdzar-



tirʌ in (13).5 (13) तमी बजार तर जाऊ! timi bʌdzar-tirʌ dza-uː 2SG.HON market-ALL go-IMP.2SG.HON 'You go towards the market.'



5



Most of the traditional Nepali grammars do not assume - तर -tirʌ as a case marker.



53



VII. Commitative/Associative The commitative/associative case in Nepali is marked by a postposition -सँग-sʌ̃gʌ and as - सत-sitʌ मसँग/ सत mʌ-sʌ̃gʌ/sitʌ in (14). (14) तमी मसँग/ सत बस ! timi mʌ-sʌ̃gʌ/sitʌ bʌsʌ! stay-IMP.2SG.HON 2SG.HON 1SG-COM 'You stay with me.' VIII. Genitive The genitive case in Nepali is marked by a postposition -को-ko as रामको ram-ko in (15). This postposition has three alternate forms -को-ko, -का-ka and -क -kiː for singular masculine, plural and feminine respectively that occurs in most cases. But, the forms -रो-ro, -रा-ra and -र -riː occur with first person pronouns म mʌ and हामी



ɦami and second person pronouns तँ tʌ̃ and तमी timiː ; and forms -नो -no, -ना-na and -नी-niː. occur with the reflexive pronoun आफू apʰu 'self'. (15) रामको कलम राॆो छ। ram-ko kʌlʌm Ram-GEN.MASC.SG pen 'Ram's pen is good.'



ramro tsʰʌ good.MASC.SG be.NPST.3SG.MASC



IX. Vocative The vocative case in Nepali is marked by changing ओ o into आ a in o-ending human nouns as केटा keta in (16). Non-o-ending nouns do not inflect for this case. (16) ए केटा ! यो काम गर ् ! e keta! jo kam gʌr! yeh boy.VOC! this work do.IMP.NHON 'Hey boy! Do this work.' X. Nominative The nominative case in Nepali is unmarked as राम ram-ø in (17). The subject of an intransitive verb and subject of a transitive verb in a non-perfective construction are in nominative case.



54



(17) राम सध ले छ। ram-ø sʌdʰʌĩ lekʰ-tsʰʌ Ram-NOM always write-NPST.3SG.MASC 'Ram always writes.' 3.2 Classification of nouns in Nepali On the basis of the characteristic features discussed in (3.3), nouns in Nepali have been grouped into fourteen classes and the finite state machines or networks have been constructed. The features like stem final segment, number, gender, form, honorificity, augumentative/diminutive and vocative case are considered while grouping the nouns. The features discussed in (3.1) are not consistently present in all the nouns. The basic criteria for grouping the nouns include presence or absence of these features and the semantics of the nouns in some cases (Prasain 2010). The sequence of tags is arbitrary. The tags for default features are not included and the names of the noun classes are arbitrary.



3.2.1 O-ending nouns a. NounType 1a In this class, the o-ending human nouns which inflect for number, gender, form, honorificity and vocative case are grouped. Some examples with their corresponding morphological tags are given in Table 3.8 Table 3.8: NounType 1a Morphological Tags NOUN+MASC+SG NOU N+MASC+PL



NOUN+MASC+OBL



NOUN+MASC+HON



NOUN+MASC+VOC



NOUN+FEM



boy



son



bridegroom



child



केटो



छोरो



ब चो



ket ̺o



tsʰoro



बेहल ु ो



beɦulo



bʌtstso



केटा



छोरा



ब चा



ket ̺a



tsʰora



बेहल ु ा



beɦula



bʌtstsa



केटा



छोरा



ब चा



ket ̺a



tsʰora



बेहल ु ा



beɦula



bʌtstsa



केटा



छोरा



ब चा



ket ̺a



tsʰora



बेहल ु ा



beɦula



bʌtstsa



केटा



छोरा



ब चा



ket ̺a



tsʰora



बेहल ु ा



beɦula



bʌtstsa



केट



छोर



बेहल ु



ब ची



ket ̺iː



tsʰoriː 55



beɦuliː



bʌtstsiː



The finite state transducer illustrated in Figure 3.1 is capable of analyzing and generating the word-forms illustrated in Table 3.8.



Figure 3.1: A finite state transducer for NounsType 1a The phonological rules involved in this group of nouns are given in PR 3.1. Phonological rules PR 3.1 i. Stem final vowel ◌ो o of the o-ending human nouns of the lower language (i.e., surface level) is changed to vowel ◌ा a for plural form, oblique, honorificity and vocative case. Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending human nouns of the lower language (i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#. b. NounType 1b In this class, the o-ending animate nouns which inflect for number, gender and form are grouped. Some examples are listed in Table 3.9 with their corresponding morphological tags.



56



Table 3.9: NounType 1b Morphological Tags NOUN+MASC+SG NOUN+MASC+PL



NOUN+MASC+OBL



NOUN+FEM



horse



goat



cat



rat



घोडो



बाभो



बरालो



मुसो



gʰod̺o



bakʰro



biralo



muso



घोडा



बाभा



बराला



मुसा



gʰod̺a



bakʰra



birala



musa



घोडा



बाभा



बराला



मुसा



gʰod̺a



bakʰra



birala



musa



घोडी



बाभी



बराल



मुसी



gʰod̺iː



bakʰriː



biraliː



musiː



The finite state transducer illustrated in Figure 3.2 is capable of analyzing and generating the word-forms illustrated in Table 3.9.



Figure 3.2: A finite state transducer for NounType 1b The phonological rules that are applied to the finite state transducer illustrated in Figure 3.2 are given in PR 3.2.



Phonological rules PR 3.2 i. Stem final vowel ◌ो o of the o-ending animate nouns of the lower language (i.e, surface level) is changed to vowel ◌ा a for plural form and oblique form. Regular expression: ◌ो -> ◌ा || _ .#.



57



ii. Stem final vowel ◌ो o of the o-ending animate nouns of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#. c. NounType 1c In this class, the o-ending inanimate nouns which inflect for number, form and augmentative/diminutive features are grouped. Some examples are listed in Table 3.10 with their corresponding morphological tags.



Table 3.10: NounType 1c Morphological Tags



basket



small hill



bag



bowl



NOUN+SG



डालो



थु को



झोलो



बटु को



d̺alo



thumko



dzʰolo



bʌt ̺uko



डाला



थु का



झोला



बटु का



d̺ala



tʰumka



dzʰola



bʌt ̺uka



डाला



थु का



झोला



बटु का



d̺ala



tʰumka



dzʰola



bʌt ̺uka



डाल



थु क



झोल



बटु क



d̺aliː



tʰumkiː



dzʰoliː



bʌt ̺ukiː



NOUN+PL



NOUN+OBL



NOUN+DIM



The finite state transducer illustrated in Figure 3.3 is capable of analyzing and generating the word-forms illustrated in Table 3.10.



Figure 3.3: A finite state transducer for NounType 1c



58



The phonological rules listed in PR 3.3 are applied to the finite state transducer illustrated in Figure 3.3.



Phonological rule PR 3.3 i. Stem final vowel ◌ो o of the o-ending inanimate nouns of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural and oblique form. Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending inanimate nouns of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for diminutive feature. Regular expression: ◌ो -> ◌ी || _ .#.



d. NounType 1d In this class, the o-ending inanimate nouns which inflect only for number and oblique form are grouped. Some examples are listed in Table 3.11 with their corresponding morphological tags. Table 3.11: NounType 1d Morphological pine Tags NOUN+SG स लो NOUN+PL



NOUN+OBL



photo



ladder



flesh(dead)



फोटो



लःनो



सनो



sʌllo



pʰot ̺o



lisno



sino



स ला



फोटा



लःना



सना



sʌlla



pʰot ̺a



lisna



sina



स ला



फोटा



लःना



सना



sʌlla



pʰot ̺a



lisna



sina



The finite state transducer illustrated in Figure 3.4 is capable of analyzing and generating the word-forms illustrated in Table 3.11.



59



Figure 3.4: A finite state transducer for NounsType 1d The finite state transducer illustrated in Figure 3.4 is composed with the finite state transducer of rules listed in PR 3.4.



Phonological rule PR 3.4 i. Stem final vowel ◌ो o of the o-ending inanimate nouns of the lower language (i.e, surface level) is changed to vowel ◌ा a for plural and oblique. Regular expression: ◌ो -> ◌ा || _ .#.



3.2.2 Non-o-ending nouns I. Marked a. NounType 21a In this class, the non-o-ending human and animate nouns which inflect only for gender feature with marker -◌ी -i: are grouped. Some examples are listed in Table 3.12 with their corresponding morphological tags. Table 3.12: NounType 21a Morphological Tags



uncle



lad



pigeon



parrot



NOUN+MASC



काका



कुमार



परे वा



सुगा



kaka



kumar



pʌrewaː



suga



काक



कुमार



परे वी



सुगी



kakiː



kumariː



pʌrewiː



sugiː



NOUN+FEM



60



The finite state transducer illustrated in Figure 3.5 is capable of analyzing and generating the word-forms illustrated in Table 3.12.



Figure 3.5: A finite state transducer for NounType21a The phonological rules listed in PR 3.5 are combined at the lower side of the network illustrated in Figure 3.5.



Phonological rule PR 3.5 i. Stem final vowel ◌ा a of the non-o-ending animate nouns of the lower language (.i.e, surface level) is deleted when followed by gender marker ◌ी iː . Regular expression: ◌ा -> [ ] || _ ◌ी .#.



b. NounType 21b In this class, the non-o-ending human nouns which inflect for masculine and feminine features with marker -नी-ni are collected. Some examples are listed in Table 3.13 with their corresponding morphological tags. Table 3.13: NounType 21b Morphological grandson beggar priest Tags NOUN+MASC ना त जोगी पि डत pʌnd̺it NOUN+FEM



chief मुिखया



nati



dzogiː



ना तनी



जो गनी



पि ड ी



मुिखनी



nati-niː



dzogi-niː



pʌnd̺it-niː



mukʰi-niː



61



mukʰiya



The finite state transducer illustrated in Figure 3.6 is capable of analyzing and generating the word-forms illustrated in Table 3.13.



Figure 3.6: A finite state transducer for NounType 21b The finite state transducer in Figure 3.6 is composed with the phonological rules listed in PR 3.6.



Phonological rule PR 3.6 i. Stem final vowel ◌ी iː of the non-o-ending human nouns of the lower language (i.e. surface level) is changed to vowel ि◌ i before the feminine gender marker नी niː.



Regular expression: ◌ी -> ि◌ || _ न ◌ी .#. ii. Halanta ◌् is inserted between consonant symbol and feminine gender marker नी -niː at the surface level.6



Regular expression: [. .] -> ◌् || cons _ न ◌ी .#. iii. या ja is deleted before the feminine gender marker नी niː at the surface level. Regular expression: य ◌ा -> [ ] || _ न ◌ी .#.



6



Halanta ◌् is generic term for the diacritic in Devanagari that is used to suppress the inherent vowel that otherwise occurs with every consonant letter.



62



iv. Stem final vowel ◌ी iː of the non-o-ending nouns of the lower language (i.e. surface level) is replaced by a halanta ◌् after liquid sounds and before the feminine gender marker नी niː. Regular expression: ◌ी -> ◌् || liquids _ न ◌ी .#. c. NounType 21c In this class, the non-o-ending human and animate nouns which inflect only for the gender feature with the marker -इनी -iniː are grouped. This group differs from the other NounType 21b because the आ a sound within the stem changes to the अ ʌ sound while inflecting for feminine gender. Some examples are listed in Table 3.14 with their corresponding morphological tags.



Table 3.14: NounType 21c Morphological Tags tiger NOUN+MASC बाघ NOUN+FEM



Surname1



Surname2



Surname3



काक



थापा



था



bagʰ



karkiː



tʰapa



tʰaruː



बिघनी



क कनी



थ पनी



थ नी



bʌgʰi-niː



kʌrkiniː



tʰʌp-iniː



tʰʌru-niː



The finite state transducer illustrated in Figure 3.6 is capable of analyzing and generating the word-forms illustrated in Table 3.14 when the rules listed in PR 3.7 are applied.



Phonological rule PR 3.7 i. Vowel ◌ा a in the non-o-ending human and animate noun stems of the lower language (i.e. surface level) is changed to vowel अʌ when the feminine gender marker -इनी -iniː appears at the end of the word Regular expression: ◌ा -> [ ] || _ ि◌ न ◌ी .#. ii. Vowel ◌ी iː. is deleted before the feminine gender marker -इनी -iniː.



63



Regular expression: ◌ी -> [ ] || _ ि◌ न ◌ी .#. iii. Vowel ◌ू uː. is changed to ◌ु u before the feminine gender marker -इनी -iniː. Regular expression: ◌ू -> ◌ु || _ ि◌ न ◌ी .#. d. NounType 21d In this class, the non-o-ending human nouns which inflect only for gender feature with marker -इनी -iniː alternatively -एनी -eniː are grouped. Some examples are listed in Table 3.15 with their corresponding morphological tags.



Table 3.15: NounType 21d Morphological Tags NOUN+MASC NOUN+FEM



Ethnic name1



Surname



Ethnic2



खस



बःट



सु बा



kʰʌs



bist ̺ʌ



subba



खसेनी/ख सनी



बःटे नी/ बिःटनी



सु बेनी/सुि बनी



kʰʌs-eni/kʰʌs-ini



bist-eni/bist-ini



subb-eni/subb-ini



The finite state transducer illustrated in Figure 3.7 is capable of analyzing and generating the word-forms illustrated in Table 3.15.



Figure 3.7: A finite state transducer for NounType 21d The phonological rules involved in this process are listed in PR 3.8 which are compiled and composed with the finite state transducer illustrated in Figure 3.7.



64



Phonological rule PR 3.8 i. Vowel ◌ा a at the end of non-o-ending noun stem is deleted before the feminine gender marker -इनी -iniː. or -एनी eniː. Regular expression: ◌ा -> [ ] || _ ि◌ न ◌ी | ◌े न ◌ी.#.



II. Unmarked a. NounType 22a In this class, the non-o-ending human nouns that do not inflect for any features but are inherently masculine are grouped. Some examples are listed in Table 3.16 with their corresponding morphological tags.



Table 3.16: NounType 22a Morphological elder brother Tags NOUN+MASC दाइ dai



younger brother



father



husband



भाइ



बाबु



लो ने



bʰai



babu



logne



The finite state transducer illustrated in Figure 3.8 is capable of analyzing and generating the word-forms illustrated in Table 3.16.



Figure 3.8: A finite state transducer for NounType 22a



b. NounType 22b In this group, the non-o-ending human nouns do not inflect for any features, but are inherently feminine gender are collected. Some examples are listed in Table 3.17 with their corresponding morphological tags.



65



Table 3.17: NounType 22b Morphological Tags



elder sister



younger sister



mother



wife



NOUN+FEM



दद



ब हनी



आमा



ःवाःनी



didiː



bʌɦiniː



ama



swasniː



The finite state transducer illustrated in Figure 3.9 is capable of analyzing and generating the word-forms illustrated in Table 3.17.



Figure 3.9: A finite state transducer for NounsType 22b c. NounType 22c In this group, the proper names of males which never inflect for anything irrespective of their final sound segments, but grammatically agree with verb for masculine gender if they are in subject-NP position. Some examples are listed in Table 3.18 with their corresponding morphological tags.



Table 3.18: NounType 22c Morphological Pname1 Tags NOUN+PROPER+MASC कणाखर kʌrɳakʰʌr



Pname2



Pname3



Pname4



हर



ँयाम



बलराम



ɦʌri



sjam



bʌlʌram



The finite state transducer illustrated in Figure 3.10 is capable of analyzing and generating the word-forms illustrated in Table 3.18.



Figure 3.10: A finite state transducer for NounType 22c



66



d. NounType 22d In this group, the proper names of females, which never inflect for anything irrespective of their final sound segments, but grammatically agree with verb for feminine gender if they are in subject NP position, are collected. Some examples are listed in Table 3.19 with their corresponding morphological tags.



Table 3.19: NounType 22d Morphological Pname1 Tags NOUN+PROPER+FEM सीता sita



Pname2



Pname3



Pname4



गीता



जानक



नमला



gita



dzanʌki



nirmʌla



Figure 3.11: A finite state transducer for NounType 22d The finite state transducer illustrated in Figure 3.11 is capable of analyzing and generating the word-forms illustrated in Table 3.19.



e. NounType 22e In this group, all the common nouns which are non-o-ending are collected. These nouns never inflect for anything irrespective of their final sound segments, but grammatically agree with verb for default feature, i.e., third person masculine singular if they are in subject NP position. Some examples with their corresponding morphological tags are given in Table 3.20



Table 3.20: NounType 22e Morphological Promise Tags +NOUN कसम kʌsʌm



shoulder



farm-land



book



काँध



खेत



कताब



kãdʰ



kʰet



kitab



67



The finite state transducer illustrated in Figure 3.12 is capable of analyzing and generating the word-forms illustrated in Table 3.20.



Figure 3.12: A finite state transducer for NounType 22e



f. NounType 22f In this class, all the place names are grouped. Some examples are listed in Table 3.21 with their corresponding morphological tags.



Table 3.21: NounType 22f Morphological Tags NOUN+PLACE



PlaceName1



PlaceName2



PlaceName3 PlaceName4



झापा



भोजपुर



नेपाल



जापान



dzʰapa



bʰodzpur



nepal



dzapan



The finite state transducer illustrated in Figure 3.13 is capable of analyzing and generating the word-forms illustrated in Table 3.21.



Figure 3.13: A finite state transducer for NounType 22f



68



3.3 Pronouns 3.3.1 Characteristics of pronouns in Nepali a. Person Pronouns in Nepali have three persons: first, second and third. They are listed in Table 3.22. Table 3.22: Pronouns with respect to persons Person First Second



Pronouns म mʌ 'I', हामी ɦami 'we'



तँ tʌ̃ 'you', तमी timiː 'you', तपा



tʌpaĩː 'you',



यहाँ jʌɦã 'you',



हजुर ɦʌ̃dzur 'you',



मौसुफ mʌusupʰ 'royal you'



Third



यो jo 's/he', यनी jiniː 'she', यी jiː 'they', यो tjo 'that', तनी tiniː 's/he', ती tiː 'they',



ऊ uː 'he', उनी uniː 'she', उहाँ uhã 'he'



b. Number Personal pronouns in Nepali show two dimensions of number: singular and plural. The number feature in pronouns is also indicated by a plural/collective postposition ह



-ɦʌruː but some of them such as म mʌ 'I', तँ tʌ̃ 'you' do not take any number



maker. They have corresponding suppletive forms for the plural feature e.g., हामी



ɦami 'we', तमी timiː 'you'. Table 3.23 lists the personal pronouns in Nepali with number distinctions.



69



Table 3.23: Personal pronouns in number distinctions Singular म mʌ 'I'



First Second



Plural हामी ɦamiː 'we', हामीह



तँ tʌ̃, 'you'



तमीह



तमी timiː 'you',



तपा



timiː-ɦʌruː,



तपा ह यहाँह



tʌpaĩː 'you',



यहाँ jʌɦã 'you',



हजुरह



हजुर ɦʌdzur 'you',



tʌpaĩː-ɦʌruː, jʌɦã-ɦʌruː, ɦʌdzur-ɦʌruː,



मौसुफह



मौसुफ mʌusupʰ 'you'



Third



ɦamiːɦʌruː 'we-Pl'



यो jo 'this', यी jiː 'this'



यनीह



mʌusupʰ-ɦʌruː jiniːɦʌruː,



यनी jiniː 's/he', यो tjo 'that', ती



तनीह



tiniːɦʌruː,



tiː 'those' तनी tiniː 'those', ऊ uː



उनीह



uniːɦʌruː,



's/he' उनी uniː 's/he' उहाँ uhã 's/he'



उहाँह



uhãɦʌruː



c. Form Pronouns in Nepali show two morphological forms: direct and oblique. When a pronoun is followed by postpositions, it changes into oblique forms. The oblique forms are found in personal, demonstrative, relative, reflexive pronouns; and sporadically in interrogative, definite and indefinite pronouns. Table 3.24 lists the direct and oblique form of some pronouns. Table 3.24: Forms of pronouns: direct and oblique Direct form म mʌ 'I'



Oblique form मै mʌi 'I.OBL'



हामी ɦami 'we'



हाम् ɦam 'we.OBL'



तँ tʌ̃ 'you'



त tʌĩ 'you.OBL'



तमी timiː 'you'



तम् tim 'you.OBL'



यो jo 'this'



यस् jʌs 'this.OBL'



ऊ uː 's/he'



उन् un 's/he.OBL'



जो dzo 'who.REL'



जस् dzʌs 'who.REL.OBL'



यो tjo 'that' को ko 'who.INTERO'



यस् tjʌs 'that.OBL' कस् kʌs 'who.INTERO.OBL'



70



d. Honorificity The second and third person pronouns in Nepali show five levels of honorificity. There is no particular honorific marker but the hierarchy is maintained at the lexical level. The honorificity in the third person pronouns is marginally marked whereas in second person pronouns it is not morphologically significant. Table 3.25 lists the pronouns in terms of honorific levels. The honorific agreement with the verb at the morphological level occurs only for non-honorific (level 0) and mid honorific (level 1) pronouns and other higher honorific levels (levels 2, 3 and 4) have the syntactic means for encoding the honorificity. 1 Table 3.25: Honorific levels in Nepali pronouns Honorificity Non-honorific



level 0



Second Person तँ tʌ̃



Mid-honorific



1



तमी timiː



High-honorific



2



तपा



HHigh-honorific



3



यहाँ jʌɦã,



tʌpaĩː



आफू apʰuː, हजुर ɦʌdzur



Royal-honorific



4



मौसुफ mʌusupʰ



Third Person यो jo, यो tjo, ऊ uː



यी jiː, ती tiː, यनी jiniː, तनी



tiniː, उनी uniː, उहाँ uɦã उहाँ uɦã, आफू apʰuː, हजुर ɦʌdzur



मौसुफ mʌusupʰ



3.3.2 Grouping of pronouns The pronouns cannot be grouped like nouns. Each pronoun in Nepali is unique in form and meaning. Therefore, they are treated and illustrated individually. However, for convenience, we have grouped them in terms of their forms to demonstrate the finite-state network. a. Personal pronouns First person: First person pronouns have two forms: singular म mʌ and plural हामी



ɦami. Both first person singular and plural have oblique forms. First person singular pronoun has direct, oblique, emphatic forms, and genitive: masculine, feminine, plural 1



Though the pronouns in Nepali in terms of honorificity are not morphologically significant, they have been tagged into five levels for computational purpose in this study.



71



and emphatic forms. But, first person plural pronoun has direct, oblique forms and genitive: masculine, feminine, plural and emphatic forms. Table 3.26 lists first person singular forms with their corresponding morphological tags.



Table 3.26: First person singular pronouns Morphological Tags PRON+1SG



Devanagari म



IPA mʌ



Gloss I



PRON+1SG+OBL



मै



mʌi



I



PRON+1SG+EMPH



मै



mʌi



I



PRON+1SG+OBL+GEN+MASC



मेरो



mero



my



PRON+1SG+OBL+GEN+FEM



मेर



meriː



my



PRON+1SG+OBL+GEN+PL



मेरा



mera



my



PRON+1SG+OBL+GEN+HON



मेरा



mera



my



PRON+1SG+OBL+GEN+OBL



मेरा



mera



my



PRON+1SG+OBL+GEN+EMPH



मेरै



merʌi



my



The finite state transducer in Figure 3.14 encodes the first person singular pronouns in Nepali presented in Table 3.26. The finite state transducer in Figure 3.14 is capable of analyzing and generating the pronouns of Table 3.26.



Figure 3.14: A finite state transducer for first person singular pronouns The first person plural pronouns in Nepali are presented in Table 3.27 with their corresponding morphological tags.



72



Table 3.27: First person plural pronouns Morphological Tags PRON+1PL



Devanagari हामी



IPA ɦamiː



Gloss we



PRON+1PL+OBL+GEN+MASC



हाॆो



ɦamro



our



PRON+1PL+OBL+GEN+FEM



हाॆी



ɦamriː



our



PRON+1PL+OBL+GEN+PL



हाॆा



ɦamra



our



PRON+1PL+OBL+GEN+HON



हाॆा



ɦamra



our



PRON+1PL+OBL+GEN+OBL



हाॆा



ɦamra



our



PRON+1PL+OBL+GEN+EMPH



हाॆै



ɦamrʌi



our



The finite state transducer illustrated in Figure 3.15 is capable of analyzing and generating the plural pronouns illustrated in Table 3.27.



Figure 3.15: A finite state transducer for first person plural pronouns



Second person: The second person pronouns can be grouped into two classes. One consists of तँ tʌ̃ 'you' and तमी timiː 'you' which have various forms for direct, oblique, emphatic and genitive: masculine, feminine, plural and emphatic. And another group consists of तपा



tʌpaĩː, उहाँ uɦã, यहाँ jʌɦãː, आफू apʰuː, हजुर ɦʌdzur and मौसुफ



mʌusupʰ which do not have any other forms. Table 3.28 lists second person nonhonorific singular forms with their corresponding morphological tags.



73



Table 3.28: Second person singular non-honorific pronouns Morphological Tags PRON+2SG



Devanagari तँ



IPA tʌ̃



Gloss you



PRON+2SG+OBL







tʌĩ



you



PRON+2SG+EMPH







tʌĩ



you



PRON+2SG+OBL+GEN+MASC



तेरो



tero



your



PRON+2SG+OBL+GEN+FEM



तेर



teriː



your



PRON+2SG+OBL+GEN+PL



तेरा



tera



your



PRON+2SG+OBL+GEN+HON



तेरा



tera



your



PRON+2SG+OBL+GEN+OBL



तेरा



tera



your



PRON+2SG+OBL+GEN+EMPH



तेरै



terʌi



your



The second person singular non-honorific pronouns in Nepali are encoded into a finite state transducer as demonstrated in Figure 3.16 which is capable of analyzing and generating the pronouns listed in Table 3.28.



Figure 3.16: A finite state transducer for second person singular non-honorific pronouns Table 3.29 lists second person singular honorific forms with their corresponding morphological tags.



74



Table 3.29: Second person honorific pronouns Morphological Tags PRON+2SG+HON



Devanagari तमी



IPA timiː



Gloss you



PRON+2SG+OBL+HON+GEN+MASC



तॆो



timro



your



PRON+2SG+OBL+HON+GEN+FEM



तॆी



timriː



your



PRON+2SG+OBL+HON+GEN+PL



तॆा



timra



your



PRON+2SG+OBL+HON+GEN+HON



तॆा



timra



your



PRON+2SG+OBL+HON+GEN+OBL



तॆा



timra



your



PRON+2SG+OBL+HON+GEN+EMPH



तॆै



timrʌi



your



The finite state transducer illustrated in Figure 3.17 encodes the second person honorific pronouns in Nepali and it is capable of analyzing and generating the pronouns listed in Table 3.29.



Figure 3.17: A finite state transducer for second person honorific pronouns Table 3.30 lists second person high honorific singular forms with their corresponding morphological tags.



Table 3.30: Second person high honorific pronouns Morphological Tags Devanagari PRON+2SG+HHON तपा



IPA tʌpaĩː



Gloss you



PRON+2SG+HHON



यहाँ



jʌɦãː



you



PRON+2SG+HHON



उहाँ



uɦã



you



PRON+2SG+HHON



वहाँ



wʌɦã



you



PRON+2SG+HHON



हजुर



ɦʌdzur



you



75



The finite state transducer demonstrated in Figure 3.18 encodes the second person high honorific pronouns in Nepal and it is capable of analyzing and generating the pronouns listed in Table 3.30.



Figure 3.18: A finite state transducer for second person higher honorific pronouns A second person royal honorific pronoun in Nepali is given in Table 3.31 with its corresponding morphological tags.



Table 3.31: Second person royal honorific pronoun Morphological Tags PRON+2SG+RHON



Devanagari



IPA mʌusupʰʌ



मौसुफ



Gloss you.royal



The finite state transducer in Figure 3.19 encodes the royal honorific pronoun and it is capable of analyzing and generating it.



Figure 3.19: A finite state transducer for second person highest honorific pronoun



Third person: The third person pronouns can be grouped into three distinct sets. The first one is ऊ u: and its various forms. ऊ u: inflects for form: direct and oblique, honorificity: non-honorific and honorific; and emphatic. Table 3.32 lists the pronoun ऊ u: and its various forms with their corresponding morphological tags.



76



Table 3.32: Third person pronoun ऊ u: Morphological Tags



Devanagari



PRON+3SG







IPA u:



Gloss he



PRON+3SG+EMPH



उह



uɦi:



he



PRON+3SG+OBL



उस



usʌ



he



PRON+3SG+OBL+EMPH



उसै



usʌ



he



PRON+3SG+HON



उनी



uni:



she



PRON+3SG+HON+OBL



उन



unʌ



she



PRON+3SG+HON+OBL+EMPH



उनै



unʌ



she



PRON+3SG+HON



उहाँ



uɦã



s/he



PRON+3SG+HON



वहाँ



wʌɦã



s/he



The finite state transducer illustrated in Fig 3.20 is capable of analyzing and generating the third person pronoun ऊ u: and its various forms illustrated in Table 3.32.



Figure 3.20: A finite state transducer for third person uː The second one is यो tjo, ती ti: and their various forms. यो tjo and ती ti: inflect for form: direct and oblique, honorificity: non-honorific and honorific and emphatic. Table 3.33 lists the pronoun यो tjo, ती ti: and their various forms with their corresponding morphological tags.



77



Table 3.33: Third person pronouns यो tjo and ती ti: Morphological Tags



यो



IPA tjo



Gloss he



PRON+3SG+DIST+EMPH



यह



tjʌɦiː



he



PRON+3SG+OBL



यस



tjʌsʌ



s/he



PRON+3SG+OBL+EMPH



यसै



tjʌsʌi



s/he



PRON+3SG+DIST



Devanagari



PRON+3SG+HON+DIST



ती



ti:



s/he



PRON+3PL+DIST



ती



ti:



s/he



PRON+3SG+HON+DIST



तनी



tini:



s/he



PRON+3SG+OBL+HON+DIST



तन



tinʌ



s/he



PRON+3SG+OBL+HON+DIST+EMPH



तनै



tinʌi



s/he



The finite state transducer illustrated in Figure 3.21 is capable of analyzing and generating the third person pronoun यो tjo, ती ti: and their various forms illustrated in Table 3.33.



Figure 3.21: A finite state transducer for third person pronouns यो tjo and ती ti: The third one is यो jo and यी ji: and their various forms. यो jo and यी ji: inflect for form: direct and oblique, honorificity: non-honorific and honorific and emphatic. Table 3.34 lists the pronoun यो jo and यी ji: and their various forms with their corresponding morphological tags.



78



Table 3.34: Third person pronouns यो jo and यी ji: Morphological Tags



Devanagari



PRON+3SG+PROX



यो



IPA jo



Gloss s/he



PRON+3SG+PROX+EMPH



यह



jʌɦi



s/he



PRON+3SG+OBL+PROX



यस



jʌsʌ



s/he



PRON+3SG+OBL+PROX+EMPH



यसै



jʌsʌi



s/he



PRON+3SG+PROX+HON



यी



ji:



s/he



PRON+3PL+PROX



यी



ji:



s/he



PRON+3SG+PROX+HON



यनी



jini:



s/he



PRON+3SG+PROX+OBL+HON



यन



jinʌ



s/he



PRON+3SG+PROX+OBL+HON+EMPH



यनै



jinʌi



s/he



The finite state transducer illustrated in Fig 3.22 encodes the pronouns listed in Table 3.34 and it is capable of analyzing and generating the third person pronouns यो jo, यी



ji: and their various forms illustrated in Table 3.34.



Figure 3.22: A finite state transducer for third person pronouns यो jo and यी ji: b. Reflexive pronoun There is a single reflexive pronoun आफू apʰu: 'self' in Nepali. But it has various forms. It inflects for form: direct and oblique, genitive case: singular, plural, honorific, oblique and feminine, and emphatic. The Table 3.35 lists आफू apʰu: 'self' and its various forms with their corresponding morphological tags.



79



Table 3.35: The reflexive pronouns Morphological Tags PRON+REFL



Devanagari आफू



IPA apʰu:



Gloss self



PRON+REFL+OBL+EMPH



आफै



apʰʌi



self



PRON+REFL+OBL+EMPH



आफ



apʰʌĩ



self



PRON+REFL+OBL+GEN+SG



आ नो



apʰno



own



PRON+REFL+OBL+GEN+PL



आ ना



apʰna



own



PRON+REFL+OBL+GEN+HON



आ ना



apʰna



own



PRON+REFL+OBL+GEN+OBL



आ ना



apʰna



own



PRON+REFL+OBL+GEN+FEM



आ नी



apʰni:



own



PRON+REFL+OBL+GEN+EMPH



आ नै



apʰnʌi



own



The finite state transducer illustrated in Figure 3.23 is capable of analyzing and generating the reflexive pronoun आफू apʰu: and its various forms illustrated in Table 3.35.



Figure 3.23: A finite state transducer for reflexive pronouns c. Demonstrative pronouns The demonstrative pronouns can be grouped into four distinct sets. The first one is यो



jo and यी ji: and their various forms. यो jo and यी ji: inflect for form: direct and oblique and emphatic. Table 3.36 lists the demonstrative pronouns यो jo and यी ji: and their various forms with their corresponding morphological tags.



80



Table 3.36: The demonstrative pronouns यो jo and यी ji: Morphological Tags



Devanagari



PRON+DEM+PROX



यो



IPA jo



Gloss this



PRON+DEM+PROX+EMPH



यह



jʌɦi:



this one



PRON+DEM+PROX



यी



ji:



these



PRON+DEM+PROX+HON



यनी



jini:



these



PRON+DEM+PROX+OBL



यन



jinʌ



these



PRON+DEM+PROX+OBL+EMPH



यनै



jinʌi



these ones



PRON+DEM+PROX+HON



यहाँ



jʌɦã



you



The finite state transducer illustrated in Figure 3.24 is capable of analyzing and generating the demonstrative pronouns यो jo, यी ji: and their various forms illustrated in Table 3.36.



Figure 3.24: A finite state transducer for demonstrative pronouns यो jo and यी



ji:



The second one is यो tjo and ती ti: and their various forms. यो tjo and ती ti: inflect for form: direct and oblique; and emphatic. Table 3.37 lists the demonstrative pronoun यो



tjo and ती ti: and their various forms with their corresponding morphological tags.



81



Table 3.37: The demonstrative pronouns यो tjo and ती ti: Morphological Tags PRON+DEM+DIST PRON+DEM+DIST+EMPH



Devanagari यो



IPA tjo



Gloss that



यह



tjʌɦi:



that one



PRON+DEM+DIST



ती



ti:



those



PRON+DEM+DIST+OBL+HON



तनी



tini:



those



PRON+DEM+DIST+OBL



तन



tinʌ



those



PRON+DEM+DIST+OBL+EMPH



तनै



tinʌi



those



The finite state transducer illustrated in Figure 3.25 is capable of analyzing and generating the demonstrative pronouns यो tjo and ती ti: and their various forms illustrated in Table 3.37.



Figure 3.25: A finite state transducer for demonstrative pronouns यो tjo and ती ti:



The third one is ऊ u: and its various forms. ऊ u: inflects for form: direct and oblique, and emphatic. Table 3.38 lists the pronoun ऊ u: and its various forms with their corresponding morphological tags.



82



Table 3.38: The demonstrative pronouns ऊ u: Morphological Tags



Devanagari



PRON+DEM+DIST







IPA u



Gloss that



PRON+DEM+DIST+EMPH



उह



uɦi:



that same



PRON+DEM+DIST+HON



उनी



uni:



that



PRON+DEM+DIST+OBL



उन



unʌ



that



PRON+DEM+DIST+OBL+EMPH



उनै



unʌi



that



PRON+DEM+DIST+HON



उहाँ



uɦã



there



PRON+DEM+DIST+HON



वहाँ



wʌɦã



there



The finite state transducer illustrated in Fig 3.26 is capable of analyzing and generating the demonstrative pronouns ऊ u: and its various forms illustrated in Table 3.38.



Figure 3.26: A finite state transducer for demonstrative pronouns ऊ u: The fourth one is remaining demonstratives and their various forms that inflect only for emphatic. Table 3.39 lists the remaining demonstrative pronouns and their various forms with their corresponding morphological tags.



83



Table 3.39: The remaining demonstrative pronouns Morphological Tags PRON+DEM+DIST



Devanagari सो



IPA so



Gloss that



PRON+DEM+DIST+EMPH



सोह



soɦi



that



PRON+DEM+PROX



नज



nidzʌ



him/her



PRON+DEM+PROX+EMPH



नजै



nidzʌi



him/her



PRON+DEM+PROX







uktʌ



that



The finite state transducer illustrated in Figure 3.27 is capable of analyzing and generating the remaining demonstrative pronouns and their various forms illustrated in Table 3.39.



Figure 3.27: A finite state transducer for remaining demonstrative pronouns



d. Relative pronouns There are three relative pronouns जो dzo, जे dze and जुन dzunʌ in Nepali. These relative pronouns inflect only for oblique and emphatic forms. Table 3.40 lists relative pronouns and their various forms with their corresponding morphological tags.



Table 3.40: The Relative Pronouns Morphological Tags PRON+REL+HUM



Devanagari जो



IPA dzo



Gloss who



PRON+REL+OBL+HUM



जस



dzʌsʌ



who



PRON+REL+OBL+HUM+EMPH



जसै



dzʌsʌi



who



PRON+REL+NHUM



जे



dze



which



PRON+REL



जुन



dzunʌ



which



PRON+REL+EMPH



जुनै



dzunʌi



which



84



The finite state transducer illustrated in Figure 3.28 is capable of analyzing and generating the relative pronouns and their various forms illustrated in Table 3.40.



Figure 3.28: A finite state transducer for relative pronouns



e. Interrogative pronouns There are three interrogative pronouns को ko, के ke and कुन kunʌ in Nepali. But two adverbs which act as interrogative form कन kinə and कसर kəsəriː are also included here. These interrogative pronouns inflect only for oblique and emphatic forms. Table 3.40 lists relative pronouns and their various forms with their corresponding morphological tags. Table 3.41a: The interrogative pronouns Morphological Tags



Devanagari



PRON+INTERRO+HUM



को



IPA ko



Gloss who



PRON+INTERRO+HUM+OBL



कस्



kʌs



who



PRON+INTERRO+HUM+OBL



कसै



kʌsʌi



who



PRON+INTERRO+NHUM



के



ke



what



PRON+INTERRO



कुन



kun



which



PRON+INTERRO



कन



kinʌ



why



PRON+INTERRO



कसर



kʌsʌri



how



The finite state transducer illustrated in Figure 3.29 is capable to analyze and generate the relative pronouns and their various forms illustrated in Table 3.41a.



85



Figure 3.29: A finite state transducer for interrogative pronouns



f. Indefinite pronouns The indefinite pronouns are derived from interrogative and relative pronouns. The indefinite pronouns derived from interrogative pronouns take ह ɦiː and ◌ै ʌi as an emphatic marker. And those derived from relative pronouns take सुकै sukʌi as an emphatic marker. Table 3.41b lists indefinite pronouns derived from interrogative pronouns with their corresponding morphological tags. Table 3.41b: The indefinite pronouns derived from interrogative pronouns Morphological Tags PRON+INDEF+HUM



Devanagari कोह



IPA koɦi



Gloss someone



PRON+INDEF+NHUM



केह



keɦi



something



PRON+INDEF+NEU



कुनै



kunʌi



anything



The finite state transducer illustrated in Figure 3.30 is capable of analyzing and generating the indefinite pronouns listed in Table 3.41b.



86



Figure 3.30: A finite state transducer for indefinite pronouns derived from interrogative pronouns Table 3.42 lists indefinite pronouns derived from relative pronouns with their corresponding morphological tags. Table 3.42: The indefinite pronouns derived from relative pronouns Morphological Tags PRON+INDEF+HUM



Devanagari जोसुकै



IPA dzosukʌi



Gloss whoever



PRON+INDEF+NHUM



ु ै जेसक



dzesukʌi



whatever



PRON+INDEF+NEU



जुनसुकै



dzunsukʌi



whichever



Figure 3.31: A finite state transducer for indefinite pronouns derived from relative pronouns The finite state transducer illustrated in Figure 3.31 is capable of analyzing and generating the indefinite pronouns and their various forms illustrated in Table 3.42.



87



g. Definite pronouns There is a small set of definite pronouns, which does not show any kind of inflections except अक ʌrko. अक ʌrko inflects for number, honorificity and form: oblique. Table 3.43 lists the definite pronouns with their corresponding morphological tags.



Table 3.43a: The definite pronouns Morphological Tags PRON+DEF



Devanagari ू येक



IPA prʌtekʌ



Gloss everyone



PRON+DEF



हरे क



hʌrekʌ



each one



PRON+DEF



सबै



sʌbʌi



all



PRON+DEF







ʌruː



other



The finite state transducer in Figure 3.32a encodes the definite pronouns listed in Table 3.43a and it is capable of analyzing and generating those pronouns.



Figure 3.32a: A finite state transducer for definite pronouns The definite pronoun अक along with its various forms and their corresponding morphological tags are listed in Table 3.43b. Table 3.43b: The definite pronoun अक Morphological Tags PRON+DEF+SG



Devanagari अक



IPA ʌrko



Gloss another



PRON+DEF+PL



अका



ʌrka



another



PRON+DEF+HON



अका



ʌrka



another



PRON+DEF+OBL



अका



ʌrka



another



PRON+DEF+FEM



अक



ʌrkiː



another



PRON+DEF+EMPH



अक



ʌrkʌi



another



88



The definite pronoun अक ʌrko and its various forms listed in Table 3.43b have been compiled into a finite state transducer as demonstrated in Figure 3.32b and it is capable of analyzing and generating them.



Figure 3.32b: A finite state transducer for definite pronouns



h. Reciprocal pronouns The reciprocal pronouns in Nepali are compound forms except one, i.e., आपस apʌsʌ. A reciprocal pronoun एकअक



ekʌʌrko 'each other' inflect for form: oblique,



honorificity, number: plural and gender: feminine. Table 3.44 lists the reciprocal pronoun एकअक ekʌrko and its various forms with their corresponding morphological tags.



Table 3.44a: The reciprocal pronouns Morphological Tags



Devanagari



PRON+RECIP



एकअक



IPA ekʌrko



Gloss each other



PRON+RECIP+OBL



एकअका



ekʌrka



each other



PRON+RECIP+HON



एकअका



ekʌrka



each other



PRON+RECIP+PL



एकअका



ekʌrka



each other



PRON+RECIP+FEM



एकअक



ekʌrkiː



each other



PRON+RECIP+EMPH



एकअक



ekʌrkʌi



each other



The finite state transducer demonstrated in Figure 3.33a encodes the reciprocal pronouns listed in Table 3.44a and is capable of analyzing and generating them.



89



Figure 3.33a: A finite state transducer for reciprocal pronouns Some other reciprocal pronouns are listed in Table 3.44b with their corresponding morphological tags. Table 3.44b: The reciprocal pronouns Morphological Tags PRON+RECIP



Devanagari एकआपस



IPA ekapʌs



Gloss each other



PRON+RECIP



आपस



apʌs



each other



PRON+RECIP



आआफू



aphu:



each other



The finite state transducer illustrated in Figure 3.33b is capable of analyzing and generating the reciprocal pronouns and their various forms illustrated in Table 3.44b.



Figure 3.33b: A finite state transducer for reciprocal pronouns



90



3.4 Adjectives Adjectives in Nepali are the words indicating quality, quantity and frequency generally modifying the nouns. The adjectives show various kinds of morphological features which are discussed in the following sections. 3.4.1 Characteristics of adjectives in Nepali a. Significant stem finals The adjectives in Nepali, like that of nouns, show the binary division between oending adjectives and non-o-ending adjectives. The o-ending adjectives inflect for number, gender, form and honorificity. These adjectives agree with the features carried over by the head nouns that they modify. The non-o-ending adjectives are not consistent in their formal behavior. Rather a sub-group of non-o-ending adjectives take feminine gender marker and another sub-group, especially Sanskrit loan adjectives, inflects for comparative and superlative forms. Table 3.45 lists some oending and some non-o-ending adjectives.



Table 3.45: O-ending and non-o-ending adjectives O-ending Adjectives Stems Gloss good राॆो ramro black कालो kalo



Non-o-ending Adjectives Gloss good असल ʌsʌl clever चतुर tsʌtur



खॐो kʰʌsro



coarse



लघु lʌgʰu



small



मठो mʰito



sweet



पुव या purwija



related to east



Stem



b. Number Adjectives in Nepali show two dimensions of number: singular and plural. The number distinction is found only in o-ending adjectives. The citation form of o-ending adjective as राॆो ramro in (18a) changes to the a-ending as राॆा ramra in (18b) for plural. (18) a. एउटा राॆो केटो आयो।



eut ̺a ramro ket ̺o a-jo one.CL good.SG boy.SG come-P.3SG.MASC 'A handsome boy came.' 91



b. दुइटा राॆा केटा आए।



duit ̺a ramra ket ̺a a-je two.CL good.PL boy.PL come-P.3PL 'Two handsome boys came.'



Table 3.46 lists some adjectives that show the singular and plural form and this number feature in the adjectives agree with the number feature of the head noun in the noun phrase. Table 3.46: Number: singular and plural Singular Plural



good



black



coarse



old



राॆो



कालो



खॐो



बुढो



ramro



kalo



kʰʌsro



bud̺ʰo



राॆा



काला



खॐा



बुढा



ramra



kala



kʰʌsra



bud̺ʰa



c. Gender Adjectives in Nepali that are o-ending show masculine and feminine gender. The oending adjective such as राॆो ramro in (19a) changes to the iː-ending as राॆी ramriː in (19b) showing masculine and feminine alternation. Some of the non-o-ending adjectives change into feminine adjective with the suffix -नी-niː (alternatevely -इनी-iniː and -एनी-eniː). (19) a. एउटा राॆो केटो आयो।



eut ̺a ramro ket ̺o a-jo one.CL good.MASC.SG boy.MASC.SG come-P.3SG.MASC 'A handsome boy came.'



b. एउट राॆी केट आई।



eut ̺i ramri ket ̺i one.CL.FEM good.FEM.SG boy.FEM.SG 'A beautiful girl came.'



a-iː come-P.3SG.FEM



Table 3.47 lists some examples of adjectives showing the gender change. The gender distinction depends on the head noun. If head noun refers to human, then only the gender is functional.



92



Table 3.47: Gender: masculine and feminine Masculine Feminine



Good



black



clever



rural



राॆो



कालो



चतुर



पाखे



ramro



kalo



tsʌturʌ



pakʰe



राॆी



काल



चतुन



पिखनी



ramriː



kaliː



tsʌturniː



pʌkʰiniː



d. Form Adjectives in Nepali show two forms: direct and oblique. The o-ending adjective as राॆो ramro in (20a) shows oblique form and it changes to a-ending as राॆा ramra in



(20b) showing oblique form. (20) a. एउटा राॆो केटो आउँदै छ।



eut ̺a ramro ket ̺o a-ũdʌi tsʰʌ one.CL good.SG boy come-IMPERF be.NP.3SG.MASC 'A handsome boy is coming.'



b. एउटा राॆा केटाले ूःताव राखेको छ।



eut ̺a ramra ket ̺a-le prʌstaw rakʰ-eko tsʰʌ one.CL good.OBL boy.OBL-ERG proposal keep-PERF be.NP.3SG.MASC 'A handsome boy has proposed.'



Table 3.48 lists some examples of adjectives showing the direct and oblique forms Table 3.48: Form: direct and oblique Direct Oblique



good



black



coarse



old



राॆो



कालो



खॐो



बुढो



ramro



kalo



kʰʌsro



bud̺ʰo



राॆा



काला



खॐा



बुढा



ramra



kala



kʰʌsra



bud̺ʰa



e. Honorificity Adjectives in Nepali show two levels of honorificity: non-honorific and honorific. The o-ending adjectives as राॆो ramro in (21a) changes into a-ending as राॆा ramra in (21b) showing non-honorific and honorific, respectively.



93



(21) a. तँ राॆो छस्। tʌ̃ ramro tsʰʌs you.NHON good.NHON be.NP.2SG.NHON 'You are good.' b. तमी राॆा छौ। timi ramra tsʰʌu you.HON good.HON be.NP.2SG.HON 'You are good.' Table 3.49 lists some examples of adjectives showing the honorifcity.



Table 3.49: Honorificity: non-honorific and honorific Non-honorific Honorific



good



black



coarse



old



राॆो



कालो



खॐो



बुढो



ramro



kalo



kʰʌsro



bud̺ʰo



राॆा



काला



खॐा



बुढा



ramra



kala



kʰʌsra



bud̺ʰa



f. Degree Native adjectives in Nepali do not inflect for degree. The degrees in adjectives are handled syntactically. But the Sankrit loan adjectives show three levels of degree morphologically: positive, comparative and superlative. The positive adjective is unmarked as यून njuːnʌ in (22a). The comparative degree is indicated by a suffix तर-tʌr as यूनतर njuːnʌ-tʌr in (22b) and superlative by a suffix -तम -tʌm as यूनतम



njuːnʌ-tʌm in (22c). (22) a. हाॆो आ दानी यून छ। ɦamro amdani njuːnʌ our income less 'Our income is less.'



tsʰʌ be.NP.3SG.MASC



b. हाॆो आ दानी यूनतर छ। ɦamro amdani njuːnʌ-tʌr tsʰʌ our income less-COMP be.NP.3SG.MASC 'Our income is lesser.'



94



c. हाॆो आ दानी यूनतम छ। ɦamro amdani njuːnʌ-tʌm tsʰʌ our income less-SUPER be.NP.3SG.MASC 'Our income is the least.' Table 3.50 lists some examples of Sanskrit loan adjectives that show three degrees. Table 3.50: Degree: positive, comparative and superlative less



low



rigorous



small



यून



न न



गहन



लघु



njuːnʌ



nimnʌ



gʌɦʌnʌ



lʌgʰu



यूनतर



न नतर



गहनतर



लघुतर



nimnʌ-tʌrʌ



gʌɦʌnʌ-tʌrʌ



lʌgʰu-tʌrʌ



न नतम



गहनतम



लघुतम



nimnʌ-tʌmʌ



gʌɦʌnʌ-tʌmʌ



lʌgʰu-tʌmʌ



Positive Comprative



njuːnʌ-tʌrʌ यूनतम



Superlative



njuːnʌ-tʌmʌ 3.4.2 Classification of adjectives



On the basis of characteristic features of adjectives in Nepali as discussed in (3.4.1), the adjectives are classified into two major groups. The first one is o-ending adjectives whereas the second one is non-o-ending adjectives.



a. O-ending adjectives All the o-ending adjectives are grouped in a class. The adjectives in this group inflect for number, gender, form and honorificity. The inflection in the adjectives has direct relation with the head noun which it modifies because there is feature agreement between head noun and modifier adjective. Table 3.51 lists some examples of oending adjectives.



95



Table 3.51: O-ending adjectives Morphological Tags +ADJ+SG +ADJ+PL +ADJ+OBL +ADJ+HON +ADJ+FEM



good



Black



coarse



old



राॆो



कालो



खॐो



बुढो



ramro



kalo



kʰʌsro



bud̺ʰo



राॆा



काला



खॐा



बुढा



ramra



kala



kʰʌsra



bud̺ʰa



राॆा



काला



खॐा



बुढा



ramra



kala



kʰʌsra



bud̺ʰa



राॆा



काला



खॐा



बुढा



ramra



kala



kʰʌsra



bud̺ʰa



राॆी



काल



खॐी



बुढ



ramriː



kaliː



kʰʌsriː



bud̺ʰiː



The finite state transducer illustrated in Figure 3.33 is capable of analyzing and generating the o-ending adjectives and their forms illustrated in Table 3.44.



Figure 3.34: A finite state transducer for o-ending adjectives The phonological rules given in PR 3.9 are compiled into a finite state transducer and composed with finite state transducer illustrated in Figure 3.34.



Phonological rule PR 3.9 i. Stem final vowel ◌ो o of the o-ending adjectives of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural, oblique and honorificity. 96



Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending adjectives of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#.



b. Non-o-ending adjectives Non-o-ending adjectives in Nepali form a group which includes both marked and unmarked adjectives. Marked adjectives mean those which take some sort of marking such as feminine marker, comparative marker and superlative maker.



i. Marked adjectives Type 1: Those non-o-ending adjectives in Nepali that inflect for gender: masculine and feminine have been grouped in this class. The citation form is masculine in gender and maker -नी /-इनी -niː-iniː when suffixed to changes to feminine gender. Table 3.52 lists some adjectives of this group.



Table 3.52: Type 1 marked adjectives Morphological Tags



clever



cunning



of east



rural



ADJ+MASC



चतुर



धुत



पु वया



पाखे



tsʌturʌ



dʰurtʌ



purwija



pakʰe



चतुन



धु तनी



पु वनी



पिखनी



tsʌturniː



dʰurtiniː



purwiniː



pʌkʰeniː



ADJ+FEM



The finite state transducer illustrated in Figure 3.34 is capable of analyzing and generating the non-o-ending type 1 adjectives and their forms illustrated in Table 3.52.



97



Figure 3.35: A finite state transducer for Type 1 marked adjectives The phonological rules involved in this process are given in PR 3.10 which are compiled and composed with finite state transducer illustrated in Figure 3.35.



Phonological rule PR 3.10 i. Halant ◌् is inserted between consonant symbol and feminine gender marker नी



niː at the surface level. Regular expression: [. .] -> ◌् || liquids _ न ◌ी .#. ii. या ja is deleted before the feminine gender marker नी niː at the surface level. Regular expression: य ◌ा -> [ ] || _ न ◌ी .#. Type 2 Those non-o-ending adjectives in Nepali that inflect for comparative and superlative forms are grouped in this class. The adjectives in this group, in fact, are Sanskrit loan adjectives. The adjectives in this group take the comparative marker -तर



-tʌrʌ and superlative maker -तम -tʌmʌ forming the comparative and superlative forms respectively. Table 3.53 lists some examples of Sanskrit loan adjectives.



98



Table 3.53: Type 2 marked adjectives Morphological Tags +ADJ+POSIT +ADJ+COMP



less



low



regorous



small



यून



न न



गहन



लघु



njuːnʌ



nimnʌ



gʌɦʌnʌ



lʌgʰu



यूनतर



न नतर



गहनतर



लघुतर



nimnʌ-tʌrʌ



gʌɦʌnʌ-tʌrʌ



lʌgʰu-tʌrʌ



न नतम



गहनतम



लघुतम



gʌɦʌnʌ-tʌmʌ



lʌgʰu-tʌmʌ



njuːnʌ-tʌrʌ +ADJ+SUPER



यूनतम



njuːnʌ-tʌmʌ



nimnʌ-tʌmʌ



The finite state transducer illustrated in Figure 3.35 is capable of analyzing and generating the non-o-ending type 2 adjectives and their forms illustrated in Table 3.53. In this class of adjectives, no rules are involved.



Figure 3.36: A finite state transducer for Sanskrit loan adjectives ii. Unmarked adjectives All those non-o-ending adjectives in Nepali which never take any marker are grouped in this class. The adjective in this class remains unaltered. Table 3.54 lists some examples of unmarked adjectives. Table 3.54: Unmarked mdjectives Morphological Tags +ADJ



gentle



bad



new



rich



असल



खराब



नयाँ



धनी



ʌsʌl



kʰʌrab



nʌjã



dʰʌni



The finite state transducer illustrated in Figure 3.35 is capable of analyzing and generating the non-o-ending unmarked adjective forms illustrated in Table 3.54.



99



Figure 3.37: A finite state transducer for unmarked adjectives 3.5 Numerals The numerals in Nepali are of two types: cardinal numbers and ordinal numbers.



3.5.1 Cardinal numbers Cardinal number in Nepali from one to hundred and some other such as हजार ɦʌdzar 'thousand', लाख lakʰ 'hundred thousand', करोड kʌrod̺ 'ten million', अरब ʌrʌb 'ten billion' and खरब kʰʌrʌb 'ten trillion' are written as a single word. The cardinal numbers appear with numeral classifiers and modify the head nouns. Table 3.55 lists some examples of cardinal numbers.



Table 3.55: Some cardinal numbers Morphological Tags +NUM



Devanagari शू य



IPA ʃuːnjʌ



Gloss zero



+NUM+CARD



एक



ek



one



+NUM+CARD



दुई



duiː



two



+NUM+CARD



तीन



tiːn



three



+NUM+CARD



चार



tsar



four



+NUM+CARD



पाँच



pãts



five



+NUM+CARD







tsʰʌ



six



+NUM+CARD



सात



sat



seven



+NUM+CARD



आठ



at ̺ʰ



eight



+NUM+CARD



नौ



nʌu



nine



+NUM+CARD



दस



dʌs



ten



+NUM+CARD



एघार



egʰarʌ



eleven



100



+NUM+CARD



बा॑



bahrʌ



twelve



+NUM+CARD



ते॑



tehrʌ



thirteen



+NUM+CARD



चौध



tsʌudʰʌ



fourteen



+NUM+CARD



प ी



pʌndʰrʌ



fifteen



+NUM+CARD



सो॑



sohrʌ



sixteen



+NUM+CARD



सऽ



sʌtrʌ



seventeen



+NUM+CARD



अठार



ʌtʰarʌ



eighteen



+NUM+CARD



उ ाइस



unnais



nineteen



+NUM+CARD



बीस



biːs



twenty



+NUM+CARD



ए ाइस



ekkais



twenty one



+NUM+CARD



प चीस



pʌtstsiːs



twenty five



+NUM+CARD



तीस



tiːs



thirty



+NUM+CARD



चाल स



tsaliːs



fourty



+NUM+CARD



पचास



pʌtsas



fifty



+NUM+CARD



साठ



sat ̺ʰiː



sixty



+NUM+CARD



स र



sʌttʌriː



seventy



+NUM+CARD



असी



ʌsiː



eighty



+NUM+CARD



न बे



nʌbbe



ninenty



+NUM+CARD



सय



sʌjʌ



hundred



+NUM+CARD



हजार



ɦʌdzar



thousand



+NUM+CARD



लाख



lakʰ



+NUM+CARD



करोड



kʌrod̺



hundred thousand ten million



+NUM+CARD



अरब



ʌrʌb



ten billion



+NUM+CARD



खरब



kʰʌrʌb



ten trillion



3.5.2 Ordinal number The ordinal numbers in Nepali are of two types: regular and irregular. a. Regular ordinal number: Numbers one, two, three, four and six constitute an exceptional set in the formation of the ordinal numbers from the cardinal numerals. Except the exceptional set, all the numerals take -औ -ʌũ as a suffix and form the ordinal numbers. Some examples are illustrated in Table 3.56.



101



Table 3.56: Some regular ordinal numbers Morphological Tags Devanagari +NUM+ORD पाँच



IPA pãtsʌũ



Gloss fifth



+NUM+ORD



सात



satʌũ



seventh



+NUM+ORD



आठ



atʰʌũ



eighth



+NUM+ORD



दस



dʌsʌũ



tenth



+NUM+ORD



बीस



biːsʌũ



twentieth



+NUM+ORD



सय



sʌjʌũ



hundredth



+NUM+ORD



हजार



hʌdzarʌũ



thousandth



+NUM+ORD



लाख



lakʰʌũ



+ORD+NUM



करोड



kʌrodʌũ



hundred thousandth ten millionth



The finite state transducer for cardinal numbers listed in Table 3.55 and ordinal number listed in Table 3.56 except the exceptional set is illustrated in the Figure 3.37 which is capable of analyzing and generating these numeral forms.



Figure 3.38 A finite state transducer for cardinal numbers and regular ordinal numbers The phonological rules involved in the regular numerals are given in PR 3.11, which are compiled and composed with finite state transducer illustrated in Figure 3.38.



Phonological rule PR 3.11 i. Vowel sequence औ ʌũ is changed to it corresponding dependent vowel symbol ◌ौ ʌũ if the numeral ends with consonant.



Regular expression: औ -> ◌ौ || cons _ .#.



102



b. Irregular ordinal numbers: The corresponding ordinal numerals from number one, two, three and four are different from the regular ordinal numerals. They inflect for number, gender, form and honorificity. Table 3.57, Table 3.58, Table 3.59, Table 3.60 list the ordinal numerals and their corresponding morphological tags of number one, two, three and four respectively. Table 3.57: Irregular ordinal numbers of one Morphological Tags +NUM+ORD+MASC



Devanagari प हलो



IPA pʌhilo



Gloss first



+NUM+ORD+PL



प हला



pʌhila



first



+NUM+ORD+OBL



प हला



pʌhila



first



+NUM+ORD+HON



प हला



pʌhila



first



+NUM+ORD+FEM



प हल



pʌhiliː



first



Table 3.58: Irregular ordinal numbers of two Morphological Tags +NUM+ORD+MASC



Devanagari दोॐो



IPA dosro



Gloss second



+NUM+ORD+PL



दोॐा



dosra



second



+NUM+ORD+OBL



दोॐा



dosra



second



+NUM+ORD+HON



दोॐा



dosra



second



+NUM+ORD+FEM



दोॐी



dosriː



second



Table 3.59: Irregular ordinal numbers of three Morphological Tags +NUM+ORD+MASC



Devanagari तेॐो



IPA tesro



Gloss third



+NUM+ORD+PL



तेॐा



tesra



third



+NUM+ORD+OBL



तेॐा



tesra



third



+NUM+ORD+HON



तेॐा



tesra



third



+NUM+ORD+FEM



तेॐी



tesriː



third



Table 3.60: Irregular ordinal numbers of four Morphological Tags +NUM+ORD+MASC



Devanagari चौथो



IPA tsʌutʰo



Gloss fourth



+NUM+ORD+PL



चौथा



tsʌutʰa



fourth



+NUM+ORD+OBL



चौथा



tsʌutʰa



fourth



+NUM+ORD+HON



चौथा



tsʌutʰa



fourth



+NUM+ORD+FEM



चौथी



tsʌutʰiː



fourth



103



The finite state transducer illustrated in Figure 3.39 is capable of analyzing and generating the ordinal numerals from numbers one, two, three and four and their corresponding forms illustrated in Table 3.57, Table 3.58, Table 3.59, Table 3.60.



Figure 3.39: A finite state transducer for irregular ordinal numerals The phonological rules involved in irregular ordinal numerals are given in PR 3.12, which are compiled and composed with finite state transducer illustrated in Figure 3.39.



Phonological rule PR 3.12 i. Stem final vowel ◌ो o of the o-ending irregular numeral of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural, oblique and honorificity Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending irregular numeral of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#.



c. Ordinal numbers loaned from Sanskrit Some ordinal numbers in Nepali are loan words from Sanskrit. They are listed in Table 3.61.



104



Table 3.61: Some ordinal numbers from Sanskrit loan Morphological Tags +NUM+ORD



Devanagari ूथम



IPA prʌtʰʌm



Gloss first



dwitiːjʌ



second



+NUM+ORD



तीय



+NUM+ORD



तृतीय



tritiːjʌ



third



+NUM+ORD



चतुथ



tsʌturtʰʌ



fourth



+NUM+ORD



प म



pʌntsʌm



fifth



The ordinal numbers borrowed from Sanskrit are encoded in the finite state transducer as demonstrated in Figure 3.40 and it is capable of analyzing and generating them.



Figure 3.40: A finite state transducer for ordinal numerals form Sanskrit loan 3.5.2 Other numerals Some numerals in Nepali indicate the frequency and also modify the head nouns. Such numerals grouped into four classes and they are listed in Table 3.62, Table 3.63, Table 3.64 and Table 3.65.



Table 3.62: Frequency numerals (I) Morphological Tags +NUM+FREQ



Devanagari एकोहोरो



IPA ekoɦoro



Gloss one



+NUM+ FREQ



दोहोरो



doɦoro



two



+NUM+FREQ



तेहोरो



teɦoro



three



Table 3.63: Frequency numerals (II) Morphological Tags +NUM+FREQ



Devanagari एकसरो



IPA eksʌro



Gloss one layer



+NUM+ FREQ



दुईसरो



duiːsʌro



two layer



+NUM+FREQ



तीनसरो



tiːnsʌro



three layer



105



Table 3.64: Frequency numerals (III) Morphological Tags +NUM+FREQ



Devanagari दोबर



IPA dobʌr



Gloss twice/double



+NUM+ FREQ



तेबर



tebʌr



thrice



+NUM+FREQ



चौबर



tsʌubʌr



four times



Table 3.65: Frequency numerals (IV) Morphological Tags +NUM+FREQ



Devanagari दुईगुना



IPA duiːguna



Gloss two times



+NUM+ FREQ



तीनगुना



tiːnguna



three times



+NUM+FREQ



चौगुना



tsʌuguna



four times



The finite state transducer illustrated in Figure 3.41 is capable of analyzing and generating the frequency numerals illustrated in Table 3.62, Table 3.63, Table 3.64 and Table 3.65.



Figure 3.41: A finite state transducer for frequency numerals There are few numerals which indicate part of the measurement of things, time and space. Some of the portion numerals are listed in Table 3.66.



Table 3.66: Some portion numerals Morphological Tags Devanagari IPA +NUM+PORT आधा adha +NUM+PORT पौने pʌune



Gloss half (a number less) a quarter



+NUM+PORT



सवा



sʌwa



one and quarter



+NUM+PORT



डेढ



d̺ed̺ʰʌ



one and half



+NUM+PORT



साढे



sad̺ʰe



(a number and) half



+NUM+PORT



अढाइ



ʌd̺ʰai



two and half



+NUM+PORT



चौथाइ



tsʌutʰai



one fourth



106



The finite state transducer illustrated in Figure 3.42 is capable of analyzing and generating the portion numerals illustrated in Table 3.66.



Figure 3.42: A finite state transducer for portion numerals 3.6 Classifiers in Nepali 3.6.1 Numeral classifiers There are two numeral classifiers in Nepali. -जना -dzʌna is human masculine classifiers and it does not inflect for anything. -वटा -wʌt ̺a is a non-human classifier but it inflects for human feminine. The numeral classifiers appear only with countable nouns. Table 3.67 lists these two numeral classifiers and their various forms.



Table 3.67: Numeral classifiers Morphological Tags +CLF+HUM



Devanagari जना



IPA dzʌna



+CLF+NHUM



वटा/ओटा



wʌt ̺a/ot ̺a



+CLF+FEM



वट /ओट



wʌt ̺iː/ot ̺iː



Gloss



Figure 3.43: A finite state transducer for numeral classifiers



107



The finite state transducer illustrated in Figure 3.43 is capable of analyzing and generating the numeral classifiers illustrated in Table 3.67.



3.6.2 Quasi-classifiers Quasi-classifiers in Nepali have their lexical content as well as the properties of being the classifier. Each item in the list classifies a small set of nouns and also follows the numerals. Quasi-classifiers are related to mensurality or sortality. Such classifiers end in either o or non-o like nouns and adjectives in Nepali. o-ending quasi-classifiers inflect for number and oblique features. Some examples of o-ending classifiers are given in Table 3.68.



Table 3.68: o-ending quasi-classifiers Morphological Tags Classifier1 +CL+SG कोसो koso +CL+PL कोसा kosa +CL+OBL



कोसा kosa



Classifer2 दानो dano



Classifier3 थोपो tʰopo



दाना dana



थोपा tʰopa



दाना dana



थोपा tʰopa



The o-ending quasi-classifiers in Nepali are compiled into a finite state transducer as demonstrated in Figure 3.44 and it is capable of analyzing and generating the quasiclassifiers illustrated in Table 3.68.



Figure 3.44: A finite state transducer for general classifier type 1



108



The phonological rules involved in this set of quasi-classifiers are given in PR 3.13, which are compiled and composed with finite state transducer illustrated in Figure 3.44.



Phonological rule PR 3.13 i. Stem final vowel ◌ो o of the o-ending quasi-classifiers of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural and oblique. Regular expression: ◌ो -> ◌ा || _ .#. The finite state transducer in Figure 3.44 is capable of analyzing and generating the quasi-classifiers illustrated in Table 3.68. Non-o-ending quasi-classifiers do not inflect for anything. Table 3.69 presents some examples of non-o-ending quasi-classifiers in Nepali.



Table 3.69: General non-o-ending classifiers Morphological Tags +CL



Devanagari पोट



IPA pot ̺i



+CL



थुन



tʰun



+CL



जुवा



dzuwa



+CL



गाँस



gãs



+CL



चोइल



tsoili



+CL



िख ल



kʰilli



+CL



घर



gʰʌri



The finite state transducer in Figure 3.45 is capable of analyzing and generating the quasi-classifiers illustrated in Table 3.69.



Figure 3.45: A finite state transducer for general classifier type 2 109



3.7 Summary This chapter analyzed that nouns in Nepali. They can be grouped into two classes: oending and non-o-ending nouns. The o-ending are further sub-grouped into four classes and non-o-ending nouns are further sub-grouped into two classes, viz. marked and unmarked classes. Marked non-o-ending nouns are of four types and unmarked nouns are of six types. The basis on which the classification is done to match and implement the word categories into finite state technology is made up of the formal characteristic features possessed by the nouns in Nepali. Some of phonological rules for one group of nouns are repeated for another group; they are minimized, delimiters are used if required and implemented as regular expression and finally composed with the main noun lexicon. Personal pronouns in Nepali possess person, number, form and honorific features. Demonstratives, reflexives, reciprocal, definite and indefinite pronouns inconsistently possess number, form and honorific features. The formal grouping of the pronouns is significant for the illustration and demonstration of their finite state transducers. Since the number of pronouns is limited and their behavior is more or less idiosyncratic, they are directly encoded for creating the finite state network. Adjectives in Nepali are mainly of two major types: o-ending and non-o-ending. Nono-ending adjectives are of two types: marked and unmarked. One group of marked adjectives shows the distinction in masculine and feminine gender whereas another group containing Sanskrit loans shows three levels of degree: positive, comparative and superlative. And unmarked adjectives remain unaltered. The numerals in Nepali are mainly grouped into three classes; they are cardinal, ordinal and other numerals. Except some, all ordinal numerals are derived from the cardinal numerals. Some irregular ordinal numbers show the distinctions for the features like number, gender, honorficity and form. The classifiers in Nepali are grouped into two classes; true classifiers and quasiclassifiers. The true classifiers inflect for gender whereas some of the quasi-classifiers inflect for number and form.



110



CHAPTER 4 VERBAL MORPHOLOGY 4.0 Outline This chapter presents the analysis of verb stems in Nepali. It consists of six sections. Section 4.1 discusses the characteristic features of verbs, namely, significant verb stem finals, transitivity, syllabicity and sound a. In section 4.2, we discuss the morphological processes like causativization, passivization and negativization. The stem formation concept is presented in secion 4.3. Section 4.4 groups the verbs into various groups based on the features discussed above and presents them with their morphological tags. The finite state transducer of each group is illustrated. Section 4.5 deals with verbal inflections which include tense, aspect and mood. For every group of inflections the morphological tags and finite state transducers are illustrated. Section 4.5 deals with verbal inflections which include tense, aspect, mood and participial forms. For every group of inflections the morphological tags and finite state transducers are illustrated. Section 4.6 summarizes the findings of the chapter.



4.1 Characteristics of verb in Nepali 4.1.1 Significant verb stem finals The basic verb stems end with different sound segments. Some of the final segments are noteworthy from the morphophological point of view. The morphological processes that are under consideration such as passivization, causativization, negativization and other affixation processes need the information of the final segment of the verb to produce the acceptable surface forms. The stem of the basic verb is identified by removing the past tense third person singular marker -यो -jo from the verb forms and then the remaining segment is analyzed with reference to various phenomena. Those final segments which are significant from our point of view are discussed as follows:1



1



Pokharel (2010a) has mentioned the various strategies to derive the verb stems. Among them imperative singular form as the basic stem has been adopted here for the simplicity, although it leaves some exceptions.



111



a.Vowel final stems i. i-ending verb stems: A set of verb stems which end in vowel इ i are listed in Table 4.1. The majority of the verb stems in this class are intransitive verbs but some of them are transitive also. Some examples are listed in Table 4.2. The verbs उ ृ upʰri 'jump' and प ब pʌkri 'arrest' in (1a) and (1b), respectively end with vowel इ i. (1)



a. केटो उ ृयो।



ket ̺o upʰri-jo boy jump-PST.3SG.MASC 'The boy jumped.'



b. ूहर ले चोरलाई प बयो। prʌɦʌri-le tsor-lai pʌkri-jo police-ERG thief-DAT arrest-PST.3SG.MASC 'The police arrested the thief.' Table 4.1: i-ending intransitive verb stems Verb stem उृ



IPA upʰri-



Gloss 'jump'



खुि च



kʰumtsi-



'shrink'



चोइ ट



tsoit ̺i-



'be pieces'



भि क



bʰʌtki-



'be broken'



The i-ending intransitive verb stems listed in Table 4.1 and i-ending transitive stems listed in Table 4.2 look similar in their form. But they differ in their further morphology. Table 4.2: i-ending transitive verb stems Base form पब



IPA pʌkri-



Gloss of stem 'arrest'



पिख



pʌrkʰi-



'wait'



बस



birsi-



'forget'



मि स



mʌnsi-



'throw away'



सि झ



sʌmdzʰi-



'remember'



कुि च



kultsi-



'tread'



uĩt ̺i-



'spindle'







di-



'give'







li-



'take'



उइँ ट



112



i-ending verb stems listed in Table 4.2a behave differently. The vowel उ u is obligatorily inserted between the stems and suffix if the suffix that follows the stems begins with न् n and उँ ũ if the suffix begins with



tsʰ and थ् tʰ.



Table 4.2a: i-ending transitive verb stems Verb stem प



IPA pi



Gloss 'drink'







si



'sew'



िज



dzi



'live'



The vowel इ i at the end of the verb stem optionally drops without change in meaning. The verb stem पि ल pʌgli 'melt' in (2a) has retained vowel इ i and verb stem प ल्



pʌgl 'melt' in (2b) vowel इ i is dropped.



(2)



a. हउँ पि लयो। hiũ pʌgli-jo ice melt-PST.3SG.MASC 'The ice melted.' b. हउँ प यो। hiũ pʌgl-jo ice melt-PST.3SG.MASC 'The ice melted.'



This vowel इ i at the end of the verb stems also is optionally changed to अ ʌ especially when the suffix begining with न् n,



d, and ए e. For example, when -नु -



nu '-INF' gets attached to verb stem, इ i optionally changes to अ ʌ. Table 4.3 lists these alternative forms due to change of इ i to अ ʌ in i-ending verb stems.



113



Table 4.3: Alternative forms of i-ending verb stems



-i forms



IPA



-ʌ froms



IPA



उृ



upʰri-



उृ



upʰrʌ-



खुि च



kʰumtsi-



खु च



kʰumtsʌ-



चोइ ट



tsoit ̺i-



चोइट



tsoit ̺ʌ-



भि क



bʰʌtki-



भ क



bʰʌtkʌ-



सउ र



siuri-



सउर



siurʌ-



बम



bigri-



बम



bigrʌ-



सू



sʌpri-



सू



sʌprʌ-



उिय



ugʰri-



उय



ugʰrʌ-



पि ल



pʌgli-



प ल



pʌglʌ-



उि ल



ugli-



उ ल



uglʌ-



उि ल



ukli-



उ ल



uklʌ-



पब



pʌkri-



पब



pʌkrʌ-



पिख



pʌrkʰi-



पख



pʌrkʰʌ-



बस



birsi-



बस



birsʌ-



ii. a-ending verb stems: Some of the verb stems ending with the vowel आ a are listed in Table 4.4 and Table 4.5. Verb stems in this group are of both intransitive and transitive types. The verb stem कमा kʌma- 'earn' in (3a) and आ a- 'come' in (3b) end with vowel आ a. (3)



a. उसले धेरै पैसा कमाएको छ।



us-le dʰerʌi pʌisa kʌma-eko tsʰʌ money earn-PERF be-PST.3SG.MASC 3SG.OBL-ERG more 'He has earned a lot of money.'



b. राम ःकुलबाट घर आयो। ram skul-bat ̺ʌ gʰʌr a-jo Ram school-ABL house come-PST.3SG.MASC 'Ram came home from school.'



114



Table 4.4: a-ending verb stems (group 1) Verb stem अघा



IPA ʌgʰa-



Gloss 'satisfy'



कमा



kʌma-



'earn'



टकरा



t ̺ʌkʌra-



'be broken'



मुःकुरा



muskura-



'insert'



पा



pa-



'get'







a-



'come'



छा



tsʰa-



'cover the roof'



बा



ba-



'open (mouth)'



पा



pa-



'get'



ला



la-



'put on'



या



bʰja-



'manage'



या



bja-



'give birth'



Table 4.5: a-ending verb stems (group 2) Verb stem खा



IPA kʰa-



Gloss 'eat'



जा



dza-



'go'



The a-ending verb stems are also of two kinds, a set of verbs in which vowel उ u is inserted between stem and suffix if the following suffix begins with न् n, and उँ ũ with



tsʰ, and थ् tʰ as in Table 4.4. Those verb stems as listed in Table 4.5 do not



take उ u in the condition as stated above. In this group न् n is inserted in the non-past tense and past habitual aspect. ii. o-ending verbs stems: There are a few verb stems which end with ओ o. The stem final ओ o obligatorily changes to उ u if the following suffix begins with



tsʰ,



d,



थ् tʰ then न् n sound segments and न् n is obligatorily inserted in non-past tense.



Table 4.6 lists some of the o-ending verb stems and Table 4.7a shows the change of ओ o to उ u in the condition mentioned above.



115



Table 4.6: o -ending verb stems Verb stem रो



IPA ro-



Gloss 'weep'



धो



dʰo-



'wash'



छो



tsʰo-



'touch'



Table 4.7a: Change of o to u in o-ending verb stems Verb stem



IPA ru-nu



Gloss 'to weep'



धुन ु



dʰu-nu



'to wash'



छु नु



tsʰu-nu



'to touch'



नु



iii. ʌ-ending verbs stems: There is a small set of verbs which end with the vowel अ ʌ. The vowel अ ʌ in the end of the vowel stem drops if the following suffixes begining with ए e, इ i, उ u and ओ o are attached. Table 4.7a lists some ʌ-ending verb stems and Table 4.7b shows some dropping of vowel ʌ. Table 4.7b: ʌ-ending verb stems Verb stem सह



IPA sʌɦʌ-



Gloss 'tolerate'



रह



rʌɦʌ-



'remain'



Table 4.7c: ʌ-ending verb stems (ʌ-dropped) Verb stem सहे र



IPA sʌɦ-erʌ



Gloss 'tolerate-CONJUNT'



रहे र



rʌɦ-erʌ



'remain-CONJUNCT'



In the vowel ending verb stems, except verbs in Table 4.2a and Table 4.4, semantically null element न् n is inserted between stem and suffix if the suffix begins with



tsʰ or थ् tʰ sounds. But, in the case of the verb stems in Table 4.2a and Table



4.4, only ◌ँद ̃dʌ is inserted after उ u is inserted for some other purpose.



116



b. Consonant final stems i. Voiceless consonant ending stems: The verb stems that end with voiceless consonants are both intransitive and transitive types. Some examples of the verb stems ending with voiceless consonants are listed in Table 4.7d.



Table 4.7d: Verb stems ending with a voiceless consonant Verb stem कस्



IPA kʌs-



Gloss 'tighten'



काँप ्



kãp-



'tremble'



घसे



gʰʌset ̺-



'drag'



जाक्



dzak-



'insert'



pʰjãk-



'throw'



nats-



'dance'



याँक् नाच ्



In this group of verb stems, semantically null elements त tʌ or द dʌ are inserted optionally between the stem and suffix if the suffix begins with



tsʰ and थ् tʰ. These



forms are used only in non-past tense and past habitual aspect. These alternative forms of the stems are listed in Table 4.8.



Table 4.8: Alternative forms from stems ending with voiceless consonant Base stem कस् kʌs काँप ् kãp घसे gʰʌset ̺ जाक् dzak याँक् pʰjãk नाच ् nats



form1 कःत kʌstʌ-



form2 कःद kʌsdʌ-



काँ



kãptʌ-



काँ द kãpdʌ-



जा



dzaktʌ-



जा द dzakdʌ-



घसे त gʰʌset ̺tʌयाँ



pʰjãktʌ-



ना त natstʌ-



घसे द gʰʌset ̺dʌयाँ द pʰjãkdʌ-



ना द natsdʌ-



ii. Voiced consonant ending stems: The verb stems that end with voiced consonants are of both types intransitive and transitive. Some examples of the verb stems ending with voiced consonants are listed in Table 4.9.



117



Table 4.9: Verb stems ending with voiced consonant Verb stem



IPA bol-



Gloss 'speak'



pĩd-



'grind'



थुन ्



tʰun-



'close'



पछार ्



pʌtsʰar-



'throw down'



डुब्



d̺ub-



'sink'



छाम्



tsʰam-



'feel'



खोज्



kʰodz-



'search'



बोल् पँ



In this group of stems also, a semantically null element द dʌ is inserted optionally between the stem and suffix if the suffix begins with



tsʰ or थ् tʰ. These forms are



used only in non-past tense and past habitual aspect. These alternative forms of the stems are listed in Table 4.10.



Table 4.10: Alternative forms from stems ending with voiced consonant Base stem बोल् bolपँ pĩd-



थुन ् tʰunपछार ् pʌtsʰarडुब् d̺ubछाम् tsʰamखोज् kʰodz-



Alternative form बो द boldʌपँ



pĩddʌ-



थु द tʰundʌ-



पछाद pʌtsʰardʌडु द d̺ubdʌ-



छा द tsʰamdʌ-



खो द kʰodzdʌ-



4.1.2 Transitivity Transitivity is the number of argument that a verb takes (Katamba 1993:256-62; Pyane 1997:171). The transitivity is significant in verbs. Morphology of the verbs can be further analyzed in term of this feature.



a. Intransitive verbs Those verbs which take only one argument as subject noun phrase are intransitive verbs. In example (4) the verb उ



ut ̺ʰ 'get up' has taken only one argument उ u 'he' as



118



a subject and in example (5) the verb बस् bʌs 'sit' has taken only one argument उ u 'he' as subject, therefore, they are intransitive verbs. (4)



उ बहानै उ



u



ो।



bihan-ʌi



ut ̺ʰ-jo



he morning-EMP rise-PST.3SG.MASC 'He got up early in the morning.' (5)



उ सँधै घरमा बःछ।



u sʌ̃dhʌi gʰʌr-ma bʌs-tsʰʌ 3SG always home-LOC sit-NPST.3SG.MASC 'He always stays at home.'



The other verbs listed in Table 4.11 such as कु



kud 'run', बस् bʌs 'sit', सुत ् sut



'sleep', etc. also take only one argument as the subject. Table 4.11: Intransitive verbs Intransitive verb उ



IPA ut ̺ʰ-



Gloss 'wake up'



कु



kud-



'run'



बस्



bʌs-



'sit'







lʌd̺-



'fall down'



सुत ्



sut-



'sleep'



अघा



ʌgʰa-



'satisfied'



b. Transitive/ditransitive verbs Those verbs which take two arguments are said to be transitive and those verbs which take three arguments are said to be ditransitive verbs. Both types of verbs are kept here under the same group as they behave in the same way at the morphological level. The TableS 4.12 and 4.13 list the transitive verbs and ditransitive verbs, respectively. The verb का



kat ̺ 'cut' in (6) has taken two arguments ँयाम sjam 'Shyam' and ख



rukʰ 'tree' as subject and object of the sentence, respectively. And the verb द di 'give' in (7) has taken three arguments म mʌi '1SG', उस् us 'he.OBL' and कताब kitab 'book' as subject, indirect and direct object of the sentence, respectively.



119



(6)



ँयामले



ख का



ो।



sjam-le



rukʰ kat ̺-jo



Shyam-ERG



tree cut-PST.3SG.MASC



'Shyam cut the tree.'



(7)



ँ मैले उसलाई कताब दए।



mʌi-le



us-lai



kitab di-ẽ



1SG.OBL-ERG 3SG.OBL-DAT book give-PST.1SG.MASC 'I gave him a book.' Some transitive verbs are listed in Table 4.12 which take only two arguments as subject and object and some ditransitive verbs as listed in Table 4.13 take three arguments as subject, indirect and direct objects. Table 4.12 Some transitive verbs Transitive verb का



IPA kat ̺-



Gloss 'cut'



खा



kʰa-



'eat'



च ुस्



tsu-



'suck'







pʌd̺ʰ-



'read'



टोक्



t ̺ok-



'bite'



Table 4.13: Some ditransitive verbs Ditransitive verb तर ्



IPA tir-



Gloss 'pay'



बेच ्



bets-



'sell'







di-



'give'



लेख्



lekʰ-



'write'



सोध्



sodʰ-



'ask'



4.1.3 Syllabicity Nepali verb stems can be grouped into two classes based on the number of syllables in a stem. This feature is significant especially in the causative stem formation.



a. Monosyllabic verb stems Those verb stems which have only one syllable are said to be monosyllabic verb stems. Some examples are listed in Table 4.14. 120



Table 4.14: Monosyllabic verb stems Verb stem बोल्



IPA bol-



Gloss 'speak'



खा



kʰa-



'eat'



pĩd-



'grind'



थुन ्



tʰun-



'close'



कस्



kʌs-



'tighten'



डुब्



d̺ub-



'sink'



छाम्



tsʰam-



'feel'



खोज्



kʰodz-



'search'



खोल्



kʰol-



'open'



सुक्



suk-



'be dried'



खा



kʰa-



'eat'



जा



dza-



'go'







di-



'give'



धो



dʰo-



'wash'



रो



ro-



'weep'







si-



'sew'







pi-



'drink'



पँ



b. Polysyllabic verb stems Those verb stems which are formed from two or more syllables are said to be polysyllabic verb stems. Some examples are illustrated in Table 4.15. Table 4.15: Polysyllabic verb stems Verb stem उृ



IPA upʰri-



Gloss 'jump'



खुि च



kʰumtsi-



'shrink'



भि क



bʰʌtki-



'be broken'



पछार ्



pʌtsʰar-



'throw down'



घसे



gʰʌset ̺-



'drag'



मुःकुरा



muskura-



'insert'



नचोर ्



nitsor-



'squeeze'



नमो



nimot ̺ʰ-



'twist'



िचथोर ्



tsitʰor-



'scratch'



छमल्



tsʰimʌl-



'prune'



121



4.1.4 Sound आ a The sound आ a appears in Nepali verb stems in two manifestations, one as a normal vowel phoneme आ a ; and another as a causative marker -आ -a while forming the causative verb stems. The presence and absence of आ a sound in the base verb stem is very significant for forming the causative stems. Therefore, the basic verb stems can be grouped into two classes, i.e., stems with आ a sound and stems without आ a sound. Some examples of former group are listed in Table 4.16 and of latter group are listed in Table 4.17. Table 4.16 Verb stems with a sound Verb stem खाँ



IPA kʰãd-



Gloss 'press down'



गाल्



gal-



'melt'



छान्



tsʰan-



'choose'



पछार ्



pʌtsʰar-



'throw down'



कोचार ्



kotsar-



'insert into'



डकार ्



d̺ʌkar-



'bulch'



Table 4.17: Verb stems without a sound Verb stem तर ्



IPA tir-



Gloss 'pay'



बल्



bʌl-



'burn'



खोप्



kʰop-



'cut deep'







gʰʌt ̺-



'be less'



िचम



tsimʌt ̺-



'pinch'



छमल्



tsʰimʌl-



'prune'



4.2 Morphological processes 4.2.1 Causativization/transitivization In transitivization, an argument is added irrespective of the role of the argument but in causativization, the added argument is definitely the causer. The morphological change in the verb stem and syntactic make up are the same in both the processes, 122



however, the interpretation may differ semantically (Katamba 1993:274-5; Pokharel 2054VS:6-16). But, in this study, both are treated as a single process. In sentence (8a), the verb सु यो 'slept-PST.3SG' is non-causative which has taken ब चो bʌtstso 'child' as subject of the sentence. When it is causativized as सुताइन् sut-a-in 'sleep-CAUSPST.3SG.FEM.HON'



in (8b), it has taken a new subject आमा ama 'mother' as a causer and



the subject of the non-causative construction is demoted to the object of causativized verb. So, in the process of causativization, a morphological causative marker is suffixed to the verb stem and is followed by the agreement markers. Table 4.18 lists some examples of such causative verb stems.



(8)



a. ब चो सु यो। bʌtstso sut-jo child.SG.MASC sleep-PST.3SG.MASC 'The child slept.' b. आमाले ब चालाई सुताइन्। ama-le bʌtstsa-lai sut-a-in mother-ERG child-DAT sleep-CAUS-PST.3SG.FEM.HON 'The mother made the child sleep.' Table 4.18 Causative verb stems Casuative verb उठा



IPA ut ̺ʰ-a



Gloss 'cause to wake up'



सुता



sut-a



'cause to sleep'



तरा



tir-a



'cause to pay'



लेखा



lekʰ-a



'cause to write'



भना



bʰʌn-a



'cause to say'



123



Some ways of causative formation a. by -आ -a suffix The causativization by a causative marker -आ -a is the most regular and the bulk of the non-causative stems become causative stem by this process. The verb stems listed in Table 4.18 are formed by this method.2 b. by both -आ -a and आल् -al suffixes A small set of verb stems which, instead of taking marker -आ -a, also take marker आल् -al to form the causative stems. For example, verb stem खस् kʰʌs 'drop' in (9a),



gets causativized by maker -आ -a in (9b) and by आल् -al in (9c). Table 4.19 lists some examples of this type of causative stem formation. a. ढु ा खःयो।



(9)



d̺ʰuŋga kʰʌs-jo stone drop-PST.3SG.MASC 'The stone dropped.' b.



केटाले ढु ा खसायो।



keta-le d̺ʰuŋga kʰʌs-a-jo boy-ERG stone drop-CAUSE-PST.3SG.MASC 'The boy dropped a stone.' c.



केटाले ढु ा खसा यो।



keta-le d̺ʰuŋga kʰʌs-al-jo drop-CAUSE-PST.3SG.MASC boy-ERG stone 'The boy dropped a stone.' Table 4.19: Verb stems forming causatives with -आ -a and आल् -al



2



Base बस् bʌs-



Gloss sit



Causative बसा/बसाल् bʌsa-/bʌsal-



Gloss cause to sit



खस् kʰʌs-



drop



खसा/खसाल् kʰʌsa-/kʰʌsal-



cause to drop



च ुँ tsũd̺-



snatch



चडुँ ा/चडाल् tsũd̺a-/tsũd̺al-



cause to snatch



छन् tsʰin-



chop off



छना/ छनाल् tsʰina-/tsʰinal-



cause to chop off



Most of the Nepali grammarians believe that the basic causative marker is -आउ -au. But in this



study, -आ -a is assumed to be the basic causative marker simply for computing purpose.



124



c. by अ ʌ → आ a A small set of monosyllabic verb stems having the vowel अ ʌ in between consonants (i.e. CʌC structure) form the causative stem by changing the vowel अ ʌ to आ a. The verb stem मर ् mʌr 'die' in (10a) is causativized as मार ् mar 'kill' in (10b). Some of the verb stems in which causative stems are formed by this way are listed in Table 4.20. (10) a. मृग म यो। mrigʌ mʌr-jo deer die-PST.3SG.MASC 'The deer died.' b.



बाघले मृग मा यो।



bagʰ-le mrigʌ mar-jo die-CAUSE-PST.3SG.MASC tiger-ERG deer 'The tiger killed the deer.' Table 4.20: Verb stems forming causatives by changing अ ʌ to आ a Base verb मर ् mʌr-



Gloss die



Causative मार ् mar-



Gloss kill



सर ् sʌr-



shift



सार ् sar-



cause to shift



चल् tsʌl-



move



चाल् tsal-



cause to move



टर ् t ̺ʌr-



pass over



टार ् t ̺ar-



cause to pass over



पर ् pʌr-



fall



पार ् par-



cause to fall



गल् gʌl-



melt



गाल् gal-



cause to melt



बल् bʌl-



burn



बाल् bal-



cause to burn



d. by उ u → ओ o Another set of monosyllabic verb stems having vowel उ u in between the consonants (i.e. CuC structure) forms the causative stem by changing the vowel u to o. The verb stem खुल ् kʰul 'open' in (11a) is causativized as खोल् kʰol 'open.CAUSE' in (11b). Some of the verb stems in which causative stems are formed by this way are listed in Table 4.21a.



125



(11) a. ढोका खु यो। d̺ʰoka kʰul-jo door open-PST.3SG.MASC 'The door opened.' b. पालेले ढोका खो यो।



pale-le d̺ʰoka kʰol-jo open.CAUSE-PST.3SG.MASC gate-keeper-ERG door 'The gate keeper opened the door.'



Table 4.21a: Verb stems forming causatives by chaning उ u to ओ o Base verb छु tsʰut ̺-



Gloss be left behind



Causative छो tsʰod̺-



Gloss cause to be left behind



खुल ् kʰul-



open



खोल् kʰol-



cause to open



फु pʰut ̺-



break



फो pʰod̺-



cause to break



घुल ् gʰul-



dissolve



घोल् gʰol-



cause to dissolve



Interestingly, both the verb stems listed in Table 4.21a can also be causativized with causative marker -आ -a like the verb stems as listed in Table 4.18. The causative verb stems of this set are listed in Table 4.21b.3 Table 4.21b: Verb stems forming causatives by suffixing -आ -a



3



Base verb छु tsʰut ̺-



Gloss be left behind



Causative छु टा tsʰut ̺-a



Gloss cause to be left behind



खुल ् kʰul-



open



खुला kʰul-a



cause to open



फु pʰut ̺-



break



फुटा pʰut ̺-a



cause to break



घुल ् gʰul-



dissolve



घुला gʰul-a



cause to dissolve



छो tsʰod̺-



be left behind



छोडा tsʰod̺-a



cause to be left behind



खोल् kʰol-



open



खोला kʰol-a



cause to open



फो pʰud̺-



break



फोडा pʰod̺-a



cause to break



घोल् gʰul-



dissolve



घोला gʰol-a



cause to dissolve



In Table 4.21a, the change of to has not been discussed here (see Pokharel 2054VS). The causativizations shown in Table 4.21a and Table 4.21b have slightly different semantics.



126



e. by आ a insertion A subset of polysyllabic i-ending verb stems containing consonant cluster form the causative stem by inserting the vowel आ a in between the consonants in the cluster. The verb stem पि ल pʌgli 'melt' in (12a) is causativized as पगाल् pʌgal 'melt.CAUSE' in (12b). Some examples of verb stems in this process are listed in Table 4.22. (12) a. हउँ पि लयो ɦiũ pʌgli-jo ice melt-PST.3SG.MASC 'The ice melted.' b. घामले हउँ पगा यो। gʰam-le ɦiũ pʌgal-jo sun-ERG ice melt.CAUSE-PST.3SG.MASC 'The sun melted the ice.' Table 4.22: Verb stems form causatives by inserting आ a Base verb उ ृ upʰri-



Gloss jump



Causative उफार ् upʰar-



Gloss cause to jump



ब म bigri-



spoil



बगार ् bigar-



cause to spoil



स ू sʌpri-



flourish



सपार ् sʌpar-



cause to flourish



उिय ugʰri-



open



उघार ् ugʰar-



cause to open



पि ल pʌgli-



melt



पगाल् pʌgal-



cause to melt



उि ल ukli-



climb up



उकाल् ukal-



cause to climb up



Now, it has been clear that the causative stem formation from base verb stems depends on various features available in the verb stems such as syllabicity, presence and absence of आ a sound in the stem, transitivity and stem final segments.4



4.2.2 Passivization Passivization is an opposite phenomenon to causativization in terms of syntax. When passivization takes place, the subject noun phrase is either demoted to postpositional phrase or dropped (Katamba 1993:268-9; Pokharel 2054VS:1-5) In Nepali,



4



See Adhikari (2062VS) and Pokharel (2054VS) for detail information.



127



passivization from intransitive verbs is also possible, but it is restricted only to default agreement (i.e, third person singular), and to some other morphology and interpretation as well (Pokharel 2054VS; Adhikari 2062VS). But the passivization from transitive/causative verbs undergoes for full morphological paradigm and in its interpretations. However, in both cases, the passive marker is the same, i.e., इ -i that follows the non-passive stem. The verb as सुत ् sut 'sleep' in (13a) is intransitive and सु त sut-i 'sleep-PASS' is the passive form in (13b). The verb लेख् lekʰ 'write' in (13c)



is a transitive verb and लेिख lekʰ-i 'write-PASS' in (13d) is the passive form, लेखा lekʰ-



a 'write-CAUSE' in (13e) is causative stem and लेखाइ lekʰ-a-i 'write-CAUSE-PASS' in (13f) is the causative-passive stem. Therefore, the passive stem of a verb is at least theoretically possible to be derived from intransitive, transitive and causative verb stems. Table 4.23 lists some passive forms of the verbs. (13) a. म आज राॆर सुत। mʌ



adzʌ ramrʌri sut-ẽ



1SG today nice



'I slept nicely today.'



sleep-PST.1SG



b. आज राॆर सु तयो। adzʌ ramri



today nice



sut-i-jo



sleep-PASS-PST.3SG.MASC



'(Myself) slept nicely today.' c. उसले एउटा िच ी ले यो। us-le



3SG-ERG



eut ̺a



tsitʰtʰiː lekʰ-jo



one.CLF letter



write-PST.3SG.MASC



'He wrote a letter.' d. उसबाट एउटा िच ी लेिखयो। us-bat ̺ʌ



eut ̺a



tsitʰtʰiː lekʰ-i-jo



3SG-ABL one.CLF letter write-PASS-PST.3SG.MASC 'A letter was written by him.'



128



e. उसले एउटा िच ी लेखायो। us-le



eut ̺a



tsitʰtʰiː lekʰ-a-jo



3SG-ERG one.CLF letter



write-CAUS-PST.3SG.MASC



'He caused to write a letter.'



f. उसबाट एउटा िच ी लेखाइयो। us-bat ̺ʌ eut ̺a



tsitʰtʰiː lekʰ-a-i-jo



3SG-ABL one.CLF letter



write-CAUS-PASS-PST.3SG.MASC



'He was made to write a letter.'



Table 4.23: Some passive verb stems Passive verb उठ



IPA ut ̺ʰ-i-



Gloss 'be waken up'



सु त



sut-i-



'be slept'



तराइ



tir-a-i-



'cause to be paid'



लेखाइ



lekʰ-a-i-



'cause to be written'



अघाइ



ʌgʰa-i-



'be satisfied'



4.2.3 Negativization Negativization in Nepali is primarily an affixation process which includes both prefixation and suffixation. Basically the negative marker is न nʌ 'NEG' is used in both cases; it is consistent in its form in prefixation process whereas it gets slightly modified in suffixation due to morphophonemic changes (Pokharel 2054 VS:40-6).



a. Prefixation The negativization by prefixation takes place in moods: potential, optative and imperative, aspects: perfect and imperfect and participial forms: absolutive, conjunctive, infinitive, purposive, perfective, prospective and conditional as shown in Table 4.24, the negative by prefixation in a verb खा kʰa- 'eat'.



129



Table 4.24: Negation by the prefixation of negative marker न- nʌGrammatical categories Potential



Positive खाला kʰala



Negative नखाला nʌ-kʰala



Optative



खाएस् kʰaes



नखाएस् nʌ-kʰaes



खा kʰa



नखा nʌ-kʰa



खाएको kʰa-eko



नखाएको nʌ-kʰa-eko



खाँदै kʰã-dʌi



नखाँदै nʌ-kʰã-dʌi



खाई kʰa-iː



नखाई nʌ-kʰa-iː



खाएर kʰa-erʌ



नखाएर nʌ-kʰa-erʌ



Infinitive



खानु kʰa-nu



नखानु nʌ-kʰa-nu



Purposive



खान kʰa-nʌ



नखान nʌ-kʰa-nʌ



Conditional



खाए kʰa-e



नखाए nʌ-kʰa-e



Perfective



खाए kʰa-e



नखाए nʌ-kʰa-e



खाने kʰa-ne



नखाने nʌ-kʰa-ne



Imperative Perfect Aspect Imperfect Aspect Absolutive Conjunctive Participle



Prospective b. Suffixation



The negativization by suffixation takes place in tense: past and non-past and aspects: past habitual and inferential as shown in Table 4.25 in a verb खा kʰa- 'eat'. The negative marker न- nʌ- 'NEG' always follow the tense marker and precedes the agreement markers.5 Table 4.25: Negation by the suffixation of negative marker -न -nʌ Grammatical categories Non-Past Tense



Positive खा छ kʰa-ntsʰʌ



Negative खाँदैन kʰã-dʌinʌ



खायो kʰa-jo



खाएन kʰa-enʌ



Past Habitual Aspect



खा यो kʰa-ntʰjo



खाँदैन यो kʰã-dʌinʌ-tʰjo



Inferential Aspect



खाएछ kʰa-etsʰʌ



खाएनछ kʰa-e-nʌ-tsʰʌ



Past Tense



4.3 Stem formation As discussed in (4.2.1) the causativization is very productive in Nepali verbs at morphological level. The causative stems are formed from both intransitive and transitive verb stems. Thus, from a causativization process, the stems can be divided 5



In non-past tense and past habitual aspect, negative marker is preceded by दै dʌi, it's status is yet to be discovered.



130



into two types of stems: base verb stems and causative stems. However, there are some verb stems from which the causative verb stems can not be formed due to either phonological or semantic constraints. The passivization as discussed in (4.2.2) is a very productive phenomenon in Nepali morphology. That means, almost all the verb stems either intransitive or transitive verb stems can be passivized. Above all, the causative stems formed from the non-causative stems can still be passivized. This means, causative-passive stems have also been possible. Therefore, it can be generalized that a verb can have at least four different forms as shown in Table 4.26.



Table 4.26: Pattern of the stem formation Category Basic verb stem



Form V



Passive verb stem



V-i



Causative verb stem Causative Passive verb stem



V-a



V-a-i



Example 'write' लेख् lekʰ लेिख lekʰ-i लेखा lekʰ-a लेखाइ lekʰ-a-i



4.4 Grouping of verb stems in Nepali Characteristic features of Nepali verbs discussed in (4.1) and (4.2) are taken as the bases for grouping of Nepali verbs into various classes, so that the syntax of morphemes can be described and implemented to create the finite state network. At the same time, classification of verb stems also helps in branching of sub-lexicons to their respective inflectional paradigms. The phonological rules that are identified can also be systematically implemented.



4.4.1 Intransitive verb stems a. Verb stem Type1a



a-ending polysyllabic verbs in Nepali which have only two forms: base stem and passive stem. Some such verb stems with both the forms are listed in Table 4.27 with their corresponding morphological tags and gloss of base stems.



131



Table 4.27: Type1a verb stems Base form अघा ʌgʰa



Tags +VERB



Passive form अघाइ ʌgʰa-i



Tags Gloss of base +VERB+PASS 'to be satisfied'



करा kʌra



+VERB



कराइ kʌra-i



+VERB+PASS 'to shout'



नदा nida



+VERB



नदाइ nida-i



+VERB+PASS 'to sleep'



बहुला bʌhula



+VERB



बहुलाइ bʌhula-i



+VERB+PASS 'to be mad'



मुःकुरा muskura



+VERB



मुःकुराइ muskura-i



+VERB+PASS 'to smile'



लजा lʌdza



+VERB



लजाइ lʌdza-i



+VERB+PASS 'to shy'



टु सा t ̺usa



+VERB



टु साइ t ̺usa-i



+VERB+PASS 'to sprout'



The finite state transducer illustrated in Figure 4.1 encodes the verb stems listed in Table 4.27 and it is capable of analyzing and generating verb stems listed in Table 4.27.



Figure 4.1: A finite state transducer for Type1a verb stems b. Verb stem Type1b



i-ending polysyllabic verbs in Nepali which have four forms: base stems, passive stems, causative stems and causative-passive forms. Some such verbs with all the forms are listed in Table 4.28 with their corresponding morphological tags and gloss of base stems.



132



Table 4.28: Type1b verb stems चोिख



+VERB+PASS +VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'to be pure' चोिखइ चो या चो याइ tsokʰj-a-i



tsokʰi



tsokʰi-i



tsokʰj-a



गुि स



गुि सइ



गु ःया



gumsi



gumsi-i



gumsj-a



घोि ट



घोि टइ



घो



gʰopti



gʰopti-i



gʰoptj-a



टु ब



टु बइ



टु या



t ̺ukri



t ̺ukri-i



t ̺ukrj-a



+VERB







गु ःयाइ gumsj-a-i



'to be suffocated'



घो



'to be overturned'



ाइ gʰoptj-a-i



टुबायाइ t ̺ukrj-a-i



'to be broken into pieces'



The finite state transducer illustrated in Figure 4.2 encodes the verb stems listed in Table 4.28 and it is capable of analyzing and generating them.



Figure 4.2: A finite state transducer for Type1b verb stems



The rules listed in PR 4.1 are compiled into a finite state transducer and composed with the finite state transducer illustrated in Figure 4.2.



Phonological rule PR 4.1 i. Stem final vowel ि◌ i of the i-ending intransitive verbs at the surface level is changed to vowel य j before the causative marker आ a. Regular expressions:



ि◌ ->◌् य || __ आ



133



ii. Independent vowel आ a changes to its corresponding dependent vowel ◌ा a after य j. आ -> ◌ा || य __ ;



Regular expression:



c. Verb stem Type1c



i-ending polysyllabic verbs in Nepali which have four forms: base stems, passive stems, causative stem and causative-passive stems. In this group of verbs, causative marker -a is inserted between the consonants in consonant cluster while forming the causative stems and final vowel इ i is dropped. Some examples are listed in Table 4.29 with their corresponding morphological tags.



Table 4.29: Type1c verb stems +VERB उि ल ukli



+VERB+PASS उि लइ ukli-i



+VERB+CAUSE उकाल् ukal



+VERB+CAUSE+PASS उका ल ukal-i



Gloss of base 'step up'



उिय ugʰri



उियइ ugʰri-i



उघार ् ugʰar



उघा र ugʰar-i



'be opened'



उ ृ upʰri



उ ृइ upʰri-i



उफार ् upʰar



उफा र upʰar-i



'jump'



घ ॐ gʰʌsri



घ ॐइ gʰʌsri-i



घसार ् gʰʌsar



घसा र gʰʌsar-i



'scrawl'



थु ू tʰupri



थु ूइ tʰupri-i



थुपार ् tʰupar



थुपा र tʰupar-i



'be piled up'



निभ nikʰri



निभइ nikʰri-i



नखार ् nikʰar



नखा र nikʰar-i



'be empty'



पि ल pʌgli



पि लइ pʌgli-i



पगाल् pʌgal



पगा ल pʌgal-i



'melt'



स ू sʌpri



स ूइ sʌpri-i



सपार ् sʌpar



सपा र sʌpar-i



'grow well'



सु ी sudʰri



सु ीइ sudʰri-i



सुधार ् sudʰar



सुधा र sudʰar-i



'improve'



The verb stems listed in Table 4.29 are compiled into a finite state transducer as illustrated in Figure 4.3 and it is capable of analyzing and generating them.



134



Figure 4.3: A finite state transducer for Type1c verb stems



The phonological rules in PR 4.2 are compiled into a finite state transducer and composed with the finite state transducer demonstrated in Figure 4.3.



Phonological rule PR 4.2 i. Causative marker आ a inserted between the consonants cluster of the i-ending intransitive verbs at the surface level. Regular expression: [. .] -> ◌ा || cons __ cons; ii. Stem final vowel ि◌ i of the i-ending intransitive verbs is deleted for causative stem. Regular expression:



ि◌ -> [ ] || __ ;



iii. Independent vowel आ a is changed to its corresponding dependent vowel ◌ा a. Regular expression:



आ -> ◌ा || __ .#.;



d. Verb stem Type1d Monosyllabic verb stems in Nepali having CaC structure which have four forms: base stem, passive stems, causative stems and causative-passive stems. Vowel ◌ा a of the verb stems changes to vowel अ ʌ when causative maker आ a follows the stem. Some examples are illustrated in Table 4.30 with their corresponding morphological tags. 135



Table 4.30: Type1d verb stems



काँप ् kãp



+VERB+PASS +VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'shiver' काँ प kãp-i कँपा kʌ̃p-a कँपाइ kʌ̃p-a-i



हाँस ् ɦãs



हाँ स ɦãs-i



+VERB



हँसा ɦʌ̃s-a



हँसाइ ɦʌ̃s-a-i



'laugh'



The verb stems listed in Table 4.30 are compiled into a finite state transducer as illustrated in Figure 4.4 and it is capable of analyzing and generating them.



Figure 4.4 A finite state transducer for Type1d verb stems The phonological rules involved in this process are listed in PR 4.3 which are compiled and composed with the transducer illustrated Figure 4.4.



Phonological rule PR 4.3 i. Vowel ◌ा a of the verb stems having Ca C structure is changed to vowel अ ʌ if the stem is followed by causative maker आ a at the surface level. Regular expression:



आ -> [ ] || cons __ cons;



ii. Independent vowel आ a changes to its corresponding dependent vowel ◌ा a. Regular expression:



आ -> ◌ा || __ .#.;



136



e. Verb stem Type1e Monosyllabic consonant ending intransitive verb stems in Nepali that have four forms: base stem, passive stems, causative stems and causative-passive stems have been grouped. Some examples of verb stems of are illustrated in Table 4.31 and Table 4.32 with their corresponding morphological tags.



Table 4.31: Type1e verb stems (i) +VERB



+VERB+PASS



+VERB+CAUSE



बस् bʌs



ब स bʌs-i



बसा bʌs-a



खस् kʰʌs



ख स kʰʌs-i



खसा kʰʌs-a



+VERB+CAUSE+PASS Gloss of base sit बसाइ bʌs-a-i drop खसाइ kʰʌs-a-i



छन् tsʰin



छ न tsʰin-i



छना tsʰin-a



छनाइ tsʰin-a-i



chop off



Table 4.32: Type1e verb stems (ii) +VERB



+VERB+PASS



मर ् mʌr



म र mʌr-i



गल् gʌl



ग ल gʌl-i



+VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'to kill' मरा mʌr-a मराइ mʌr-a-i 'to melt' गला gʌl-a गलाइ gʌl-a-i



चल् tsʌl



च ल tsʌl-i



चला tsʌl-a



ँ ् dzʌ̃ts जच



जिँ च dzʌ̃ts-i



ँ ा dzʌ̃ts-a जच



चलाइ tsʌl-a-i



'to move?'



ँ ाइ dzʌ̃ts-a-i जच



'to examine'



झर ् dzʰʌr



झ र dzʰʌr-i



झरा dzʰʌr-a



झराइ dzʰʌr-a-i



'to drop'



टर ् t ̺ʌr



ट र t ̺ʌr-i



टरा t ̺ʌr-a



टराइ t ̺ʌr-a-i



सर ् sʌr



स र sʌr-i



सरा sʌr-a



सराइ sʌr-a-i



'to escape artfully' 'to move aside'



The verb stems listed in Table 4.31 and Table 4.32 are compiled into a finite state transducer as demonstrated in Figure 4.5 and it is capable of analyzing and generating them.



137



Figure 4.5: A finite state transducer for Type1e verb stems



The phonological rules in PR 4.4 are compiled and composed with the transducer illustrated in Figure 4.5. Phonological rule PR 4.4 i. Halanta ◌् at the end of the consonant ending verb stem is deleted before the causative marker आ a and passive marker ि◌ i at the surface level. Regular expression: ◌् -> [ ] || __ ई|आ; ii. Independent vowel आ a and इ i change to their corresponding dependent vowels ◌ा a and ि◌ i, respectively. Regular expressions:



आ -> ◌ा || __ .#.; इ -> ि◌ || __.#.;



4.4.2 Transitive verb stem a. Verb stem Type2a Polysyllabic verbs which contain vowel आ a within the stems and have only two forms: base stem and passive stem are grouped in this class. Some examples of such verbs are listed in Table 4.33 and Table 4.34 with their corresponding morphological tags and the gloss of base stems. 138



Table 4.33: Type2a verb stems (i) उचाल् utsal



Tags +VERB



उचा ल utsal-i



Tags Gloss of stem +VERB+PASS 'to lift'



अजाप् ʌrdzap



+VERB



अजा प ʌrdzap-i



+VERB+PASS 'to sharpen'



झपार ् dzʰʌpar



+VERB



झपा र dzʰʌpar-i



+VERB+PASS 'to scold'



पछार ् pʌtsʰar



+VERB



पछा र pʌtsʰar-i



सराप् sʌrap



+VERB



सरा प sʌrap-i



+VERB+PASS 'to make upside down' +VERB+PASS to curse'



Table 4.34: Type2a verb stems (ii) VI-form मार ् mar



Tags +VERB



Passive form मा र mar-i



Tags Gloss of stem +VERB+PASS 'to lift'



गाल् gal



+VERB



गा ल gal-i



+VERB+PASS 'to melt'



चाल् tsal



+VERB



चा ल tsal-i



+VERB+PASS 'to make move'



जाँच ् dzats



+VERB



जाँिच dzats-i



+VERB+PASS 'to examine'



झार ् dzʰar



+VERB



झा र dzʰar-i



+VERB+PASS 'to drop'



टार ् t ̺ar



+VERB



टा र t ̺ar-i



+VERB+PASS 'to '



सार ् sar



+VERB



सा र sari



+VERB+PASS 'to shift'



The verb stems listed in Table 4.33 and Table 4.34 are compiled into a finite state transducer as demonstrated in Figure 4.6 and it is capable of analyzing and generating them.



Figure 4.6: A finite state transducer for Type2a verb stems The phonological rules involved in this process are listed in PR 4.5 and they are compiled and composed with the finite state transducer as demonstrated in Figure 4.6. 139



Phonological rule PR 4.5 i. Halanta ◌् at the end of the consonant ending verb stems is deleted before the causative marker आ a and passive marker इ i at the surface level. Regular expression: ◌् -> [ ] || __ ई|आ; ii. Independent vowel आ a and इ i are changed to their corresponding dependent vowels ◌ा and ि◌, respectively. Regular expressions:



आ -> ◌ा || __ .#.; इ -> ि◌ || __.#.;



b. Verb stem Type2b



i-ending polysyllabic basic verb stems which have four forms: base stems, passive stems, causative stems and causative-stems have been grouped in this class. Some examples are listed in Table 4.35 with their corresponding morphological tags.



Table 4.35: Type2b verb stems +VERB



+VERB+PASS



+VERB+CAUS



+VERB+CAUSE+PASS



E



पब



प बइ



पबा



पबाइ



pʌkri



pʌkri-i



pʌkr-a



pʌkr-a-i



पिख



पिखइ pʌrkʰi-i



पखा



पखाइ



pʌrkʰ-a



pʌrkʰ-a-i



pʌrkʰi बस



ब सइ



बसा



बसाइ



birsi



birsi-i



birs-a



birs-a-i



मि स



मि सइ mʌnsi-i



म सा



म साइ



mʌns-a



mʌns-a-i



mʌnsi सि झ



सि झइ



स झा



स झाइ



sʌmdzʰi



sʌmdzʰi-i



sʌmdzʰ-a



sʌmdzʰ-a-i



कुि च



कुि चइ kultsi-i



कु चा



कु चाइ



kults-a



kults-a-i



kultsi उइँ ट



उइँ टइ



uĩt ̺i



uĩt ̺i-i



उइँटा



उइँटाइ



uĩt ̺-a



uĩt ̺-a-i 140



Gloss of base 'arrest' 'wait' 'forget' 'throw away' 'remember' 'tread' 'spindle'



The verb stems listed in Table 4.35 are compiled into a finite state transducer as demonstrated in Figure 4.7 and it is capable of analyzing and generating them.



Figure 4.7: A finite state transducer for Type2b verb stems



The finite state transducer in Figure 4.7 is composed with the network of phonological rules listed in PR 4.6.



Phonological rules PR 4.6 i. Vowel ि◌ i of the i-ending transitive verbs is deleted before the causative marker आ a at the surface level.



Regular expression:



ि◌ -> [ ] || __ आ;



ii. Independent vowel आ a is changed to its corresponding dependent vowel ◌ा. Regular expression: आ -> ◌ा || __ .#.;



c. Verb stem Type2c Monosyllabic verb stems having C a C structure which have four forms: base stem, passive stems, causative stems and causative-passive stems. Vowel आ a of the basic



141



verb stems changes अ ʌ when the causative maker आ a follows the base stem. Some examples are illustrated in Table 4.36 with their corresponding morphological tags.



Table 4.36: Type2c verb stems +VERB खाप्



+VERB+PASS +VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'pile up' खा प खपा खपाइ



kʰap



kʰap-i



kʰʌp-a



kʰʌp-a-i



गा



गा ड



गडा



गडाइ



gad̺



gad̺-i



gʌd̺-a



gʌd̺-a-i



छाप्



छा प



छपा



छपाइ



tsʰap



tsʰap-i



tsʰʌp-a



tsʰʌp-a-i



टाँस ्



टाँ स



t ̺ãs



t ̺ãs-i



तान्



ता न



tan



टँसा



'bury' 'print'



टँसाइ



'stick up'



तना



तनाइ



'pull'



tan-i



tʌn-a



tʌn-a-i



नाच ्



नािच



नचा



नचाइ



nats



nats-i



बाँच ्



बाँिच



ँ ा बच



bãts



bãts-i



bʌ̃ts-a



हाँक्



हाँ क



हँका



ɦãk



ɦãk-i



ɦʌ̃k-a



t ̺ʌ̃s-a



nʌts-a



t ̺ʌ̃s-a-i



'dance'



nʌts-a-i ँ ाइ बच



'survive'



bʌts-a-i हँकाइ



'drive'



ɦʌ̃k-a-i



The verb stems listed in Table 4.36 are compiled into a finite state transducer as illustrated in Figure 4.8 and it is capable of analyzing and generating them.



Figure 4.8: A finite state transducer for Type2c verb stems



142



The phonological rules involved in this process are listed in PR 4.7 which are compiled and composed with the transducer illustrated in Figure 4.8.



Phonological rule PR 4.7 i. Vowel ◌ा a of the verb stems having C a C structure is changed to vowel ʌ if the stem is followed by causative maker आ a at the surface level. आ -> [ ] || cons __ cons;



Regular expression:



ii. Independent vowel आ a is changed to its corresponding dependent vowel ◌ा a. आ -> ◌ा || __ .#.;



Regular expression: d. Verb stem Type2d



Monosyllabic consonant ending transitive basic verb stems which have four forms: base stem, passive stems, causative stems and causative-passive stems have been grouped in this class.



Some examples are illustrated in Table 4.37 with their



corresponding morphological tags.



Table 4.37: Type2d verb stems +VERB



+VERB+PASS



+VERB+CAUSE



प pʌd̺ʰ



प ढ pʌd̺ʰ-i



पढा pʌd̺ʰ-a



कन् kin



क न kin-i



कना kin-a



+VERB+CAUSE+PASS Gloss of base read पढाइ pʌd̺ʰ-a-i buy कनाइ kin-a-i



जोत् dzot



जो त dzot-i



जोता dzot-a



जोताइ dzot-a-i



plough



घस् gʰʌs



घ स gʰʌs-i



घसा gʰʌs-a



घसाइ gʰʌs-a-i



massage



The finite state transducer illustrated in Figure 4.9 encodes the verb stems listed in Table 4.37 and it is capable of analyzing and generating them.



143



Figure 4.9: A finite state transducer for Type2d verb stems The phonological rules listed in PR 4.8 are compiled and composed with the transducer demonstrated in Figure 4.9.



Phonological rule PR 4.8 i. Halanta ◌् at the end of the consonant ending verb stem is deleted before the causative marker आ a and passive marker इ i at the surface level. Regular expression: ◌् -> [ ] || __ इ|आ; ii. Independent vowel आ a and इ i are changed to their corresponding dependent vowels ◌ा a and ि◌ i, respectively. Regular expression:



आ -> ◌ा || __ .#.; इ -> ि◌ || __.#.;



4.4.3 Irregular verb (intransitive and transitive) stems Some of the intransitive and transitive verb stems which are not regular in their stem formation process. Some of the verbs of this type are listed in Table 4.38.



144



Table 4.38: Irregular verb stems +VERB आa



+VERB+PASS आइ a-i



+VERB+CAUSE



+VERB+CAUSE+PASS Gloss of base 'come'



जा dza



जाइ dza-i



रो ro



रोइ ro-i



हो ho



होइ ho-i



खा kʰa



खाइ kʰa-i



पा pa



पाइ pa-i



द di



दइ di-i



दला di-la



दलाइ di-la-i



'give'



ल li



लइ li-i



या l-ja



याइ l-ja-i



'take'



धो dʰo



धोइ dʰo-i



'go' वा ru-wa



वाइ ru-wa-i



'cry' 'be'



वा kʰw-a



वाइ kʰw-a-i



'eat' 'get'



धुला dʰu-la



धुलाइ dʰu-la-i



'wash'



बस् bʌs-



बसाल् bʌs-al-



बसा ल bʌs-al-i



'sit'



खस् kʰʌs-



खसाल् kʰʌsal-



खसा ल kʰʌs-al-i



'drop'



च ुँ ड tsũd̺i-



च ुडाल् tsũd̺al-



च ुडा ल tsũd̺-al-i



'snatch'



छन् tsʰin-



छनाल् tsʰinal-



छना ल tsʰin-al-i



'chop off'



The finite state transducer for irregular can not be generalized as done in earlier cases. Therefore, their network is not demonstrated rather they will directly be encoded and implemented.



4.4.4 Suppletive verb stems There are two pairs of suppletive verb stems; they are हु- ɦu- 'become' vs. भ- bʰʌ'became' and जा- dza- 'go' vs. ग- gʌ- 'went'. First and secod members of the suppletive pairs follow the different tracks in the inflectional paradigm and this is illustrated in Table 4.39.



145



Table 4.39: Suppletive verb stems हु ɦu- 'become'!



भ bʰʌ- 'became'



Non-past tense



जा dza- 'go'



Non-past tense Past tense Perfect aspect



Imperfect aspect Habitual aspect



Past tense Perfect aspect Imperfect aspect Habitual aspect



Inferential aspect Imperative Optative Potential



ग gʌ- 'went'



Optative



Inferential aspect Imperative Optative Potential



Absolutive Infinitive Purposive Prospecctive Durative



Optative Absolutive



Infinitive Purposive Prospective Durative Conditional Perfective Conjunctive



Conditional Perfective Conjunctive



The phonological rules involved in altering the suppletive forms are listed in PR 4.9.



Phonological rule: PR 4.9 i. The verb stem हु ɦu changes to भ bʰʌ if the following suffix begins with ए e or इ



i or ई iː or य jʌ. Regular expression: हु -> भ || __ ए|इ|ई|य; ii. The verb stem dza is changed to gʌ if the following suffix begins with ए e or इ i or ई iː or य jʌ. Regular expression: जा -> ग || __ ए|इ|ई|य;



146



4.5 Verbal inflections In Nepali, verbal inflections are suffixes attached to verb stems. A verb stem can be a base form, a causative form or a passive form. The inflectional suffixes in general encode the inherent verbal features such as tense, aspect and mood. Besides these inherent features, these inflectional suffixes also encode the agreement features such as person, number, gender and honorificity with reference to the subject of the sentence. Inherent features and agreement features are not clearly distinguishable in terms of symbols rather represented by a set of symbols. Therefore, these inflectional suffixes form a paradigm with respect to above mentioned features. The suffixal negative marker gets intermixed with these inflectional suffixes in some forms. This leads the formation of both positive and negative paradigms of the verbal inflection. The second type of negative marker is a prefix which appears in front of the verb stems (see 4.2.3). The auxiliary verbs in Nepali are more or less equivalent to the inflections in encoding the verbal features, therefore, they are discussed in this section.



4.5.1 Auxiliary verbs in Nepali Nepali has two kinds of auxiliaries, namely, non-past existential auxiliary छ tsʰʌ and non-past identificational auxiliary हो ɦo. But the past auxiliary थ- tʰi- is a different stem which inflects like main verbs (Dahal 1974 and Adhikari 2062VS). The existential and indentificational auxiliary verbs for both non-past and past tenses (i and ii) have been discussed with their inflections.1 Non-past form



Past form



i. Existential







थ-



ii. Indentificational



हो



--



a. Non-past existential auxiliary verb छ tsʰʌ 'be' inflects for person, number, gender and honorific agreement as in छन् tsʰin 'be.NPST.3SG.FEM.HON' in (14). In the default case, the auxiliary verb form छ tsʰʌ 'be' itself represents the third person singular masculine and carries non-past tense and for other cases agreement inflections follow



1



See Sharma (1980) for detailed description of auxiliary verbs in Nepali.



147



it. All together, this existential verb छ tsʰʌ 'be' has twelve forms and the inflections are listed in Table 4.39 with their corresponding morphological tags. (14) द द घरमा छन्। didi gʰʌr-ma tsʰʌin elder sister home-LOC be.NPST.3SG.FEM.HON 'The elder sister is at home.' Table 4.39: Inflections for non-past existential verb छ chʌ ‘be’ (affirmative) Grammatical category First person singular



Inflections



First person plural







IPA Tags NPST.1SG u NPST.1PL ʌũ



Second person masculine singular



स्



s



NPST.2SG.MASC



Second person feminine singular



एस्



es



NPST.2SG.FEM



Second person masculine singular hon







ʌu



NPST.2SG.MASC.HON



Second person feminine singular hon



यौ



jʌu



NPST.2SG.FEM.HON



Second person plural







NPST.2PL







Third person singular masculine Third person feminine singular



φ







ʌu φ e



Third person masculine singular hon



न्



ʌn



NPST.3SG.MASC.HON



Third person feminine singular hon



इन्



in



NPST.3SG.FEM.HON



Third person plural



न्



ʌn



NPST.3PL



NPST.3SG.MASC NPST.3SG.FEM



The finite state transducer in Figure 4.10 encodes the auxiliary verb छ chʌ ‘be’ and its various forms and it is can analyze and generate them. The rules involved in this case are directly encoded into the finite state transducer.



148



Figure 4.10: A finite state transducer for inflections of non-past existential verb छ chʌ ‘be’ (affirmative)



b. In the negative formation of the existential verb छ tsʰʌ, the negative suffix -न -nʌ 'NEG'



is inseted between the auxiliary stem छ tsʰʌ 'be' and agreement inflections.



During the process of negativization some morphophonemic changes occur as छै नन्



tsʰʌinʌn 'be-NEG' in (15). There are eight negative forms where there is no distinction in gender. Table 4.40 lists the inflections with their corresponding morphological tags. (15) द द



घरमा छै नन्।



didi gʰʌr-ma tsʰʌinʌn elder sister home-LOC be.NPST.NEG.3SG.FEM.HON 'The elder sister is not at home.' Table 4.40: Inflection for non-past existential verb छ chʌ 'be' (Negative) Grammatical category First person singular



Inflections इनँ



inʌ̃



NPST.NEG.1SG



First person plural



इन



inʌũ



NPST.NEG.1PL



Second person singular



इनस्



inʌs



NPST.NEG.2SG



Second person singular hon



इनौ



inʌu



NPST.NEG.2SG.HON



Second person plural



इनौ



inʌu



NPST.NEG.2PL



Third person singular



इन



inʌ



NPST.NEG.3SG



Third person singular hon



इनन्



inʌn



NPST.NEG.3SG.HON



Third person plural



इनन्



inʌn



NPST.NEG.3PL



149



Tags



The finite state transducer in Figure 4.11 encodes both inflections and rules involved. It can analyze and generate the negative forms of the auxiliary verb छ tsʰʌ 'be'.



Figure 4.11: A finite state transducer for inflections of non-past existential verb छ



chʌ 'be' (negative)



c. Non-past idenficational auxiliary verb हो ɦo 'be' takes similar agreement inflections but this verb does not have gender distinction. That means, there are no feminine forms. The verb itself represents the third person singular form. Some morphophonemic changes occur when inflections combine with हो ɦo 'be' as हौ ɦʌu 'be.NPST.2SG.HON' in (16). There are altogether eight forms of this verb and the inflections are listed in Table 4.41 with their corresponding morphological tags. (16) तमी िश क हौ। timi siktsʰʌk ɦʌu 2SG.HON teacher be.NPST.2SG.HON 'You are a teacher.'



150



Table 4.41: Inflections for non-past identificational verb हो ɦo ‘be’ (affirmative) Grammatical category First person singular



Inflections IPA उँ ũ



NPST.1SG



First person plural







ʌũ



NPST.1PL



Second person singular



स्



s



NPST.2SG



Second person singular hon







ʌu



NPST.2SG.HON



Second person plural







NPST.2PL



NPST.3PL



Third person singular Third person singular hon



φ



न्



ʌu φ n



Third person plural



न्



n



Tags



NPST.3SG NPST.3SG.HON



The auxiliary verb हो ɦo ‘be’ and its various forms are compiled into a finite state transducer as demonstrated in Figure 4.12 and it is capable of analyzing and generating them.



Figure 4.12: A finite state transducer for inflections of non-past identificational verb हो ɦo ‘be’ (affirmative) The phonological rules involved in this process are listed in PR 4.10 and they have been directly encoded into the finite state transducer illustrated in Figure 4.12.



Phonological rules PR 4.10 i. Independent vowels उ u and औ ʌu are changed to their corresponding dependent vowels ◌ु u and ◌ौ ʌu, respectively after हो.



151



Regular expressions:



उ -> ◌ु || हो __; औ -> ◌ौ || हो __; ◌ो -> ◌ु || ह __ न् .#. ◌ो -> [ ] || ह __ ◌ु|◌ौ;



d. In the negative formation of identificational auxiliary verb हो ɦo, the negative suffix न -nʌ (इन inʌ) is inserted between the auxiliary stem हो ɦo and agreement inflections.



In this process of negativization, no changes occur in the stem itself as होइनौ ɦoinʌu in (17). There are eight negative forms parallel to the positive ones. They are listed in Table 4.42 with their corresponding morphological tags. (17) तमी



timi



िश क होइनौ।



siktsʰʌk ɦoinʌu



2SG.HON teacher be.NPST.NEG.2SG.HON 'You are not a teacher.' Table 4.42: Inflection for non-past identificational verb हो ɦo ‘be’ (Negative) Grammatical category First person singular



Inflections इनँ



inʌ̃



Tags NPST.NEG.1SG



First person plural



इन



inʌũ



NPST.NEG.1PL



Second person singular



इनस्



inʌs



NPST.NEG.2SG



Second person singular hon



इनौ



inʌu



NPST.NEG.2SG.HON



Second person plural



इनौ



inʌu



NPST.NEG.2PL



Third person singular



इन



inʌ



NPST.NEG.3SG



Third person singular hon



इनन्



inʌn



NPST.NEG.3SG.HON



Third person plural



इनन्



inʌn



NPST.NEG.3PL



The finite state transducer in Figure 4.13 is capable of analyzing and generating the negative forms of auxiliary verb हो ɦo ‘be’.



152



Figure 4.13: A finite state transducer for inflection of non-past identificational verb हो ɦo ‘be’ (Negative) a. Past existential auxiliary verb थ tʰi 'be.PST' inflects for person, number, gender and honorific agreement as in थय tʰijʌũ in (18). The auxiliary verb form थ tʰi itself carries past tense and the agreement inflections follow it. In this case also the auxiliary verb stem does not change when suffixes are attached. All together, this existential verb has ten forms and the inflections are listed in Table 4.43 with their corresponding morphological tags. (18) हामी तनाबमा थय । ɦami tʌnab-ma tʰi-jʌũ 2PL tension-LOC be.PST.2PL 'We were at tension.' Table 4.43: Inflections for past existential verb थ tʰi 'be' (affirmative) Grammatical category First person singular



Inflections



First person plural







IPA Tags PST.1SG ẽ jʌũ PST.1PL



Second person singular



इस्



is



PST.2SG



Second person singular hon



यौ



jʌu



PST.2SG.HON



Second person plural



यौ



jʌu



PST.2PL



Third person masculine singular



यो



jo



PST.3SG.MASC



Third person feminine singular







i:



PST.3SG.FEM



Third person masculine singular hon







e



PST.3SG.MASC.HON



Third person feminine singular



इन्



in



PST.3SG.FEM.HON



Third person plural







e



PST.3PL



एँ



153



The finite state transducer illustrated in Figure 4.14 can analyze and generate the positive forms of auxiliary verb थ tʰi 'be'.



Figure 4.14: A finite state transducer for inflections of past existential verb थ tʰi 'be' (affirmative) b. In the negative formation of the existential verb थ tʰi-, the negative suffix -न -nʌ is inseted between the auxiliary stem थ tʰi- and agreement inflections. During the process of negativization no morphophonemic changes occur auxiliary verb stem as थएन tʰi-enʌjʌũ in (19). There are eleven negative forms one more than positive ones



and Table 4.44 lists the inflections with their corresponding morphological tags. (19) हामी तनाबमा थएन । ɦami tʌnab-ma tʰi-enʌjʌũ 2PL tension-loc be.P-NEG.2PL 'We were not at tension.'



154



Table 4.44: Inflections for past existential verb थ tʰi ‘be’ (negative) Grammatical category First person singular



Inflections इनँ



IPA inʌ̃



Tags PST.NEG.1SG



First person plural



एनौ



enʌũ



PST.NEG.1PL



Second person singular



इनस्



inʌs



PST.NEG.2SG



Second person singular masculine एनौ hon Second person singular hon इनौ



enʌu



PST.NEG.2SG.MASC.HON



enʌu



PST.NEG.2SG.FEM.HON



Second person plural



एनौ



enʌu



PST.NEG.2PL



Third person masculine singular



एन



enʌ



PST.NEG.3SG.MASC



Third person feminine singular



इन



inʌ



PST.NEG.3SG.FEM



Third person masculine singular hon



एनन्



enʌn



PST.H.NEG.3SG.MASC



Third person feminine singular hon



इनन्



inʌn



PST.NEG.3SG.FEM.HON



Third person plural



एनन्



enʌn



PST.NEG.3PL



The finite state transducer in Figure 4.15 can analyze and generate the negative forms of auxiliary verb थ tʰi 'be'.



Figure 4.15: A finite state transducer for inflections of past existential verb थ tʰi ‘be’ (negative) 4.5.2 Tense The Nepali morphologically exhibits two tenses: past and non-past. Past tense refers to the action that is completed prior to the speech event and non-past tense refers to



155



the action that happens at the time of speech event or later (Schmidt 1993; Pokharel 2054VS and Adhikari 2055VS).



a. Non-past Tense Non-past tense in Nepali covers both present and future. There is no such future tense marker; however, the same present (non-past) makers and sometimes combines with prospective marker referring to the future tense. In fact, there are no such definite non-past tense markers in Nepali. The non-past existential auxiliary verb छ tsʰʌ as discussed in (4.5.1) behaves as non-past tense marker. The auxiliary meaning 'be' seems to be completely absorbed and agreement features are retained. This feature can be seen in (22), the verb खा kʰa 'eat' combines with auxiliary verb छ tsʰʌ and with its agreement inflection but the form छन् tsʰʌn only indicates non-past tense and agreement features. The non-past tense and inflections for person, number, gender and honorificity altogether are in ten forms and they are listed in Table 4.45 with their corresponding morphological tags.2 (22) भाइह



भात खा छन्।



bʰai-ɦʌruː bʰat kʰa-n-tsʰʌn brother-PL rice eat-φ-NPST.3PL 'The brothers eat rice.'



2



Most of the traditional grammarians treat छ tsʰʌ as an auxiliary verb separately. So, the most of the verb stems, except past form, are considered as the compound stems.



156



Table 4.45: Inflections for non-past tense (affirmative) Grammatical category First person singular



Inflections छु



IPA tsʰu



Tags NPST.1SG



First person plural







tsʰʌũ



NPST.1PL



Second person masculine singular



छस्



tsʰʌs



NPST.2SG.MASC



Second person feminine singular



छे स ्



tsʰes



NPST.2SG.FEM



Second person masculine singular hon



छौ



tsʰʌu



NPST.2SG.MASC.HON



tsʰjʌu



NPST.2SG.FEM.HON







Second person feminine singular hon Second person plural



छौ



tsʰʌu



NPST.2PL



Third person masculine singular







tsʰʌ



NPST.3SG.MASC



Third person feminine singular



छे



tsʰe



NPST.3SG.FEM



Third person masculine singular hon



छन्



tsʰʌn



NPST.3SG.MASC.HON



Third person feminine singular hon



छन्



tsʰin



NPST.3SG.FEM.HON



Third person plural



छन्



tsʰʌn



NPST.3PL



The finite state transducer illustrated in Figure 4.16 can analyze and generate the nonpast positive forms of the inflections. This transducer is concatenated with finite state transducer of verb stems.



Figure 4.16: A finite state transducer for inflections of non-past tense Non-past negative forms are formed from the suffixation of negative marker -न -nʌ with verb stems. The non-tense marker छ tsʰʌ 'be' is completely absorbed and semantically null element द di or दै dʌi are inserted before the negative marker न nʌ;



157



and the agreement markers follow it. Even though the non-past tense marker is not overtly present, the tense is indicated by the sequence. There are altogether twelve non-past negative forms which are listed in Table 4.46 with their corresponding morphological tags.



Table 4.46: Inflections for non-past tense negative 1 Grammatical category First person singular



Inflections IPA दनँ dinʌ̃



Tags NPST.NEG.1SG



First person plural



दै न



dʌinʌũ



NPST.NEG.1PL



Second person masculine singular दै नस् Second person feminine singular दनस्



dʌinʌs



NPST.NEG.2SG.MASC



dinʌs



NPST.NEG.2SG.FEM



Second person masculine singular दै नौ hon Second person feminine singular दनौ hon Second person plural दै नौ Third person masculine singular दै न



dʌinʌu



NPST.NEG.2SG.MASC.HON



dinʌu



NPST.NEG.2SG.FEM.HON



dʌinʌu



NPST.NEG.2PL



dʌinʌ



NPST.NEG.3SG.MASC



dinʌ



NPST.NEG.3SG.FEM



dʌinʌn



NPST.NEG.3SG.MASC.HON



dinʌnʌn



NPST.NEG.3SG.FEM.HON



dʌinʌn



NPST.NEG.3PL



Third person feminine singular



दन



Third person masculine singular दै नन् hon Third person feminine singular दनन् hon Third person plural दै नन्



The finite state transducer in Figure 4.17 can analyze and generate the non-past negative forms of the inflections set 1. This transducer is concatenated with transducer of verb stems.



158



Figure 4.17: A finite state transducer for inflections of non-past tense negative 1 There is another set of negative marker -न -nʌ and agreement inflections which appear exclusively only with vowel ending verb stems. As discussed above the nonpast tense maker छ tsʰʌ 'be' is completely absorbed; however, the tense feature is indicated by the sequence. In this case, inflections indicate person, number and honoricity but not the gender. Therefore, there are only eight non-past negative forms in this set as listed in Table 4.47 with their corresponding morphological tags. Table 4.47: Inflections for non-past tense negative 2 Grammatical category First person singular



Inflections नँ



nʌ̃



Tags NPST.NEG.1SG



First person plural







nʌũ



NPST.NEG.1PL



Second person singular



नस्



nʌs



NPST.NEG.2SG



nʌu



NPST.NEG.2SG.HON



nʌu



NPST.NEG.2PL



Second person singular hon नौ Second person plural नौ Third person singular











NPST.NEG.3SG



Third person singular hon



नन्



nʌn



NPST.NEG.3SG.HON



Third person plural



नन्



nʌn



NPST.NEG.3PL



The finite state transducer in Figure 4.17a can analyze and generate the non-past negative forms of the inflections set 2. This transducer is concatenated with transducer of verb stems.



159



Figure 4.17a: A finite state transducer for inflections of non-past tense negative 2



b. Past Tense The past tense in Nepali is indicated by the same inflectional paradigm as that of the past tense auxiliary verb stem थ- tʰi- 'be'. In example (23), the verb stem रो ro 'cry' is followed by the past tense and agreement inflections -यो -jo 'PST.3SG.MASC'. Although, the past tense marker is not clear, ए e, य् j and इ i in the paradigm indicate the past tense and the inflections indicating the agreement features follow them. For the convenience, the paradigm listed in Table 4.43 is reproduced in Table 4.48 with their corresponding morphological tags. (23) ब चो सारै रोयो। bʌtstso sarʌi ro-jo child more cry-PST.3SG.MASC 'The child cried a lot.'



160



Table 4.48: Inflections for past tense (affirmative) Grammatical category First person singular



Inflections एँ







PST.1SG



Tags



First person plural







jʌũ



PST.1PL



Second person singular



इस्



is



PST.2SG



Second person singular hon



यौ



jʌu



PST.2SG.HON



Second person plural



यौ



jʌu



PST.2PL



Third person masculine singular



यो



jo



PST.3SG.MASC



Third person feminine singular







i:



PST.3SG.FEM



Third person masculine singular hon







e



PST.3SG.MASC.HON



Third person feminine singular hon



इन्



in



PST.3SG.FEM.HON



Third person plural







e



PST.3PL



The finite state transducer in Figure 4.18 can analyze and generate the positive inflections of past tense. This transducer is also concatenated with transducer of verb stems.



Figure 4.18: An Finite State Transducer for inflections of past tense (affirmative) The phonological rules listed in PR 4.11 are directly encoded into the finite state transducer demonstrated in Figure 4.18.



161



Phonological rule PR 4.11 i. Independent vowels ए e, इ i and ई i: change to their corresponding dependent vowels ◌े e, ि◌ i and ◌ी i:, respectively if the verb stems end with consonants. Regular expressions:



ए -> ◌े || cons __; इ -> ि◌ || cons __; ई -> ◌ी || cons __;



In the past tense negative forms, the negative marker न nʌ '-NEG' is inserted between the past markers ए e or य् j or इ i and agreement inflections. In example (24), the negative maker न nʌ '-NEG' is in between past tense marker ए e and agreement inflection न् n. Altogether, there are eleven past tense negative forms as listed in Table 4.49 with their corresponding morphological tags. (24) ब चाह रोएनन्। bʌtstsa-ɦʌruː ro-enʌn cry-PST.NEG.3PL child-PL 'The children did not cry.' Table 4.49: Inflections for past tense (negative) Grammatical category First person singular



Inflections इनँ



IPA inʌ̃



Tags PST.1SG



First person plural



एनौ



enʌũ



PST.1PL



Second person singular



इनस्



inʌs



PST.2SG



Second person singular hon



एनौ



enʌu



PST.2SG.HON



Second person singular Female hon



इनौ



enʌu



PST.NEG.2SG.FEM.HON



Second person plural



एनौ



enʌu



PST.2PL



Third person masculine singular



एन



enʌ



PST.3SG.MASC



Third person feminine singular



इन



inʌ



PST.3SG.FEM



Third person masculine singular एनन् hon Third person feminine singular hon इनन्



enʌn



PST.3SG.MASC.HON



inʌn



PST.3SG.FEM.HON



एनन्



enʌn



PST.3PL



Third person plural



162



The finite state transducer in Figure 4.19 can analyze and generate the negative inflections of past tense. This transducer is also concatenated with a transducer of verb stems.



Figure 4.19: A finite state transducer for inflections of past tense (negative) The phonological rules involved are listed in PR 4.12 have been directly encoded in the finite state transducer illustrated in Figure 4.19. Phonological rules PR 4.12 i. Independent vowels ए e, इ i, and ई i: change to their corresponding dependent vowels ◌े e, ि◌ i, and ◌ी i: if the verb stems end with consonants. Regular expressions:



ए -> ◌े || cons __; इ -> ि◌ || cons __; ई -> ◌ी || cons __;



4.5.3 Aspects The internal temporal orientation in a language is said to be aspect and this phenomena is expressed in Nepali morphologically through inflections. Traditionally four aspects, namely perfect, imperfect, habitual and inferential (unknown) aspect



163



(Pokharel 2054VS and Adhikari 2055VS) have been illustrated and discussed in the subsequent sections.



a. Perfect Aspect In Nepali, the perfect aspect is indicated by a suffix -एको -eko 'PERF.SG.MASC'. The aspect marker inflects for number and gender. The verb form गरे को gʌr-eko 'doPERF.SG.MASC'



in (25a) is singular masculine, the verb form गरे का gʌr-eka 'do-PERF.PL'



in (25b) is plural and the verb form गरे क gʌr-eki 'do-PERF.SG.FEM' in (25c) is singular feminine. The inflections of perfect aspects are listed in Table 4.50. (25) a. मैले यो काम गरे को छु ।



mʌi-le jo kam gʌr-eko tsʰu 1SG.OBL-ERG this work do-PERF.SG.MASC be.NPST.1SG 'I have done this work.'



b.



हामीले यो काम गरे का छ ।



ɦami-le jo kam gʌr-eka tsʰʌũ 1PL-ERG this work do-PERF.PL be.NPST.1PL 'We have done this work.' c.



सीताले यो काम गरे क छे ।



siːta-le jo kam gʌr-eki tsʰe Sita.FEM-ERG this work do-PERF.SG.FEM be.NPST.3SG.FEM 'Sita has done this work.' Table 4.50: Inflections for perfect aspect Grammatical category Perfect singular masculine



Inflections एको



IPA eko



Tags PERF.SG.MASC



Perfect plural



एका



eka



PERF.PL



Perfect singular feminine



एक



ekiː



PERF.SG.FEM



Perfect singular feminine Emphatic



एकै



ekʌi



PERF.SG.FEM.EMPH



The finite state transducer in Figure 4.20 can analyze and generate the inflections of perfect aspect. This transducer is also concatenated with the finite state transducer of verb stems.



164



Figure 4.20: A finite state transducer for inflections of perfect aspect The phonological rule listed in PR 4.13 has been compiled and composed with the transducer of Figure 4.20.



Phonological rules PR 4.13 i. Independent vowel ए e changes to its corresponding dependent vowel ◌े e if the verb stem ends with consonant. Regular expression:



ए -> ◌े || cons __;



The perfect aspect negative forms are formed by prefixing the negative marker न nʌ with the perfect aspect form of the verb stem as नगरे का nʌ-gʌr-eka 'NEG-do-PERF.HON' in (26).



(26)



तमीले यो काम नगरे का भए पैसा पाउँदैनौ।



timiː-le jo kam nʌ-gʌr-eka bʰʌ-e pʌisa 2SG.HON-ERG this work NEG-do-PERF.HON be-COND money pa-ũ-dʌinʌu get-NPST.NEG.2.HON 'If you have not done this work, (you) won't get money.' b. Imperfect Aspect The imperfect aspect in Nepali is indicated by a marker -दै -dʌi '-IMPERF' as in (27) the verb form गद gʌr-dʌi 'do-IMPERF' is imperfect aspect. The maker -दै -dʌi is



165



neutral with respect to number and gender. However, it has other three forms which distinguish between number and gender. All four imperfect aspect markers are listed in Table 4.51.3 (27) केटो यो काम गद छ।



ket ̺o jo kam gʌr-dʌi tsʰʌ boy.SG.MASC this work do-IMPERF be.NPST.3SG.MASC 'The boy is doing this work.' Table 4.51: Inflections for imperfect aspect



Grammatical category Imperfect singular masculine



Inflections



Tags



दो



do



IMPERF.SG.MASC



Imperfect singular feminine







diː



IMPERF.SG.FEM



Imperfect plural



दा



da



IMPERF.PL



Imperfect



दै



dʌi



IMPERF



The finite state transducer in Figure 4.21 can analyze and generate the inflections of imperfect aspect.



Figure 4.21 A finite state transducer for Inflections of imperfect aspect The finite state transducer demonstrated in Figure 4.21 is composed with the finite state transducer of phonological rules listed in PR 4.14.



3



The suffixes -दो do, -दा -da and -द -diː may have some other syntactic and semantic status, however, the computational purpose, they are treated as imperfect suffixes.



166



Phonological rules PR 4.14 i. Nasal ◌ँ is inserted if the verb stems end with vowels and the following affix begins with द d. Regular expression:



[. .] -> ◌ँ || vowel __ द;



The imperfect aspect negative forms are formed by prefixing the negative marker न



nʌ 'NEG-' with the imperfect aspect form of the verb stem as नखाँदै nʌ-kʰã-dʌi 'NEG-eatIMPERF'



in (28).



(28) िचया नखाँदै उहाँले फोन उठाउनुभयो ।



tsija nʌ-kʰã-dʌi uɦã-le pʰon ut ̺ʰ-au-nu tea NEG-eat-IMPERF 3SG.HON-ERG phone lift-CAUSE-INF 'He lifted the phone while not drinking tea.'



bʰ-jo be.p-3SG



c. Habitual Aspect There are two habitual aspects in the Nepali: present habitual and past habitual. The present habitual aspect is encoded by the non-past tense marker छ tsʰʌ and its inflections as पउँछु pi-ũ-tsʰu 'drink-φ-NPST.1SG' in (29a) whereas the past habitual aspect is indicated by a marker थ् tʰ plus inflections for agreement such as person, number, gender and honorificity. In example (29c), the verb form पउँथ pi-ũ-tʰẽ 'drinkφ-HAB.PST.1SG' is in past habitual form in which थ् tʰ is past habitual marker which is followed by the agreement inflection. There are altogether ten past habitual forms of inflections. They are listed in Table 4.52. (29) a. म धेरै रि स पउँछु। mʌ dʰerʌi rʌksi pi-ũ-tsʰu 1SG more alcohol drink-φ-NPST.1SG 'I drink a lot of alcohol.' b. म धेरै रि स पउँथ।



mʌ dʰerʌi rʌksi pi-ũ-tʰẽ 1SG more alcohol drink-φ-HAB.PST.1SG 'I used to drink a lot of alcohol.'



167



Table 4.52: Inflections for past habitual aspect (Affirmative) Grammatical category First person singular



Inflections थ



IPA tʰẽ



Tags PST.HAB.1SG



First person plural







tʰjʌũ



PST.HAB.1PL



Second person singular



थस्



tʰis



PST.HAB.2SG



Second person singular hon



यौ



tʰjʌu



PST.HAB.2SG.HON



Second person plural



यौ



tʰjʌu



PST.HAB.2PL



Third person masculine singular



यो



tʰjo



PST.HAB.3SG.MASC



Third person feminine singular







tʰi



PST.HAB.3SG.FEM



Third person masculine singular hon



थे



tʰe



PST.HAB.3SG.MASC.HON



Third person feminine singular hon



थन्



tʰin



PST.HAB.3SG.FEM.HON



Third person plural



थे



tʰe



PST.HAB.3PL



The finite state transducer in Figure 4.22 can analyze and generate the inflections of past habitual aspect. This finite state transducer is concatenated with the finite state transducer of the verb stems.



Figure 4.22: A finite state transducer for inflections of habitual aspect (affirmative) The past habitual negative forms are formed by inserting the negative marker न nʌ between verb stems and past habitual maker थ् tʰ plus agreement inflections. In this case, semantically null elements दै dʌi and द di are inserted between the negative maker न nʌ '-NEG' and verb stem as पउँदैनथ pi-ũ-dʌinʌ-tʰẽ 'drink-φ-HAB.NEG.PST.1SG'



168



in (30). There are altogether eleven past habitual negative forms and they are listed in Table 4.53. 30.



म रि स पउँदैनथ।



mʌ rʌksi pi-ũ-dʌinʌ-tʰẽ 1SG alcohol drink-φ-HAB.NEG.PST.1SG 'I did not used to drink alcohol.'



Table 4.53: Inflections for habitual aspect (negative) Grammatical category First person singular



Inflections दै नथ



dʌinʌtʰẽ



Tags PST. NEG.HAB.1SG



First person plural



दै न य



dʌinʌtʰjʌũ



PST.NEG.HAB.1PL



Second person singular



दै न थस्



dʌinʌtʰis



PST.NEG.HAB.2SG



Second person singular hon



दै न यौ



dʌinʌtʰjʌu



PST.NEG.HAB.2SG.HON



Second person plural



दै न यौ



dʌinʌtʰjʌu



PST.NEG.HAB.2PL



dinʌtʰis



PST.NEG.HAB.2SG.FEM



dinʌtʰjʌu



PST.NEG.HAB.2SG.FEN.HO N



dʌitʰjo



PST.NEG.HAB.3SG.MASC



dinʌtʰis



PST.NEG.HAB.3SG.FEM



दनथे



dinʌtʰe



PST.NEG.HAB.3SG.MASC. HON



दै नथे



dʌinʌtʰe



PST.NEG.HAB.3PL



Second person feminine दन थस् singular Second person feminine दन यौ singular hon Third person masculine दै न यो singular Third person feminine singular दन थस् Third person singular hon Third person plural



masculine



The finite state transducer in Figure 4.23 can analyze and generate the negative inflections of habitual aspect and it is also concatenated with the finite state transducer of the verb stems.



169



Figure 4.23: A finite state transducer for inflections of habitual aspect (negative)



d. Inferential (unknown) aspect Inferential (unknown) aspect indicates the event that took place in past but it is known at present based on some evidence or clues. The inferential form of a verb is formed by inserting the inferential aspect marker ए e and इ i between the verb stems and non-past tense plus agreement inflections. The verb form सुतेछौ sut-etsʰʌu 'sleepINFER.2.PL'



in (31) is the inferential aspect form and Table 4.54 lists twelve inferential



aspect inflections. (31)



तमीह



त हजो बजारमा सुतेछौ।



timiː-ɦʌruː tʌ ɦidzo bʌdzar-ma sut-etsʰʌu PART yesterday market-LOC sleep-INFER.2.PL 2-PL '(I came to know that) you slept in the market yesterday.'



170



Table 4.54: Inflections for inferential aspect (affirmative) Grammatical category



Inflections



IPA



Tag



First person singular



एछु



etsʰu



PST.INFER.1SG



First person plural



एछ



etsʰʌũ



PST.INFER.1PL



Second person masculine singular



एछस्



etsʰʌs



PST.INFER.2SG.MASC



Second person feminine singular



इछस्



itsʰʌs



PST.INFER.2SG.FEM



Second person masculine singular एछौ hon Second person feminine singular hon इछौ



etsʰʌu



PST.INFER.2SG.MASC.HON



itsʰʌu



PST.INFER.2SG.FEM.HON



Second person plural



एछौ



etsʰʌu



PST.INFER.2PL



Third person masculine singular



एछ



etsʰʌ



PST.INFER.3SG.MASC



Third person feminine singular



इछ



itsʰʌ



PST.INFER.3SG.FEM



Third person masculine singular hon



एछन्



etsʰʌn



PST.INFER.3SG.MASC.HON



Third person feminine singular hon



इछन्



itsʰʌn



PST.INFER.3SG.FEM.HON



Third person plural



एछन्



etsʰʌn



PST.INFER.3PL



The finite state transducer in Figure 4.24 can analyze and generate the positive inflections of inferential aspect and it is concatenated with the finite state transducer of verb stems.



Figure 4.24: A finite state transducer for inflections of inferential aspect (affirmative) The phonological rules listed PR 4.15 are compiled and composed with the transducer demonstrated in 4.24.



171



Phonological rules PR 4.15 i. Independent vowels ए e and इ i change to their corresponding dependent vowels ◌े e and ि◌ i, respectively if the verb stems end with consonants. Regular expression: ए -> ◌े || cons __; इ -> ि◌ || cons __;



The inferential aspect negative forms are formed by inserting the negative marker न



nʌ between inferential aspect maker ए e or इ e and agreement inflections as सुतेनछौ sut-enʌtsʰʌu 'sleep-INFER.NEG.2.PL' in (32). There are twelve inferential aspect negative forms which are listed in Table 4.55. (32)



तमीह



त हजो बजारमा सुतेनछौ।



timiː-ɦʌruː tʌ ɦidzo bʌdzar-ma sut-enʌtsʰʌu PART yesterday market-LOC sleep-INFER.NEG.2PL 2-PL '(I came to know that) you did not sleep in the market yesterday.'



Table 4.55: Inflections for inferential aspect (negative) Grammatical category



Inflectio ns



IPA



Tags



First person singular



एनछु



enʌtsʰu



PST.INFER.NEG.1SG



First person plural



एनछ



enʌtsʰʌũ



PST.INFER.NEG.1PL



Second person masculine singular Second person feminine singular Second person masculine singular hon Second person feminine singular hon Second person plural



एनछस्



enʌtsʰʌs



PST.INFER.NEG.2SG.MASC



इनछे स्



inʌtsʰes



PST.INFER.NEG.2SG.FEM



एनछौ



enʌtsʰʌu



PST.INFER.NEG.2SG.MASC.HON



इनछौ



inʌtsʰʌu



PST.INFER.NEG.2SG.FEM.HON



एनछौ



enʌtsʰʌu



PST.INFER.NEG.2PL



Third person masculine singular Third person feminine singular Third person masculine singular hon Third person feminine singular hon Third person plural



एनछ



enʌtsʰʌ



PST.INFER.NEG.3SG.MASC



इनछ



inʌtsʰʌ



PST.INFER.NEG.3SG.FEM



एनछन्



enʌtsʰʌn



PST.INFER.NEG.3SG.M.HON



इनछन्



inʌtsʰʌn



PST.INFER.NEG.3SG.FEM.HON



एनछन्



enʌtsʰʌn



PST.INFER.NEG.3PL



172



The finite state transducer in Figure 4.25 can analyze and generate the negative inflections of inferential aspect. This transducer is also concatenated with transducer of verb stems.



Figure 4.25: A finite state transducer for inflections of inferential aspect (negative) The rules involved in this process are listed in PR 4.16 which are compiled and composed with finite state transducer illustrated in Figure 4.25.



Phonological rules PR 4.16 i. Independent vowels ए e, इ i and ई i: change to their corresponding dependent vowels ◌े e, ि◌ i and ◌ी i:, respectively if the verb stems end with consonants. Regular expressions:



ए -> ◌े || cons __; इ -> ि◌ || cons __; ई -> ◌ी || cons __;



4.5.4 Moods Morphologically, Nepali has two types of moods, namely, declarative and nondeclarative. The former one does not have a distinct marker to indicate the mood, rather they are indicated by the default system; and the latter is further sub-divided



173



into imperative, optative and potential moods (Pokharel 2054VS). Each of them is indicated by their respective markers.



a. Imperative Mood The imperative form of a verb has number and honorific distinctions. The base stem of the verb indicates singular non-honorific imperative form. Other two forms are singular honorific and plural. The imperative markers differ depending upon the end segment of the verb stems. The consonant ending verb stems take -φ for singular nonhonorific and अ-ʌ for singular honorific and plural. In the case of vowel ending verb stems, i-ending verb stems take -ई -iː for singular non-honorific and -अ -ʌ for singular honorific and plural; and other vowel ending take -φ for singular non-honorific, -ऊ -u: for singular honorific and for plural. The imperative inflections are listed in Table 4.56. In example (33), the imperative verb form जाओ dza-o 'go-IMP.2PL' indicates plural form for instance. (33)



तमीह



बजार तर जाओ!



timiː-ɦʌruː bʌdzar-tirʌ dza-o! market-DIR go-IMP.2PL 2-PL '(You) go towards the market.' Table 4.56: Inflections for imperative mood Grammatical Category Second person singular



Inflections -φ/-iː -φ/-ई



Tag IMP.2SG



Second person singular hon



-अ/-ऊ



-ʌ/u:



IMP.2SG.HON



Second person plural



-अ/-ओ



-ʌ/-o



IMP.2PL



The finite state transducer in Figure 4.26 can analyze and generate the inflections of imperative mood. This transducer is also concatenated with transducer of verb stems.



174



Figure 4.26: A finite state transducer for inflections of imperative mood The phonological rules listed in PR 4.17 are compiled into a network and composed with the finite state transducer as demonstrated in Figure 4.26. Phonological rules PR 4.17 i. ^IMPsg removed for non-honorific imperative Regular expression: ^IMPsg -> [ ], ii. ^IMPpl changes to ऊ for honorific imperative after आ a ending verb stems Regular expression: ^IMPpl -> ऊ || आ _ , iii. ^IMPhon changes to ओ for plural imperative after आ a ending verb stems Regular expression: ^IMPpl -> ऊ || ओ _ ,4 Negative Imperative forms are obtained from prefixing the negative marker न nʌ 'NEG-' to the imperative form of the verb stems. The example sentence in (34) is negative sentence of example (33) in which form नजाओ nʌ-dza-o 'NEG-go-IMP.2PL' is negative imperative one. (34)



तमीह



बजार तर नजाओ !



timiː-ɦʌruː bʌdzar-tirʌ nʌ-dza-o! market-DIR NEG-go-IMP.2PL 2-PL '(You) go towards the market.'



4



Arbitrary tags ^IMPsg and ^IMPpl are used for creating the environment and are finally eliminated from the network.



175



b. Optative Mood The optative forms are obtained from the combination of verb stems and the optative inflections. The verb stems in optative mood inflect for person, number, gender and honorificity. The example in (35), the verb form गरे स् gʌr-es 'do-OPT.2SG' indicates second person singular optative form. There are altogether eight optative inflections and they are listed in Table 4.57. ् (35) ल परदे शमा राॆोसँग काम गरे स।







pʌrʌdes-ma ramro-sʌ̃gʌ kam gʌr-es foreign-LOC good-COM work do-OPT.2SG 'I wish, (you) work nicely in foreign country.'



PART



Table 4.57: Inflections for optative mood (affirmative) Grammatical category First person singular



Inflections ऊँ



ũ



Tags OPT.1SG



First person plural







ʌũ



OPT.1PL



Second person singular



एस्



es



OPT.2SG



Second person singular hon







e



OPT.2SG.HON



Second person plural







e



OPT.2PL



Third person singular



ओस्



os



OPT.3SG



Third person singular hon



ऊन्



u:n



OPT.3SG.HON



Third person plural



ऊन्



u:n



OPT.3PL



The finite state transducer in Figure 4.27 can analyze and generate the inflections of optative mood. This transducer is also concatenated with transducer of verb stems.



Figure 4.27: A finite state transducer for inflections of optative mood



176



The phonological rules listed in PR 4.18 are compiled and composed with the transducer illustrated in Figure 4.27.



Phonological rules PR 4.18 i. Independent vowels ए e, औ ʌu, ओ o and ऊ u: change to their corresponding dependent vowels ◌े e, ◌ौ ʌu, ◌ो o and ◌ू u:, respectively if the verb stems end with consonants. Regular expression:



ए -> ◌े || cons __; औ -> ◌ौ || cons __; ओ -> ◌ो || cons __; ऊ -> ◌ू || cons __;



The negative optative forms are obtained by prefixing the negative marker न nʌ 'NEG' to the optative forms of the verb. In the sentence (36) the form नगरोस् nʌ-gʌr-os 'NEG-do-OPT.3SG' is the third person singular negative optative form.



36.



उसले यःतो काम नगरोस्।



us-le tjʌsto kam nʌ-gʌr-os 3SG.OBL-ERG like that work NEG-do-OPT.3SG 'I wish him not to do work like that.'



c. Potential Mood Potential forms of the verbs are obtained from the combination of verb stems and potential inflections. The potential forms make the distinction on person, number, gender and honorificity. In example (37), the verb stem गर ् gʌr 'do' and potential mood marker -ला-la '-POT' form third person singular potential verb form. Altogether, there are twelve potential mood inflections and they are listed in Table 4.58.



177



(37) यो केटाले प र ा पास गला। jo



ket ̺a-le pʌriksja pas gʌr-la DEM.PROX boy-ERG examination pass do-POT.3SG 'This boy may pass the examination.' Table 4.58: Inflections for potential mood (affirmative)



Grammatical category First person singular



Inflections



उँला



First person plural



औला



IPA Tag ũla POT.1SG ʌũla POT.1PL



Second person masculine singular



लास्



las



POT.2SG.MASC



Second person feminine singular



लस्



lis



POT.2SG.FEM



Second person masculine singular hon



औला



ʌula



POT.2SG.MASC.HON



Second person feminine singular hon



औल



ʌuli



POT.2SG.FEM.HON



Second person plural



औला



ʌula



POT.2PL



Third person masculine singular



ला



la



POT.3SG.MASC



Third person feminine singular







li



POT.3SG.FEM



Third person masculine singular hon



लान्



lan



POT.3SG.MASC.HON



Third person feminine singular hon



लन्



lin



POT.3SG.FEM.HON



Third person plural



लान्



lan



POT.3PL



The finite state transducer in Figure 4.28 can analyze and generate the inflections of potential mood.



Figure 4.28: A finite state transducer for inflections of potential mood The transducer presented in Figure 4.28 is composed with the network of the phonological rules in PR 4.19. 178



Phonological rules PR 4.19 i. Independent vowels उ u and औ ʌu change to their corresponding dependent vowels ◌ु u and ◌ौ ʌu if the verb stems end with consonants. Regular Expression:



उ -> ◌ु || cons __; औ -> ◌ौ || cons __;



The negative potential mood forms are obtained from prefixing the negative marker न



nʌ 'NEG-' to the verb stems. The verb form नगला nʌ-gʌr-la 'NEG-do-POT.3SG' in (38) is third person negative potential form. (38) यो केटाले प र ा पास नगला। jo



ket ̺a-le pʌriksja pas nʌ-gʌr-la DEM.PROX boy-ERG examination pass NEG-do-POT.3SG 'This boy may not pass the examination.'



4.5.5 Participial forms a. Absolutive The absolutive form is formed from the combination of verb stems and the absolutive marker ई i: '-ABS'. The absolutive form normally occurs with other forms of the verbs forming the compound verbs. In example (39) the verb form पठाई pʌt ̺ʰa-iː 'send-ABS' is the absolutive form. The inflection for absolutive participle is listed in Table 4.59 with its morphological tag and its finite state transducer is demonstrated in Figure 4.29. (39) उसले िच ी पठाई दयो। us-le tsit ̺ʰt ̺ʰː 3SG.OBL-ERG letter 'He sent a letter.'



pʌt ̺ʰa-iː send-ABS



di-jo give-PST.3SG



Table 4.59: Inflection for absolutive participle Grammatical category Absolutive



Inflections ई



179



IPA i:



Tags ABS



The finite state transducer presented in Figure 4.29 can analyze and generate the absolutive form when it is concatenated with the finite state transducer of the verb stems.



Figure 4.29: A finite state transducer for inflection of absolutive form The phonological rules involved are compiled and composed with finite state transducer as demonstrated in Figure 4.29.



Phonological rules PR 4.20 i. Independent vowel ई i: changes to its corresponding dependent vowel ◌ी i: if the verb stems end with consonants. Regular expression: ई -> ◌ी || cons __; The negative absolutive form is obtained from prefixing the negative marker न nʌ ' NEG-' to



the absolutive verb form. The verb form नपठाई nʌ-pʌt ̺ʰa-iː 'NEG-send-ABS' in



(40) is negative absolutive form. (40) उसले िच ी नपठाई रा यो। us-le tsit ̺ʰt ̺ʰː nʌ-pʌt ̺ʰa-iː rakʰ-jo 3SG.OBL-ERG letter NEG-send-ABS keep-PST.3SG 'He kept the letter without reading it.' b. Infinitive The infinitive form of a verb, in fact, is the dictionary entry in Nepali. It is obtained from suffixing an infinitive marker नु -nu '-INF' to the verb stem. In example (41), the verb form हँ नु ɦĩd̺-nu 'walk-INF' is the infinitive form. The infinitive has three forms:



180



infinitive, oblique and emphatic. The infinitive is the default one with the marker नु -



nu '-INF', the oblique form with marker -ना -na occurs with case markers and the emphatic form with marker -नै -nʌi occurs at pragmatic level. The infinitive markers are listed in Table 4.60 with their morphological tags. बहान हँ नु राॆो कुरो हो।



(41)



biɦan ɦĩd-nu ramro kuro ɦo morning walk-INF good.SG.MASC thing be.ID.NPST.3SG 'To walk in the morning is a good thing.' Table 4.60: Inflections for infinitive participle Grammatical category Infinitive



Inflections



Tags



नु



nu



INF



Infinitive Oblique



ना



na



INF.OBL



Infinitive Emphatic



नै



nʌi



INF.EMPH



The finite state transducer illustrated in Figure 4.30 can analyze and generate the infinite forms when it is concatenated with the finite state transducer of the verb stems.



Figure 4.30: A finite state transducer for inflections of infinitive participial form The negative infinitive form is obtained from prefixing the negative marker न- nʌ'NEG-' to the verb stems. The verb form न हँ नु nʌ-ɦĩd̺-nu 'NEG-walk-INF' in (42) is negative infinitive form. (42) बहान न ह नु राॆो कुरो होइन। biɦan nʌ-ɦid̺-nu ramro kuro ɦoinʌ morning NEG-walk-INF good.SG.MASC thing be.ID.NPST.NEG.3SG 'Not to walk in the morning is not a good thing.' 181



c. Purposive Purposive form of the verb is obtained from the combination of the verb stem and purposive marker -न -nʌ '-PURP'. The verb form क



kin-nʌ 'buy-PURP' in (43) is the



purposive form. The purposive marker inflects for emphasis also. The inflections for purposive participle are listed in Table 4.61 with their morphological tags. (43) म समान क बजार जा छु । mʌ sʌman kin-nʌ bʌdzar dza-n-tsʰu 1SG thing buy-PURP market go-φ-NPST.1SG 'I will go the market (in order) to buy things.' Table 4.61: Inflections for purposive participle Grammatical category Purposive



Inflections न



IPA nʌ



PURP



Purposive emphasis



Tags



नै



nʌi



PURP.EMPH



The finite state transducer illustrated in Figure 4.31 encodes purposive participial inflections listed in Table 4.61 and it is capable of analyzing and generating the purposive forms when it is concatenated with the finite state transducer of the verb stems.



Figure 4.31: A finite state transducer for inflections of purposive participial form d. Prospective The prospective form of the verb is formed from the suffixation of the prospective marker -ने -ne with the verb stems. The verb form गुजान gudzar-ne in (44) is the prospective form. The prospective inflection is listed in Table 4.62 with its morphological tag. (44) उ नह ले यो वष प न ऽपालमु न नै गुजान छन्। uniː-ɦʌruː-le 3.OBL-PL-ERG



jo wʌrsʌ pʌni tripalmuni nʌi gudzar-ne this year also tent PART spend-PROSP 182



tsʰʌn be.NPST.3PL 'This year also, they will spend under the tent.' Table 4.62: Inflection for prospective participle Grammatical category Prospective



Inflections ने



IPA ne



Tags PROSP



The finite state transducer demonstrated in Figure 4.32 encodes the prospective participial forms and it can analyze and generate the prospective forms when it is concatenated with the finite state transducer of the verb stems.



Figure 4.32: A finite state transducer for inflection of prospective participial form



e. Durative The durative form is formed from the combination of verb stems and the durative markers -दा -da. In example (45), the verb आउँदा a-ũ-da form is the durative form which indicates the duration of the action. The durative form also inflects for emphasis. The inflections for durative participles are listed in Table 4.63 with their morphological tags. ँ (45) उ नह आउँदा म सु तरहेको थए।



uniː-ɦʌruː a-ũ-da mʌ sut-i-rʌɦ-eko tʰiẽ 3.OBL-PL come-φ-DUR 1SG sleep-ABS-remain-PERF.SG.M be.P1.SG '(I was sleeping while they arrived.' Table 4.63: Inflections for durative participle Grammatical category Durative



Inflections



Tags



दा



IPA da



Durative emphatic



दै



dʌi



DUR.EMPH



183



DUR



The durative inflections listed in Table 4.63 are compiled into a network as illustrated in Figure 4.33 and it becomes capable of analyzing and generating the durative forms of the verbs when it is concatenated with the finite state transducer of verb stems.



Figure 4.33 A finite state transducer for inflections of durative participial forms f. Conjunctive The conjunctive participle form of the verb is obtained from the combination of the verb stems and a conjunctive marker -एर -erʌ. In example (46), the पढे र pʌdʰ-erʌ is the conjunctive participle form. This conjunctive form inflects for emphasis at pragmatic level with marker -ऐ ʌi. There is another conjunctive marker -इकन -iʌknʌ but it has low frequency use. The inflections for conjunctive participle with their corresponding morphological tags are listed in Table 4.64 with their morphological tags. (46)



राम ःकुलमा पढे र आयो।



ram iskul-ma pʌdʰ-erʌ a-jo Ram school-LOC read-CONJ come-PST.3SG.M 'Ram studied in the school and came.' Table 4.64: Inflections for conjunctive participle Grammatical category Conjunctive



Inflections एर



erʌ



CONJ



Tags



Conjunctive emphasis



एरै



erʌi



CONJ.EMPH



Conjunctive



इकन



ikʌnʌ



CONJ



Conjunctive emphasis



इकनै



ikʌnʌi



CONJ.EMPH



The finite state transducer illustrated in Figure 4.34 encodes the conjunctive participial forms and it can analyze and generate them when it is concatenated with the finite state transducer of the verb stems.



184



Figure 4.34: A finite state transducer for inflections of conjunctive participial form The phonological rules involved in this process are listed in PR 4.21. They are compiled and composed with the finite state transducer illustrated in Figure 4.34.



Phonological rules PR 4.21 i. Independent vowels ए e and इ i change to their corresponding dependent vowels ◌े e and ि◌ i if the verb stems end with consonants. Regular expression: ए -> ◌े || cons __; इ -> ि◌ || cons __;



g. Conditional The conditional form of the verb is obtained form the combination of the verb stems and the conditional marker ए -e. In example (47) गरे gʌr-e is the conditional form of the verb. The inflection for conditional participle is listed in Table 4.65 with its morphological tags. (47) तमीले सहयोग गरे म पास हु छु । timiː-le sʌɦʌjog gʌr-e 2SG-ERG help do-COND 'If you help I will pass.'



mʌ pas ɦu-n-tsʰu 1SG pass be-φ-NPST.1SG



Table 4.65: Inflection for conditional participle Grammatical category Conditional



Inflections ए



Tag e



185



COND



The finite state transducer illustrated in Figure 4.35 encodes the conditional inflection and in association with transducer of verb stems, it can analyze and generate the conditional forms.



Figure 4.35: A finite state transducer for conditional participial form The phonological rules in PR 4.22 are compiled and composed with the finite state transducer illustrated in Figure 4.35.



Phonological rules PR 4.22 i. Independent vowel ए e changes to its corresponding dependent vowel ◌े e if the verb stems end with consonants. Regular expression: ए -> ◌े || cons __; h. Perfective The perfective form of the verb is obtained form the combination of the verb stems and the perfective marker ए -e. In example (48) गरे gʌr-e is the perfective form of the verb. But this form generally occurs with some postpositions. The inflection for perfective participle is listed in Table 4.65 with its morphological tag.5



(48)



तमीले भनेअनुसार मैले काम गरे ।



timiː-le bʰʌn-e-ʌnusar mʌi-le kam gʌr-ẽ 2SG-ERG say-PERFT-POSTP 1SG.OBL-ERG work do-PST.1SG 'I did the work as you said.'



Table 4.65: Inflection for conditional participle Grammatical category Perfective



5



Inflection ए



Tag e



Pokharel (2054VS) has treated this form as oblique of past tense forms.



186



PERFT



The finite state transducer illustrated in Figure 4.36 encodes the perfective inflection. It can analyze and generate the perfective forms of the verbs when concatenated with the finite state transducer of the verb stems.



Figure 4.36: A finite state transducer for inflection of conditional participial form The phonological rules listed in PR 4.23 are compiled and composed with the finite state transducer illustrated in Figure 4.36. Phonological rules PR 4.23 i. Independent vowel ए e changes to its corresponding dependent vowel ◌े e if the verb stems end with consonants. Regular expression:



ए -> ◌े || cons __;



4.6 Summary In this chapter, we presented the verb stems that are classified into two major groups: intransitive based stems and transitive based stems. Intransitive based stems are further grouped into five classes and transitive based stems are grouped into four classes. Each class is distinct at least in one feature discussed in the process of stem formation. And a set of few irregular verb stems have been discussed as a separate class. The existential and identificational auxiliary verbs were discussed and illustrated under inflections as they more or less carry the features similar to the inflections. Inflectional paradigms for both affirmative and negative forms of tense, aspects, moods and participle are analyzed and illustrated. The finite state transducers of each type have been demonstrated. The phonological rules are identified, stated and expressed in regular expression format.



187



CHAPTER 5 ADVERBS, CONJUNCTIONS, POSTPOSITIONS AND PARTICLES



5.0 Outline This chapter presents the analysis of closed words. It consists of four sections. Section 5.1 groups adverbs and presents them with morphological tags and corresponding finite state transducers. In section 5.2, we present conjunctions with their morphological tags and finite state transducers. Section 5.3 deals with postpositions, namely, plural marker, case markers and adverbial postpositions with their morphological tags. Section 5.4 discusses the particles and interjections and also presents morphological tags and finite state transducers. And section 5.5 finally summarizes the findings.



5.1 Adverbs in Nepali Adverbs in Nepali indicate manner, place, time and intensity. They do not inflect for anything but appear with postpositions in writing (Adhikari 2062VS). They are not the obligatory elements in the sentence. However, they are classified into various groups based on their semantics for our purpose.1 5.1.1 Temporal adverbs Temporal adverbs are those adverbs which indicate time with respect to the action performed as आज adzʌ 'today' in (1). Table 5.1 lists some temporal adverbs. (1)



राम आज ःकुल गयो।



ram adzʌ skul gʌ-jo Ram today school go.PST-3SG.MASC 'Today, Ram went to school.'



1



Though the classification of adverbs in Nepali into various classes is not computationally significance, it has been done simply for the identification.



188



Table 5.1: Temporal Adverbs Morphological tags +ADV+TEMP



Devanagari अ हले



IPA ʌhile



Gloss now



+ADV+TEMP



आज



adzʌ



today



+ADV+TEMP



अबेर



ʌberʌ



late



+ADV+TEMP



सोमबार



somʌbarʌ Monday winter hiũdʌ



हउँद



+ADV+TEMP



Since temporal adverbs do not inflect, the finite-state transducer demonstrated in Figure 5.1 is capable of analyzing and generating them.



Figure 5.1: A finite state transducer for temporal adverbs



5.1.2 Spatial adverbs Spatial adverbs are those adverbs which indicate place or location in the space where the action has taken place as नजकै nʌdzik-ʌi 'near-EMPH' in (2). Table 5.2 lists some of the spatial adverbs. (2)



मेरो घर नजकै मि दर छ।



mero gʰʌr nʌdzik-ʌi mʌndir tsʰʌ 1SG.GEN house near-EMPH temple be.NP.3SG.MASC 'There is a temple near my house.' Table 5.2: Spatial adverbs Morphological tags +ADV+SPAC



Devanagari तल



IPA tʌlʌ



Gloss below



+ADV+SPAC



यहाँ



jʌhã



here



+ADV+SPAC



पछा ड



pʌtsad̺i



behind



+ADV+SPAC



भऽ



bʰitrʌ



inside



+ADV+SPAC



निजक



nʌdzik



near



The finite state transducer illustrated in Figure 5.2 encodes the adverbs listed in Table 5.2 and it can analyze and generate them. 189



Figure 5.2: A finite state transducer for spatial adverbs



5.1.3 Amount adverbs Amount adverbs are those words which indicate the amount of the head nouns it modifies as धेरै derʌi 'more' in (3). Table 3 lists some of the amount adverbs. (3)



रामसँग धेरै पैसा छ।



ram-sʌ̃gʌ derʌi pʌisa tsʰʌ Ram-COM more money be.NP.3SG.MASC 'Ram has a lot of money.' Table 5.3: Amount adverbs Morphological tags +ADV+AMOUNT



Devanagari IPA धेरै dʰerʌi



Gloss more



+ADV+AMOUNT



थोरै



tʰorʌi



less



+ADV+AMOUNT



अ लक



ʌlik



little



+ADV+AMOUNT



यत



tjʌti



that much



+ADV+AMOUNT



कत



kʌti



how much



Since amount adverbs do not inflect, the finite-state transducer is simple and it is demonstrated in Figure 5.3 and it is capable of analyzing and generating them.



Figure 5.3: A finite state transducer for amount adverbs



190



5.1.4 Manner adverbs Manner adverbs are those adverbs which indicate the ways or modes how the action has taken place as बःतारो bistaro 'slowly' in (4). Table 4 lists the some of the manner adverbs. (4)



सीता बःतारो प छे ।



siːta bistaro pʌd̺ʰ-tsʰe Sita.FEM slowly read-NP.3SG.FEM 'Sita slowly reads.' Table 5.4: Manner Adverbs Morphological tags +ADV+MANNER



Devanagari IPA सुःतर sustʌri



Gloss slowly



+ADV+MANNER



कसर



kʌsʌri



how



+ADV+MANNER



सुटु



sut ̺ukkʌ



quietly



+ADV+MANNER



छटो



tsit ̺o



quickly



+ADV+MANNER



यसर



jʌsʌri



this way



The finite state transducer demonstrated in Figure 5.4 encodes the manner adverbs listed in Table 5.4 and it can analyze and generate them.



Figure 5.4 A finite state transducer for manner adverbs 5.1.5 Frequency adverbs Frequency adverbs are those words which indicate the frequency of the action that is performed or takes place as क हलेकाह kʌhilekahĩ 'sometimes' in (5). Table 5 lists some of the frequency adverbs. (5)



हामी क हलेकाह बजार जा छ ।



ɦamiː kʌhilekahĩ bʌdzar dza-n-tsʰʌũ 1PL sometimes market go-NP.3PL 'We sometimes go to market.' 191



Table 5.5: Frequency Adverbs Morphological tags +ADV+FREQ



Devanagari



+ADV+FREQ



सध



IPA Gloss kʌhilekahĩ sometimes always sʌdʰʌĩ



+ADV+FREQ



बार बार



barʌmbar



frequently



+ADV+FREQ



ूायः



prajʌ



often



+ADV+FREQ



नर तर



nirʌntʌr



continuously



क हलेकाह



The finite-state transducer is simple since frequency adverbs do not inflect. It is demonstrated in Figure 5.5 and it is capable of analyzing and generating them.



Figure 5.5: A finite state transducer for frequency adverbs



5.1.6 Reason adverbs Reason adverbs are those which provide the reasons as यसकारण tjʌskarʌɳ 'therefore' in (6). Table 5.6 lists some of the reason adverbs. (6)



यसकारण म मा थ दुवै ठाउँको ूभाव छ।



tjʌskarʌɳ mʌ matʰi duwʌi tʰaũko prʌbʰawʌ tsʰʌ therefore 1SG above both place-GEN influence be-NP.3SG.MASC 'Therefore, I have the influences of both places.' Table 5.6: Reason adverbs Morphological tags +ADV+REASON



Devanagari



+ADV+REASON



फलःव प



+ADV+REASON



तसथ



यसकारण



IPA tjʌsʌkarʌɲ



Gloss therefore



pʰʌlʌswʌrup as a result thus tʌsʌrtʰʌ



The reason adverbs listed in Table 5.6 are compiled into a finite state transducer as demonstrated in Figure 5.6 and it can analyze and generate them.



192



Figure 5.6: A finite state transducer for reason adverbs



5.1.7 Sentential adverbs Sentential adverbs are those which modify the entire sentence as सायद sajʌdʌ 'probably' in (7). Table 5.7 lists some of the sentential adverbs. (7)



ँ सायद य त धेरै खुशीलाई मनमै अटाउन असमथ भए।



sajʌdʌ probably



bʰʌẽ



jʌti dʰerʌi kʰusi-laiː mʌn-mʌi ʌtaunʌ ʌsʌmʌrtʰ this much more happy-DAT heart-LOC.EMPH keep-INF unable



be-P.1SG 'Probably, (I) could not keep this much happiness within the heart.'



Table 5.7: Sentential adverbs Morphological tags +ADV+SENT



Devanagari IPA सायद sajʌd



Gloss probably



+ADV+SENT



अवँय



ʌwʌʃjʌ



surely



+ADV+SENT



सामा यतः



samanjʌtʌ



generally



+ADV+SENT



य प



jʌdjʌpi



however



+ADV+SENT



साँ चै



sãtstsʌi



truly



The finite state transducer illustrated in Figure 5.7 encodes the sentential adverbs listed in Table 5.7 and it is capable of analyzing and generating them.



Figure 5.7: A finite state transducer for sentential adverbs



193



The individual finite-state transducers of the adverbs can be unioned together into a finite state transducer that can handle all the adverbs, that means, it can analyze and generate them.



5.2 Conjunctions in Nepali Conjunctions in Nepali are of two kinds: coordinate and subordinate. The coordinate conjunctions are simple in their formation whereas the subordinate conjunctions are simple as well as compound types (Pokharel 2054VS; Adhikari 2062VS).



5.2.1 Coordinate conjunctions Coordinate conjunctions join the constituents of the sentence having equal status as र



rʌ 'and' in (8) it has joined कलम kʌlʌm 'pen' and कताब kitab 'book'. Some coordinate conjunctions in Nepali are listed in Table 5.8. (8)



रामले कलम र कताब क यो।



ram-le kʌlʌm rʌ kitab Ram-ERG pen and book 'Ram bought a pen and a book.'



kin-jo buy-P.3SG.MASC



Table 5.8: Coordinate conjunctions Morphological tags +CCONJ



Devanagari र



IPA rʌ



Gloss and



+CCONJ



वा



wa



or



+CCONJ



तर



tʌrʌ



but



+CCONJ+EMPH



तरै



tʌrʌi



but



+CCONJ



अन



ʌni



and



+CCONJ



तथा



tʌtʰa



and



The coordinators are compiled into a finite state transducer as demonstrated in Figure 5.8 and it can analyze and generate them.



194



Figure 5.8: A finite state transducer for coordinate conjunctions 5.2.2 Subordinate conjunctions Coordinate conjunctions join the constituents of the sentence that have unequal status, mainly subordinate clause with matrix clause as



क ki 'that' in (9). Some of



subordinate conjunctions in Nepali are listed in Table 5.9 and the finite-state transducer to account them is demonstrated in Figure 5.9. (9)



भ डार ले भ ु भयो क उहाँले सं माम कन छो नु भयो।



bʰʌndari-le bʰʌn-nu bʰʌ-jo ki uhã-le sʌŋgram kinʌ Bhandari-ERG say-INF be-P.3SG that 3SG.HON war why tsʰod-nu bʰʌ-jo leave-INF be-P.3SG 'Bhandari said that why he gave up the war.' Table 5.9: Subordinate conjunctions Morphological Tags +SCONJ



Devanagari IPA कन क kinʌki



Gloss so that



+SCONJ



भने



bʰʌne



If



+SCONJ







ki



That



+SCONJ



कनभने



+SCONJ



यसकारण



kinʌbʰʌne that's why jʌsʌkarʌɳ Therefore



The subordinate conjunctions are compiled into a finite state transducer as illustrated in Figure 5.9 and it is capable of analyzing and generating them.



Figure 5.9: A finite state transducer for subordinate conjunctions 195



5.3 Postpositions in Nepali Postpositions in Nepali always follow the nominal. According to their functions they perform in the sentence and their semantics, the postpositions can be grouped into three classes, namely, plural/collective marker, case markers and adverbial postpositions (Hardie et al 2005).



5.3.1 Plural/collective marker -ह



-ɦʌruː is the plural or collective marker in Nepali and it occurs optionally with o-



ending nouns but systematically occurs with other non-o-ending nouns and pronouns. However, this is not an obligatory element to indicate the plural in nominals as there are other mechanisms to express the plural number in Nepali (see 3.1.1). The finitestate transducer is demonstrated in Figure 5.10.



Table 5.10: Collective/plural marker Morphological Tags +PL



Devanagari IPA -ह ɦʌruː



Gloss Plural



The plural marker has been compiled into a finite state transducer as illustrated in Figure 5.10 and it can analyze and generate it.



Figure 5.10: A finite state transducer plural/collective marker



5.3.2 Case markers in Nepali Cases in Nepali morphology are marked with postpositions except nominative case and absolutive case. In traditional Nepali grammars, the following cases are identified, namely, ergative, instrumental, dative, ablative, locative, commutative, genitive and allative. The ergative and instrumental cases have same marker -ले -le, ablative



case



is



marked



by



two



markers 196



बाट



bat ̺ʌ



and



दे िख



-dekʰi,



commutative/associative case with two markers सँग -sʌ̃gʌ and सत -sitʌ and genitive case is marked with को -ko , का -ka and क -ki for singular, plural and feminine features. The allative case is marked with तर -tirʌ (see 3.1.1). The case markers in Nepali are listed in Table 5.11a and Table 5.11b. The case markers in Table 5.11a do not take an emphatic marker whereas case markers in Table 5.11b take an emphatic marker. Table 5.11a: Case marker postpositions (i) Morphological tags +ERG



Devanagari ले



IPA -le



Gloss Ergative



+INST



ले



-le



Instrumental



+DAT



लाई



-laiː



Dative



+ABL



दे िख



-dekʰi



Ablative



They are separately compiled into a finite state transducer which can analyze and generate them. Table 5.11b: Case marker postpositions (ii) +ABL



बाट



-bat ̺



Ablative



+ABL+EMPH



बाटै



-bat ̺ʌi



Ablative



+LOC



मा



-ma



Locative



+LOC+EMPH



मै



-mʌi



Locative



-sʌ̃gʌ



Commitative



-sʌ̃gʌi



Commitative



+COM



सँग



+COM+EMPH



सँगै



+COM



सत



-sitʌ



Commitative



+COM+EMPH



सतै



-sitʌi



Commitative



+GEN+SG



को



-ko



Genitive



+GEN+PL



का



-ka



Genitive



+GEN+FEM







-ki



Genitive



+GEN+EMPH



कै



-kʌi



Genitive



+DIR



तर



-tirʌ



Directional



+DIR+EMPH



तरै



-tirʌi



Directional



The case markers listed in Table 5.11b also take emphatic marker. Therefore, they are compiled into a finite state transducer along with emphatic marker. It can analyze and generate them. 197



5.3.3 Adverbial postpositions There are a number of forms which behave like postpositions but they also have the content meaning like that of adverbs. These forms usually occur with nominals providing the information about time, space, amount, frequency and manner. Some adverbial postpositions which do not take emphatic marker are listed in Table 5.12a.



Table 5.12a: Adverbial postpositions (a) Morphological tags +POSTP



Devanagari मा थ



IPA -matʰi



Gloss above



+POSTP



कहाँ



-kʌhã



where



+POSTP



मु न



-muni



below, under



+POSTP



वा र



-wari



this side



+POSTP



पा र



-pari



other side



+POSTP



वर



-wʌri



this side



+POSTP



पर



-pʌri



other side



+POSTP







-pʌt ̺t ̺i



towards



+POSTP



ूत



-prʌti



towards



+POSTP



पा ल



-pali



time



+POSTP



खे र



-kʰeri



while doing



+POSTP



पा ल



-pali



time



+POSTP



छे उ



-tsʰeu



at the side



+POSTP



पछ



-pʌtsʰi



later



+POSTP



पछा ड



-pʌtsʰad̺i



back side



+POSTP



अिघ



-ʌgʰi



before



+POSTP



अगा ड



-ʌgad̺i



before



+POSTP



भर



-bʰʌri



full of



+POSTP



प हले



-pʌhile



before



+POSTP



नि त



-nimti



for



+POSTP



ला ग



-lagi



for



+POSTP



बारे



-bare



about



+POSTP



सामु



-samu



before



+POSTP



सर



-sʌri



like



+POSTP



म ये



-mʌdʰje



among



+POSTP



जत



-dzʌti



whatever



+POSTP



प छे



-pitstsʰe



each



+POSTP



पा ल



-pali



time



198



The finite-state transducer is demonstrated in Figure 5.12a encodes the adverbial postpositions listed in Table 5.12a which is capable of analyzing and generating them.



Figure 5.12a: A finite state transducer for adverbial postpositions that do not take emphatic marker Some postpositions take emphatic marker, they are listed in Table 5.12b. Table 5.12b: Adverbial postpositions (b) Morphological tags +POSTP



Devanagari स हत



IPA -sʌhitʌ



Gloss along with



+POSTP



साथ



-satʰʌ



with



+POSTP



स म



-sʌmmʌ



until, upto



+POSTP



बा हर



-baɦirʌ



out side



+POSTP



वार



-warʌ



+POSTP



पार



-parʌ



+POSTP



वर



-wʌrʌ



little near



+POSTP



पर



-pʌrʌ



little further



-ũbʰo



up



-ũdʰo



down



+POSTP



उँभो



+POSTP



उँधो



+POSTP



तफ



-tʌrpʰʌ



towards



+POSTP



नेर



-nerʌ



near



+POSTP



नर



-nirʌ



near



+POSTP



सम



-sʌmʌk



before



+POSTP



पय त



-pʌrjʌntʌ



till



+POSTP



खेर



-kʰerʌ



moment



+POSTP



उूा त



-uprantʌ



then after



+POSTP



साथ



-satʰʌ



with



+POSTP



पख



-pʌkʰʌ



time



+POSTP



ताक



-takʌ



time



+POSTP



ताका



-taka



time



199



+POSTP



पाला



-pala



time



+POSTP



पटक



-pʌt ̺ʌkʌ



times



+POSTP



प ट



-pʌlt ̺ʌ



times



+POSTP



प ात्



-pʌʃtsat



after



+POSTP



छे क



-tsʰekʌ



at the time



+POSTP



भऽ



-bʰitrʌ



inside



+POSTP



निजक



-nʌdzikʌ



near



+POSTP



भर



-bʰʌrʌ



full of



+POSTP



तक



-tʌkʌ



till



+POSTP



यता



-jʌta



here



+POSTP



उता



-uta



there



+POSTP



बीच



-biːtsʌ



between



+POSTP



नम



-nimittʌ



+POSTP



खा तर



-kʰatirʌ



for the sake of for



+POSTP



अ तगत



-ʌntʌrgʌtʌ



within



+POSTP



बमोिजम



-bʌmodzimʌ



according to



+POSTP



मा फक



-mapʰikʌ



according to



+POSTP



मुता बक



-mutabikʌ



according to



+POSTP



अनुसार



-ʌnusarʌ



according to



+POSTP



उपर



-upʌrʌ



on



+POSTP



माफत



-marpʰʌtʌ



via



+POSTP



अलावा



-ʌlawa



beside



+POSTP



अतर



-ʌtiriktʌ



in addition



+POSTP



बाहे क



-bahekʌ



except



+POSTP



जःतो



-dzʌsto



like



+POSTP



सरह



-sʌrʌhʌ



same as



+POSTP



बाबजुद



-babʌdzudʌ



+POSTP







-wiruddʰʌ



against



+POSTP



बापत



-bapʌtʌ



for



+POSTP



स ा



-sʌt ̺t ̺a



instead of



+POSTP



बदला



-bʌdʌla



instead of



+POSTP



लेखा



-lekʰa



+POSTP



सु



-suddʌ



+POSTP



समेत



-sʌmetʌ



200



along with



The adverbial postpositions which take emphatic marker listed in Table 5.12b are compiled into a network as demonstrated in Figure 5.12b which can analyze and generate them.



Figure 5.12b: A finite state transducer for adverbial postpositions that take emphatic marker Phonological Rule: PR 5.1 1. Vowels ◌ा a, ◌ो o and halant ◌् at the end of the adverbial postpositions of this group are removed when the emphatic marker ʌi ◌ै is attached. Regular expression ◌ा|◌ो|◌् ‐> [ ]  || cons __ ◌ै #;



5.4 Particles and interjections in Nepali 5.4.1 Particles In Nepali, particles are the residual class comprising those stems which do not enter into inflectional constructions and stand as free forms (Dahal 1974). They appear before or after any lexical word and add an abstract meaning to the word that they are associated with. The extra meaning added can only be predicted from the context where they are used. The Nepali particles are monosyllabic or disyllabic words and their behaviors are different from other indeclinable words such as adverbs, postpositions and conjunctions. In written Nepali, the particles are written separately. Some particles are listed in Table 5.13 with morphological tags.



201



Table 5.13: Particles in Nepali Morphological tags +PARTICLE



Devanagari IPA नै nʌi



+PARTICLE



माऽ



matrʌ



+PARTICLE



चा हँ



+PARTICLE



पन



pʌni



+PARTICLE











+PARTICLE



है



hʌi



+PARTICLE











+PARTICLE







ni



+PARTICLE











+PARTICLE



पो



po



+PARTICLE











tsahĩ



The finite state transducer as illustrated in Figure 5.13 encodes the particles listed in Table 5.13 and it can analyze and generate them.



Figure 5.13: A finite state transducer for particles



5.4.2 Emphatic markers a. ऐ ʌi ! emphasizes or draws an attention to or focuses a sentence or a part of the sentence. In addition, the techniques such as (a) sentence stress, (b) use of particles, (c) dislocation of the sentence constituents and (d) intonation are used for emphasizing the sentence or part of the sentence. The technique (a & d) are phonological and technique (b) is syntactic where the particles follow the word that gets focused. The technique (c) involves topicalization and some sorts of movements. The use of emphatic marker ऐ ʌi is not restricted to a particular class of words. Except some phonological constraints, it gets attached to any words irrespective of its parts of 202



speech. So, this marker can be said as global marker (Pokharel 2053VS). The following are the conditions where, this emphatic marker can or cannot appear. i. It doesn't appear with इ i, ए e and उ u (except आफू apʰuː) ending words. ii. When it gets attached to the words ending with ʌ, a and o, these vowels get deleted as illustrated below. म mʌ ' I'



केटो keto 'boy'



राजा radza 'king'



mʌ + ʌi



= mʌi



मै



keto + ʌi radza + ʌi



= ketʌi = radzʌi



केटै



राजै



iii. But, with the words ending with consonants, this maker gets attached without any change as given below. ख ruːkʰ 'tree'



कताब kitab 'book'



ruːkʰ + ʌi



= ruːkʰʌi



खै



kitab + ʌi



= kitabʌi



कताबै



In contrast, when this marker gets attached to the adjective, it deemphasizes the attribute possessed by adjectives. (10) रामको बानी राॆै छ। ram-ko bani ramrʌi tsʰʌ Ram-GEN habit good.DEMPH be.NP.3SG.MASC 'Ram's habit is okay (lesser than good).' b. ह ɦi emphasizes only some of the demonstratives. But, the same marker with interrogative pronouns changes it to the indefinite pronouns. The examples given below demonstrate this phenomenon. यो 'this' + ह = यह 'this-EMPH' यो 'that' + ह = यह 'that-EMPH' ऊ 'this' + ह = उह 'this-EMPH' सो 'the same' + ह = सोह 'the same-EMPH' को 'who' + ह = कोह 'who-INDEF' के 'what' + ह = केह 'what-INDEF'



5.4.3 Interjections in Nepali Interjections are a subset of particles which generally appear at the sentence initial position and express speaker's emotions such as surprise, pain, disgust, joy, 203



excitement and enthusiasm (Pokharel 2054VS:125-31). Interjections are also indeclinable words. Some examples of interjections are listed in Table 5.14 with morphological tags. Table 5.14: Interjections in Nepali Morphological Tags +INTERJ



Devanagari IPA ओहो oho



+INTERJ



आ था



attʰa



+INTERJ







tsʰi



+INTERJ



थुइ



tʰuikkʌ



+INTERJ



हरे



hʌre



+INTERJ



बचरा



bitsʌra



+INTERJ



या ए



+INTERJ



यू



+INTERJ +INTERJ



नाइँ



+INTERJ



कुि



dzja e dzju naĩ kunni



The interjections listed in Table 5.14 are compiled into a finite state transducer as demonstrated in Figure 5.14 and it is capable of analyzing and generating them.



Figure 5.14: A finite state transducer for interjections 5.5 Summary In this chapter, we discussed and presented the grouping of adverbs, conjunctions, case markers, particles and interjections. Adverbs in Nepali are grouped into seven semantic classes: temporal, spatial, amount, frequency, manner, reason and sentential. Conjunctions are of two types: coordinate and subordinate. Postpositions are of three types: plural marker, case markers and adverbial postpositions. Particles in general are in the single class but two emphatic markers which can be applied globally except in the phonologically constrained case is equivalent to some particles. The interjections form a single class. 204



CHAPTER 6 DERIVATIONAL MORPHOLOGY 6.0 Outline This chapter presents analysis of derivational processes in Nepali morphology. It consists of three sections. Section 6.1 presents the prefixation that includes noun to noun derivation, noun to adjective derivation, noun to adverb derivation and adjective to adjective derivation. In each types, a model for the finite state transducer and a tag for prefixaton is provided. In section 6.2, we present the suffixation process that includes noun to noun derivation, noun to noun adjective derivation, noun to noun/adjective derivation, adjective to noun derivation, adjective/noun to noun derivation, verb to noun derivation, verb to adjective derivation, verb to adverb derivation, adverb to adjective derivation, verb to noun conversion, verb to adjective/noun conversion and verb to noun derivation. In each type of derivation, the morphological tags and finite state transducers for each group are illustrated. And finally, section 6.3 summarizes the chapter.



6.1 Prefixation Derivation is a morphological process of word formation. It involves the additon of bound affix forms to an existing lexeme/stem, whereby the addition of the affix derives a new word or a lexeme (Katamba 1993; Payne 1997). The word class of newly formed word is generally different from the original word from which it is derived. Sometimes, this not always true, i.e. the word class remains the same, however, the semantics of the word definitely changes. The meaning of a derived new word may have clear meaning change, addition of speciality, technicality and stylistics. The derivation of a word from the same word class and from different class is possible. In most of the prefixing derivation, the word class remains same except few cases whereas in suffixing derivation the word class changes except few cases (Adhikari 2062VS). In prefixation, an affix is prefixed to a base stem and a new word is derived. In Nepali, a number of prefixes as listed in the Tables (6.1-6.4) with base stem derived a 205



word. The various types of derivation using prefixes are discussed in subsquent sections.



6.1.2 Noun to noun derivation In this type of derivation, 24 prefixes are involved and they derive a noun from a noun stem. The semantics of the prefixes is not predictable, so they are simply marked as prefix with a tag PFX. Table 6.1 lists those prefixes with base stem and derived word. Table 6.1: Noun to noun derivation Prefix



Base noun stem



Gloss



Derived noun



Gloss



चलन tsʌlʌn



tradition



ू prʌ-



ूचलन prʌtsʌlʌn



Tradition



परा pʌra-



जय dzʌjʌ



victory



पराजय pʌradzʌjʌ



Defeat



अप ʌpʌ-



श द ʃʌbdʌ



word



अपश द ʌpʌʃʌbdʌ



abusing word



सम् sʌm-



मान man



honor



स मान sʌmman



respect



अनु ʌnu-



शासन ʃasʌn



अनुशासन ʌnuʃasʌn



discipline



अव ʌwa-



गुण guɳ



quality



अवगुण ʌwaguɳ



demerit



दुस ् dus-



प रणामpʌriɳam



result



दुःप रणाम duspʌriɳam



bad result



दुर ् dur-



घटना gʰʌt ̺ʌna



event



दुघटना durgʰʌt ̺ʌna



accident



व wi-



नाश naʃ



loss



वनाश winaʃ



damage



अ ध ʌdʰi-



रा य radzjʌ



state



अ धरा य ʌdʰi radzjʌ



kingdom



अ त ʌti-



वृ wristi



rain



अ तवृ



over rain



अ भ ʌbʰi-



िच rutsi



interest



अ भ िच ʌbʰi rutsi



interest



ू त prʌti-



व न dʰwʌni



sound



ू त व न prʌti dʰwʌni



echo



प र pʌri-



योजना jodzʌna



plan



प रयोजना pʌrijodzʌna



project



उप upʌ-



मह grʌɦʌ



planet



उपमह upʌgrʌɦʌ



satalite



सह sʌɦʌ-



काय karjʌ



work



सहकाय sʌɦʌkarjʌ



collaboration



स sʌ-



प रवार pʌriwar



family



सप रवार sʌpʌriwar



whole family



कु ku-



पुऽ putrʌ



son



कुपुऽ kuputrʌ



bad son



ान gjan



knowledge



अ ान ʌgjan



ignorance



अन् ʌn-



आःथा astʰa



belief



अनाःथा ʌnastʰa



disbelief



बे be-



इ जत idzdzʌt



prestige



बेइ जत beidzdzʌt



insult



बद bʌdʌ-



नाम nam



name



बदनाम bʌdʌnam



bad name



ला la-



वा रस waris



heir



लावा रस lawaris



???



सु su-



समाचारsʌmatsar



news



सुसमाचार susʌmatsar



good news



अ ʌ-



governance



206



ʌtiwristi



The finite-state transducer in Figure 6.1 is common to all the prefixes listed in Table 6.1. It can analyze and generate both base and derived word.



+NOUN:0



NounType



PFX+:Prefix



NounType



Figure 6.1: A finite state transducer for noun to noun derivation



6.1.3 Noun to adjective derivation In this type of derivation, 9 prefixes are involved and they derive an adjective from a noun stem. The semantics of the prefixes is not predictable, so they are simply marked as prefix with a tag



PFX.



Table 6.2 lists those prefixes with base stem and derived



word. Table 6.2: Noun to adjective derivation Prefix नर ् nir-



Base noun stem दोष dos



Gloss fault



Derived adjective नद ष nirdos



Gloss innocent



नः ni:-



ःवाथ swartʰʌ



self-interest



नःःवाथ ni:swartʰʌ



selfless



न ni-



डर d̺ʌr



व wi-



मुख mukʰ



mouth



वमुख wimukʰ



deviated



नस् nis-



फल pʰʌl



fruit



नःफल nispʰʌl



fruitless



स sʌ-



बल bʌl



force



सबल sʌbʌl



capable



बे be-



घर gʰʌr



house



बेघर begʰʌr



homeless



अ ʌ-



मू य muːljʌ



value



अमू य ʌmuːljʌ



valueless



अन ʌnʌ-



मोल mol



price



अनमोल ʌnʌmol



priceless



fear



नडर nid̺ʌr



bold



The finite-state transducer in Figure 6.2 is common to all the prefixes listed in Table 6.2 and it is capable of analyzing and generating them.



207



+NOUN:0



NounType



+ADJ:0 NounType



PFX+:Prefix



Figure 6.2: A finite state transducer for noun to adjective derivation 6.1.4 Noun to adverb derivation In this type of derivation, 4 prefixes are involved and they derive an adverb from a noun stem. The semantics of the prefixes is not predictable, so they are simply marked as prefix with a tag



PFX.



Table 6.3 lists those prefixes with example base stem and



derived word. Table 6.3: Noun to adverb derivation Prefix आa-



स sʌ नर ् nir-



ू तprʌti-



Base Noun stem मरण mʌrʌɳ



Gloss



Derived Adverb



Gloss



death



आमरण amʌrʌɳ



till death



हष ɦʌrʂʌ



happy



सहष sʌɦʌrʂʌ



घात gʰat



stroke



नघात nirgʰat



with happiness severly



ह ा ɦʌpta



week



ू तह ा prʌtiɦʌpta



per week



The finite-state transducer in Figure 6.3 is common to all the prefixes listed in Table 6.3 and it can analyze and generate them.



+NOUN:0



NounType



+ADV:0 NounType



PFX+:Prefix



Figure 6.3: A finite state transducer for noun to adverb derivation



208



6.1.5 Adjective to adjective derivation In this derivation, 6 prefixes are involved and derive an adjective from an adjective stem. The semantics of the prefixes is not consistent, so they are simply marked as prefix with a tag PFX. Table 6.4 lists those prefixes with base stem and derived word.



Table 6.4: Adjective to adjective derivation Prefix सम्sʌmवwi-



दुर ् durउन् unसु su-



प र pʌri-



Base adjective stem पूण puːrɳʌ



Gloss



मु



muktʌ



िशि त ʃiktsʰit पूण puːrɳʌ



Gloss



complete



स पूण sʌmpuːrɳʌ



total



pure



वशु wiʃuddʰʌ



pure



शु ʃuddʰʌ भे bʰedjʌ



Derived adjective



vulnerable दुभ durbʰedjʌ free उ मु unmuktʌ educated सुिशि त suʃiktsʰit प रपूण pʌripuːrɳʌ



complete



invulnerable free educated sufficient



The finite-state transducer in Figure 6.4 is common to all the prefixes listed in Table 6.4 and it is capable of analyzing and generating.



+ADJ:0



AdjType



AdjType



PFX+:Prefix



Figure 6.4: A finite state transducer for adjective to adjective derivation 6.2 Suffixation 6.2.1 Noun to noun derivation In this derivation, 2 suffixes are involved and they derive a noun from a noun stem. The semantics of the suffixes is not considered, so they are simply marked as suffix with a tag SFX. Table 6.5 lists those suffixes with base stem and derived word. Table 6.5: Noun to noun derivation Base noun stem सुन sun



Gloss



Suffix



Derived noun



Gloss



gold



आर-ar



सुनारsunar



goldsmith



घाँस gʰãs



grass



ई-iː



घाँसी gʰãsiː



grass cutter



209



The finite-state transducer in Figure 6.5 is common to all the suffixes listed in Table 6.5 and it is capable of analyzing and generating them.



+NOUN:0



NounType



NounType



+SFX:Suffix



Figure 6.5: A finite state transducer for noun to noun derivation The phonological rules involved in the derivation are listed in PR 6.1. They are compiled and composed with the finite state transducer illustrated in Figure 6.5.



Phonological rule PR 6.1 i. Independent vowels आ a and ई i change to corresponding dependent vowels ◌ा a and ◌ी i:, respectively after the consonants Regular expressions:



आ -> ◌ा || cons __ ई -> ◌ी || cons __



6.2.2 Noun to adjective derivation In this derivation, 11 suffixes are involved and they derive an adjective from a noun stem. The semantics of the prefixes is not considered, so they are simply marked as suffix with a tag SFX. Table 6.6 lists those suffixes with example of base stem derived word.



210



Table 6.6: Noun to adjective derivation Base noun stem दया dʌja



Gloss



Suffix



Derived adjective



love



अनीय -ʌnijʌ



दयनीय dʌjanijʌ



लाभ labʰ



profit



अक -ʌkʌ



लाभक labʰʌkʌ



profitable



सेवा sewa



service



इका-ika



से वका sewika



service girl



मुगल mugʌl



Mugal



आन –an



मुगलान mugʌlan



ल बु limbu



Limbu



वान –wan



ल बुवान limbuwan



दान dan खच kʰʌrtsʌ



donation ई iː expense आलु-alu



दानी daniː



Gloss lovable



Indian Limbu area donor



खचालु kʰʌrtsalu



expensive



भर bʰir



cliff



आलो-alo



भरालो bʰiralo



steep



रस ris



anger



आहा-aha



रसाहा risaha



angry



शहर ʃʌɦʌr



town



इया-ija



शह रया ʃʌɦʌrija



urban



होस ɦos



sense



इयार-ijar



हो सयार ɦosijar



careful



The finite-state transducer illustrated in Figure 6.6 is common to all the suffixes listed listed in Table 6.6.



+NOUN:0



NounType



+ADJ:0 +SFX:Suffix



NounType



Figure 6.6 A finite state transducer for noun to adjective derivation The phonological rules involved in this derivation are listed in PR 6.2. They are compiled and composed with the finite state transducer demonstrated in Figure 6.6 which is capable of analyzing and generating them.



Phonological rule PR 6.2 i. Independent vowels आ a, ई i: and इ i change to their corresponding dependent vowel ◌ा a, ◌ी i: and ि◌ i after the consonants 211



आ -> ◌ा || cons __



Regular expressions:



ई -> ◌ी || cons __ इ -> ि◌ || cons __ ii. Vowel sequence of dependent ◌ा a and अ ʌ changes to अ ʌ. ◌ा अ -> [ ]



Regular expression:



iii. Vowel sequence of dependent ◌ा a and independentइ i changes to ि◌ i. ◌ा इ -> ि◌ [ ]



Regular expression:



6.2.3 Noun to noun/adjective derivation In this derivation, 3 suffixes are involved and they derive a noun or adjective from a noun stem. The semantics of the prefixes is not considered, so they are simply marked as suffix with a tag SFX. Table 6.7 lists those suffixes with example of base stem and derived word.



Table 6.7: Noun to noun/adjective derivation Base noun stem झापा dzʰapa



Gloss



Suffix ल -li



Derived noun/adjective झापाल dzʰapali



Jhapa



गु मी gulmi इलाम ilam



Gloss of Jhapa



Gulmi



एल -eli



गु मेल gulmeli



of Gulmi



Illam



इलामे ilame



of Illam



गाउँ gaũ



ए-e



village



ले-le



गाउँले gaũle



villager



नेपाल nepal



Nepali



ई-iː



नेपाल nepaliː



of Nepal



The finite-state transducer in Figure 6.7 is common to all the suffixes listed in Table 6.7. +NOUN:0 NounType +ADJ:0 +SFX:Suffix +NOUN:0



Figure 6.7: A finite state transducer for noun to noun/adjective derivation 212



The phonological rules involved in this derivation are listed in PR 6.3; they are compiled and composed with the finite state transducer illustrated in Figure 6.7 and it can analyze and generate both stem and derived words. Phonological rule: PR 6.3 i. Independent vowels ई i: and ए e change to their corresponding dependent vowel ◌ी i: and ◌े e after the consonants



Regular expressions:



ई -> ◌ी || cons __ ; ए -> ◌े || cons __;



ii. Vowel sequence of dependent vowel ◌ी iː and independent vowel ए e changes



to ◌े e. ◌ी ए -> ◌े;



Regular expression:



iii. Vowel sequence of dependent vowel◌ा a and independentइ i changes to ि◌ i. ◌ा इ -> ि◌ [ ]



Regular expression:



6.2.4 Adjective to noun derivation In this derivation, 1 suffix is involved and they derive a noun from a adjective stem. The semantics of the suffixes is not considered but they are marked as suffix with a tag SFX. Table 6.8 lists those suffixes with example of derived and base stem of each.



Table 6.8: Adjective to noun derivation Base adjective Stem लामो lamo



Gloss long



Suffix आइ-ai



Derived noun लमाइ lʌmai



Gloss length



The finite-state transducer in Figure 6.8 is common to all the suffixes listed in Table 6.8.



213



+ADJ:0



AdjType



+NOUN:0 +SFX:Suffix



AdjType



Figure 6.8: A finite state transducer for noun to adjective derivation The phonological rules involved in this derivation are listed in PR 6.4; they are compiled and composed with the finite state transducer and it can analyze and generate both underived and derived words.



Phonological rule PR 6.4 i. Dependent vowel ◌ा a between consonant changes to ʌ. Regular expression:



◌ा -> [ ] || cons __ cons;



ii. Vowel sequence of dependent vowel ◌ो o and independent vowel आ a changes to dependent vowel ◌ा a. Regular expression:



◌ो आ-> ◌ा;



6.2.5 Adjective/noun to noun derivation In this derivation, 1 suffix is involved and it derives a noun from a noun/adjective stem. The semantics of the suffixes is not considered, so it is simply marked as suffix with a tag SFX. Table 6.9 lists those suffixes with base stem and derived word.



Table 6.9: Adjective/Noun to Noun Derivation Base noun/adjective stem ग रब gʌrib



Gloss poor



Suffix ई-iː



Derived noun ग रबीgʌribi



Gloss poverty



The finite-state transducer in Figure 6.9 is common to all the suffixes listed in Table 6.9. It is capable of analyzing and generating the derived words.



214



+ADJ:0 +NOUN:0



NAType



SFX:Suffix



+NOUN:0



Figure 6.9: A finite state transducer for noun/adjective to noun derivation The phonological rules involved in this process are listed in PR 6.5; they are compiled into a network and composed with the network illustrated Figure 6.9.



Phonological rule PR 6.5 i. Independent vowel ई i: changes to its corresponding dependent vowel ◌ी i: after the consonant Regular expression:



ई -> ◌ी || cons __;



6.2.6 Verb to noun derivation In this type of derivation, 32 suffixes are involved and they derive a noun from a verb stem. The semantics of the suffixes is not considered but they are marked as suffix with a tag



SFX.



Table 6.10 lists those suffixes with example of derived and base stem



of each.



215



Table 6.10: Verb to Noun Derivation Base verb stem च ुन् tsun



Gloss elect



च ुन् tsun



elect



क kit ̺



decide



क kit ̺



decide



ढाक् d̺ʰak



cover



जल् dzʌl



burn



चोर ् tsor



steal



हाँस ् ɦãs



laugh



प pʌd̺ʰ



read



थाक् tʰak



tire



छाप् tsʰap



print



छान् tsʰan



choose



िच या tsitsja



shout



झर ् dzʌr



drop



ढोग् dʰog



bow



राख् rakʰ



keep



दाब् dab



press



बच ् bʌts



save



स sʌd̺



decay



रोप् rop



plant



छे क् tsʰek



block



िचर ् tsir



split



ब bʌd̺ʰ ् सरsʌr



उ utʰ



चाल्tsal ् बेरber गाga



भ bid̺



िजत्dzit ् कोरkor



् ʰul खुलk



grow move rise sieve roll sing fight win scratch open



Suffix



आउ-au आब-ab आन-an



आनी-ani अनी-ʌni अन-ʌn



Derived noun च ुनाउ tsunau



Gloss election



च ुनाब tsunab



election



कटान kit ̺an



decision



कटानी kit ̺ani



decision



ढकनी d̺ʰʌkʌni



lid



जलन dzʌlʌn



ई-iː



ओ-o



आइ-ai



आवट-awʌt ̺ आ-a



ओट-ot ̺



हट-ɦʌt ̺



अना-ʌna



आउनी-auni आलो-alo आब-ab अत-ʌt



अल-ʌl



आइँ-aĩ



आरो-aro



औटो-ʌuto औती-ʌuti उवा-uwa ती-ti



नी-ni



नो-no ना-na



अ त-ʌnt



और -ʌuri एसो-eso



अःत-ʌstʌ



चोर tsoriː



theft



हाँसो ɦãso



laughter



पढाइ pʌd̺ʰai



reading



थकावट tʰakawʌt ̺



tiredness



छापा tsʰapa



printing



छनोट tsʰanot ̺



selection



िच याहट tsitsjaɦʌt ̺



shouting



झरना dzʌrʌna



water fall



ढोगाउनी dʰogauni



bowing



रखालो rakʰalo



servant



दबाब dabab



pressure



बचत bʌtsʌt



saving



सडल sʌd̺ʌl



decay



रोपाइँ ropaĩ



plantation



छे कारो tsʰekaro



blockade



िचरौटोtsirʌuto



split



बढौती bʌd̺ʰʌuti



growth



स वा sʌruwa



shift



उ ती utʰti



credit



चा नी tsalni



sieve



बेन berno



piles



गाना gana



song



भड त bid̺ʌnt



fighting



िजतौर dzitʌuri



winning



कोरे सो koreso



scratcher



खुलःत kʰulʌstʌ



216



burning



open



The finite-state transducer in Figure 6.10 is common to all the prefixes listed in Table 6.10 and it can analyze and generate the derived words.



Vstem



+SFX:Suffix



+NOUN:0



Figure 6.10: A finite state transducer for verb to noun derivation The phonological rules involved in this derivation process are listed in PR 6.6; they are compiled and composed with the finite state transducer illustrated in Figure 6.10.



Phonological rule PR 6.6 i. Independent vowels आ a, ई i:, ओ o, इ i , उ u, ए e, ऐ ʌi, औ ʌu, and अ ʌ change to their corresponding dependent vowels ◌ा a, ◌ी i:, ◌ो o, ि◌ i, ◌ु u, ◌े e, ◌ै ʌi, ◌ौ ʌu and [ ] after the consonants, respectively.



Regular expressions:



आ -> ◌ा || cons __ ; ई -> ◌ी || cons __; ओ -> ◌ो || cons __; इ -> ि◌ || cons __; उ -> ◌ु || cons __; ए -> ◌े || cons __; ऐ -> ◌ै || cons __; औ -> ◌ौ || cons __; अ -> [ ] || cons __;



6.2.7 Verb to adjective derivation In this derivation, 14 suffix are involved and they derive an adjective from a verb stem. The semantics of the suffixes is not considered but they are marked as suffix



217



with a tag



SFX.



Table 6.11 lists those suffixes with example of base stem and derived



word.



Table 6.11: Verb to adjective derivation Base verb stem मच ् mits



Gloss squeeze



Suffix आहा-aha



Gloss suppressor



अ ड -ʌkkʌd̺



Derived adjective मचाहा mitsaha भ ुल ड bʰulʌkkʌd̺



forgetful



भ ुल् bʰul



forget



पोस् pos



feed



इलो-ilo



पो सलो posilo



nutritious



घुम ् gʰum



roam



अ ते-ʌnte



घुम ते gʰumʌnte



vagabond



घुम ् gʰum



roam



अ ता-ʌnta



घुम ता gʰumʌnta



vagabond



खप् kʰʌp



last



आलु-alu



खपालु kʰʌpalu



long lasting



प pʌd̺ʰ



read



ऐया-ʌija



पढै या pʌd̺ʰʌija



studious



छा tsʌd̺



leave



आ-a



छाडा tsʌd̺a



wanton



रोप् rop



plant



आर-ar



रोपार ropar



planter



सक् sik



learn



आ -aru



सका sikaru



learner



बक् bik



sell



आउ-au



बकाउ bikau



salable



भाग् bʰag



flee



औटो-ʌut ̺o



भगौटो bʰagʌut ̺o



runner



छे र ् tsʰer



pass stool attach



औट -ʌut ̺i



छे रौट tsʰerʌut ̺i



sufferer



उ-u



लागु lagu



addicted



लाग् lag



The finite-state transducer in Figure 6.11 is common to all the suffixes listed in Table 6.11. It can analyze and generate the derived words of this type.



Vstem



+SFX:Suffix



+ADJ:0



Figure 6.11: A finite state transducer for verb to adjective derivation The phonological rules involved in this type of derivation are listed in PR 6.7; they are compiled and composed with the finite state transducer illustrated in Figure 6.11.



218



Phonological rule PR 6.7 i. Independent vowels आ a, ई i:, ओ o, इ i , उ u, ए e, ऐ ʌi, औ ʌu, and अ ʌ change to their corresponding dependent vowels ◌ा a, ◌ी i:, ◌ो o, ि◌ i, ◌ु u, ◌े e, ◌ै ʌi, ◌ौ ʌu and [ ] after the consonants, respectively.



Regular expressions:



आ -> ◌ा || cons __ ; ई -> ◌ी || cons __; ओ -> ◌ो || cons __; इ -> ि◌ || cons __; उ ->◌ु || cons __; ए -> ◌े || cons __; ऐ -> ◌ै || cons __; औ -> ◌ौ || cons __; अ -> [ ] || cons __;



6.2.8 Verb to adverb derivation In this derivation, 2 suffixes are involved and they derive an adverb from a verb stem. The semantics of the suffixes is not considered, so they are simply marked as suffix with a tag



SFX.



Table 6.12 lists those suffixes with example of base stem and derived



word. Table 6.12: Verb to adverb derivation Base verb stem ् गरgʌr ् गरgʌr



Gloss do do



Suffix उ जेल -undzel



Derived adverb ग जेल gʌrundzel



Gloss till doing



इ जेल -indzel



ग र जेल gʌrindzel



till doing



The finite-state transducer in Figure 6.12 is common to all the suffixes listed in Table 6.12. It can analyze and generate the derived words.



219



Vstem



+SFX:Suffix



+ADV:0



Figure 6.12: A finite state transducer for verb to adverb derivation The phonological rules involved in this derivation are listed in PR 6.8; they are compiled and composed with the finite state transducer illustrated in Figure 6.12.



Phonological rule PR 6.8 i. Independent vowels उ u and इ i change to their corresponding dependent vowel ◌ु u and ि◌ i after the consonants Regular expressions:



उ -> ◌ु || cons __ ; इ -> ि◌ || cons __;



6.2.9 Adverb to adjective derivation In this derivation, 1 suffix is involved and it derives an adjective from an adverb stem. The semantics of the suffixes is not considered but it is marked as suffix with a tag SFX.



Table 6.12 lists those suffixes with example of base stem derived word.



Table 6.13: Adverb to adjective derivation Base adverb stem भऽ bʰitrʌ



Gloss inside



Suffix ई-iː



Derived adjective भऽी bʰitriː



Gloss inner



बा हर bahirʌ



outside



ई-iː



बा हर bahiri



ourter



The finite-state transducer in Figure 6.12 is common to all the suffixes listed in Table 6.13. It is capable of analyzing and generating the base stems and derived words.



220



+ADV:0



AdvType



+ADJ:0 +SFX:Suffix



AdvType



Figure 6.13: A finite state transducer for noun to adjective derivation The phonological rules involved in this process is listed in PR 6.9; it is compiled and composed with the finite state transducer illustrated in Figure 6.13.



Phonological rule PR 6.9 i. Independent vowel ई i: changes to corresponding dependent vowel ◌ी i: after the consonants Regular expression:



ई -> ◌ी || cons __;



6.2.10 Verb to noun conversion Some of the verb stems alter between verb and noun. They are same phonologically but differ in written form. In the noun form, a diacritic halanta is dropped. Some examples of such stems are listed in Table 6.14.



Table 6.14: Verb to noun conversion Base verb stem खेल ् kʰel



Gloss 'play'



Derived noun खेल kʰel



खोज् kʰodz



'search'



खोज kʰodz



Gloss 'game' 'research'



The finite-state transducer in Figure 6.14 encodes stems listed in Table 6.14 and it is capable of analyzing and generating the base stem and derived words.



221



Vstem



+NOUN:0



Figure 6.14: A finite state transducer for verb to adverb derivation The phonological rule involved in this conversion is listed in PR 6.10; it is compiled and composed with finite state transducer illustrated in Figure 6.14.



Phonological rule PR 6.10 i. Halanta ◌् at the end of verb stem is removed. Regular Expression:



◌् -> [ ] || __ .#.



6.2.11 Verb to adjective/noun conversion Some verb stems alter between verb and noun or adjective forms. Some examples of such stems are listed in Table 6.15. Phonologically they are same but orthographically differ by halanta.



Table 6.15: Verb to Adjective/Noun Conversion Base verb Stem ठग् t ̺ʰʌg



Gloss cheat



Derived adjective ठग t ̺ʰʌg



Gloss cheat



चोर ् tsor



steal



चोर tsor



thief



थप् tʰʌp



add



थप tʰʌp



additional



The finite-state transducer illustrated in Figure 6.14 encodes stems listed in Table 6.15 and it is capable of analyzing and generating the base stems and derived words.



+NOUN:0



Vstem



+ADJ:0



Figure 6.15: A finite state transducer for verb to adverb derivation 222



The phonological rule involved in this conversion is listed in PR 6.11. It is compiled and composed with the finite state transducer illustrated in Figure 6.15.



Phonological rule PR 6.11 i. Halanta ◌् at the end of verb stem is removed. Regular expression:



◌् -> [ ] || __ .#.



6.2.12 Verb to noun derivation (vowel insertion) Some verb stems change from verb form to noun forms by inserting vowel अ ʌ between consonants in the stem. Some examples of such stems are listed in Table 6.16. Table 6.16: Verb to noun (vowel insertion) Base verb stem च क tsʌmkʌ



Gloss shine



Derived adjective चमक tsʌmʌk



Gloss shining



स झ sʌmdzʌ



remember



समझ sʌmʌdz



understanding



ट कt ̺ʌlkʌ



shine



टलक t ̺ʌlʌk



shining



The finite-state transducer illustrated in Figure 6.16 encodes stems listed in Table 6.16 and it is capable of analyzing and generating the derived words.



Vstem



+NOUN:0



Figure 6.16: A finite state transducer for verb to adverb derivation The phonological rule involved in this derivation is listed in PR 6.12; it is compiled and composed with the finite state transducer illustrated in Figure 6.16.



223



Phonological rule PR 6.12 i. Halant ◌् between the consonants of verb stem is removed. Regular expression:



◌् -> [ ] || cons__ cons;



6.3 Summary In this chapter, we presented the derivation process in Nepali. The various derivation such as noun to noun, noun to adjective, noun to adverb and adjective to adjective are the former types and noun to noun, noun to adjective, nount to noun/adjective, adjective to noun, adjective/noun to noun, verb to noun, verb to adjective, verb to adverb are the latter types. In addition, there are two kinds of conversions: verb to noun and verb to adjective/noun. And verb to noun derivation due to vowel insertion is also included. Each prefix and suffix has its own set of words from which derivation takes place. The derivation process in Nepali is not as productive and regular as inflectional process in Nepali. However there exists a quite good number of derived words. Two major types of derivation, prefixation and suffixation, are discussed and implemented.



224



CHAPTER 7 IMPLEMENTATION



7.0 Outline This chapter presents the implementation of morphological categories and phonological rules analyzed in the earlier chapters to design a computational model using the Xerox finite state toolkit. It consists of four sections. Section 7.1 presents the morphotactics, i.e. syntax of morphemes. The morphological categories and grammatical categories have been separated based on the earlier analysis. Section 7.2 presents the lexc grammar for nouns, pronouns, adjectives, verbs, numerals, classifiers, adverbs, postpositions, conjunctions, particles, interjections and derivation. Section 7.3 deals with the realization, i.e. rules for alternation using xfst interface for each category. Finally, section 7.4 summarizes the chapter.



7.1 Morphotactics: syntax of morphemes 7.1.1 Morphological categories As discussed and analyzed in chapters (3-6), two major categories are identified, open word class and closed word class. Table 7.1 shows four open word classes and their corresponding morphological tags used in the morphological analyzer. Table 7.2 shows seven closed word classes with their corresponding morphological tags used in the morphological analyzer.



Table 7.1: The open word classes S.N. 1. 2. 3. 4.



Morphological Categories Nouns Adjectives Verbs Adverbs



225



Tags +NOUN +ADJ +VERB +ADV



Table 7.2: The closed word classes S.N. 1. 2. 3. 4. 5. 6. 7.



Morphological Categories Pronouns Numeral Classifier Postpositions Conjunctions Particles Interjections



Tags +PRON +NUM +CLF +POSTP +CCONJ, +SCONJ +PARTICLE +INTERJ



7.1.2 Grammatical categories Altogether 45 grammatical categories are identified in all open and closed word classes. Features in adverbs from 37 to 42 in Table 7.3 and two features in nouns in 44 and 45 of Table 7.3 are semantic whereas the rest of the features in other categories are formal. Table 7.3 lists all the grammatical categories and their corresponding morphological tags used in the morphological analyzer. The redundant features such as augmentative, non-causative, active, direct form have not been incorporated into the analyzer.



Table 7.3: The grammatical categories and features S.N. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.



Grammatical categories Number Gender Form Honorificity Evaluation Persons Cases Distal Proximate Reflexive Demonstrative Relative Interrogative Indefinite Definite Reciprocal Degree Cardinal Ordinal Frequency Portion



226



Tags +SG, +PL +MASC, +FEM +DIRT, +OBL +NHON, +HON, +HHON, +RHON +AUG, +DIM 1, 2, 3 +ERG, +INST, +DAT, +ABL, +LOC, +COM, +GEN, +VOC, +ALL +DIST +PROX +REFL +DEM +REL +INTERRO +INDEF +DEF +RECIP +POSIT, +COMP, +SUPER +CARD +ORD +FREQ +PORT



22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45.



Voice Causative Existential Idenficational Tenses Aspects Moods Absolutive Infinitive Purposive Prospective Durative Conjunctive Conditional Perfective Temporal Spatial Amount Manner Reason Sentential Emphasis Proper Name Place Name



+PASS +CAUSE +EXIST +ID +NPST, +PST +PERF, +IMPERF, +INFER, +HAB +IMP, +OPT, +POT +ABS +INF +PURP +PROS +DUR +CONJUCT +COND +PERFT +TEMP +SPAC +AMOUNT +MANNER +REASON +SENT +EMPH +PROPER +PLACE



There are a number of arbitrary tags used in the lexc file to restrict the scope of replace rules. Finally these tags are removed for the transducer after the application of the replace rules. Table 7.3 lists a sample of such tags.



Table 7.3: The arbitrary tags S.N. Purpose of arbitrary tag



Tags



1.



O-ending nouns for plural, honorific, oblique and vocative



^MP



2.



O-ending nouns for feminine and diminutive



^FE



3.



Non-honorific imperative



^IMPsg



4.



Honorific imperative



^IMPhon



5.



Plural imperative



^IMPpl



6.



Noun to adjective derivation



^NA



7.



Noun to adverb derivation



^NADV



8.



Adjective to verb derivation



^ADJV



9.



Verb to noun conversion



^R



10.



Insertion in verb to noun derivation



^a



227



7.2 Lexc grammar Nouns, pronouns, adjectives, numerals, classifiers, verbs, adverbs, postpositions and particles and interjections are encoded in lexc files. The main morphological forms are in Devanagari script and the morphological tags are in Roman script using UTF-8 character encoding. The lexc files begin with Multichar_Symbols and lexicon follows it. 7.2.1 Nouns The nouns discussed and analyzed in (3.1) are implemented in a lexc file named nouns.txt which includes 14 classes of nouns. The upper language contains stems and sequence of morphological tags and the lower language contains surface forms. Besides regular morphological tags, some arbitrary tags are used for restricting the application of replace rules. The encoding of the nouns with their morphological tags is as follows. This lexc file accumulates the transducers form Figure 3.1 to Figure 3.13.



Multichar_Symbols +NOUN +MASC +FEM +OBL +PL +SG +DIM +VOC +HON ^MP ^FE +PLACE +PROPER LEXICON ROOT Nouns; LEXICON Nouns !! Type 1a Nouns: केटो inflection_1a; !!Type 1b Nouns: मुसो inflection_1b; !!Type 1c Nouns: डालो inflection_1c; !!Type 1d Nouns: फोटो inflection_1d; !!Type 21a Nouns: काका inflection_21a; !!Type 21b Nouns: ना त inflection_21b; !!Type 21c Nouns: बाघ inflection_21c; !!Type 21d Nouns: बःट inflection_21d;



228



!!Type 22a Nouns: दाइ inflection_22a; !!Type 22b Nouns: द द inflection_22b; !!Type 22c Nouns: राम inflection_22c; !!Type 22d Nouns: सीता inflection_22d; !!Type 22e Nouns: खेत inflection_22e; !!Type 22f Nouns: पोखरा inflection_22f; LEXICON inflection_1a #; +NOUN+MASC+SG:0 +NOU N+MASC+PL:^MP #; +NOUN+MASC+OBL:^MP #; +NOUN+MASC+HON:^MP #; +NOUN+MASC+VOC:^MP #; +NOUN+FEM:+FE #; LEXICON inflection_1b +NOUN+MASC+SG:0 #; +NOUN+MASC+PL:^MP #; +NOUN+MASC+OBL:^MP #; +NOUN+FEM:+FE #; LEXICON inflection_1c +NOUN+SG:0 #; +NOUN+PL:^MP #; +NOUN+OBL:^MP #; +NOUN+DIM:^FE #; LEXICON inflection_1d +NOUN+SG:0 #; +NOUN+PL:^MP #; +NOUN+OBL:^MP #; LEXICON inflection_21a +NOUN+MASC:0 #; #; +NOUN+FEM:◌ी LEXICON inflection_21b +NOUN+MASC:0 #; +NOUN+FEM:नी #; LEXICON inflection_21c +NOUN+MASC:0 #; #; +NOUN+FEM:ि◌नी



229



LEXICON inflection_21d #; +NOUN+MASC:0 #; +NOUN+FEM:◌ेनी #; +NOUN+FEM:ि◌नी LEXICON inflection_22a +NOUN+MASC:0 #; LEXICON inflection_22b +NOUN+FEM:0 #; LEXICON inflection_22c +NOUN+PROPER+MASC:0 #; LEXICON inflection_22d +NOUN+PROPER+FEM:0 #; LEXICON inflection_22e +NOUN:0 #; LEXICON inflection_22f +NOUN+PLACE:0 #; END



7.2.2 Pronouns Nepali pronouns are limited in number and more or less idiosyncratic in their forms and functions. Hence, instead of organizing them where rules can be applied to get their surface forms, they are directly encoded uniting all the finite state transducers from Figure 3.14 to Figure 3.33 along with their morphological tags. Therefore, this section does not contain any replace rules. Multichar_Symbols +PRON +1SG +OBL +EMPH +GEN +MASC +FEM +HON +1PL +2SG +HHON +RHON +3SG +PROX +DIST +REFL +DEM +HUM +NHUM +DEF +INTERRO +RECIP +PL +SG +REL +INDEF +3PL LEXICON ROOT pronouns; LEXICON pronouns !!First person singular pronoun म+PRON+1SG:म



#;



म+PRON+1SG+OBL:मै



#;



म+PRON+1SG+EMPH:मै



#;



म+PRON+1SG+OBL+GEN+MASC:मेरो #; म+PRON+1SG+OBL+GEN+FEM:मेर #;



म+PRON+1SG+OBL+GEN+PL:मेरा #; म+PRON+1SG+OBL+GEN+HON:मेरा #;



म+PRON+1SG+OBL+GEN+OBL:मेरा #;



230



म+PRON+1SG+OBL+GEN+EMPH:मेरै



#;



!!First Person Plural Prouns हामी+PRON+1PL:हामी



#;



हामी+PRON+1PL+OBL+GEN+MASC:हाॆो



#;



हामी+PRON+1PL+OBL+GEN+FEM:हाॆी #; हामी+PRON+1PL+OBL+GEN+PL:हाॆा #;



हामी+PRON+1PL+OBL+GEN+HON:हाॆा



#;



हामी+PRON+1PL+OBL+GEN+OBL:हाॆा #; हामी+PRON+1PL+OBL+GEN+EMPH:हाॆै



!! Second Person Singular तँ+PRON+2SG:तँ तँ+PRON+2SG+OBL:त



तँ+PRON+2SG+EMPH:त



#;



#; #; #;



तँ+PRON+2SG+OBL+GEN+MASC:तेरो #;



तँ+PRON+2SG+OBL+GEN+FEM:तेर #;



तँ+PRON+2SG+OBL+GEN+PL:तेरा



#;



तँ+PRON+2SG+OBL+GEN+HON:तेरा #;



तँ+PRON+2SG+OBL+GEN+OBL:तेरा #; तँ+PRON+2SG+OBL+GEN+EMPH:तेरै #;



!!Second Person honorific Pronoun तमी+PRON+2SG+HON: तमी



#;



तमी+PRON+2SG+OBL+HON+GEN+MASC: तॆो #;



तमी+PRON+2SG+OBL+HON+GEN+FEM: तॆी #; तमी+PRON+2SG+OBL+HON+GEN+PL: तॆा



#;



तमी+PRON+2SG+OBL+HON+GEN+HON: तॆा #;



तमी+PRON+2SG+OBL+HON+GEN+OBL: तॆा #;



तमी+PRON+2SG+OBL+HON+GEN+EMPH: तॆै #; !! Second person high honorific pronouns #; तपा + PRON+2SG+HHON:तपा यहाँ+PRON+2SG+HHON:यहाँ #; उहाँ+PRON+2SG+HHON:उहाँ #;



231



वहाँ+PRON+2SG+HHON:वहाँ #;



हजुर+PRON+2SG+HHON:हजुर #; !! Second person royal honorific pronoun मौसुफ+PRON+2SG+RHON:मौसुफ #; !!Third Person Singular Pronoun ऊ ऊ+PRON+3SG:ऊ



#;



ऊ+PRON+3SG+EMPH:उह



#;



ऊ+PRON+3SG+OBL:उस



#;



ऊ+PRON+3SG+OBL+EMPH:उसै



#;



ऊ+PRON+3SG+HON:उनी



#;



ऊ+PRON+3SG+HON+OBL:उन



#;



ऊ+PRON+3SG+HON+OBL+EMPH:उनै #; ऊ+PRON+3SG+HON:उहाँ ऊ+PRON+3SG+HON:वहाँ



#; #;



!! Third person singular pronoun यो यो+PRON+3SG+PROX:यो



#;



यो+PRON+3SG+PROX+EMPH:यह



#;



यो+PRON+3SG+OBL+PROX:यस



#;



यो+PRON+3SG+OBL+PROX+EMPH:यसै



#;



यो+PRON+3SG+PROX+HON:यी



#;



यो+PRON+3PL+PROX:यी



#;



यो+PRON+3SG+PROX+HON: यनी



#;



यो+PRON+3SG+PROX+OBL+HON: यन



#;



यो+PRON+3SG+PROX+OBL+HON+EMPH: यनै #; !!!Third person singular pronoun यो and ती यो+PRON+3SG+DIST: यो



#;



यो+PRON+3SG+DIST+EMPH: यह



#;



यो+PRON+3SG+OBL: यस



#;



यो+PRON+3SG+OBL+EMPH: यसै



#;



ती+PRON+3SG+HON+DIST:ती



ती+PRON+3PL+DIST:ती



#; #;



232



ती+PRON+3SG+HON+DIST: तनी



#;



ती+PRON+3SG+OBL+HON+DIST: तन



#;



ती+PRON+3SG+OBL+HON+DIST+EMPH: तनै !!Reflexive pronoun आफू+PRON+REFL:आफू



#;



#;



आफू+PRON+REFL+OBL+EMPH:आफै



#;



आफू+PRON+REFL+OBL+EMPH:आफ



#;



आफू+PRON+REFL+OBL+GEN+SG:आ नो



#;



आफू+PRON+REFL+OBL+GEN+HON:आ ना



#;



आफू+PRON+REFL+OBL+GEN+PL:आ ना #;



आफू+PRON+REFL+OBL+GEN+OBL:आ ना



आफू+PRON+REFL+OBL+GEN+FEM:आ नी



#; #;



आफू+PRON+REFL+OBL+GEN+EMPH:आ नै



#;



!! Demonstrative pronouns यो यो+PRON+DEM+PROX:यो



#;



यो+PRON+DEM+PROX+EMPH:यह



#;



यो+PRON+DEM+PROX:यी



#;



यो+PRON+DEM+PROX+HON: यनी



#;



यो+PRON+DEM+PROX+OBL: यन



#;



यो+PRON+DEM+PROX+OBL+EMPH: यनै यो+PRON+DEM+PROX:यहाँ



#;



#;



!!Demonstrative pronoun यो and ती यो+PRON+DEM+DIST: यो



#;



यो+PRON+DEM+DIST+EMPH: यह #;



ती+PRON+DEM+DIST:ती



#;



ती+PRON+DEM+DIST+OBL: तन



#;



ती+PRON+DEM+DIST+OBL+HON: तनी #; ती+PRON+DEM+DIST+OBL+EMPH: तनै #; !!Demonstrative pronoun ऊ ऊ+PRON+DEM+DIST:ऊ



#;



233



ऊ+PRON+DEM+DIST+EMPH:उह ऊ+PRON+DEM+DIST+HON:उनी



#; #;



ऊ+PRON+DEM+DIST+OBL:उन #; ऊ+PRON+DEM+DIST+OBL+EMPH:उनै #; ऊ+PRON+DEM+DIST+HON:उहाँ



#;



ऊ+PRON+DEM+DIST+HON:वहाँ



#;



!!Other Demonstrative pronouns #; सो+PRON+DEM+DIST:सो



सो+PRON+DEM+DIST+EMPH:सोह नज+PRON+DEM+PROX: नज



#;



नज+PRON+DEM+PROX+EMPH: नजै उ



#;



+PRON+DEM+PROX:उ



#;



#;



!! Relative pronouns जो+PRON+REL+HUM:जो



#;



जो+PRON+REL+OBL+HUM:जस #; जो+PRON+REL+OBL+HUM+EMPH:जसै #; जे+PRON+REL+NHUM:जे



#;



जुन+PRON+REL:जुन



#;



जुन+PRON+REL+EMPH:जुनै



#;



!! Interrogative pronouns को+PRON+INTERRO+HUM:को#; को+PRON+INTERRO+HUM+OBL:कस #; को+PRON+INTERRO+HUM+OBL+EMPH:कसै



#;



के+PRON+INTERRO+NHUM:के #;



कुन+PRON+INTERRO:कुन



#;



कन+PRON+INTERRO: कन



#;



कसर +PRON+INTERRO:कसर #; !! Indefinite pronouns को+PRON+INDEF+HUM:कोह #;



के+PRON+INDEF+NHUM:केह #;



कुन+PRON+INDEF:



कुनै



#;



जो+PRON+INDEF+HUM:जोसुकै #; 234



ु ै #; जे+PRON+INDEF+NHUM:जेसक जुन+PRON+INDEF:जुनसुकै



#;



!! Definite pronouns ू येक+PRON+DEF:ू येक #; हरे क+PRON+DEF:हरे क



#;



सबै+PRON+DEF:सबै



#;



अक +PRON+DEF+SG:अक #; अक +PRON+DEF+PL:अका #; अक +PRON+DEF+HON:अका



#;



अक +PRON+DEF+OBL:अका



#;



अक +PRON+DEF+FEM:अक



#;



अक +PRON+DEF+EMPH:अक #; अ +PRON+DEF:अ



#;



!! Reciprocal pronouns एकअक +PRON+RECIP:एकअक



#;



एकअक +PRON+RECIP+OBL:एकअका #; एकअक +PRON+RECIP+HON:एकअका #;



एकअक +PRON+RECIP+PL:एकअका



#;



एकअक +PRON+RECIP+FEM:एकअक #;



एकआपस+PRON+RECIP:एकआपस आपस+PRON+RECIP:आपस



आफू+PRON+RECIP:आआफू



#;



#; #;



END 7.2.3 Verbs The verb stems analyzed and classified in (4.4); and auxiliary verbs and inflections analyzed in (4.5) are implemented in a single lexc file. The stems and inflections are concatenated in same lexc file with the help of continuation lexicons. Two flag diacritics @U.NEG.PRESENT@ and @U.NEG.ABSENT@ are defined and implemented to restrict negative prefix. This lexc file includes the tranducers from Figure 4.1 to Figure 4.36. MULTICHAR_SYMBOLS ^IMP +1PL +1SG +2PL +2SG +3PL +3SG +ABS +COND +CONJ +DUR +EMPH +FEM +HAB +HON +IMP +IMPERF +INF +MASC +NEG +NPST +OBL +OPT +PST +PERF +PERFT +PL +POT +PROSP



235



+PURP +SG +UNA +EXIST +IDEN @U.NEG.ABSENT@ +VERB +PASS +CAUSE



NEG+



@U.NEG.PRESENT@



LEXICON ROOT !!Auxiliary verbs छ+EXIST:छ auxcha; हो+IDEN:हो



auxho;



थ+EXIST: थ past;



!!=============Main Verb Stems ==========!! [email protected]@:न@U.NEG.PRESENT@ Verbs; LEXICON Verbs !! Verb Type 1a अघा Type1a; !! Verb Type 1b चोिख Type1b; !!Verb Type1c उि ल Type1c; !!Type verb1d हाँस् Type1d; !!Verb Type1e बस् Type1e; !! Verb Type2a उचाल् Type2a; !! Verb Type 2b प ब Type2b; !!Verb Type2c नाच ् Type2c; !!Type verb2d कन् Type2d; !! Irregular verbs आउनु+VERB:आ Group; आउनु+VERB+PASS:आइ



खानु+VERB:खा



intGroup;



Group;



खानु+VERB+PASS:खाइ



Group;



खानु+VERB+CAUSE: वा



Group;



खानु+VERB+CAUSE+PASS: वाइ Group; बःनु+VERB+CAUSE:बसाल् Group;



236



Verbs;



बःनु+VERB+CAUSE+PASS:बसा ल Group;



बःनु+VERB+CAUSE:बसाल् Group;



बःनु+VERB+CAUSE+PASS:बसा ल Group; LEXICON Type1a उनु+VERB:0 Group; उनु+VERB+PASS:इ intGroup; LEXICON Type1b नु+VERB:0 Group; नु+VERB+PASS:इ



intGroup;



नु+VERB+CAUSE:आ



Group;



नु+VERB+CAUSE+PASS:आइ LEXICON Type1c नु+VERB:0 Group; नु+VERB+PASS:इ



intGroup;



नु+VERB+CAUSE:आ



Group;



नु+VERB+CAUSE+PASS:आइ LEXICON Type1d नु+VERB:0 Group; नु+VERB+PASS:इ



Group;



नु+VERB+CAUSE+PASS:आइ LEXICON Type1e नु+VERB:0 Group; Group;



नु+VERB+CAUSE+PASS:आइ LEXICON Type2a नु+VERB:0 Group;



नु+VERB+PASS:इ



Group;



Group; Group; Group;



नु+VERB+CAUSE:आ



Group;



नु+VERB+CAUSE+PASS:आइ LEXICON Type2c नु+VERB:0 Group; नु+VERB+PASS:इ



Group;



intGroup;



नु+VERB+CAUSE:आ



नु+VERB+PASS:इ LEXICON Type2b नु+VERB:0



Group;



intGroup;



नु+VERB+CAUSE:आ



नु+VERB+PASS:इ



Group;



Group;



Group;



237



नु+VERB+CAUSE:आ



Group;



नु+VERB+CAUSE+PASS:आइ LEXICON Type2d नु+VERB:0 Group; नु+VERB+PASS:इ



Group;



Group;



नु+VERB+CAUSE:आ



Group;



नु+VERB+CAUSE+PASS:आइ !!!=============Verbs end



LEXICON intGroup @U.NEG.ABSENT@ intGroup2; LEXICON intGroup1 +NPST+3SG+MASC:छ



Group;



intGroup1;



#;



+NPST+NEG+3SG+MASC:दै न



#;



+NPST+NEG+3SG:न #; +PST+3SG+MASC:यो



#;



+PST+3SG+MASC:एन



#;



+PST+HAB+3SG+MASC: यो



#;



+PST+NEG+HAB+3SG+MASC:दै न यो +PST+INFER+3SG+MASC:एछ



#;



+PST+INFER+NEG+3SG+MASC:एनछ LEXICON intGroup2 +PERF+SG+MASC:एको +IMPERF:दै



#;



#;



#;



+OPT+3SG:ओस्



#;



+POT+3SG+MASC:ला +INF:नु



#;



#;



#;



+INF+OBL:ना #; +PURP:न



#;



+PROSP:ने



#;



+DUR+EMPH:दै



#;



+CONJ:एर



#;



+CONJ:इकन #; +COND:ए



#;



238



+PERFT:ए



#;



LEXICON Group @U.NEG.ABSENT@ Group1; Group2; LEXICON Group1 NonpastAffirmative; NonpastNegative1; NonpastNegative2; PastAffirmative; PastNegative; HabitualAspectAffirmative; HabitualAspectNegative; InferentialAffiramtive; InferentialNegative; LEXICON Group2 PerfectAspect; ImperfectAspect; Imperative; Optative; Potential; Participles; LEXICON past PastAffirmative; PastNegative; !Inflection for non-past existential verb छ chʑ 'be' (Affirmative) LEXICON auxcha +NPST+1SG:◌ु #; +NPST+1PL:◌ौ◌ँ



#;



+NPST+2SG+MASC:स्



#;



+NPST+2SG+FEM:◌ेस्



#;



+NPST+2SG+MASC+HON:◌ौ



#;



+NPST+2SG+FEM+HON:◌्यौ



#;



+NPST+2PL:◌ौ #; +NPST+3SG+MASC:0 +NPST+3SG+FEM:◌े #;



#;



+NPST+3SG+MASC+HON:न्



#;



+NPST+3SG+FEM+HON:ि◌न्



#;



+NPST+3PL:न्



#;



239



!Inflection for non-past existential verb छ chʑ 'be' (Negative) +NPST+NEG+1SG:◌ैनँ



#;



+NPST+NEG+1PL:◌ैन



#;



+NPST+NEG+2SG:◌ैनस्



#;



+NPST+NEG+2SG+HON:◌ैनौ +NPST+NEG+2PL:◌ैनौ



#;



+NPST+NEG+3SG:◌ैन



#;



+NPST+NEG+3SG+HON:◌ैनन्



+NPST+NEG+3PL:◌ैनन्



#;



#;



#;



!! end of छ



!Inflections for non-past identificational verb हो ɦo ‘be’ (affirmative) LEXICON auxho +NPST+1SG:उँ #; +NPST+1PL:औ



#;



+NPST+2SG:स्



#;



+NPST+2SG+HON:औ



#;



+NPST+2PL:औ #; +NPST+3SG:0 #; +NPST+3SG+HON:न्#; +NPST+3PL:न्



#;



!Inflection for non-past identificational verb हो ɫo ‘be’ (Negative)



+NPST+NEG+1SG:इनँ



#;



+NPST+NEG+1PL:इन



#;



+NPST+NEG+2SG:इनस्



#;



+NPST+NEG+2SG+HON:इनौ +NPST+NEG+2PL:इनौ



#;



+NPST+NEG+3SG:इन



#;



+NPST+NEG+3SG+HON:इनन् +NPST+NEG+3PL:इनन्



#;



#;



#;



!!!===Tense, aspect and mood ==== !Inflections for non-past tense (affirmative) LEXICON NonpastAffirmative +NPST+1SG:छु #; +NPST+1PL:छ



#;



240



+NPST+2SG+MASC:छस्



#;



+NPST+2SG+FEM:छे स ्



#;



+NPST+2SG+MASC+HON:छौ



#;







#;



+NPST+2SG+FEM+HON: +NPST+2PL:छौ



#;



+NPST+3SG+MASC:छ



#;



+NPST+3SG+FEM:छे



#;



+NPST+3SG+MASC+HON:छन् +NPST+3SG+FEM+HON: छन् +NPST+3PL:छन्



#; #;



#;



!Inflections for non-past tense negative 1 LEXICON NonpastNegative1 +NPST+NEG+1SG: दनँ #; +NPST+NEG+1PL:दै न



#;



+NPST+NEG+2SG+MASC:दै नस् +NPST+NEG+2SG+FEM: दनस्



#; #;



+NPST+NEG+2SG+MASC+HON:दै नौ



#;



+NPST+NEG+2SG+FEM+HON: दनौ



#;



+NPST+NEG+2PL:दै नौ



#;



+NPST+NEG+3SG+MASC:दै न +NPST+NEG+3SG+FEM: दन



#; #;



+NPST+NEG+3SG+MASC+HON:दै नन् +NPST+NEG+3SG+FEM+HON: दनन् +NPST+NEG+3PL:दै नन्



#; #;



#;



!Inflections for non-past tense negative 2 LEXICON NonpastNegative2 +NPST+NEG+1SG:नँ #; +NPST+NEG+1PL:न #; +NPST+NEG+2SG:नस्



#;



+NPST+NEG+2SG+HON:नौ #; +NPST+NEG+2PL:नौ#; +NPST+NEG+3SG:न #; +NPST+NEG+3SG+HON:नन् +NPST+NEG+3PL:नन्



#;



#;



241



!Inflections for past tense (affirmative) LEXICON PastAffirmative +PST+1SG:ए ँ #; +PST+1PL:य #; +PST+2SG:इस्



#;



+PST+2SG+HON:यौ #; +PST+2PL:यौ #; +PST+3SG+MASC:यो



#;



+PST+3SG+FEM:ई #; +PST+3SG+MASC+HON:ए #; +PST+3SG+FEM+HON:इन् #; +PST+3PL:ए #; !Inflections for past tense (negative) LEXICON PastNegative +PST+NEG+1SG:इनँ #; +PST+NEG+1PL:एन #; +PST+NEG+2SG:इनस्



#;



+PST+NEG+2SG+MASC+HON:एनौ



#;



+PST+NEG+2SG+FEM+HON:इनौ #; +PST+NEG+2PL:एनौ #; +PST+NEG+3SG+MASC:एन



#;



+PST+NEG+3SG+FEM:इन #; +PST+NEG+3SG+MASC+HON:एनन्



#;



+PST+NEG+3SG+FEM+HON:इनन् #; +PST+NEG+3PL:एनन्



#;



!Inflections for perfect aspect LEXICON PerfectAspect +PERF+SG+MASC:एको #; +PERF+PL:एका



#;



+PERF+SG+FEM:एक +PERF+EMPH:एकै



#;



#;



!Inflections for imperfect aspect LEXICON ImperfectAspect +IMPERF+SG+MASC:दो #;



242



+IMPERF+SG+FEM:द +IMPERF+PL:दा +IMPERF:दै



#;



#;



#;



! Inflections for habitual aspect (affirmative) LEXICON HabitualAspectAffirmative +PST+HAB+1SG:थ #; +PST+HAB+1PL: य #; +PST+HAB+2SG: थस्



#;



+PST+HAB+2SG+HON: यौ #;



+PST+HAB+2PL: यौ #;



+PST+HAB+3SG+MASC: यो



#;



+PST+HAB+3SG+FEM: थ #; +PST+HAB+3SG+MASC+HON:थे #; +PST+HAB+3SG+FEM+HON: थन् #; +PST+HAB+3PL:थे #; !Inflections for habitual aspect (negative) LEXICON HabitualAspectNegative +PST+NEG+HAB+1SG:दै नथ #; +PST+NEG+HAB+1PL:दै न य



#;



+PST+NEG+HAB+2SG:दै न थस्



#;



+PST+NEG+HAB+2SG+HON:दै न यौ +PST+NEG+HAB+2PL:दै न यौ



#;



#;



+PST+NEG+HAB+2SG+FEM: दन थस्



#;



+PST+NEG+HAB+2SG+FEM+HON: दन यौ



#;



+PST+NEG+HAB+3SG+MASC:दै न यो



#;



+PST+NEG+HAB+3SG+FEM: दन थस्



#;



+PST+NEG+HAB+3SG+MASC+HON: दनथे +PST+NEG+HAB+3PL:दै नथे #;



!Inflections for Inferential aspect (affirmative) LEXICON InferentialAffiramtive +PST+INFER+1SG:एछु #; +PST+INFER+1PL:एछ



#;



+PST+INFER+2SG+MASC:एछस्



#;



+PST+INFER+2SG+FEM:इछस्



#;



243



#;



+PST+INFER+2SG+MASC+HON:एछौ



#;



+PST+INFER+2SG+FEM+HON:इछौ



#;



+PST+INFER+2PL:एछौ



#;



+PST+INFER+3SG+MASC:एछ



#;



+PST+INFER+3SG+FEM:इछ



#;



+PST+INFER+3SG+MASC+HON:एछन्



#;



+PST+INFER+3SG+FEM+HON:इछन्



#;



+PST+INFER+3PL:एछन्



#;



!Inflections for Inferential aspect (negative) LEXICON InferentialNegative +PST+INFER+NEG+1SG:एनछु #; +PST+INFER+NEG+1PL:एनछ



#;



+PST+INFER+NEG+2SG+MASC:एनछस् #; +PST+INFER+NEG+2SG+FEM:इनछे स्



#;



+PST+INFER+NEG+2SG+MASC+HON:एनछौ



#;



+PST+INFER+NEG+2SG+FEM+HON:इनछौ



#;



+PST+INFER+NEG+2PL:एनछौ



#;



+PST+INFER+NEG+3SG+MASC:एनछ



#;



+PST+INFER+NEG+3SG+FEM:इनछ



#;



+PST+INFER+NEG+3SG+HON:एनछन्



#;



+PST+INFER+NEG+3SG+FEM+HON:इनछन्



#;



+PST+INFER+NEG+3PL:एनछन्



#;



!Inflection for imperative mood LEXICON Imperative +IMP+2SG:^IMPsg #; !-/-ई +IMP+2SG+HON:^IMPhon #; !-अ/-ऊ +IMP+2PL:^IMPpl



#; !-अ/-ओ



! Inflections for optative mood (affirmative) LEXICON Optative +OPT+1SG:ऊँ #; +OPT+1PL:औ#; +OPT+2SG:एस्



#;



+OPT+2SG+HON:ए #; +OPT+2PL:ए #;



244



+OPT+3SG:ओस्



#;



+OPT+3SG+HON:ऊन् +OPT+3PL:ऊन्



#;



#;



! Inflections for potential mood (affirmative) LEXICON Potential +POT+1SG:उँला #; +POT+1PL:औला



#;



+POT+2SG+MASC:लास्



#;



+POT+2SG+FEM: लस्



#;



+POT+2SG+MASC+HON:औला



#;



+POT+2SG+FEM+HON:औल



#;



+POT+2PL:औला



#;



+POT+3SG+MASC:ला



#;



+POT+3SG+FEM:ल #; +POT+3SG+MASC+HON:लान्



#;



+POT+3SG+FEM+HON: लन्



#;



+POT+3PL:लान्



#;



!Inflection for participles LEXICON Participles +ABS:ई #; +INF:नु



#;



+INF+OBL:ना #; +INF+EMPH:नै +PURP:न



#;



+PURP+EMPH:नै +PROSP:ने



#;



+DUR:दा



#;



+DUR+EMPH:दै +CONJ:एर



#; #;



#;



#;



+CONJ+EMPH:एरै



#;



+CONJ:इकन #; +CONJ+EMPH:इकनै #; +COND:ए



#;



+PERFT:ए



#;



245



END 7.2.4 Adjectives Adjectives described, analyzed and classified in (3.4) are implemented in a lexc file. The adjectives are classified into four groups. This lexc file contains the transducers from Figure 3.34 to Figure 3.37. Multichar_Symbols +ADJ +SG +PL +OBL +HON +FEM +POSIT +COMP +SUPER LEXICON ROOT Adjectives; LEXICON Adjectives !!O-ending Adjectives राॆो inflection_o_ending; !!Non-o-ending Adjective Type 1 चतुर inflection_non_o_ending1; !!Non-o-ending Adjective Type 2 यून inflection_non_o_ending2; !!Unmarked adjectives असल inflection_unmarked; LEXICON inflection_o_ending +ADJ+SG:0 #; +ADJ+PL:+MP #; +ADJ+OBL:+MP #; +ADJ+HON:+MP #; +ADJ+FEM:+FE #; LEXICON inflection_non_o_ending1 +ADJ+SG:0 #; #; +ADJ+FEM:नी+FE LEXICON inflection_non_o_ending2 +ADJ+POSIT:0 #; +ADJ+COMP:तर #; +ADJ+SUPER:तम



LEXICON +ADJ:0 #; END



#;



inflection_unmarked



246



7.2.5 Numerals and classifiers The numerals analyzed and described in (3.5) and classifiers described and analyzed in (3.6) are implemented in a lexc file. Irregular ordinal numerals are directly encoded and this lexc file contains the transducers from Figure 3.38 to Figure 3.45.



Multichar_Symbols +CARD +ORD +NUM +CLF +PORT +FREQ +FEM +PL +HON +MASC +HUM +NHUM +OBL +CL +MP +SG LEXICON ROOT Numbers; LEXICON Numbers !!Cardinal Numbers पाँच CardOrd; सात सय



CardOrd; CardOrd;



हजार CardOrd; लाख



CardOrd;



अरब



CardOrd;



खरब



CardOrd;



करोड CardOrd;



नील



CardOrd;







CardOrd;



दानो



ctag1;



प CardOrd; !!Clasifier like items !!O-ending classifiers कोसो ctag1;



!!Non-o-ending classifiers पोट ctag2; थुन



ctag2;



!!Exceptional numbers एक+NUM+CARD:एक



#;



दुई+NUM+CARD:दुई #; तीन+NUM+CARD:तीन



#;



247



चार+NUM+CARD:चार



#;



छ+NUM+CARD:छ #; नौ+NUM+CARD:नौ #; !!Exceptional ordinal numerals एक+NUM+ORD+MASC:प हलो



#;



एक+NUM+ORD+PL:प हला #;



एक+NUM+ORD+OBL:प हला



#;



एक+NUM+ORD+HON:प हला



#;



एक+NUM+ORD+FEM:प हल



#;



दुई+NUM+ORD+MASC:दोॐो



#;



दुई+NUM+ORD+PL:दोॐा



#;



दुई+NUM+ORD+OBL:दोॐा #; दुई+NUM+ORD+HON:दोॐा #; दुई+NUM+ORD+FEM:दोॐी #;



तीन+NUM+ORD+MASC:तेॐो तीन+NUM+ORD+PL:तेॐा



#;



#;



तीन+NUM+ORD+OBL:तेॐा #;



तीन+NUM+ORD+HON:तेॐा #; तीन+NUM+ORD+FEM:तेॐी #; चार+NUM+ORD+MASC:चौथो चार+NUM+ORD+PL:चौथा



#;



#;



चार+NUM+ORD+OBL:चौथा #; चार+NUM+ORD+HON:चौथा #;



चार+NUM+ORD+FEM:चौथी #; एक+NUM+ORD:ूथम



#;



दुई+NUM+ORD: तीय



#;



तीन+NUM+ORD:तृतीय



#;



चार+NUM+ORD:चतुथ



#;



पाँच+NUM+ORD:प म



#;



छै ट +NUM+ORD:छै ट



#;



नव +NUM+ORD:नव #; !! Frequency numerals एक+NUM+FREQ:एकोहोरो



#;



248



दुई+NUM+FREQ:दोहोरो



#;



तीन+NUM+FREQ:तेहोरो



#;



एक+NUM+FREQ:एकसरो



#;



दुई+NUM+FREQ:दुईसरो



#;



तीन+NUM+FREQ:तीनसरो



#;



दुई+NUM+FREQ:दोबर



#;



तीन+NUM+FREQ:तेबर



#;



चार+NUM+FREQ:चौबर



#;



दुई+NUM+FREQ:दुईगुना



#;



तीन+NUM+FREQ:तीनगुना



#;



चार+NUM+FREQ:चौगुना



#;



!! Portion Numerals आधा+NUM+PORT:आधा



#;



पौने+NUM+PORT:पौने



#;



सवा+NUM+PORT:सवा



#;



डेढ+NUM+PORT:डेढ #; साढे +NUM+PORT:साढे



अढाइ+NUM+PORT:अढाइ



#; #;



चौथाइ+NUM+PORT:चौथाइ #; !! Classifiers जना+CLF+HUM:जना #; वटा+CLF+NHUM:वटा



#;



ओटा+CLF+NHUM:ओटा



#;



वट +CLF+FEM:वट #;



ओट +CLF+FEM:ओट LEXICON CardOrd +NUM+CARD:0 #; +NUM+ORD:औ #; LEXICON ctag1 +CL+SG:0 #; +CL+PL:+MP #; LEXICON ctag2 +CL:0 #; END



#;



249



7.2.6 Adverbs The adverbs described, analyzed and classified in (5.1) are implemented in a lexc file. Since the adverbs do not inflect, they are classified into semantic classes. This lexc file contains the transducers from Figure 5.1 to Figure 5.7. Multichar_Symbols +ADV +TEMP +SPAC +AMOUNT +MANNER +FREQ +REASON +SENT LEXICON Root !!! Temporal adverbs अ हले AdvT; हजो AdvT; !!! Spatial Adverbs तल AdvS;



यहाँ AdvS; !!! Amount adverbs धेरै AdvAm; अझ



AdvAm;



!!! Manner adverbs सुःतर AdvMa; फटाफट AdvMa; !!! Frequency adverbs बार बार AdvFr; नर तर AdvFr;



!!! Reason adverbs यसकारण AdvRe; तसथ AdvRe; !!! Sentential adverbs साँ चै AdvSe; ःवाभावतः



AdvSe;



LEXICON AdvT +ADV+TEMP:0 LEXICON AdvS +ADV+SPAC:0 LEXICON AdvAm +ADV+AMOUNT:0 LEXICON AdvMa +ADV+MANNER:0



#; #; #; #;



250



LEXICON AdvFr +ADV+FREQ:0 #; LEXICON AdvRe +ADV+REASON:0 #; LEXICON AdvSe +ADV+SENT:0 #; END 7.2.7 Postpositions Postpositions discussed and analyzed in (5.3) are implemented in a lexc file. Plural marker and case markers are directly encoded whereas adverbial postpositions are implemented through continuations lexicons. This lexc file contains transducers from Figure 5.10 to Figure 5.12.



Multichar_Symbols +POSTP +EMPH +ERG +INST +DAT +ABL +LOC +COM +GEN +DIR +SG +PL +FEM LEXICON ROOT !!Case Markers which do not take emphatic marker +ERG:ले #; +INST:ले



#;



+DAT:लाई



#;



+ABL:दे िख



#;



!!Case marker which take emphatic marker also +ABL:बाट #; +ABL+EMPH:बाटै +LOC:मा



#;



+LOC+EMPH:मै +COM:सँग



#;



#;



+COM+EMPH:सँगै +COM: सत



#;



#;



#;



+COM+EMPH: सतै



#;



+GEN+SG:को #; +GEN+PL:का #; +GEN+FEM:क



#;



251



+GEN+EMPH:कै



+ALL: तर



#;



#;



+ALL+EMPH: तरै #; !!Plural/collective marker +PL:ह #; !!Adverbial postpositions which do not take emphatic marker मा थ tag1; कहाँ



tag1;



!!Adverbial Postpositions which take emphatic marker स हत tag2; साथ



अनुसार



tag2; tag2;



बाहे क tag2; LEXICON tag1 +POSTP:0 #; LEXICON tag2 +POSTP:0 #; +POSTP+EMPH:◌ै END



#;



7.2.8 Conjunctions, particles and interjections The conjunctions analyzed in (5.2) and particles and interjections analyzed in (5.4) are implemented in a lexc file. This lexc file contains the transducers from Figure 5.8 to Figure 5.9 and Figure 5.13 to Figure 5.14. Multichar_Symbols +PART +INTERJ +CCONJ +SCONJ +PARTICLE



LEXICON Root !!!समप दक सं योजकह र



वा



Coordinate; Coordinate;



अथवा Coordinate; या



Coordinate;







Coordinate;



नक



Coordinate;



अन



Coordinate;



252



पन



Coordinate;



तथा



Coordinate;



एवं



Coordinate;



तर



Coordinate;



क तु Coordinate;



पर तु Coordinate; !!! वषमप दक सं योजकह भ े



Subordinate;



भनेर



Subordinate;



भने



Subordinate;







Subordinate;



कनभने Subordinate; कन क Subordinate;



यसकारण



Subordinate;



!!!Particles नपातह नै



Particle;



माऽ



Particle;



केवल Particle; चा हँ



Particle;



पन



Particle;







Particle;



है



Particle;







Particle;







Particle;







Particle;



पो



Particle;



या



के



Particle; Particle;







Particle;



रे



Particle;



हँ



यारे



हग



Particle; Particle; Particle;



253



खै



Particle;



लौ



Particle;



हौ



Particle;



यार



Particle;



यारे



Particle;



या



Particle;



है



Particle;







Particle;



!!!Interjections वःमाया दबोधकह अहा



Interjection;



अहो



Interjection;



ओहो



Interjection;



उहु



Interjection;



उफ



Interjection;



आ था Interjection; आ थु Interjection; आ छु Interjection; छ



Interjection;



धत्



Interjection;



थु



Interjection;



थुइ



Interjection;



ध े र Interjection;



बःड



Interjection;



हाय



Interjection;



कठै



Interjection;



हरे



Interjection;



िशव



Interjection;







Interjection;



बरा



Interjection;



उस्



Interjection;



हाहा



Interjection;



हह



Interjection;



बचरा Interjection;



254



या



Interjection;







Interjection;







Interjection;







Interjection;



हौ



Interjection;



ऐ या



Interjection;







Interjection;



हवस्



Interjection;



अँ



Interjection;



यू



Interjection;



हजुर



Interjection;



हँ



अहँ



नाइँ



Interjection; Interjection; Interjection;



कुि



Interjection;



स े



Interjection;



साँ ची Interjection; धरोधम Interjection; भो



Interjection;







Interjection;







Interjection;



वाह



Interjection;



अबुइ



Interjection;



आ पै



Interjection;



ःयाबास Interjection;



उफ्



Interjection;







Interjection;



चै



Interjection;



ओइ



Interjection;



एइ



Interjection;



याहै



Interjection;







Interjection;



हा



ब:ड



Interjection; Interjection;



255



LEXICON Coordinate +CCONJ:0 #; LEXICON Subordinate +SCONJ:0 #; LEXICON Particle +PARTICLE:0 #; LEXICON Interjection +INTERJ:0 #; END 7.2.9 Derivations The derivational process prefixation described and analyzed in (6.1) and suffixation described and analyzed in (6.2) are implemented in lexc file. This lexc file contains the transducers form Figure 6.1 to Figure 6.16.



Multichar_Symbols +NOUN PFX+ +ADJ +SFX ^R ^a +ADV LEXICON ROOT !Lexcons for prefixation PNtoN; PNtoAdj; PNtoAdv; PAdjtoAdj; !Lexicons for suffixation SNtoN; SNtoAdj; SNtoNAdj; SAdjtoN; SAdjNtoN; SVtoN; SVtoAdj; SVtoAdv; SAdvtoAdj; ConVtoN; ConVtoNAdj; InsVtoN; LEXICON PNtoN PFX+:ू PNtoN1; PFX+:परा



PNtoN2;



PFX+:अप



PNtoN3;



PFX+:सम्



PNtoN4;



PFX+:अनु



PNtoN5;



PFX+:अव



PNtoN6;



256



PFX+:दुस ्



PNtoN7;



PFX+:दुर ्



PNtoN8;



PFX+: व



PNtoN9;



PFX+:अ ध



PNtoN10;



PFX+:अ त



PNtoN11;



PFX+:अ भ



PNtoN12;



PFX+:ू त



PNtoN13;



PFX+:प र



PNtoN14;



PFX+:उप



PNtoN15;



PFX+:सह



PNtoN16;



PFX+:स



PNtoN17;



PFX+:कु



PNtoN18;



PFX+:अ



PNtoN19;



PFX+:अन्



PNtoN20;



PFX+:बे



PNtoN21;



PFX+:बद



PNtoN22;



PFX+:ला



PNtoN23;



PFX+:सु



PNtoN24;



!!Lexicon of underived nouns -----!! LEXICON PNtoN1 चलन PNtoNtag; LEXICON PNtoN2 जय PNtoNtag; LEXICON PNtoN3 श द PNtoNtag; LEXICON PNtoN4 मान PNtoNtag; LEXICON PNtoN5 शासन PNtoNtag; LEXICON PNtoN6 गुण PNtoNtag; LEXICON PNtoN7 प रणाम PNtoNtag; LEXICON PNtoN8 घटना PNtoNtag; LEXICON PNtoN9 नाश PNtoNtag; 257



LEXICON PNtoN10 रा य PNtoNtag; LEXICON PNtoN11 वृ PNtoNtag; LEXICON PNtoN12 िच PNtoNtag; LEXICON PNtoN13 व न PNtoNtag; LEXICON PNtoN14 योजना PNtoNtag; LEXICON PNtoN15 मह PNtoNtag; LEXICON PNtoN16 काय PNtoNtag; LEXICON PNtoN17 प रवार PNtoNtag; LEXICON PNtoN18 पुऽ PNtoNtag; LEXICON PNtoN19 ान PNtoNtag; LEXICON PNtoN20 आःथा PNtoNtag; LEXICON PNtoN21 इ जत PNtoNtag; LEXICON PNtoN22 नाम PNtoNtag; LEXICON PNtoN23 वा रस PNtoNtag; LEXICON PNtoN24 समाचारPNtoNtag; !Lexicon for common tag LEXICON PNtoNtag +NOUN:0 #; !!---Noun to adjective derivation -----!! LEXICON PNtoAdj PFX+: नर ् PNtoAdj1; PFX+: नः



PNtoAdj2;



PFX+: न



PNtoAdj3;



PFX+: व



PNtoAdj4;



258



PFX+: नस्



PNtoAdj5;



PFX+:स



PNtoAdj6;



PFX+:बे



PNtoAdj7;



PFX+:अ



PNtoAdj8;



PFX+:अन



PNtoAdj9;



!!Lexicon of underived nouns LEXICON PNtoAdj1 दोष PNtoAdjtag; LEXICON PNtoAdj2 ःवाथ PNtoAdjtag; LEXICON PNtoAdj3 डर PNtoAdjtag; LEXICON PNtoAdj4 मुख PNtoAdjtag; LEXICON PNtoAdj5 फल PNtoAdjtag; LEXICON PNtoAdj6 बल PNtoAdjtag; LEXICON PNtoAdj7 घर PNtoAdjtag; LEXICON PNtoAdj8 मू य PNtoAdjtag; LEXICON PNtoAdj9 मोल PNtoAdjtag; !!Lexicon for common tag LEXICON PNtoAdjtag +ADJ:0 #; !!-----Noun to adverb derivation -----!! LEXICON PNtoAdv PFX+:आ PNtoAdv1; PFX+:स



PNtoAdv2;



PFX+: नर ्



PNtoAdv3;



PFX+:ू त



PNtoAdv4;



!!Lexicon of underived nouns LEXICON PNtoAdv1 मरण PNtoAdvtag; LEXICON PNtoAdv2



259



हष PNtoAdvtag; LEXICON PNtoAdv3 घात PNtoAdvtag; LEXICON PNtoAdv4 ह ा PNtoAdvtag; !!Lexicon for common tag LEXICON PNtoAdvtag +ADV:0 #; !!-----Adjective to adjective derivation -----!! LEXICON PAdjtoAdj PFX+:सम् PAdjtoAdj1; PFX+: व



PAdjtoAdj2;



PFX+:दुर ्



PAdjtoAdj3;



PFX+:उन्



PAdjtoAdj4;



PFX+:सु



PAdjtoAdj5;



PFX+:प र



PAdjtoAdj6;



!!Lexicon of underived nouns LEXICON PAdjtoAdj1 पूण PAdjtoAdjtag; LEXICON PAdjtoAdj2 शु PAdjtoAdjtag; LEXICON PAdjtoAdj3 भे PAdjtoAdjtag; LEXICON PAdjtoAdj4 मु PAdjtoAdjtag; LEXICON PAdjtoAdj5 िशि त PAdjtoAdjtag; LEXICON PAdjtoAdj6 पूण PAdjtoAdjtag; !!Lexicon for common tag LEXICON PAdjtoAdjtag +ADJ:0 #; !! Suffixation !!Noun to Noun Derivation LEXICON SNtoN !!Nountype1 सुन SNtoN1; !!Nountpe2 260



घाँस SNtoN2; LEXICON SNtoN1 +SFX:आर SNtoNtag; LEXICON SNtoN2 +SFX:ई SNtoNtag; LEXICON SNtoNtag +NOUN:0 #; !! Noun to adjective derivation LEXICON SNtoAdj !!Nountype1 दया SNtoAdj1; !!Nountpe2 लाभ SNtoAdj2; !!Nountpe3 सेवा SNtoAdj3; !!Nountpe4 मुगल SNtoAdj4; !!Nountpe5 ल बु SNtoAdj5; !!Nountpe6 दान SNtoAdj6; !!Nountpe7 खच SNtoAdj7; !!Nountpe8 भर SNtoAdj8; !!Nountpe9 रस SNtoAdj9; !!Nountpe10 शहर SNtoAdj10; !!Nountpe11 होस SNtoAdj11; LEXICON SNtoAdj1 +SFX:अनीय SNtoAdjtag; LEXICON SNtoAdj2 +SFX:अक SNtoAdjtag; LEXICON SNtoAdj3 +SFX:इका SNtoAdjtag; LEXICON SNtoAdj4 +SFX:आन SNtoAdjtag; LEXICON SNtoAdj5



261



+SFX:वान SNtoAdjtag; LEXICON SNtoAdj6 +SFX:ई SNtoAdjtag; LEXICON SNtoAdj7 +SFX:आलु SNtoAdjtag; LEXICON SNtoAdj8 +SFX:आलो SNtoAdjtag; LEXICON SNtoAdj9 +SFX:आहा SNtoAdjtag; LEXICON SNtoAdj10 +SFX:इया SNtoAdjtag; LEXICON SNtoAdj11 +SFX:इयार SNtoAdjtag; LEXICON SNtoAdjtag +ADJ:0 #; !!-----Noun to noun/adjective derivation -----!! LEXICON SNtoNAdj !!Nountype1 झापा SNtoNAdj1; !!Nountpe2 गु मी SNtoNAdj2; !!Nountpe3 इलाम SNtoNAdj3; !!Nountpe4 गाउँ SNtoNAdj4; !!Nountpe5 नेपाल SNtoNAdj5; LEXICON SNtoNAdj1 +SFX:ल SNtoNAdjtag1; LEXICON SNtoNAdj2 +SFX:एल SNtoNAdjtag1; LEXICON SNtoNAdj3 +SFX:ए SNtoNAdjtag1; LEXICON SNtoNAdj4 +SFX:ले SNtoNAdjtag1; LEXICON SNtoNAdj5 +SFX:ई SNtoNAdjtag1; LEXICON SNtoNAdjtag +NOUN:0 #; LEXICON SNtoNAdjtag1 +NOUN:0 #;



262



+ADJ:0



#;



!!-----Adjective to noun derivation -----!! LEXICON SAdjtoN !!Nountype1 लामो SAdjtoN1; छोटो SAdjtoN1; !!Nountpe2 !!xy SAdjtoN2; LEXICON SAdjtoN1 +SFX:आइ SAdjtoNtag; LEXICON SAdjtoNtag +NOUN:0 #;



!!----- Adjective/noun to noun derivation -----!! LEXICON SAdjNtoN !!Nountype1 ग रब SAdjNtoN1; LEXICON SAdjNtoN1 +SFX:ई SAdjNtoNtag1; LEXICON SAdjNtoNtag +NOUN:0 #; +ADJ:0 #; LEXICON SAdjNtoNtag1 +NOUN:0 #; !!------ Verb to noun derivation -----!! LEXICON SVtoN !!verbtype1 च ुन् SVtoN1; !verbtype2 च ुन् SVtoN2; !!verbtype3 क SVtoN3; !!verbtype4 क SVtoN4; !!verbtype5 ढाक् SVtoN5; !!verbtype6 जल् SVtoN6; !!verbtype7 चोर ् SVtoN7; !!verbtype8 हाँस् SVtoN8;



263



!!verbtype9 प SVtoN9; !!verbtype10 थाक् SVtoN10; !!verbtype11 छाप् SVtoN11; !!verbtype12 छान् SVtoN12; !!verbtype13 िच या SVtoN13; !!verbtype14 झर ् SVtoN14; !!verbtype15 ढोग् SVtoN15; !!verbtype16 राख् SVtoN16; !!verbtype17 दाब् SVtoN17; !!verbtype18 बच ् SVtoN18; !!verbtype19 स SVtoN19; !!verbtype20 रोप् SVtoN20; !!verbtype21 छे क् SVtoN21; !!verbtype22 िचर ् SVtoN22; !!verbtype23 ब SVtoN23; !!verbtype24 सर ् SVtoN24; !!verbtype25 उ SVtoN25; !!verbtype26 चाल् SVtoN26; !!verbtype27 बेर ् SVtoN27; !!verbtype28 गा SVtoN28; !!verbtype29



264



भ SVtoN29; !!verbtype30 िजत् SVtoN30; !!verbtype31 कोर ् SVtoN31; !!verbtype32 खुल ् SVtoN32; LEXICON SVtoN1 +SFX:आउ SVtoNtag; LEXICON SVtoN2 +SFX:आब SVtoNtag; LEXICON SVtoN3 +SFX:आनी SVtoNtag; LEXICON SVtoN4 +SFX:आनी SVtoNtag; LEXICON SVtoN5 +SFX:अनी SVtoNtag; LEXICON SVtoN6 +SFX:अन SVtoNtag; LEXICON SVtoN7 +SFX:ई SVtoNtag; LEXICON SVtoN8 +SFX:ओ SVtoNtag; LEXICON SVtoN9 +SFX:आइ SVtoNtag; LEXICON SVtoN10 +SFX:आवट SVtoNtag; LEXICON SVtoN11 +SFX:आ SVtoNtag; LEXICON SVtoN12 +SFX:ओट SVtoNtag; LEXICON SVtoN13 +SFX:हट SVtoNtag; LEXICON SVtoN14 +SFX:अना SVtoNtag; LEXICON SVtoN15 +SFX:आउनी SVtoNtag; LEXICON SVtoN16 SVtoNtag; +SFX:आलो LEXICON SVtoN17



265



+SFX:आब SVtoNtag; LEXICON SVtoN18 +SFX:अत SVtoNtag; LEXICON SVtoN19 +SFX:अल SVtoNtag; LEXICON SVtoN20 +SFX:आइँ SVtoNtag; LEXICON SVtoN21 +SFX:आरो SVtoNtag; LEXICON SVtoN22 +SFX:औटो SVtoNtag; LEXICON SVtoN23 +SFX:औती SVtoNtag; LEXICON SVtoN24 +SFX:उवा SVtoNtag; LEXICON SVtoN25 +SFX:ती SVtoNtag; LEXICON SVtoN26 +SFX:नी SVtoNtag; LEXICON SVtoN27 +SFX:नो SVtoNtag; LEXICON SVtoN28 +SFX:ना SVtoNtag; LEXICON SVtoN29 +SFX:अ त SVtoNtag; LEXICON SVtoN30 +SFX:और SVtoNtag; LEXICON SVtoN31 +SFX:एसो SVtoNtag; LEXICON SVtoN32 +SFX:अःत SVtoNtag; LEXICON SVtoNtag +NOUN:0 #; !!----- Verb to adjective derivation -----!! LEXICON SVtoAdj !!verbtype1 मच ् SVtoAdj1; !!verbtype2 भ ुल् SVtoAdj2; !!verbtype3



266



पोस् SVtoAdj3; !!verbtype4 घुम ् SVtoAdj4; !!verbtype5 घुम ् SVtoAdj5; !!verbtype6 खप् SVtoAdj6; !!verbtype7 प SVtoAdj7; !!verbtype8 छा SVtoAdj8; !!verbtype9 रोप् SVtoAdj9; !!verbtype10 सक् SVtoAdj10; !!verbtype11 बक् SVtoAdj11; !!verbtype12 भाग् SVtoAdj12; !!verbtype13 छे र ् SVtoAdj13; !!verbtype14 लाग् SVtoAdj14; LEXICON SVtoAdj1 +SFX:आहा SVtoAdjtag; LEXICON SVtoAdj2 +SFX:अ ड SVtoAdjtag; LEXICON SVtoAdj3 +SFX:इलो SVtoAdjtag; LEXICON SVtoAdj4 +SFX:अ ते SVtoAdjtag; LEXICON SVtoAdj5 +SFX:अ ता SVtoAdjtag; LEXICON SVtoAdj6 +SFX:आलु SVtoAdjtag; LEXICON SVtoAdj7 SVtoAdjtag; +SFX:ऐया LEXICON SVtoAdj8 +SFX:आ SVtoAdjtag; LEXICON SVtoAdj9



267



+SFX:आर SVtoAdjtag; LEXICON SVtoAdj10 +SFX:आ SVtoAdjtag; LEXICON SVtoAdj11 +SFX:आउ SVtoAdjtag; LEXICON SVtoAdj12 +SFX:औटो SVtoAdjtag; LEXICON SVtoAdj13 +SFX:औट SVtoAdjtag; LEXICON SVtoAdj14 +SFX:उ SVtoAdjtag; LEXICON SVtoAdjtag +ADJ:0 #; !!----- Verb to adverb derivation -----!! LEXICON SVtoAdv !!verbtype1 गर ् SVtoAdv1; LEXICON SVtoAdv1 +SFX:उ जेल SVtoAdvtag; +SFX:इ जेल SVtoAdvtag; LEXICON SVtoAdvtag +ADV:0 #;



!!------ Adverb to adjective derivation -----!! LEXICON SAdvtoAdj भऽ SAdvtoAdj1; LEXICON SAdvtoAdj1 +SFX:ई SAdvtoAdjtag; LEXICON SAdvtoAdjtag +ADJ:0 #; !!------ Verb to noun converstion -----!! LEXICON ConVtoN !!verbtype1 खेल ् ConVtoNtag; खोज् ConVtoNtag; LEXICON ConVtoNtag +NOUN:^R #; !!------ Verb to noun/adjective conversion -----!! LEXICON ConVtoNAdj !!verbtype1 ठग् ConVtoNAdjtag; 268



चोर ्



ConVtoNAdjtag;



थप् ConVtoNAdjtag; LEXICON ConVtoNAdjtag +NOUN:^R #; +ADJ:^R #; !!----- Verb to noun derivation (by vowel insertion) -----!! LEXICON InsVtoN !!verbtype1 च क InsVtoNtag; स झ



InsVtoNtag;



ट क InsVtoNtag; LEXICON InsVtoNtag +NOUN:^a #; END 7.3 Realization: rules of alternations When the lexc files are compiled into an finite state transducer, the upper language contains sequence of stem (citation form) and morphological tags and the lower language contains actual spelling of the stems and affixes. At the same time, there may be some of the arbitrary tags used for creating some sorts of environment for the application of rules. The rules demonstrated right after each figure of finite state transducer are respectively collected and put them into certain order. The required variables are defined and these rules are composed into a single finite state transducer. Finally the rules are applied to the lower language of the lexicon finite state transducer. Each category of the word class is treated separately in the subsequent sections.



7.3.1 Phonological rules for nouns clear define cons |ग|घ|ङ|च|छ|ज|झ|ञ|ट|ठ|ड|ढ|ण|त|थ|द|ध|न|प|फ|ब|भ|म|य|र|ल|व|स|ष|श|ह; define liquids र|ल; define change [[◌ो %+MP -> ◌ा || _ .#.] .o. [◌ो %+FE -> ◌ी || _ .#.] .o.



269



[◌ा -> [ ] || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ा -> [ ] || _ ◌े न ◌ी .#.] .o. [◌ा -> [ ] || _ ि◌ न ◌ी .#.] .o. [◌ा -> [ ] || _ ◌ी .#.] .o. [◌ी -> ◌् || liquids _ न ◌ी .#.] .o. [◌ी -> ि◌ || _ न ◌ी .#.] .o. [[. .] -> ◌् || cons _ न ◌ी .#.] .o. [य ◌ा -> [ ] || _ न ◌ी .#.] .o. [◌ी -> [ ] || _ ि◌ न ◌ी .#.] .o. [◌ू -> ◌ु || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ी -> [ ] || _ %^b ि◌ न ◌ी .#.] .o. [%^b -> [ ] ] .o. [ि◌ -> [ ] || ◌ु _ न ◌ी .#.] ];1



1



When more nouns are included, new rules, if any, can be added to this set of rules.



270



read lexc ◌ा || cons _ cons ?* %^a ] .o. [ि◌ %^a -> ◌् ] .o. [◌ा -> [ ] || cons _ ?* %^ia ] 271



.o. [◌ा -> [ ] || cons _ ?* %^ta ] .o. [◌् %^ia -> ◌ा ] .o. [◌् %^ta -> ◌ा ] .o. [%^ia -> ◌ा ] .o. [ि◌ %^ta -> ◌ा] .o. [%^IMPsg -> [ ], %^IMPpl -> ऊ, %^IMPhon -> ओ || ◌ा _ .#.] .o. [◌् -> [ ] || cons _ %^IMPhon ] .o. [◌्



-> [ ] || cons _ %^IMPpl]



.o. [ि◌ -> [ ] || cons _ %^IMPhon] .o. [ि◌ -> [ ] || cons _ %^IMPpl] .o. [%^IMPsg -> [ ] || ◌् _ .#.] .o. [%^IMPsg -> [ ] || _ .#.] .o. [%^IMPpl -> [ ] || _ .#.] .o. [%^IMPhon -> [ ] || _ .#.]



272



.o. [◌् आ -> ◌ा || cons _ ] .o. [◌् इ -> ि◌ || cons _ ] .o. [◌् ई -> ◌ी || cons _ ] .o. [◌् उ -> ◌ु || cons _ ] .o. [◌् ऊ -> ◌ू || cons _ ] .o. [◌् ए -> ◌े || cons _ ] .o. [◌् ओ -> ◌ो || cons _ ] .o. [◌् औ -> ◌ौ || cons _ ] .o. [इ इ



-> इ ]



.o. [इ ई



-> ई ]



.o. [[. .] -> उ ◌ँ || ◌ा _ npastinfl|npastneginfl1|habinfl|habneginfl|imperfect .#.] .o. [[. .] -> न ◌् || ि◌|इ _ npastinfl|habinfl|imperfect .#.] .o. [i -> इ ] .o.



273



[◌् इ -> ि◌ || cons _ ] .o. [इ इ



-> इ ]



];2 read lexc ◌ा || _ .#.] .o. [◌ो %+FE -> ◌ी || _ .#.] .o. [[. .] -> ◌् || liquids _ न ◌ी .#.] .o. [य ◌ा -> [ ] || _ न ◌ी .#.] .o. [◌ा -> [ ] || _ ?* न ◌ी .#.] .o. [[. .] -> ि◌ || cons _ न ◌ी .#.] .o. [ि◌ -> [ ] || ध _ न ◌ी .#.] ];3 2



More rules may come up when more verbs will be added.



3



When more adjectives are added to lexc file, it may require some other rules.



274



read lexc [ ] || _ ◌ै .#.] .o. [◌ो -> [ ] || _ ◌ै .#.] .o. [◌् -> [ ] || _ ◌ै .#.] ]; read lexc ◌ो] .o. [◌् इ -> ि◌] .o.



276



[◌् ई -> ◌ी] .o. [◌् उ ->◌ु] .o. [◌् ऊ ->◌ू] .o. [◌् ए -> ◌े] .o. [◌् ऐ -> ◌ै] .o. [◌् औ -> ◌ौ] .o. [◌् अ -> [ ] ] .o. [अ -> [ ] || con _ ] .o. [ए -> ◌े, इ -> ि◌ || con _ ] .o. [◌ी ए -> ◌े || _ ल ◌ी] .o. [आ -> ◌ा || con _ ] .o. [◌ो आ -> ◌ा || con _ ] .o. [ई -> ◌ी || con _ ] .o. [◌ा इ -> ि◌ || con _ con] .o. [◌ा अ -> [ ] || con _ ] .o. [◌ा -> [ ] || .#. con _ ?* इ .#.] .o. [◌् -> [ ] || _ ?* %^a] .o. [◌् %^R -> [ ] ] .o. [%^a -> [ ] ] ]; read lexc