A computational analysis of nepali morphology. A model for natural language processing [PDF]

Commentary
1098103

0 0 0
Suka dengan makalah ini dan mengunduhnya? Anda bisa menerbitkan file PDF Anda sendiri secara online secara gratis dalam beberapa menit saja! Sign Up

A computational analysis of nepali morphology. A model for natural language processing [PDF]

Publisher: Faculty of Humanities and Social Sciences of Tribhuvan University in Fulfillment of the Requirements for the

103 20 4 MB

Nepali-English Pages [380]

Report DMCA / Copyright

DOWNLOAD FILE

Natural Language Processing

0 0 96 KB Read more

Morphology of the Tibetan Language

104 10 7 MB Read more

A practical dictionary of modern Nepali

103 47 3 MB Read more

A Nepali Conversation Manual

108 80 5 MB Read more

A glossary of constitutional terms English-Nepali

98 94 421 KB Read more

Cohesion in Russian - A Model For Discourse Analysis

0 0 375 KB Read more

TDIL. Script Grammar for Nepali Language

108 97 3 MB Read more

Limbu: A Sketch of an Endangered Language

113 7 630 KB Read more

A short descriptive grammar of Turkmen language

97 35 479 KB Read more

Environmental Analysis A. SWOT Analysis

6 0 172 KB Read more

File loading please wait...

Citation preview

A COMPUTATIONAL ANALYSIS OF NEPALI MORPHOLOGY: A MODEL FOR NATURAL LANGUAGE PROCESSING

A Dissertation

Submitted to the Faculty of Humanities and Social Sciences of Tribhuvan University in Fulfillment of the Requirements for the Degree of

DOCTOR OF PHILOSOPHY in LINGUISTICS

By

BALARAM PRASAIN Ph.D. Reg. No.: 19-2058 Magh TU Reg. No.: 288-83 March 2011

ACKNOWLEDGEMENTS

My profound indebtedness is due to my supervisor prof. Dr. Yogendra Prasad Yadava, the former head, Central Department of Linguistics, Tribhuvan University, Nepal for his insistent encouragement, continuous guidance, valuable suggestions and insightful comments in accomplishing this dissertation. I would like to express my sincere gratitude to Professor Dr. Miriam Butt, Department of Linguistics, Konstanz University, Germany for her constructive suggestions, proper guidance, insightful comments to improve this dissertation. I owe a great deal to Dr. Andrew Hardie, Lancaster University, for his valuable suggestions and providing his articles which help me understand the basic concepts and also for helping in using online NNC corpus. I would like to extend my thanks to Dr. Dan Raj Regmi, head of the Central Department of Linguistics, Tribhuvan University, for his encouragement, useful suggestions and comments provided to my study. I would like to extend my sincere gratitude to Prof. Madhav Prasad Pokharel, Central Department of Linguistics, Tribhuvan University, for his prompt answer to any queries regarding Nepali morphology and its structure. I would like to express my gratitude to Prof. Dr. Chuda Mani Bandhu and Prof. Dr. Tej R. Kansakar, Former heads, Central Department of Linguistics, for their inspiration and encouragement given to this study. I extend my thanks to Krishna Prasad Chalise, Central Department of Linguistics, Dubi Nanda Dhakal, Central Department of Linguistics, Krishna Poudel, Central Department of Linguistics, for their valuable comments and active participation in discussion whenever the problem was raised. I am equally thankful to Ram Raj Lohani, Central Department of Linguistics, Bhim Narayan Regmi, Central Department of Linguistics, Karnakhar Khatiwada, Central Department of Linguistics, Bhim Lal Gautam, Central Department of Linguistics, Krishna Prasad Parajuli, Central Department of Linguistics for their support and encouragement.

iv

I am extremely thankful to Santa Bahadur Basnet for his tokenizer and a number of computational concepts. I would like to thank Madan Puraskar Library and its staff for their help in different point of time. My sincere thanks go to Dr. Tika Ram Poudel, Tina Bögel, Sebastian Sulger, Kanstanz Univeristy for their help. I would like to thank Tribhuvan University for providing me study leave for two years, SIL International for providing me travel grants while attending the workshops and institute in University of California, Bangkok, Thailand and IIT Hyderabada, India; Bhashasanchar project for supporting me financially while attending the training on text-to-speech in Gotenberg University, Sweden; and Department of Linguistics, University of Kanstanz for supporting me financially to attend the school of computational and natural language processing in Konstanz University, Germany. I would like to take this opportunity to express my sincere appreciation to my spouse Mrs. Nirmala Prasain, son Aryan Prasain and daughter Sanskriti Prasain for their tolerance. Finally, I express my thanks to all the Central Department’s non-teaching staff for their help whenever required.

BALARAM PRASAIN

v

ABSTRACT

The main goal of this study is to present a computational analysis of morphology in Nepali for developing a model for natural language processing by applying the finite state approach. The morphological categories have been analyzed according to the principle of Two-level morphology (Koskeniemmi 1983), and these categories have been implemented using Xerox finite state tool (Beesley and Kartumnen 2003) to create the morphological analyzer. A version of finite state automaton called finite state transducer is used in this study which handles relation between two languages, namely upper language and lower language. Upper language is equivalent to lexical level and lower language is equivalent to surface level. The finite state transducer is bidirectional, i.e., moving from surface level to lexical level is analysis and from lexical level to surface level is generation. This study is organized into eight chapters. Chapter 1 presents the general morphological concepts, the objectives, methodology, the significance and limitations of the study. Chapter 2 presents the theoretical framework that is adopted for the study. Chapter 3 analyzes nouns, pronouns, adjectives, numerals and classifiers in Nepali. Chapter 4 analyzes the verbs in Nepali from computational approach in the first part and verbal inflections in the second part. Chapter 5 deals with indeclinable words in Nepali. Chapter 6 analyzes the derivational process. Chapter 7 implements the outcome of analysis in previous chapters into a finite state transducer using Xerox Finite State Tool. Chapter 8 summarizes the findings of the study. This study has identified fourteen groups of nouns, eight groups of pronouns, four groups of adjectives, one group of cardinal numerals, two groups of ordinal numerals, three groups of classifiers, ten groups of verbs, seven groups of adverbs, two groups of conjunctions, three groups of postpositions, one group of particles and fifteen groups of derivations in Nepali. The phonological rules for each group have also been identified. The finite state transducer for each group with corresponding morphological tags and phonological rules have been created; and all of them have been put together into a single transducer which can be used as a morphological analyzer for Nepali.

vi

TABLE OF CONTENTS

Recommendation Letter

i

Approval Letter

ii

Declaration

iii

Acknowledgments

iv

Abstract

vi

List of tables

xiii

List of figures

xx

List of abbreviations

xxv

Chapter 1: Introduction

1

1.1 Background

1

1.2 Statement of the problem

6

1.3 Objectives of the study

7

1.3 Literature review

7

1.5 Significance of the study

16

1.6 Research methodology

16

1.7 Limitations

17

1.8 Organization of the study

17

Chapter 2: Theoretical framework

19

2.0 Outline

19

2.1 Computational concept

19

2.2 Regular expression

20

2.3 Finite state technology

22

2.4 Regular language

23

2.5. Finite state machine

24

2.5.1 Finite state automata (FSA)

24

2.5.2 Finite state transducer (FST)

25

2.5.3 Some important operations on FSTs

26

vii

2.6 FST in computational morphology

30

2.7 Xerox finite state tool syntax (XFST)

31

2.7.1 LEXC grammar

33

2.7.2 XFST interface

38

2.8 Summary

41

Chapter 3: Nominal morphology

42

3.0 Outline

42

3.1 Nouns in Nepali

42

3.1.1 Characteristics of nouns in Nepali

43

3.2 Classification of nouns in Nepali

55

3.2.1 O-ending nouns

55

3.2.2 Non-o-ending nouns

60

3.3 Pronouns

69

3.3.1 Characteristics of pronouns in Nepali

69

3.3.2 Grouping of pronouns

71

3.4 Adjectives

91

3.4.1 Characteristics of adjectives in Nepali

91

3.4.2 Classification of adjectives

95

3.5 Numerals

100

3.5.1 Cardinal numbers

100

3.5.2 Ordinal number

101

3.5.2 Other numerals

105

3.6 Classifiers in Nepali

107

3.6.1 Numeral classifiers

107

3.6.2 Quasi classifiers

108

3.7 Summary

110

Chapter 4: Verbal morphology

111

4.0 Outline

111

4.1 Characteristics of verb in Nepali

111

viii

4.1.1 Significant verb stem finals

111

4.1.2 Transitivity

118

4.1.3 Syllabicity

120

4.1.4 Sound आ a

122

4.2 Morphological processes

122

4.2.1 Causativization/transitivization

122

4.2.2 Passivization

127

4.2.3 Negativization

129

4.3 Stem formation

130

4.4 Grouping of verb stems

131

4.4.1 Intransitive verb stems

131

4.4.2 Transitive verb stem

138

4.4.3 Irregular verb stems

144

4.4.4 Suppletive verb stems

145

4.5 Verbal inflections

147

4.5.1 Auxiliary verbs in Nepali

147

4.5.2 Tense

155

4.5.3 Aspects

163

4.5.4 Moods

173

4.5.5 Participial forms

179

4.6 Summary

187

Chapter 5: Adverbs, conjunctions, postpositions and particles

188

5.0 Outline

188

5.1 Adverbs in Nepali

188

5.1.1 Temporal adverbs

188

5.1.2 Spatial adverbs

189

5.1.3 Amount adverbs

190

5.1.4 Manner adverbs

191

5.1.5 Frequency adverbs

191

5.1.6 Reason adverbs

192 ix

5.1.7 Sentential adverbs

193

5.2 Conjunctions in Nepali

194

5.2.1 Coordinate conjunctions

194

5.2.2 Subordinate conjunctions

195

5.3 Postpositions in Nepali

196

5.3.1 Plural/collective marker

196

5.3.2 Case markers in Nepali

196

5.3.3 Adverbial postpositions

198

5.4 Particles and interjections in Nepali

201

5.4.1 Particles

201

5.4.2 Emphatic markers

202

5.4.3 Interjections in Nepali

203

5.5 Summary

204

Chapter 6: Derivational morphology

205

6.0 Outline

205

6.1 Prefixation

205

6.1.2 Noun to noun derivation

206

6.1.3 Noun to adjective derivation

207

6.1.4 Noun to adverb derivation

208

6.1.5 Adjective to adjective derivation

209

6.2 Suffixation

209

6.2.1 Noun to noun derivation

209

6.2.2 Noun to adjective derivation

210

6.2.3 Noun to noun/adjective derivation

212

6.2.4 Adjective to noun derivation

213

6.2.5 Adjective/noun to noun derivation

214

6.2.6 Verb to noun derivation

215

6.2.7 Verb to adjective derivation

217

6.2.8 Verb to adverb derivation

219

6.2.9 Adverb to adjective derivation

220

x

6.2.10 Verb to noun conversion

221

6.2.11 Verb to adjective/noun conversion

222

6.2.12 Verb to noun derivation

223

6.3 Summary

224

Chapter 7: Implementation

225

7.0 Outline

225

7.1 Morphotactics: syntax of morphemes

225

7.1.1 Morphological categories

225

7.1.2 Grammatical categories

226

7.2 Lexc grammar

228

7.2.1 Nouns

228

7.2.2 Pronouns

230

7.2.3 Verbs

235

7.2.4 Adjectives

246

7.2.5 Numerals and classifiers

247

7.2.6 Adverbs

250

7.2.7 Postpositions

251

7.2.8 Conjunctions, particles and interjections

252

7.2.9 Derivations

256

7.3 Realization: rules of alternations

269

7.3.1 Phonological rules for nouns

269

7.3.2 Phonological rules for pronouns

271

7.3.3 Phonological rules for verbs

271

7.3.4 Phonological rules for adjectives

274

7.3.5 Phonological rules for adverbs

275

7.3.6 Phonological rules for postpositions

275

7.3.7 Phonological rules for particles and interjections

276

7.3.8 Phonological rules for numerals and classifiers

276

7.3.9 Phonological rules for derivations

276

7.4 Summary

278

xi

Chapter Eight: Summary and conclusion

279

Annexes

282

Annex-1: Devanagari – IPA

282

Annex-2: Nepali nouns sample

284

Annex-3: Nepali pronouns

305

Annex-4: Adjectives in Nepali

310

Annex-5: Numerals and classifiers in Nepali

311

Annex-6: Adverbs in Nepali

318

Annex-7: Verbs in Nepali

325

Annex-8: Verbal Inflections in Nepali

328

Annex-9: Conjunctions and particles in Nepali

334

Annex-10: Postpositions in Nepali

337

Annex-11: Words and affixes for derivation in Nepali

340

References

348

xii

List of Tables

Table 1.1: Simple, complex, compound and reduplicated words

3

Table 1.2 Free morphemes in Nepali

3

Table 1.3 Bound morphemes in Nepali

4

Table 1.4: Lexical and surface levels representation

5

Table 2.1: The sample regular expressions

20

Table 2.2: Some operators used in regular expressions

21

Table 2.3: Regular expressions and regular language

24

Table 2.4: The transition table for घर and घरह

25

Table 3.1: The o-ending and non-o-ending nouns

42

Table 3.2: Number: singular and plural

44

Table 3.3: Lexical gender

46

Table 3.4: Morphological gender

47

Table 3.5: Direct and oblique forms

48

Table 3.6: Honorificity: non-honorific and honorific

49

Table 3.7: Augmentative and dimunitive

50

Table 3.8: NounType 1a

54

Table 3.9: NounType 1b

56

Table 3.10: NounType 1c

57

Table 3.11: NounType 1d

58

Table 3.12: NounType 21a

59

Table 3.13: NounType 21b

60

Table 3.14: NounType 21c

62

Table 3.15: NounType 21d

63

Table 3.16: NounType 22a

64

Table 3.17: NounType 22b

64

Table 3.18: NounType 22c

65

Table 3.19: NounType 22d

66

Table 3.20: NounType 22e

66

Table 3.21: NounType 22f

67

Table 3.22: Pronouns with respect to persons

69

Table 3.23: Persons number in number distinctions

70

xiii

Table 3.24: Form of pronouns: direct and oblique

70

Table 3.25: Honorific levels in Nepali pronouns

71

Table 3.26: First person singular pronouns

72

Table 3.27: First person plural pronouns

73

Table 3.28: Second person singular non-honorific pronouns

74

Table 3.29: Second person honorific pronouns

75

Table 3.30: Second person high honorific pronouns

75

Table 3.31: Second person royal honorific pronoun

76

Table 3.32: Third person pronoun ऊ u:

77

Table 3.33: Third person pronouns यो tjo and ती ti:

78

Table 3.34: Third person pronouns यो jo and यी ji:

79

Table 3.35: The reflexive pronouns

80

Table 3.36: The demonstrative pronouns यो jo and यी ji:

81

Table 3.37: The demonstrative pronouns यो tjo and ती ti:

82

Table 3.38: The demonstrative pronouns ऊ u:

83

Table 3.39: The remaining demonstrative pronouns

84

Table 3.40: The relative pronouns

84

Table 3.41a: The interrogative pronouns

85

Table 3.41b: The indefinite pronouns derived from interrogative pronouns

86

Table 3.42: The indefinite pronouns derived from relative pronouns

87

Table 3.43a: The definite pronouns

88

Table 3.43b: The definite pronoun अक

88

Table 3.44a: The reciprocal pronouns

89

Table 3.44b: The reciprocal pronouns

90

Table 3.45: O-ending and non-o-ending adjectives

91

Table 3.46: Number: singular and plural

92

Table 3.47: Gender: masculine and feminine

93

Table 3.48: Form: direct and oblique

93

Table 3.49: Honorificity: non-honorific and honorific

94

Table 3.50: Degree: positive, comparative and superlative

95

xiv

Table 3.51: O-ending adjectives

96

Table 3.52: Type 1 marked adjectives

97

Table 3.53: Type 2 marked adjectives

99

Table 3.54: Unmarked adjectives

99

Table 3.55: Some cardinal numbers

100

Table 3.56: Some regular ordinal numbers

102

Table 3.57: Irregular ordinal numbers of one

103

Table 3.58: Irregular ordinal numbers of two

103

Table 3.59: Irregular ordinal numbers of three

103

Table 3.60: Irregular ordinal numbers of four

103

Table 3.61: Some ordinal numbers from Sanskrit loan

105

Table 3.62: Frequency numerals (I)

105

Table 3.63: Frequency numerals (II)

105

Table 3.64: Frequency numerals (III)

106

Table 3.65: Frequency numerals (IV)

106

Table 3.66: Some portion numerals

106

Table 3.67: Numeral classifiers

107

Table 3.68: o-ending classifiers

108

Table 3.69: General non-o-ending classifiers

109

Table 4.1: i-ending intransitive verb stems

112

Table 4.2: i-ending transitive verb stems

112

Table 4.2a: i-ending transitive verb stems

113

Table 4.3: Alternative forms of i-ending verb stems

114

Table 4.4: a-ending verb stems (group 1)

115

Table 4.5: a-ending verb stems (group 2)

115

Table 4.6: o -ending verb stems

116

Table 4.7a: Change of o to u in o-ending verb stems

116

Table 4.7b: ʌ-ending verb stems

116

Table 4.7c: ʌ-ending verb stems

116

Table 4.7d: Verb stems ending with a voiceless consonant

117

xv

Table 4.8: Alternative forms from stems ending with voiceless consonant

117

Table 4.9: Verb stems ending with voiced consonant

118

Table 4.10: Alternative forms from stems ending with voiced consonant

118

Table 4.11: Intransitive verbs

119

Table 4.12 Some transitive verbs

120

Table 4.13: Some ditransitive verbs

120

Table 4.14: Monosyllabic verb stems

121

Table 4.15: Polysyllabic verb stems

121

Table 4.16 Verb stems with a sound

122

Table 4.17: Verb stems without a sound

122

Table 4.18 Causative verb stems

123

Table 4.19: Verb stems forming causatives with -आ -a and आल् -al

124

Table 4.20: Verb stems forming causatives by changing अ ʌ to आ a

125

Table 4.21a: Verb stems forming causatives by chaning उ u to ओ o

126

Table 4.21b: Verb stems forming causatives by suffixing -आ –a

126

Table 4.22: Verb stems form causatives by inserting a

127

Table 4.23: Some passive verb stems

129

Table 4.24: Negation by the prefixation of negative marker न- nʌ-

130

Table 4.25: Negation by the suffixation of negative marker -न -nʌ

130

Table 4.26: Pattern of the stem formation

131

Table 4.27: Type1a verb stems

132

Table 4.28: Type1b verb stems

133

Table 4.29: Type1c verb stems

134

Table 4.30: Type1d verb stems

136

Table 4.31: Type1e verb stems (i)

137

Table 4.32: Type1e verb stems (ii)

137

Table 4.33: Type2a verb stems (i)

139

Table 4.34: Type2a verb stems (ii)

139

Table 4.35: Type2b verb stems

140

Table 4.36: Type2c verb stems

142 xvi

Table 4.37: Type2d verb stems

143

Table 4.38: Irregular verb stems

145

Table 4.39: Suppletive verb stems

146

Table 4.39: Inflections for non-past existential verb छ chʌ ‘be’ (affirmative)

148

Table 4.40: Inflection for non-past existential verb छ chʌ 'be' (negative)

149

Table 4.41: Inflections for non-past identificational verb हो ɦo ‘be’ (affirmative)

151

Table 4.42: Inflection for non-past identificational verb हो ɦo ‘be’ (negative)

152

Table 4.43: Inflections for past existential verb थ tʰi 'be' (affirmative)

153

Table 4.44: Inflections for past existential verb थ tʰi ‘be’ (negative)

155

Table 4.45: Inflections for non-past tense (affirmative)

157

Table 4.46: Inflections for non-past tense negative 1

158

Table 4.47: Inflections for non-past tense negative 2

159

Table 4.48: Inflections for past tense (affirmative)

161

Table 4.49: Inflections for past tense (negative)

162

Table 4.50: Inflections for perfect aspect

164

Table 4.51: Inflections for imperfect aspect

166

Table 4.52: Inflections for past habitual aspect (affirmative)

168

Table 4.53: Inflections for habitual aspect (negative)

169

Table 4.54: Inflections for inferential aspect (affirmative)

171

Table 4.55: Inflections for inferential aspect (negative)

172

Table 4.56: Inflection for imperative mood

174

Table 4.57: Inflections for optative mood (affirmative)

176

Table 4.58: Inflections for potential mood (affirmative)

178

Table 4.59: Inflection for absolutive participle

179

Table 4.60: Inflections for infinitive participle

181

Table 4.61: Inflections for purposive participle

182

Table 4.62: Inflection for prospective participle

183

Table 4.63: Inflections for durative participle

183

Table 4.64: Inflections for conjunctive participle

184

Table 4.65: Inflection for conditional participle

185

Table 4.65: Inflection for perfective participle

186

xvii

Table 5.1: Temporal adverbs

189

Table 5.2: Spatial adverbs

189

Table 5.3: Amount adverbs

190

Table 5.4: Manner adverbs

191

Table 5.5: Frequency adverbs

192

Table 5.6: Reason adverbs

192

Table 5.7: Sentential adverbs

193

Table 5.8: Coordinate conjunctions

194

Table 5.9: Subordinate conjunctions

195

Table 5.10: Collective/plural marker

196

Table 5.11a: Case marker postpositions (i)

197

Table 5.11b: Case marker postpositions (ii)

197

Table 5.12a: Adverbial postpositions (a)

198

Table 5.12b: Adverbial postpositions (b)

199

Table 5.13: Particles in Nepali

202

Table 5.14: Interjections in Nepali

204

Table 6.1: Noun to noun derivation

206

Table 6.2: Noun to adjective derivation

207

Table 6.3: Noun to adverb derivation

208

Table 6.4: Adjective to adjective derivation

209

Table 6.5: Noun to noun derivation

209

Table 6.6: Noun to adjective derivation

211

Table 6.7: Noun to noun/adjective derivation

212

Table 6.8: Adjective to noun derivation

213

Table 6.9: Adjective/noun to noun derivation

214

Table 6.10: Verb to noun derivation

216

Table 6.11: Verb to adjective derivation

218

Table 6.12: Verb to adverb derivation

219

Table 6.13: Adverb to adjective derivation

220

Table 6.14: Verb to noun conversion

221

Table 6.15: Verb to adjective/noun conversion

222

Table 6.16: Verb to noun (vowel insertion)

223

Table 7.1: The open word classes

225

Table 7.2: The closed word classes

226 xviii

Table 7.3: The grammatical categories and features

226

Table 7.3: The arbitrary tags

227

xix

List of Figures Figure 1.1: Lexical and surface levels of Nepali word केटो ket ̺o 'boy'

6

Figure 2.1. A finite state automaton that accepts घर ‘house’ and घरह ‘houses’

24

Figure 2.2: A finite state transducer that transduces between घर ‘house’ and घर+NOUN+SG

26

Figure 2.3: FST unioned from three FSTs for nouns, adjectives and adverbs

27

Figure 2.4: A finite state transducer concatenated from two FSTs above

28

Figure 2.5: A finite state transducer from composing two FSTs above

29

Figure 2.6: The interrelation among language, regular expression and finite state network 30 Figure 2.7: The structure of Lexc grammar (Beesley and Kartumnen 2003:205)

34

Figure 2.8: xfst interface can compile lexicon and rule and compose them into single FST (Karttunen 2000)

39

Figure 3.1: A finite state transducer for NounsType 1a

56

Figure 3.2: A finite state transducer for NounType 1b

57

Figure 3.3: A finite state transducer for NounType 1c

58

Figure 3.4: A finite state transducer for NounsType 1d

60

Figure 3.5: A finite state transducer for NounType21a

61

Figure 3.6: A finite state transducer for NounType 21b

62

Figure 3.7: A finite state transducer for NounType 21d

64

Figure 3.8: A finite state transducer for NounType 22a

65

Figure 3.9: A finite state transducer for NounsType 22b

66

Figure 3.10: A finite state transducer for NounType 22c

66

Figure 3.11: A finite state transducer for NounType 22d

67

Figure 3.12: A finite state transducer for NounType 22e

68

Figure 3.13: A finite state transducer for NounType 22f

68

Figure 3.14: A finite state transducer for first person singular pronouns

72

Figure 3.15: A finite state transducer for first person plural pronouns

73

Figure 3.16: A finite state transducer for second person singular non-honorific pronouns 74 Figure 3.17: A finite state transducer for second person honorific pronouns

75

Figure 3.18: A finite state transducer for second person higher honorific pronouns

76

xx

Figure 3.19: A finite state transducer for second person highest honorific pronoun

76

Figure 3.20: A finite state transducer for third person uː

77

Figure 3.21: A finite state transducer for third person pronouns यो tjo and ती ti:

78

Figure 3.22: A finite state transducer for third person pronouns यो jo and यी ji:

79

Figure 3.23: A finite state transducer for reflexive pronouns

80

Figure 3.24: A finite state transducer for demonstrative pronouns यो jo and यी ji:

81

Figure 3.25: A finite state transducer for demonstrative pronouns यो tjo and ती ti:

82

Figure 3.26: A finite state transducer for demonstrative pronouns ऊ u:

83

Figure 3.27: A finite state transducer for remaining demonstrative pronouns

84

Figure 3.28: A finite state transducer for relative pronouns

85

Figure 3.29: A finite state transducer for interrogative pronouns

86

Figure 3.30: A finite state transducer for indefinite pronouns derived from interrogative pronouns

87

Figure 3.31: A finite state transducer for indefinite pronouns derived from relative pronouns

87

Figure 3.32a: A finite state transducer for definite pronouns

88

Figure 3.32b: A finite state transducer for definite pronouns

89

Figure 3.33a: A finite state transducer for reciprocal pronouns

90

Figure 3.33b: A finite state transducer for reciprocal pronouns

90

Figure 3.34: A finite state transducer for o-ending adjectives

96

Figure 3.35: A finite state transducer for Type 1 marked adjectives

98

Figure 3.36: A finite state transducer for Sanskrit loan adjectives

99

Figure 3.37: A finite state transducer for unmarked adjectives

100

Figure 3.38 A finite state transducer for cardinal numbers and regular ordinal numbers

102

Figure 3.39: A finite state transducer for irregular ordinal numerals

104

Figure 3.40: A finite state transducer for ordinal numerals form Sanskrit loan

105

Figure 3.41: A finite state transducer for frequency numerals

106

Figure 3.42: A finite state transducer for portion numerals

107

Figure 3.43: A finite state transducer for numeral classifiers

107

Figure 3.44: A finite state transducer for general classifier type 1

108

Figure 3.45: A finite state transducer for general classifier type 2

109

xxi

Figure 4.1: A finite state transducer for Type1a verb stems

132

Figure 4.2: A finite state transducer for Type1b verb stems

133

Figure 4.3: A finite state transducer for Type1c verb stems

135

Figure 4.4 A finite state transducer for Type1d verb stems

136

Figure 4.5: A finite state transducer for Type1e verb stems

138

Figure 4.6: A finite state transducer for Type2a verb stems

139

Figure 4.7: A finite state transducer for Type2b verb stems

141

Figure 4.8: A finite state transducer for Type2c verb stems

142

Figure 4.9: A finite state transducer for Type2d verb stems

144

Figure 4.10: A finite state transducer for inflections of non-past existential verb छ chʌ ‘be’ (affirmative)

149

Figure 4.11: A finite state transducer for inflections of non-past existential verb छ chʌ 'be' (negative)

150

Figure 4.12: A finite state transducer for inflections of non-past identificational verb हो ɦo ‘be’ (affirmative)

151

Figure 4.13: A finite state transducer for inflection of non-past identificational verb हो ɦo ‘be’ (negative)

153

Figure 4.14: A finite state transducer for inflections of past existential verb थ tʰi 'be' (affirmative)

154

Figure 4.15: A finite state transducer for inflections of past existential verb थ tʰi ‘be’ (negative)

155

Figure 4.16: A finite state transducer for inflections of non-past tense

157

Figure 4.17: A finite state transducer for inflections of non-past tense negative 1

159

Figure 4.17a: A finite state transducer for inflections of non-past tense negative 2

160

Figure 4.18: A Finite State Transducer for inflections of past tense (affirmative)

161

Figure 4.19: A finite state transducer for inflections of past tense (negative)

163

Figure 4.20: A finite state transducer for inflections of perfect aspect

165

Figure 4.21 A finite state transducer for Inflections of imperfect aspect

166

Figure 4.22: A finite state transducer for inflections of habitual aspect (affirmative)

168

Figure 4.23: A finite state transducer for inflections of habitual aspect (negative)

170

Figure 4.24: A finite state transducer for inflections of inferential aspect (affirmative)

171

xxii

Figure 4.25: A finite state transducer for inflections of inferential aspect (negative)

173

Figure 4.26: A finite state transducer for inflections of imperative mood

175

Figure 4.27: A finite state transducer for inflections of optative mood

176

Figure 4.28: A finite state transducer for inflections of potential mood

178

Figure 4.29: A finite state transducer for inflection of absolutive form

180

Figure 4.30: A finite state transducer for inflections of infinitive participial form

181

Figure 4.31: A finite state transducer for inflections of purposive participial form

182

Figure 4.32: A finite state transducer for inflection of prospective participial form

183

Figure 4.33 A finite state transducer for inflections of durative participial forms

184

Figure 4.34: A finite state transducer for inflections of conjunctive participial form

185

Figure 4.35: A finite state transducer for conditional participial form

186

Figure 4.36: A finite state transducer for inflection of conditional participial form

187

Figure 5.1: A finite state transducer for temporal adverbs

189

Figure 5.2: A finite state transducer for spatial adverbs

190

Figure 5.3: A finite state transducer for amount adverbs

190

Figure 5.4 A finite state transducer for manner adverbs

191

Figure 5.5: A finite state transducer for frequency adverbs

192

Figure 5.6: A finite state transducer for reason adverbs

193

Figure 5.7: A finite state transducer for sentential adverbs

193

Figure 5.8: A finite state transducer for coordinate conjunctions

195

Figure 5.9: A finite state transducer for subordinate conjunctions

195

Figure 5.10: A finite state transducer plural/collective marker

196

Figure 5.12a: A finite state transducer for adverbial postpositions that do not take emphatic marker

199

Figure 5.12b: A finite state transducer for adverbial postpositions that take emphatic marker

201

Figure 5.13: A finite state transducer for particles

202

Figure 5.14: A finite state transducer for interjections

204

Figure 6.1: A finite state transducer for noun to noun derivation

207

Figure 6.2: A finite state transducer for noun to adjective derivation

208

Figure 6.3: A finite state transducer for noun to adverb derivation

208

Figure 6.4: A finite state transducer for adjective to adjective derivation

209

Figure 6.5: A finite state transducer for noun to noun derivation

210

Figure 6.6 A finite state transducer for noun to adjective derivation

211

xxiii

Figure 6.7: A finite state transducer for noun to noun/adjective derivation

212

Figure 6.8: A finite state transducer for noun to adjective derivation

214

Figure 6.9: A finite state transducer for noun/adjective to noun derivation

215

Figure 6.10: A finite state transducer for verb to noun derivation

217

Figure 6.11: A finite state transducer for verb to adjective derivation

218

Figure 6.12: A finite state transducer for verb to adverb derivation

220

Figure 6.13: A finite state transducer for noun to adjective derivation

221

Figure 6.14: A finite state transducer for verb to adverb derivation

222

Figure 6.15: A finite state transducer for verb to adverb derivation

222

Figure 6.16: A finite state transducer for verb to adverb derivation

223

xxiv

List of abbreviations +ABL +ABS +ADJ +ADV +AMOUNT +AUG +CARD +CAUSE +CCONJ +CLF +COM +COMP +COND +CONJUCT +DAT +DEF +DEM +DIM +ALL +DIRT +DUR +EMPH +ERG +EXIST +FEM +FREQ +GEN +HAB +HHON +HON +ID +IMP +IMPERF +INDEF +INF +INFER +INST +INTERJ +INTERRO IPA +LOC +MANNER +MASC +NHON +NOUN NP +NPST

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

Ablative Absolutive Adjective Adverb Amount Augmentative Cardinal Causative Coordinate Conjunction Classifier Commitative Comparative Conditional Conjunctive Dative Definite Demonstrative Dimunitive Directional Direct Durative Emphatic Ergative Existential Feminine Frequency Genitive Habitual High Honorific Honorific Identificational Imperative Imperfect Indefinite Infinitive Inferential Instrument Interjecction Interrogative International Phonetic Alphabet Locative Manner Masculine Non honorific Noun Noun Phrase Non past xxv

+NUM +OBL +OPT +ORD +PST +PARTICLE +PASS +PERF +PERFT +PL +PLACE +PORT POS +POSIT +POSTP +POT +PRON +PROPER +PROS +PROX +PURP +REASON +RECIP +REFL +REL +RHON +SCONJ +SENT +SG +SPAC +SUPER +TEMP +VERB +VOC

1 2 3

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

Numeral Oblique Optative Ordinal Past Particle Passive Perfect Perfective Plural Place Portion Parts of speech Positive Postposition Potential Pronoun Proper Prospecctive Proximate Purposive Reason Reciprocal Reflexive Relative Royal honorific Subordinate conjunction Sentential Singular Spatial Superlative Temporal Verb Vocative First person Second person Third person

xxvi

CHAPTER 1 INTRODUCTION 1.1 Background This study is an attempt to analyze morphology in Nepali and design a computational model for natural language processing within the framework of the finite state technology in general and 'Two-level morphology' developed by Koskeniemmi (1983) in particular. For the implementing of analyzed data and creating a computational model, the Xerox Finite State Tool developed by Beesley and Kartumnen (2003) has been employed. Language as a means of human communication is a tool to express the greater part of human ideas and emotions. Language shapes human thoughts, has a structure and carries meaning. Learning and expressing new concepts and ideas through language are so natural that it is hardly realized how the natural language is processed in our brain. Thus, it can be claimed that there must be some sorts of language representation and a processing module in the human brain (Siddiqui and Tiwari 2008:1). This type of content in the brain also helps to represent the language in real time world. Every time the language activities take place, there is always a very fast and accurate natural language processing that finally performs a successful communicative event. To capture this reality, the computational linguistics attempts to develop computational models of aspects of human language processing. Developing such automated tools for language processing and gaining a better understanding of human communication are the main reasons that have inspired the linguists and the computer scientists in this ever growing field. In fact, language is the outer form of the content it expresses. Therefore, the language processing means the processing of content it possesses. The language, generally, is manifested either in the written or spoken form. Both forms of the language can be processed with the help of computer. To achieve this goal, a computational model of a particular language based on formal approach is required to be designed and implemented into the computer. As computers are not able to understand the natural language, the computational models and methods are developed to map its content in a formal language. And such formal languages are extended to account the natural language phenomena at various

1

levels of the language. The representation of the whole body of the knowledge of language can be an ambitious project. Thus, the language can be graded into various levels such as phonology, morphology, syntax, semantics and pragmatics. Each level is perceived and defined in different ways by the people in different disciplines according to the goals set up. It is not possible to create a mega computational model at a time to cover the entire language. Therefore, the main goal of this study is to represent, design a computational model and process the morphology of written from of the Nepali text. Nepali is an Indo-Aryan language characterized by agglutinating morphology in general. The verb is predominantly inflectional whereas the noun is heavily agglutinating. There have not yet been made attempts to analyze Nepali morphology for the development of computational model. The basic linguistic concepts in relation to morphology and its representation in order to further clarify the computational aspects of morphology are briefly discussed in the following subsections.1

I. Morphology The words are considered to be fundamental building blocks of language (O'Grady, Dobrovolsky and Aronoff 1997:118; Jurafsky and Martin 2000:45). Every human language has words (Mathews 1991:20), they are said to be granted for at least in descriptive linguistics (Katamba 1993:17). Of all the units of linguistic analysis, the word is the central, most familiar and crucial. The smallest free form found in a language are said to be words (Bloomfield 1933); however, a free form that can occur in isolation is not only atomic but also molecular in its structure. A word (i.e. wordform), in real sense, can either be in simple, complex, compound or reduplicated form. Table 1.1 presents simple, complex, complex and reduplicated words in Nepali.

1

The general concepts of the linguistic categories may be required to be qualified and specified for computational purposes. Therefore, some basic concepts are briefly discussed in sections I and II.

2

Table 1.1: Simple, complex, compound and reduplicated words Type Simple

Word घर

IPA/gloss gʰʌr 'house'

Meaning house

Complex

घरबाट

gʰʌr-bat 'house-ABL'

from house

Compound

घरप रवार

gʰʌr-pʌriwar 'house-family'

family

Reduplication

घरघरै

gʰʌr-gʰʌr-ʌi 'house-house-EMPH'

each house

Table 1.1 shows that the word घर gʰʌr 'house' is simple; घरबाट gʰʌr-batʌ 'from house' is complex; घरप रवार gʰʌr-pʌriwar 'famliy' is compound; and घरघरै gʰʌr-gʰʌr-ʌi 'each house' is reduplicated.

a. Morpheme: free and bound The smallest (minimal) unit of grammar that carries information about the meaning and function is said to be a morpheme (Bloomfield 1933; Katmaba 1993; O'Grady, Dobrovolsky, Aronoff 1997; Mathews 1991). The morpheme is an abstract entity that may correspond to various forms at the surface level. The morpheme can be free and bound. A morpheme may be a word by itself or it may not be. The lexical categories such as nouns, verbs, adjective, adverbs, etc. are free morphemes whereas a morpheme that must be attached to another element (normally a free morpheme) is called a bound morpheme. Free morphemes are lexical and bound morphemes are generally grammatical. Table 1.2 lists some free morphemes in Nepali. Table 1.2: Free morphemes in Nepali Morpheme घर

IPA gʰʌr

Gloss house

जा

dza

go

असल

ʌsʌl

good

आज

adzʌ

today

The morphemes घर gʰʌr 'house', जा dza 'go', असल ʌsʌl 'good' and आज adzʌ 'today' in Table 1.2 can stand as words. The bound morphemes in Nepali cannot stand by themselves. Table 1.3 lists some bound morphemes in Nepali.

3

Table 1.3: Bound morphemes in Nepali Morpheme -नु

IPA -nu

Gloss -INF

-एको

-eko

-PERF

-ला

-la

-POT

-आइ

-ai

-NML

The morphemes -नु -nu 'INF', -एको -eko 'PERF', -ला -la 'POT' and -आइ -ai 'NML' in Table 1.3 cannot stand alone. They appear with other free morphemes and express some grammatical functions.

b. Root, stem, base, affixes and word Root is an ultimate and irreducible constituent element common to all word-forms of the same family. It is not an abstract but a concrete form. The root constitutes the core of the word and carries the major component of its meaning and typically belongs to a lexical category such as noun, verb, adjective or adverb. Therefore, a root corresponds to a free morpheme (Katamba 1993:45; Payne 1997:25). For example, man, book, tea, etc. are the roots. A stem is a part of the word except the last (generally for inflectional purpose) affix. Therefore, the stem may be composed of minimally a root and it may have more elements (Katamba 1993:45). cat in cats is the stem for instance. A base is the word or part of the word to which an affix can be attached. It may be called stem for the inflectional purpose (Katamba 1993:45). In English, the word work can be a root, a stem and a base whereas the word worker can only be a base for the word workers. The root, stem and base have something in common in their definition from linguistic perspective. A base can be a stem as well as a root and a stem can be a root as well.2 The concept of root, stem and base overlaps with one another from theoretical point of view. But, from computational perspective, it makes no difference among them especially in the computer processing. So, the term 'stem' is used to represent any one of them in this study. That means any sequence of character to which some other sequences of characters can be attached.

2

I have not discussed the bound roots, see Katamba (1993) for details.

4

An affix is a bound morpheme when added to the radical element (root, stem or base) it changes the meaning or function of a word by creating a new word-form. Therefore, the affixes are basically involved in the inflectional and derivational phenomena of the language. The affixes can be of various kinds but only prefixes and suffixes are discussed here for the present purpose. The prefix is an affix that gets attached in front of the stem and suffix at the end (Katamba 1993:44). In English, affixes re-, un- and

in- in the words reunion, unhappy and intolerable are prefixes whereas -ly, -ing, -er, ed in the words slowly, working, walker and walked are suffixes. Thus, it is clear that from computational point of view a word minimally consists of a stem and optionally one or more affixes.3 In analysis, words are decomposed into their constituents and represented by following certain formalism whereas in generation, the process is reversed.

II. Levels of representation: lexical and surface Words in written or spoken texts, in fact, represent the outer form, i.e. surface form. But a word carries various kinds of information which can be represented at least at two levels. The lexical level of the word is its canonical form or lemma word and a set of tags showing its syntactic category and morphological features. They are the possible parts of speech and/or inflectional properties such as gender, number, person, tense, aspect and mood. Thus, the lexical level represents the sequence of morphemes in a certain fashion. The actual arrangement of morphemes is governed by the language specific rules. Table 1.4 presents lexical and surface level representations of words in Nepali.

Table 1.4: Lexical level and surface level representation Lexical level केटो+NOUN+MASC+SG

Surface level केटो

Gloss of stem boy

खा+VERB+P.3SG

खायो

eat

यून+ADJ+SUPER

3

यूनतम

least

र+CC

र

and

र+PART

र

uncertain

See Katamba (1993:17-23) for the detail.

5

Table 1.4 presents the lexical level representation consisting of the sequences of morphemes attached to the stem resulting to certain word forms at the surface level. Thus, the representation at the surface level in Nepali corresponds to actual spelling of the word. In this study, an attempt has been made to represent a word in Nepali at the two levels. The pair of lexical level and surface level can be taken as a relation between two languages and can be used as morphological analyzer and generator simply by changing the direction of transitions. Figure 1.1 illustrates two levels and the process of analysis and generation of words in Nepali.

LEXICAL LEVEL

: केटो+NOUN+MASC+SG

SURFACE LEVEL

: केटो

Figure 1.1: Lexical and surface levels of Nepali word केटो ket ̺o 'boy' In Figure 1.1, for instance, moving from the surface level केटो to lexical level केटो+NOUN+MASC+SG represents morphological analysis and reverse represents

generation, respectively.

1.2 Statement of the problem Nepali is a morphologically rich language. There exist a number of morphological studies in Nepali. Most of them are descriptive in nature. There also exist some scanty works from computational point of view (see, review of literature in 1.3). Morphology of Nepali has not yet been fully analyzed from computational perspective. The main problem of this study is to analyze morphology in Nepali that can be implemented from computational point of view. The specific problems of this study are as follows:

a. What are the morphological categories in Nepali? b. What are the morphological processes in the language? c. What are the rules involved in the morphological processes? d. What is the computational model for morphology in Nepali?

6

1.3 Objectives of the study The main objective of this study is to analyze the morphology in Nepali from computation perspective. The specific objectives of the study are as follows:

a. To identify the morphological categories in Nepali; b. To identify the morphological processes in the language; c. To formulate the phonological/orthographic rules in Nepali; and d. To design and develop the computational model for Nepali morphology.

1.3 Review of literature There are only a few scanty works in Nepali morphology from computational perspective. However, there are a number of works in Nepali morphology from traditional and descriptive perspectives. These works contribute to the understanding of the main problem of the study to some extent. Such works have been thematically reviewed in four groups.4

a. Nepali morphology Pandit (2051VS (1969 VS)) has classified word categories in Nepali from traditional point of view. The categories include noun, pronoun, adjective, verb and indeclinable. Nouns are grouped into common noun, proper noun and abstract noun. Each noun is discussed with respect to gender: masculine, feminine and neuter; number: singular and plural; cases and case markers: subjective, objective, instrumental, dative, ablative and locative. He has also presented a detailed inflectional paradigm of nouns and verbs. Bandhu (1973) has analyzed clause patterns of Nepali from tagmemic approach. The basic clause patterns are sub-classified and illustrated with examples. Under the 4

The literatures available are not directly related to this study. However, they provide knowledge to understand the research problems. Therefore, instead of evaluating them critically as per the review style, their contributions to this study have been mentioned under four themes, they are (i) Nepali morphology, (ii) Nepali computational morphology, (iii) Nepali language and related NLP works, and (iv) NLP works in selected languages.

7

inflected patterns, he has analyzed the inflectional categories, inflectional system, mood and finite system, aspects and copulas, modals, negation and post verbal particles. The paradigms for each inflectional category are presented alone with the illustrations in the sentence. Dahal (1974) is an extensive description of colloquial and literary Nepali. He has classified the stem formation process into three classes, namely, derived stems, composite stems and reduplicated stems and has also discussed the derived stems as suffix-derived noun stems, suffix-derived adjective stems, suffix-derived verb stems, suffix-derived adverbs, modification-derived stems and prefix-derived stems. Inflectional categories and their realizations have been described under two headings, namely nominal inflections and verbal inflections. Adhikari (1980) has examined a set of Nepali verbs ending in relation of

ch [tsʰ] and the

ch with the time of speech. There can be other elements between verb

stem and element

ch and

ch always refers to the non-past tense and it is always

followed by concord marker. Sharma (1980) has presented the verbal structure in a formulation as V Æ stem ((+Aspect + BE) + Tense) (+Neg) + concord from descriptive approach. Among them, only the stem and concord are obligatory and others are controlled by some other constraints. The verbal stems are divided into simple and complex. The morphophonemic changes that occur when the suffixes for tense, aspect, negation and concord appear to the simple and complex stems are discussed in a length. Wallace (1985) is an attempt to test Nepali data against the Relational Grammar and Government and Binding theory. It is mentioned that Nepali nouns show number and the case relations that are indicated by postpositions. Finite verb forms indicate tense, aspect, affirmation or negation and they agree with subject. Adjectives are discussed as noun phrase modifiers where they agree with the head noun in terms of their number and gender. Chapagain (2046 VS) has also classified the Nepali words into nouns, pronouns, adjectives, verbs, adverbs, conjunction and from traditional and descriptive point of view. The word formation processes such as prefixation, suffixation, compounding and reduplication are well discussed with illustrations.

8

Acharya (1991) is a corpus based study. He has classified the form classes into inflected forms: noun, adjective, pronoun and verb; and uninflected forms: adverb, conjunction, postposition, interjection and nuance particle. Nouns are discussed with respect to the number: singular and plural; cases: nominative, accusative, instrumental, dative, ablative, genitive and locative. Adjectives are discussed with reference to their endings. The verbs are discussed according to the inflectional suffixes for present, past and future tenses with their corresponding number, person, gender and honorificity. The verbs can have simple and compound stems. The adverbs are placed under the uninflected form class and described in term of comparative and superlative structures with examples from the corpus. The conjunctions, postpositions, interjection and particles are also discussed. Adhikari (1993) has classified Nepali words into nouns, pronouns, adjectives, verbs, adverbs, postpositions, conjunctions and interjections from descriptive approach. The agreement system is extensively discussed with respect to gender, number, person and five levels of honorificity. The classification of the verbs and stem formation, inflections and derivation are of a great help in designing the alternation rules. Adhikari (2052 VS) is a study of Nepali case system using Filmorian framework. He has classified the Nepali cases into two main categories. They are core cases and peripheral cases. The former includes semantic cases such as agent, affected, resultative, neutral, experiencer, recipient, essive and the latter includes cases such as locative, instrumental, cause, ablative, beneficiary, purposive and comitative. Pokharel (2054 VS) has presented the analysis of morphological and syntactic levels of voice, causativization, tense, aspect and mood in the simple verb, compound verbs and negation from descriptive approach. The classification of the verbs, the agreement, honorificity, various kinds of classifiers, gender, number, case in nouns and the grammar of the pronouns have been described and illustrated. Lohani (1999) has studied the complex predicates in Nepali using theoretical framework of lexical functional grammar. He has classified complex predicates into nominal, verbal, adjectival and adverbial complex predicates. Sharma (2056 VS) has discussed nine parts of speech categories, namely nouns, pronouns, adjectives, adverbs, verbs, postpositions, conjunctions, interjections and

9

particles extensively from traditional and descriptive approach. Each class is further sub-classified and discussed with illustrations. Acharya (2058 VS) has classified words into various classes on different bases, viz: original and loan words; underived and derived; compound and reduplication; declinable and indeclinable from traditional approach. Nouns, pronouns, adjectives, verbs, adverbs are discussed with illustration. Dhakal (2058 VS) has analyzed the Nepali numerical words from a historical perspective. He has compared the Nepali word forms with Sanskrit and Prakrit and has also shown the changes that occurred during the evolutionary period. He has also analyzed numerical words into their component parts and has observed the sound changes. Pokharel (2010a) has analyzed noun class agreement system in Nepali from descriptive point of view. On the basis of analysis, Nepali nouns are grouped into eleven agreement classes. The gender assignment in Nepali is 'strictly semantic'. The use of classifiers for gender distinction is a unique feature not commonly found in languages. Nepali has based nominal agreement on human vs. non-human distinction. Pokharel (2010b) has presented various strategies to derive the verb root in Nepal. Derivation from citation form, imperative singular form and probalilitative singular form are compared. None of the strategies can derive the entire verb roots, so he has proposed the mathematical strategy of generative phonology to this problem. According to this, if all the verb forms of a root are taken together and calculated the highest common factor, it will generate the root form and this will be the general formula of verb root derivation in Nepali.

b. Nepali computational morphology Keshari et al. (2005) has discussed the development of a rule based system that guesses the part of speech of words in Nepali in a raw corpus without the use of lexicon. The system uses the linguistic information at morphological level and guesses the POS by looking at the affixes. The system has three modules, namely, lexicon maintainer module, rule maintainer module and POS guesser module. The modules interact with lexicon database, guessing rules and corpus.

10

Upadhyaya et al. (2005) has developed a morphological analyzer for Nepali language. The finite state technology is used in designing the analyzer but it can handle only the surface forms but not on the lexical level. It also lacks the detailed description of various aspects dealt in the process. Aryal et al. (2006) has developed a system that produces the parsed text with maximum possible POS tag from computational perspective. The process consists of three phases: tokenizing into syllables, morphological analysis and disambiguation. Paudel et al. (2006) has developed a morphological analyzer along with a spell checker. Nepali words are categorized into two major types, namely, declinable words and indeclinable words. The declinable and indeclinable words are further grouped into subclasses. The morphological analyzer consists of a root word dictionary and a rule dictionary. In the main engine, the root dictionary and rule dictionary interact with one another at various levels and spell checking has also been done in a coordinated manner. Bal (2007) is an attempt to analyze the structure of Nepali grammar from the computational perspective. Even though the main focus of the paper is on the morphological and syntactic aspects of the Nepali language, it has also given a space for writing system of Nepali language. Despite of a novel start, it lacks detailed and deeper observation into the morphological structure of the words in Nepali. Bal and Shrestha (2007c) discuss the design and implementation issues as well as the linguistic aspects of the morphological analyzer and a stemmer for Nepali language. The stemming algorithms and their limitations have also been discussed. Bal and Shrestha (2007b) has presented a stemmer for Nepali language. This stemmer is especially designed to assist the morphological analysis and parts of speech tagging. Based on paradigm approach the stemmer is capable of splitting the words into meaningful units. Aryal et al. (2007) has presented the techniques used for the syntactic and semantic disambiguation for Nepali language. The process of parsing has been done with two components, namely tokenizer and morphological analyzer. The work includes both syntactic and semantic levels. Prasain (2008) is an attempt to analyze the Nepali basic verbs from computational perspective. The basic verbs are classified into two broad categories in terms of the 11

ending, viz., consonant ending and vowel ending. The former group has two types of ending: voiced and voiceless and latter has five types: i-ending, u-ending, a-ending, ʌending and vowel sequence ending. The analysis uses the finite state technology; since it is a preliminary work in nature, it does not cover all the aspects of the verbs. Shrestha (2008) has developed a system that disambiguates the Nepali word senses from natural language processing perspective. He has used modified Lesk algorithm and the wordNet. The processes in Nepali word sense disambiguation have been completed in four stages, namely, (i) tokenizer, (ii) context selection, (iii) finding the senses of target word and (iv) sense identification. Hardie (2008) has analyzed Nepali postpositions applying a collocation-based technique to the categorization of postposition in Nepali from corpus linguistic perspective using the Nepali National Corpus. He has examined the most significant collocations of several postpositions for patterns that characterize postposition as a category or categories. The collocation with semantically coherent nouns, and collocation with words for which the postposition functions as a subcategorizer are identified. Hardie et al. (2009) has described the linguistic rationale underlying the part-ofspeech tagset used for tagging the Nepali National Corpus. The implementation of the tagset in an automated tagging system has also been outlined. This work further supports the classification of words into various groups for designing the finite state transducer for each of the groups. Prasain (2010) is an attempt to analyze Nepali basic nouns and implement them into computer using finite state approach. Various noun characteristics: number, gender, form, honorificity, augmentative/diminutive and significant stem finals are analyzed. On the basis of these features, Nepali basic nouns are grouped into fourteen classes. Each group of nouns are implemented following xfst format (Beesley and Karttumen 2003): lexc grammar to create lexicon finite state transducer and xfst interface to create rule finite state transducer. And finally these finite state transducers are composed into one to create a single finite state transducer which can directly be used to analyze and generate the basic nouns.

12

c. Nepali language and related NLP works Bandhu (1971) is a computer concordance of Nepali spoken corpus. The corpus has been morphologically analyzed and forms are segmented according to their functions. Most of the data for the collection were collected from Palpa district (January, 1971), Syangja and Pokhara (January, 1970) and Gorkha district (April-May 1971). Now, this corpus is available at [http://cqpweb.lancs.ac.uk/bandhu/index.php] and the information such as title, speakers list, text type and POS tags based on Nelralec tagset are available. Gurung and Khatiwada (2007) is an attempt to analyze Devanagari script used in Nepali writing system for the collation sequence. The study also discusses the development process of a lexicon for Nepali language. To make the lexicon computer readable, the XML format is used. Each entry is provided with pronunciation, syllable break, parts of speech, meaning, and synonyms. And the framework of Hunspell is used so that it could be used in the spellchecker in OpenOffice. Bista et al. (2007) has presented the Nepali lexicon development process to be used in spell checking system for Nepali language. This paper also reports about the collection of words and the tools used the problems and issues faced during the process of lexicon development. The architecture consisting of various modules such as Lexicon database, Lexicon maintainer, Rules database, Rule maintainer, Corpus and Rule Interpreter. These modules interact with other concerned modules as required during the process of developing the lexicon. Bal and Shrestha (2007a) has developed simple spellchecker to be used in Nepali OpenOffice.org. The head words are stored in a file and the affix rules in another file. Bal et al. (2007) discusses a general overview of the technical and linguistic research and development being carried out for the development of Nepali spellchecker 1.1. OO.org using HunSpell framework with Unicode support. The dictionary is populated with stems of nouns, verbs, pronouns, adjectives, adverbs, conjunctions, interjections, particles, postpositions and compound words. The possible word forms are generated applying the affix rules. Hyoju and Shrestha (2007) presents an overview of the contemporary Nepali dictionary based on Nepali national corpus. The various components that incorporated in this dictionary are the headword, part of speech, phrase category, guide word,

13

pragmatics, definition, example, usage note, various form, suppletive form, extra information, phrase, idiom, compound, proverb and cross-reference. Gurung and Thapa (2007) has described the process of building text-to-speech for the Nepali language from speech processing perspective. The multilingual speech synthesis system known as Festival has been used. A component of Festival called Festvox provides a framework in building synthetic voices. Corpus based rule generation and statistical modeling methodologies are used. The sentences for building speech database are taken from the Nepali National Corpus. The normalized text is fed to module where generation of wave form takes place using letter-to-sound rules, concatenation of diphones and the pitch extraction. Yadava et al. (2008) describes the construction of the 14-million-word Nepali National Corpus (NNC) (http://cqpweb.lancs.ac.uk/nncv2/index.php) which includes spoken corpus, written corpus, parallel corpus and speech corpus. The NNC is encoded as Unicode text and marked up in CES-compatible XML and follows FLOB and Frown frameworks.

d. NLP works in some selected languages Megerdoomian (2003) provides a detailed description and analysis of Persian inflectional morphology from a computational perspective. The morphological analyzer designed for Persian language uses a unification-based grammar with typed feature structure. The linguistic analysis and implementation to the Samba Grammar for developing the morphological analyzer are main tasks. The surface form is formally represented as a regular expression. The morphological features are specified as a feature structure that contains the lexical and inflectional information provided by the rule. These features describe how the stem and the morphological features of the affixes are combined. Hussain (2004) has developed a finite-state morphological analyzer for Urdu. She has described the general morphological concepts such as morpheme, roots, bases, affixes, inflection, derivation and causation. The analysis has been done following the two-level morphology formalism using finite-state transducer. Makedonski (2005) is a finite state approach to the inflectional morphology of Turkish nouns. The finite state transducer is used for analyzing the nouns and the 14

implementation is done in Xerox Finite State Toolbox in two levels namely the lexicon and rule component. Ziai (2006) has developed a finite state morphological analyzer for Persian simple verbs from finite state technology approach. The system presented covers the full inflectional paradigm of modern Persian for both regular and a large number of irregular verbs. Islam (2007) describes the inflectional Bangla verb and noun morphology and also mentions the rules, lexicons and grammar for Bangla morphological analysis. This analysis is based on PC-KIMMO, a two level morphological analyzer. Dasgupta et al. (2007) has discussed the inflectional behaviors of the compound words in Bangla language from computational perspective. The Bangla compound words may retain the inflectional suffixes on both the constituents and the resultant compound that may further be inflected as a single word. Khan and Fatima (2007) investigate the inflectional properties of Pashto nouns from finite state perspective. The main focus is on the classification of the Pashto nouns. The finite state transducer is used for analyzing the Pashto nouns. Bharati and Kulkarni (2007) has discussed the importance of Paninian grammar from the perspective of information coding. The study has applied the finite state technology. The theoretical and practical aspects of computational linguistics concerning Hindi and Sanskrit and application of Paninian approach to English language are highlighted. The complexity of word formation in Sanskrit is captured by a finite state automata, analyzer for Sanskrit has been developed which provides the output with morph analysis. Bögel et al. (2007) discusses a number of issues, in particular, potential ambiguity and non-concatenative morphology . This approach deals with the treatment of both Urdu and Hindi via a cascade of FSTs that transliterates the very different scripts into a common ASCII transcription system; and the implementation of the analysis is based on the xerox finite state toolkit. Shrivastava et al. (n.d.) has developed a rule-based part-of-speech tagger for Hindi with stemmer and morphological analyzer. The developed stemmer and morphological analyzer are integrated with Hindi WordNet, Hindi Generation and Question Answering Projects. 15

1.5 Significance of the study It has been clear from the literature review (1.3) that the Nepali language is primarily described from two approaches: notional and descriptive. There have been very few and sporadic works done from formal and computational perspective. The numbers of natural language processing works are growing in many languages of the world and in South Asian languages in particular. In this context, the Nepali language is lagging behind. Therefore, there is an urgent need to develop the computational models and implement it in computing processes so that various kinds of computer applications such as spell checker, grammar checker, part of speech tagger and syntactic parser can be developed. When the applications related to Nepali language are developed, specially, end users who are seeking the information stored in Nepali language (i.e. texts) can be benefited. In this regard, this work can be a foundational and very much useful. The computational analysis of morphology in Nepali would be a central and essential component for the development of various kinds of other Nepali language processing applications.

Further analysis of linguistic levels such as syntax,

semantics, and pragmatics can also be done taking this work as the reference point. Therefore, this study can be of a great importance by itself and much more useful for both academicians and practitioners of natural language processing.

1.6 Research methodology The methodology consists of data collection, classification, analysis and implementation. Data collection: The study is primarily based upon secondary data for morphological analysis; however the example sentences illustrated are elicited. The secondary data, especially word-froms have been taken from the Nepali National Corpus developed by Bhashanchar Project (Nelralec) and cross-checked them from 'Brihad Nepali Sabdakosh' (Pokharel et al., 2040 VS). Being a native speaker of Nepali, I have also used my intuition for cross checking and analyzing the data. Classification: The unique word-forms are classified into different categories such as nouns, verbs, adjective, etc. and further subdivisions have been made according to their morpho-syntactic behaviors.

16

Analysis: The classified data are analyzed into stems and affixes for each category and the inflectional and derivational processes have been treated separately. Then phonological rules have been identified and formalized. Implementation: Finite state transducers for each group of words have been created following concept of ‘two-level morphology’. Then, a computational model for Nepali morphology has been implemented by using the tool referred to as Xerox Finite State Tool (XFST) developed by Beesley and Kartumnen (2003). See Chapter 2 for detailed description of the theoretical framework.

1.7 Limitations of the study This study has dealt mainly with the written form of words in Nepali. Only inflectional and derivational aspects of the words have been taken care of in this study. Compounding and reduplication are also morphologically important, but they have not been dealt with in this study. Despite the fact that there are a number of models/approaches for computational analysis in the literature, only finite state approach is employed in this study. Moreover, only the representative words have been considered in the implementation of the analysis.

1.8 Organization of the study This study has been structured into eight chapters. Chapter 1 introduces the concept of computational analysis of word level categories following the Two-level morphology. This chapter also deals with the statement of the problems, objectives, review of literature, research methodology and justification of the study. In Chapter 2, we present theoretical framework for the study. Chapter 3 looks into the general characteristic features and analyzes nominals in Nepali computationally and presents finite state transducer for each of them. In this chapter, we also deal with the phonological rules in the form of regular expressions which are implemented later on. In Chapter 4, we discuss the various features possessed by the verbs. It also presents finite state transducers and phonological rules involved in the verb morphology. Chapter 5 analyzes adverbs, conjunctions, case markers, postpositions, particles and interjections. Separate finite state transducers for each category have been presented in this chapter. In Chapter 6, we present the analyses of various derivational systems 17

in Nepali. The finite state transducers, phonological rules have also been presented in this chapter. Chapter 7 implements all those analyses done in the preceding chapters. We have summarized the study in chapter 8. And finally, a number of annexes have been provided.

18

CHAPTER 2 THEORETICAL FRAMEWORK

2.0 Outline This chapter presents the theoretical framework employed in this study. It consists of eight sections. Section 2.1 deals with general idea of computational concepts. In section 2.2, we deal with the concept of regular expression. In section 2.3, we briefly discuss the finite state technology. Section 2.4 introduces a brief idea of regular language. Section 2.5 presents the introduction of finite state machine which includes finite state automata, finite state transducer and some relevant and important operations that can be performed on finite state transducer. Section 2.6 deals with the use of finite state transducer in computational morphology in natural language. It also briefly shows the relation among regular expression, regular language and finite state network. In section 2.7, we discuss the basic concepts and application of Xerox finite state tool (xfst) used in this study for the development of computational model of Nepal morphology that can be used as morphological analyzer. In Section 2.8, we present the summary.

2.1 Computational concepts Finite state morphology has been an important and active field of research and development for a number of decades. The Natural Language Processing (NLP) system remains incomplete without morphological analysis. The words are the units of syntax and meaning of word is the basis of semantics. In fact the input to the syntactic and semantic analysis comes from the morphology. Therefore, the morphological analysis of a natural language has become important and fundamental. Relating word forms and detecting the structure of word forms are what morphological analysis is all about. The task of relating a given form to a canonical form is called lemmatization. Both lemmatization and the decomposition into parts have their uses, however, they share some common processes. The task of morphological analysis, then, is to take forms and relate them to other word forms, at the same time deriving featural information about the form (Roark and Sproat 2006).

19

It is customary in discussion of morphology to talk about inflection versus derivational morphology in terms of the kinds of features each of these encodes. This distinction is not relevant here for discussion. Rather we will concentrate purely on the computational mechanisms for performing morphological analysis and the way these mechanisms represent two kinds of linguistic information. The formal properties of morphological operations, viz. the syntagmatic combination of morphological elements and the paradigmatic relation between the forms are the crucial aspects (Roark and Sproat 2006). To realize this objective, one needs to understand some mathematical and computational notions and operations which are introduced in the subsequent sections. 2.2 Regular expression Regular expressions are the standard notation for characterizing text sequences and it is used for specifying the text strings in searching text (Jurafsky and Martin 2000:4859). They are highly applied in various natural language processing activities such as information retrieval, word-processing, computation of frequencies from corpora and other such tasks. A regular expression is a formula and a special language that is used for specifying classes of strings. A string is any sequences of alphanumeric characters or symbols for the purpose of most text-based search techniques. The set of strings in the regular expression has a pattern which is actually a value for the algebraic formula (Siddiqui and Tiwari 2008:54-9). The regular expressions are kept between the slashes to distinguish them from other ordinary set of characters. Table 2.1 lists some of the simplest regular expressions and the matches in the text.

Table 2.1: The sample regular expressions Regular expressions

Example pattern matched

/a/

There is a dog.

/book/

I have read many books.

/घर/ 'house'

तमी घरमा बस। 'You stay at home.'

/म/ '1SG'

मलाई भोक ला यो। 'I am hungry.'

20

The simple regular expressions in Table 2.1 are used for searching the text. The regular expression in the left can search the underlined text in right. Formally, regular expressions are an algebraic notation for characterizing a set of strings. Thus, they can be used to specify search strings as well as to define a language in a formal way. The characters are grouped by putting them between square brackets. For example, the pattern /[कखगघङ]/ will match any one of them. The square brackets specify the disjunction of the characters used within the square brackets. A dash '-', which specifies a range, can be used when the set of the characters within the brackets is very big. For example, [a-z] specifies any lowercase letter of Latin alphabet and [0-9] specifies any digit from 0 to 9. Some important operators used in the regular expressions, patterns and their meanings are listed in Table 2.2.

Table 2.2: Some operators used in regular expressions Operators in RE

Pattern

Meaning

[]

[abc]

a or b or c

[-]

[A-Z]

any one of the capital letters

^

[^a-z]

not a lowercase letter

*

ab*c

zero or more bs

.

a.c

any character between a and c

?

ab?c

either zero or b in between a and c

+

ab+

one or more bs

|

a|b

either a or b

()

appl(y|ies)

apply or applies

{n}

n occurrences

{n,m}

from n to m occurrences

\n

a new line

\t

a tab Source: Jurafsky and Martin 2000

The operators illustrated in Table 2.2 are used for creating the complex regular expressions.

21

The simple regular expression, i.e, any alphanumeric alphabets and various operators, can be combined together and a very complex regular expression can be constructed according to the requirement.1 Now, it is clear that a regular expression requires a pattern of the text to be searched and a corpus of texts to search through. And finally, a regular expression search function will search through the corpus returning all the texts of that pattern provided. Thus, a regular expression specifies a language according to its pattern, the complexity of the language that is represented depends on the complexity of the pattern used to specify the language. However, the regular expression is more than just a convenient metalanguage for text searching. Firstly, a regular expression is one way of describing a finite state automaton. It means the regular expression can be compiled into a finite state automaton. Secondly, it is a way of characterizing a particular kind of formal language called a regular language (see 2.6).

2.3 Finite state technology In order to understand how to build the linguistic application, we first need to be acquainted with the basics of how a finite-state machine works. A finite-state machine is a network consisting of states indicating one start state and one or more final states. Transitions between states are possible only if the required input is recognized. A path is a sequence of transition over arcs to a particular state. In computational morphology, a path is a set of alphabets equivalent to a word in natural language. So, it can be said that the technology that utilizes the finite-state network in the processing of creating an application is said to be a finite state technology. Therefore, the basic concept behind the finite state technology is a set of states with different properties and a set of arcs that connect these states. The arcs have a direction and an input symbol. That means there is a set of outgoing arcs with their respective input symbols. The sets of these states and arcs together form a network. As Chomsky (1957) stated, the finite state devices were limited in generative capacity i.e., the power to accurately describe all natural language phenomena. Therefore,

1

A detailed description, how a complex regular expression is created, is out of the scope of this study.

22

finite state technology was considered to be inefficient by the linguists at the earlier stages of its development. One reason was that it is a mathematical formal abstract device, so it was believed that it doesn’t have the descriptive power for natural language analysis. The second reason was that in its developing stages, it was not really powerful to account for the linguistic phenomena. But later on, it proved to be quite useful in modeling the parts of languages that could be considered as finite and regular. As far as the natural language is concerned, it shows this quality of being regular and finite at least in its parts if not in whole. Various tasks such as POS disambiguation, tokenization, shallow parsing, etc. are successfully accomplished using the finite state technology. Morphology is the core component of the natural language and it can be considered more or less finite and regular. Thus, the most significant application of the finite state technology has been the computational morphology in which both analysis and generation of the morphology of the natural language is performed. The computational morphological analysis has been the basis for any further kind of natural language processing.

2.4 Regular language As discussed in section 2.2, a regular expression denotes or specifies or describes a regular language using a specified pattern. A formal language is a set of strings each of which is composed of symbols from a finite symbol-set called an alphabet (Jurafsky and Martin 2000:48). A regular language is a formal language that is possibly an infinite set of finite sequences of symbols from a finite alphabet that satisfies particular mathematical properties: "A class of languages that are definable by regular expression is exactly the same as the class of languages that are characterized by finite-state automaton is said to be a regular language" (Jurafsky and Martin 2000:75). Table 2.3 illustrates the regular language from Nepali defined by a regular expression.

23

Table 2.3: Regular expressions and regular language Regular Expression

Regular Language

/खा.*छ.*/

खा छ kʰantsʰʌ,

'kʰa.*tsʰ.*'

खा छन् kʰantsʰʌn, खाइ छ kʰaintsʰʌ, खाएछ kʰaetsʰʌ,

खा छ

kʰantsʰʌũ, खा छु kʰantsʰu,

खाइरह छ kʰairʌɦʌntsʰʌ, खानुह ु छ kʰanuɦutsʰʌ, खाएछन्

kʰaetsʰʌn, खा

ँ ु kʰaẽtsʰu खाँदैछस् ौ kʰantsʰʌu, खाएछ

kʰãdʌitsʰʌs, खा छे स् kʰantsʰes, ….. The language in right column of Table 2.3 is a regular language denoted by a regular expression in left column. All the strings (words) matched by a regular expression /खा.*छ.*/ 'kʰa.*tsʰ.*' have the same pattern. 2.5. Finite state machine 2.5.1 Finite state automata (FSA) Finite state automata are a mathematical abstract device. As discussed in (2.3), they consist of states and arcs called transitions. Each FSA has exactly only one initial state and one or more final states. In between initial and final states, there can be any finite number of states called intermediate states. The transitions are the connections between these states and thus responsible for moving from one state to another. Conventionally, the states are represented as circles and the transitions between them are represented as labeled arcs; and an arrow is used to indicate the initial state and double circles are used to indicate the final states. The finite state automata are best understood as recognizers because they accept a finite set of input strings. For illustration, an automaton that accepts a string from the Nepali language घर ‘house’ and घरह ‘houses’ is visualized in Figure 2.1.

q0

घ

q1

र

q2

ह

q3

र

q4

◌ू

q5

Figure 2.1. A finite state automaton that accepts घर ‘house’ and घरह ‘houses’

24

The finite state automaton shown in Figure 2.1 recognizes the words by reading the input string symbol-by-symbol and matching the symbols to the labels on the arcs. This FSA accepts घर ‘house’ and घरह

‘houses’ because the inputs lead to final

states. No other strings are accepted by this FSA. Formally, the finite state automaton can be defined by the following five parameters: Q:

a finite set of N states q0, q1, … qN

Σ:

a finite input alphabet of symbols

q0 :

the start state

F:

the set of final states F ⊆ Q

δ(q,i): the transition function or transition matrix between states. Given a state q∈Q. δ is thus a relation from Q × Σ to Q (Jurafsky and Martin 2000:62). For the language automaton in Figure 2.1, Q = {q0, q1, q2, q3 q4, q5}, Σ = {घ, र, ह,

र, ◌ू}, F = { q2, q5 } and δ(q,i) is defined by the transition in Table 2.4. Table 2.4: The transition table for घर and घरह Input State

घ

र

ह

र

◌ू

0

1

ø

ø

ø

ø

1

ø

2

ø

ø

ø

2:

ø

ø

3

ø

ø

3

ø

ø

ø

4

ø

4

ø

ø

ø

ø

5

5:

ø

ø

ø

ø

ø

2.5.2 Finite state transducer (FST) The finite state automaton discussed in (2.5.1) accomplishes the task of recognizing strings in a regular language by providing a way to systematically explore all the possible paths through a machine (Jurafsky and Martin 2000:97-108). However, this exploration can only address the problem whether the string is present in its language or not. The automaton of this capacity cannot be used to show the relation between

25

two or more languages. However, this problem can be solved by the use of another version of the FSA called ‘Finite State Transducer’. An FST is similar to an FSA; it consists of states and transitions with labeled arcs. However, in an FST the labels can be in a pair of symbols, i.e., the relation between two languages, instead of simple symbols. Whenever an arc has such a label, it is traversed and the input symbol matches, then it is transduced to the output symbols (Makedonski 2005; Ziai, 2006). Consider the example transducer in Figure 2.2, in the upper side is labeled as घर+NOUN+SG and lower side is labeled as घर. This FST transduces घर+NOUN+SG to घर and vice versa. That means, the input घर is matched and it outputs घर+NOUN+SG.

q0

घ

q1

र

+SG:0

+NOUN:0 q2

q3

q4

Figure 2.2: A finite state transducer that transduces between घर ‘house’ and घर+NOUN+SG

Therefore, the important property of FSTs is that they are in principle bi-directional, meaning that they can also be applied backwards. Thus, the bi-directionality feature of the FST can be applied to the morphological analysis and generation.

2.5.3 Some important operations on FSTs a. Union The union of two or more networks is another set that contains all the elements of constituent networks. There is no ordering of the arcs in the network and it is denoted as [A|B], where A and B are the networks (Beesley and Kartumnen 2003). To illustrate this operation, there are three FSTs for nouns, adjectives and adverbs which are illustrated in upper part of the Figure 2.3. When operation union is performed on these three FSTs, it results into a single FST, as illustrated in the lower part of Figure 2.3.

26

FST for nouns

FST for adjectives

FST for adverbs

Figure 2.3: FST unioned from three FSTs for nouns, adjectives and adverbs

The Figure 2.3 shows the process of unioning two or more finite state transducers into a single network which contain the elements of its constituent finite state transducers. This union operation is very much useful and powerful to create the large and complex FST from smaller FSTs of the morphological word classes. Therefore, it allows working in the modular concept.

b. Concatenation Concatenation is an operation which keeps the networks in sequence. One can also concatenate two existing finite state networks with one another to build up new words productively or dynamically (Beesley and Kartumnen 2003). This is usually denoted as [A B] where A and B are the networks. This phenomenon is illustrated in Figure 2.4.

27

FST for purposive -न -nʌ and

FST for a verb बस् 'sit'

infinitive –नु –nu inflections

Figure 2.4: A finite state transducer concatenated from two FSTs above

In Figure 2.4, there are two FSTs in the upper part of the Figure 2.4, one for a Nepali verb बस् bʌs 'sit' and another for purposive -न -nʌ 'PURP' and infinitive -नु –nu 'INF' suffixes. And in the lower part of the Figure 2.4, there is an FST resulted from concatenating two FSTs, which can analyze and generate purposive and infinitive forms of the verb बस् bʌs 'sit'. This concatenation operation is useful in handling the verb stems and inflectional and derivational suffixes.

c. Composition Composition is an operation on two or more languages or relations. It is usually denoted as [A .o. B]. In fact, this operation removes the common elements from the networks used for composing (Beesley and Kartumnen 2003). For exemplification of this operation, there is a rule FST at the left top in Figure 2.5 which changes ◌ो o into ◌ा a for plural feature. An arbitrary symbol +MP is used for creating the environment

so that the rule can be applied to specific group of nouns. Figure 2.5 illustrates the process of composition.

28

FST for ◌ो+MP -> ◌ा

FST for plural form of केटो 'boy'

Figure 2.5: A finite state transducer from composing two FSTs above At the right top of the Figure 2.5, there is an FST for केटो 'boy' with +mp symbol. When these FSTs are composed, it results into a single FST in lower part of the Figure 2.5 which is capable of changing ◌ो o into ◌ा a for plural feature and also removes the arbitray symbol +mp without any intermediate FSTs. In fact, composition operation forms a sequence of transducers. It builds a cascade of FSTs into a single one by eliminating the common intermediate outputs, so, it allows working for a modular structure. Because of this feature of composition, it has been very much useful for composing rules with lexicon to obtain the correct surface forms.

d. Intersection The intersection of two networks contains the set containing all the members that are common to both. It is usually denoted as [A & B]. This operation might not be used as a major operation in this work. But it can be used to find the common words between two sets of words. e. Subtraction The subtraction of two networks contains the set containing elements that are in A but not in B. It is denoted as [A-B]. This operation is normally performed to find the words in a network which are not in another network. 29

f. Complementation or negation The complement language of a network A is the set of all strings that are not in the language A. It is usually denoted as ~A. This operation is very useful for filtering the words from a network. Among the operations discussed above, operations union, concatenation and composition are used while implementing the analyzed morphological categories and rules to create a single lexical transducer while others are used elsewhere.

2.6 FST in computational morphology Johnson (1972) was the first to prove that the finite state technology was appropriate and applicable to certain areas of computational linguistics. The most important thing that he discovered was that the language and relations used in traditional rewrite rules of generative phonology were essentially as powerful as the mathematical devices used in the finite state calculus. Kaplan and Kay (1994) showed with a detailed mathematical proof that every rewrite rule corresponds to a regular relation and thus can be modeled by means of an FST. The two-level morphology (Lexical Level and Surface Level) developed by Koskeniemmi (1983) also used the similar concept of an FST. One of the fundamental results of formal theory (Kleene 1956) is the demonstration that finite-state languages are precisely the set of languages that can be described by a regular expression. Figure 2.6 demonstrates relation among FST, regular language and regular expression.

Language/Relation Denotes Regular Expression

Encodes Compiles into

Finite-State network

Figure 2.6: The interrelation among language, regular expression and finite state network

30

Figure 2.6 indicates, a regular expression denotes a set of strings (i.e., a language) or a set of ordered pairs (i.e., a relation). It can be compiled into a finite-state network that compactly encodes the corresponding language or relation that may well be infinite (Beesley and Kartumnen 2003:44). The language of a regular expression includes the common set of operators of Boolean logic and operators such as concatenation that are specific to strings. Each of the regular expression operators for finite-state languages there is a corresponding operation that applies to finite-state network and produces a network for the resulting language. A finite-state network for a complex language can be built by first constructing a regular expression that describes the language in terms of set operations and then compiling that regular expression into a network. This is, in general, easier than constructing a complex network directly and in fact it is the only practical way for all but the most trivial infinite languages and relation.

2.7 Xerox finite state tool syntax (XFST) XFST used here for the computational analysis of Nepali morphology, developed at the Xerox Research Center Europe is based on Beesley and Karttumen (2003). It implements the standard finite state operations such as composition, concatenation, complement and union as well as several innovative operations like replacement rules and local sequentialization. XFST includes: lexc – a compiler for lexicons in the lexc language, which is specifically designed for handling morphotactics (the syntax of the morphemes) in natural languages (see 2.7.1), and xfst – the core tool providing an interface to the finite state calculus for building, accessing, manipulating finite state networks and a compiler for regular expressions and replacement rules which will be essential for this work (see 2.7.1). There are other run time tools within it but they are not relevant for this discussion. XFST defines transducers as relations between two languages. What would be referred to as an upper language could be thought of as the input and the lower language then would be the output when an input is applied to a transducer downwards. If we apply input to the transducer upwards then the roles switch – the input is applied on the lower side and the output comes from the upper side. Although

31

it seems a bit confusing the terms upper and lower remain constant (Beesley and Kartumnen 2003:85-202). In the definition of a lexical transducer, the upper side language describes the lexical (underlying) forms of the language to be analyzed and the lower side language contains the actual surface forms in the written forms. XFST has many operators used while analyzing the natural languages, but some important and essential ones are discussed as follows: (i) “ : ”

The crossproduct operator relates every symbol in the language in its

left side to every symbol on its right side. For example [घर+NOUN+PL:घरह ], the square brackets indicate grouping. The language घर+NOUN+PL is in the upper side of the transducer and घरह is in the lower side of the tranducer. (ii) “ -> ” The left-to-right replacement operator is an extended regular expression operator that provides for convenient formulation of rewrite rules in XFST. For example, a rule that replaces every x by a y might be written as x -> y. Every symbol that is not an x will be left unchanged. In generative phonology, rewrite rules have a context part which will cause the rule to apply only if the context is satisfied. XFST provides for that also. For example, if all xs are to be replaced by ys but only in the environment where xs are followed by z. One could formulate the rule like x -> y || _ z. The double bars indicate the begining of the context. (iii) “ $ ” The ‘contains’ operator denotes the language of all the strings containing in a variable. For example, $a means the language of all strings containing a. This operator can be used to operate on a specific set of strings in the network. (iv) “ | ” The union operator creates the union of two languages or relations. For example, A and B are two languages and A|B means the language formed from the union of A and B (see 2.5.3). (v) “ .o. ” The composition operator is very powerful for the combination of two or more transducers into one. Thus, a very complex network is possible by use of this operator. The main use of this operator is to compose the rules FSTs with lexical FST (see 2.5.3).

32

(vi) “ # ” This symbol is used for two purposes. One in lexc files to indicate the final state (see 2.7.1) and another in the replacement rules to indicate the word boundary where it is surrounded by dots (see 2.7.2). 2.7.1 LEXC grammar A lexc grammar consists of at least one lexicon (called Root). A lexicon contains a list of entries where each entry has a continuation class. It corresponds to a state. Each entry corresponds to a labeled arc that can be traversed only if the entry is successfully matched against the input string. The entry can be a regular expression. If an entry is matched, the arc is traversed and the continuation class, which is another state, is reached. The procedure is repeated until a final state is reached, denoted by the special continuation class # (Beesley and Kartumnen 2003:203-278). The structure and its components of the lexc grammar are presented in Figure 2.7 and discussed.

a. Multichar_Symbols First of all, there is a set of multicharacter symbols definition such as +NOUN +MASC +HUM +SG +PL +DIM +ERG +DAT +LOC +ABL +NOM +GEN +POP +OBL +FEM where sequences of symbols in the set are treated as atomic symbols. These symbols are primarily used as tags to indicate various grammatical categories and features. They are attached on the upper side that is visible only if morphological analysis is performed and on the lower side, each multicharacter symbol corresponds to respective suffix or epsilon. But, sometimes, additional tags are also used to create an environment for the replace rules and they are removed at the end. Figure 2.7 shows the structure of lexc grammar.

33

Multichar_Symbols Optional Declarations

LEXICON ROOT LEXICON X

Body

LEXICON X …….. END

Figure 2.7: The structure of lexc grammar (Beesley and Kartumnen 2003:205)

In Figure 2.7, the lexc grammar begins with optional components Multicharacter symbols and declarations, and body consisting of list lexicons

b. Definitions After the Multichar_Symbols section, an optional definition section can also appear in the lexc file. It consists of the keyword Definitions followed by one or variable assignment.

c. Lexicon As discussed in (2.7.1), the lexicon root corresponds to start state of the network to be compiled. There can be any number of other lexicons, but they must follow the lexicon root. Each entry consists of two parts: a form and a continuation class. The form can have formatives (i.e., a stem) or a regular expression and the continuation

34

class refers to the next sub-lexicon to be followed. The end of the word or final state is indicated by reserved symbol #. The lexicon is designed according to the format shown in Figure 2.7 above. A sample lexicon of nouns in Nepali is given below which account for both o-ending nouns and non-o-ending

nouns

including

number,

gender,

honorificity

and

augmentative/diminutive features. For the illustration, only one stem of a particular noun type and markers for stated features are included for this purpose.2 Multichar_Symbols +NOUN +MASC +FEM +OBL +PL +SG +DIM +VOC +HON +MP +FE +PLACE +PROPER ^b3

LEXICON ROOT Nouns; LEXICON Nouns !! Type 1a Nouns: केटो

inflection_1a;

! keto 'boy'

!!Type 1b Nouns: मुसो

inflection_1b;

! muso 'mouse'

!!Type 1c Nouns: डालो

inflection_1c;

! d̺alo 'basket'

!!Type 1d Nouns: फोटो

inflection_1d;

! pʰoto 'photo'

!!Type 21a Nouns: काका inflection_21a;

! kata 'uncle'

!!Type 21b Nouns:

2

Classification of nouns is based on their characteristic features. However, the names of the noun class and the sub-lexicon used in the lexc file are purely arbitrary as they are removed during the compilation processes.

3

Multichar_Symbols +MP, +FE and ^b are used in lower language to create environment and they are also removed later.

35

ना त

inflection_21b;

! nati 'grandson'

!!Type 21c Nouns: बाघ

inflection_21c;

! bagʰ 'tiger'

!!Type 21d Nouns: बःट

inflection_21d;

! bist ̺ 'a surname'

!!Type 22a Nouns: दाइ

inflection_22a;

! dai 'elder brother'

!!Type 22b Nouns: दद

inflection_22b;

! didi 'elder sister'

!!Type 22c Nouns: राम

inflection_22c;

! ram 'Ram'

!!Type 22d Nouns: सीता

inflection_22d;

! siːta 'Sita'

!!Type 22e Nouns: खेत

inflection_22e;

! kʰet 'farm land'

!!Type 22f Nouns: पोखरा inflection_22f;

LEXICON

! pokʰʌra 'Pokhara'

inflection_1a

+NOUN+MASC+SG:0

#;

+NOUN+MASC+PL:+MP

#;

+NOUN+MASC+OBL:+MP #; +NOUN+MASC+HON:+MP #; +NOUN+MASC+VOC:+MP #; +NOUN+FEM:+FE LEXICON

#;

inflection_1b

+NOUN+MASC+SG:0

#;

+NOUN+MASC+PL:+MP

#;

+NOUN+MASC+OBL:+MP #;

36

+NOUN+FEM:+FE LEXICON

#;

inflection_1c

+NOUN+SG:0

#;

+NOUN+PL:+MP

#;

+NOUN+OBL:+MP

#;

+NOUN+DIM:+FE

#;

LEXICON

inflection_1d

+NOUN+SG:0

#;

+NOUN+PL:+MP

#;

+NOUN+OBL:+MP #; LEXICON

inflection_21a

+NOUN+MASC:0

#;

+NOUN+FEM:◌ी

#;

LEXICON

inflection_21b

+NOUN+MASC:0

#;

+NOUN+FEM:नी

#;

LEXICON

inflection_21c

+NOUN+MASC:0

#;

+NOUN+FEM:^bि◌नी LEXICON

#;

inflection_21d

+NOUN+MASC:0

#;

+NOUN+FEM:◌ेनी

#;

+NOUN+FEM:ि◌नी #; LEXICON

inflection_22a

+NOUN+MASC:0 LEXICON

inflection_22b

+NOUN+FEM:0 LEXICON

#;

#;

inflection_22c

+NOUN+PROPER+MASC:0 #; LEXICON

inflection_22d

+NOUN+PROPER+FEM:0 #;

37

LEXICON

inflection_22e

+NOUN:0

#;

LEXICON

inflection_22f

+NOUN+PLACE:0

#;

END

2.7.2 XFST interface The xfst part of this system is mainly concerned with the realization, i.e., surface forms, and phonological alternation rules. This component takes the output of lexc transducer (lexc grammar) as input, which has stems with grammatical features labeled with tags and it is passed through additional rules to obtain the acceptable surface forms. The xfst component helps to compile the lexc grammar into an FST as well as other rule FSTs using lexc files and rule files respectively. At the same time, other various operations are also performed through the xfst. As demonstrated in Figure 2.8, first the different separate lexicons and rules are compiled, and then they are composed into a single FST. The lexical level (i.e. upper language) consists of citation form of a word and a sequence of tags indicating various features. The surface level (i.e. lower language) consists of actual spelling of the word. But, the process is not so straightforward. During the process of forming the word by placing the formative through the sublexicon in the lexc file and the spelling that is concatenated may differ. Therefore, some replace rules are applied to the lower language so that the final output would be grammatical. The orthographic rules for each FST are formulated and applied using xfst script. Sometimes, to change the sequence of tags, similar rules are applied to upper language also. The entire architecture for creating a finite state transducer that can be used as a morphological analyzer for Nepali is illustrated in Figure 2.8.

38

Lexicon component

Compiles

Lexicon FST Composition Lexicon FST

Rule component

Compiles

Rule FST

Figure 2.8: xfst interface can compile lexicon and rule and compose them into single FST (Karttunen 2000) In figure 2.8, the lexicon is compiled to lexicon FST and rules are compiled to rule FST. These two FSTs have been composed to a single FST with the help of xfst interface. All these functions and operations are systematically carried out through a single script file which defines various kinds of variables for rules and compiles them into an FST. This also compiles the lexicon into an FST and ultimately composes both of them into single FST. A sample of entire process with nouns in Nepali is illustrated below.

!! xfst script file clear define cons क|ग|घ|ङ|च|छ|ज|झ|ञ|ट|ठ|ड|ढ|ण|त|थ|द|ध|न|प|फ|ब|भ|म|य|र|ल|व|स|ष|श|ह; define liquids र|ल; define change [[◌ो %+MP -> ◌ा || _ .#.] .o. [◌ो %+FE -> ◌ी || _ .#.] .o. [◌ा -> [ ] || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ा -> [ ] || _ ◌े न ◌ी .#.] .o. [◌ा -> [ ] || _ ि◌ न ◌ी .#.]

39

.o. [◌ा -> [ ] || _ ◌ी .#.] .o. [◌ी -> ◌् || liquids _ न ◌ी .#.] .o. [◌ी -> ि◌ || _ न ◌ी .#.] .o. [[. .] -> ◌् || cons _ न ◌ी .#.] .o. [य ◌ा -> [ ] || _ न ◌ी .#.] .o. [◌ी -> [ ] || _ ि◌ न ◌ी .#.] .o. [◌ू -> ◌ु || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ी -> [ ] || _ %^b ि◌ न ◌ी .#.] .o. [%^b -> [ ] ] .o. [ि◌ -> [ ] || ◌ु _ न ◌ी .#.] ];

read lexc सु बेनी/सुि बनी subb-eniː/subb-iniː 'Female Subba' The majority of non-o-ending human nouns which have their corresponding feminine forms are kinship terms, family names (surnames), caste, social status and professions. (4) a. बेहल ु ो सु दर छ।

beɦulo sundʌr tsʰʌ bridegroom handsome be.NPST.3SG.MASC 'The bridegroom is handsome.'

b. बेहल ु कु प छे ।

beɦuliː kurup bridegroom.FEM ugly 'The bride is ugly.'

tsʰe be.NPST.3SG.FEM

47

c. काकाले घर बनाए। kaka-le gʰʌr bʌn-a-e uncle-ERG house make-CAUS-PST.3SG.HON 'The uncle made a house.' d. काक ःकुलमा काम गनु हु छ। kaki iskul-ma kam gʌr-nu hun-tsʰʌ uncle.FEM school-LOC work do-INF be-NPST.3SG.MASC 'The aunt works in the school.' e. ना त भात खा छ। nati bʰat kʰa-ntsʰʌ eat-NPST.3SG.MASC grandson.MASC rice 'The grandson eats rice.' f. ना तनी ःकुल जा छे । nati-niː iskul dza-ntsʰe grandson-FEM school go-NPST.3SG.FEM 'The grand daughter goes to school.' Table 3.4 illustrates the morphological gender system in Nepali. Table 3.4: Morphological gender Masculine छोरो tsʰoro

Gloss son

Feminine छोर tsʰoriː

Gloss daughter

बेहल ु ो beɦulo

bridegroom

बेहल beɦuliː ु

bride

काका kaka

uncle

काक kakiː

aunt

कुमार kumar

lad

कुमार kumari

lass

ना त nati

grandson

ना तनी natiniː

grand daughter

बाघ bagʰ

Tiger

बिघनी bʌgʰiniː

tigress

सु बा subba

Subba (male)

सु बेनी/सुि बनी

Subba (female)

subb-eniː/subb-iniː In Table 3.4, o-ending and non-o-ending nouns in Nepali that inflect for feminine gender by various ways are demonstrated.

d. Form The nouns in Nepali show two forms morphologically: direct and oblique. In traditional grammars, the nouns which appear as the citation forms are direct and 48

those appear with postpositions are oblique. The o-ending nouns as बरालो biralo 'cat' in (5a) change to a-ending as बराला birala in (5b) to take the oblique form. This happens only in o-ending nouns when they are followed by postpositions. The non-oending nouns do not show such changes whether they are followed by postpositions or not such as ख ruːkʰ 'tree' in (5c)4. (5)

a. बरालो दुध खा छ। biralo dudʰ kʰa-n-tsʰʌ cat.SG milk eat-φ-NPST.3SG.MASC 'The cat drinks milk.' b. बरालाले मुसा माछ। birala-le musa cat.OBL-ERG mouse.PL 'The cat kills the rats.'

mar-tsʰʌ kill-NPST.3SG.MASC

c. खमा एउटा चरो बसेको छ। ruːkʰ-ma eut ̺a tsʌro bʌs-eko tsʰʌ tree-LOC one.CLF bird sit-PERF be.NPST.3SG.MASC 'A bird is sitting in the tree.' Table 3.5 shows the alternation between direct and oblique forms of some nouns in Nepali. Table 3.5: Direct and oblique forms Direct Oblique

Horse

cat

mouse

घोडो

बरालो

मुसो

gʰod̺o

biralo

muso

घोडा

बराला

मुसा

gʰod̺a

birala

musa

tree ख

ruːkʰ ख

ruːkʰ

In Table 3.5, Nepali nouns that show the direct and oblique forms in different conditions and those which do not alter are listed.

4

For the present purpose, non-o-ending nouns when followed by postpositions are not considered as oblique forms.

49

e. Honorificity Nouns in Nepali show two levels of honorificty morphologically: non-honorific and honorific. The honorificity distinction can be found only in o-ending human nouns. Those nouns with o-ending as बेहल ु ो beɦulo 'bridegroom' in (6a) change into a-ending as बेहल ु ा beɦula 'bridegroom' in (6b) indicating non-honorificity and honorificity respectively. But non-o-ending nouns do not show this distinction, therefore, they are not listed here.

(6)

a. बेहल ु ो हा ीमा थयो।

beɦulo hatti-ma tʰi-jo bridegroom.NHON elephant-LOC be-PST.3SG.NHON 'The bridegroom was on the elephant.'

b. बेहल ु ा हा ीमा थए। beɦula hattima tʰi-e bridegroom.HON elephant-LOC be-PST.3SG.HON 'The bridgegroom was on the elephant.' Some examples of honorificity and non-honorificity in Nepali o-ending human nouns are illustrated in Table 3.6. Table 3.6: Honoficity: non-honorific and honorific Non-honorific Honorific

Boy

son

bridegroom

child

केटो

छोरो

ब चो

ket ̺o

tsʰoro

बेहल ु ो

beɦulo

bʌtstso

केटा

छोरा

बेहल ु ा

ब चा

ket ̺a

tsʰora

beɦula

bʌtstsa

The examples presented in Table 3.6 show the alternation between honorificity and non-honorificity in o-ending human nouns.

f. Augmentative and dimunitive From an evaluative point of view, nouns in Nepali show two distinctions: augmentative and diminutive. This distinction is found only in a small set of o-ending inanimate nouns which indicates the size of the object whether it is bigger or smaller.

50

The o-ending as डालो d̺alo 'basket' in (7a) changes into iː-ending as डाल d̺aliː in (7b) indicating augmentative and diminutive forms, respectively. This distinction of bigger and smaller in size is limited to morphology only because there is no augmentative or diminutive agreement in the verbs. The non-o-ending inanimate nouns do not show this distinction, therefore, they are not considered here.

(7)

a. राम डालो बनाउन जा दछ। ram d̺alo bʌnau-nʌ dzan-dʌtsʰʌ Ram basket.AUG make-INF know-NPST.3SG.MASC 'Ram knows to make the basket.' b. यो िचज डाल मा राख् ! jo tsidz d̺aliː-ma rakʰ this thing basket.DIM-LOC keep.IMP 'Keep this thing in the small basket.'

Table 3.7 illustrates some examples of augmentative and diminutive form of Nepali oending nouns. Table 3.7: Augmentative and dimunitive Basket

small hill

bag

bowl

थु को

झोलो

बटु को

d̺alo

thumko

dzʰolo

bʌt ̺uko

डाल

थु क

झोल

बटु क

d̺aliː

tʰumkiː

dzʰoliː

bʌt ̺ukiː

Augmentative डालो Diminutive

In Table 3.7, Nepali o-ending inanimate nouns change to i-ending for augmentative and diminutives forms, respectively.

g. Cases and case markers In Nepali, the cases are marked by postpositions. Even though the case markers are affixed to stems of nouns or pronouns, they are treated as a separate group of linguistic units. Thus, they are tokenized into separate tokens (Hardie et al. 2005, 2009). However, a short traditional description of cases and case markers has been

51

given here with examples. We have dealt with case markers as postpositions in (see 5.3) for computational purpose.

I. Ergative The ergative case in Nepali is marked by a postposition -ले-le as रामले ram-le in (8). Mostly the ergative case marker occurs with agent subject in perfective transitive constructions. (8). रामले भात खायो। ram-le bʰat kʰa-jo Ram-ERG rice eat-PST.3SG.MASC 'Ram ate rice.'

II. Dative The dative case in Nepali is marked by a postposition -लाई-laiː as मलाई mʌi-laiː in (9). The dative marker appears normally with indirect/direct object noun phrase. Normally the accusative case is not marked, but in some condition it is marked with same dative marker -लाई-laiː.

(9)

मैले रामलाई पट।

mʌi-le ram-laiː pit ̺-ẽ 1SG.OBL-ERG Ram-DAT beat-PST.1SG 'I beat Ram.' III. Instrumental The instrumental case in Nepali is marked by a postposition -ले-le as च चाले tsʌmtsa-

le in (10). The ergative and instrumental case markers are same in their forms but ergative case marker appears with agent as in (8), whereas instrumental case marker appears with instruments or objects with which the action is performed. (10) उसले च चाले खाना खायो। us-le tsʌmtsa-le kʰana 3SG.OBL-ERG spoon-INST meal 'He ate the food with spoon.'

kʰa-jo eat-PST.3SG.MASC

52

IV. Ablative The ablative case in Nepali is marked by a postposition -बाट -bat ̺ʌ as बजारबाट

bʌdzar-bat ̺ in (11). There is an alternative postposition to the former one, i.e., -दे िख dekʰi. which has almost the same meaning.

(11) ऊ बजारबाट/दे िख आयो। u: bʌdzar-bat ̺/dekʰi a-jo come-PST.3SG.MASC 3SG market-ABL 'He came from the market.' V. Locative The locative case in Nepali is marked by a postposition -मा-ma as घरमा gʰʌr-ma in (12a). There is another maker as -कहाँ-kʌhã in (12b) which normally occurs with pronouns and occasionally occurs with other nominals indicating the location but it is not frequent. (12)a.

म घरमा बःछु ।

mʌ gʰʌr-ma bʌs-tsʰu 1SG house-LOC sit-NPST.1SG 'I stay at home.' b. ह र मकहाँ आयो। hʌri məkəhã a-jo Hari 1SG-LOC come-PST.3SG.MASC 'Hari came to me.'

VI. Allative The allative case in Nepali is marked by a postposition - तर -tirʌ as बजार तर bʌdzar-

tirʌ in (13).5 (13) तमी बजार तर जाऊ! timi bʌdzar-tirʌ dza-uː 2SG.HON market-ALL go-IMP.2SG.HON 'You go towards the market.'

5

Most of the traditional Nepali grammars do not assume - तर -tirʌ as a case marker.

53

VII. Commitative/Associative The commitative/associative case in Nepali is marked by a postposition -सँग-sʌ̃gʌ and as - सत-sitʌ मसँग/ सत mʌ-sʌ̃gʌ/sitʌ in (14). (14) तमी मसँग/ सत बस ! timi mʌ-sʌ̃gʌ/sitʌ bʌsʌ! stay-IMP.2SG.HON 2SG.HON 1SG-COM 'You stay with me.' VIII. Genitive The genitive case in Nepali is marked by a postposition -को-ko as रामको ram-ko in (15). This postposition has three alternate forms -को-ko, -का-ka and -क -kiː for singular masculine, plural and feminine respectively that occurs in most cases. But, the forms -रो-ro, -रा-ra and -र -riː occur with first person pronouns म mʌ and हामी

ɦami and second person pronouns तँ tʌ̃ and तमी timiː ; and forms -नो -no, -ना-na and -नी-niː. occur with the reflexive pronoun आफू apʰu 'self'. (15) रामको कलम राॆो छ। ram-ko kʌlʌm Ram-GEN.MASC.SG pen 'Ram's pen is good.'

ramro tsʰʌ good.MASC.SG be.NPST.3SG.MASC

IX. Vocative The vocative case in Nepali is marked by changing ओ o into आ a in o-ending human nouns as केटा keta in (16). Non-o-ending nouns do not inflect for this case. (16) ए केटा ! यो काम गर ् ! e keta! jo kam gʌr! yeh boy.VOC! this work do.IMP.NHON 'Hey boy! Do this work.' X. Nominative The nominative case in Nepali is unmarked as राम ram-ø in (17). The subject of an intransitive verb and subject of a transitive verb in a non-perfective construction are in nominative case.

54

(17) राम सध ले छ। ram-ø sʌdʰʌĩ lekʰ-tsʰʌ Ram-NOM always write-NPST.3SG.MASC 'Ram always writes.' 3.2 Classification of nouns in Nepali On the basis of the characteristic features discussed in (3.3), nouns in Nepali have been grouped into fourteen classes and the finite state machines or networks have been constructed. The features like stem final segment, number, gender, form, honorificity, augumentative/diminutive and vocative case are considered while grouping the nouns. The features discussed in (3.1) are not consistently present in all the nouns. The basic criteria for grouping the nouns include presence or absence of these features and the semantics of the nouns in some cases (Prasain 2010). The sequence of tags is arbitrary. The tags for default features are not included and the names of the noun classes are arbitrary.

3.2.1 O-ending nouns a. NounType 1a In this class, the o-ending human nouns which inflect for number, gender, form, honorificity and vocative case are grouped. Some examples with their corresponding morphological tags are given in Table 3.8 Table 3.8: NounType 1a Morphological Tags NOUN+MASC+SG NOU N+MASC+PL

NOUN+MASC+OBL

NOUN+MASC+HON

NOUN+MASC+VOC

NOUN+FEM

boy

son

bridegroom

child

केटो

छोरो

ब चो

ket ̺o

tsʰoro

बेहल ु ो

beɦulo

bʌtstso

केटा

छोरा

ब चा

ket ̺a

tsʰora

बेहल ु ा

beɦula

bʌtstsa

केटा

छोरा

ब चा

ket ̺a

tsʰora

बेहल ु ा

beɦula

bʌtstsa

केटा

छोरा

ब चा

ket ̺a

tsʰora

बेहल ु ा

beɦula

bʌtstsa

केटा

छोरा

ब चा

ket ̺a

tsʰora

बेहल ु ा

beɦula

bʌtstsa

केट

छोर

बेहल ु

ब ची

ket ̺iː

tsʰoriː 55

beɦuliː

bʌtstsiː

The finite state transducer illustrated in Figure 3.1 is capable of analyzing and generating the word-forms illustrated in Table 3.8.

Figure 3.1: A finite state transducer for NounsType 1a The phonological rules involved in this group of nouns are given in PR 3.1. Phonological rules PR 3.1 i. Stem final vowel ◌ो o of the o-ending human nouns of the lower language (i.e., surface level) is changed to vowel ◌ा a for plural form, oblique, honorificity and vocative case. Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending human nouns of the lower language (i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#. b. NounType 1b In this class, the o-ending animate nouns which inflect for number, gender and form are grouped. Some examples are listed in Table 3.9 with their corresponding morphological tags.

56

Table 3.9: NounType 1b Morphological Tags NOUN+MASC+SG NOUN+MASC+PL

NOUN+MASC+OBL

NOUN+FEM

horse

goat

cat

rat

घोडो

बाभो

बरालो

मुसो

gʰod̺o

bakʰro

biralo

muso

घोडा

बाभा

बराला

मुसा

gʰod̺a

bakʰra

birala

musa

घोडा

बाभा

बराला

मुसा

gʰod̺a

bakʰra

birala

musa

घोडी

बाभी

बराल

मुसी

gʰod̺iː

bakʰriː

biraliː

musiː

The finite state transducer illustrated in Figure 3.2 is capable of analyzing and generating the word-forms illustrated in Table 3.9.

Figure 3.2: A finite state transducer for NounType 1b The phonological rules that are applied to the finite state transducer illustrated in Figure 3.2 are given in PR 3.2.

Phonological rules PR 3.2 i. Stem final vowel ◌ो o of the o-ending animate nouns of the lower language (i.e, surface level) is changed to vowel ◌ा a for plural form and oblique form. Regular expression: ◌ो -> ◌ा || _ .#.

57

ii. Stem final vowel ◌ो o of the o-ending animate nouns of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#. c. NounType 1c In this class, the o-ending inanimate nouns which inflect for number, form and augmentative/diminutive features are grouped. Some examples are listed in Table 3.10 with their corresponding morphological tags.

Table 3.10: NounType 1c Morphological Tags

basket

small hill

bag

bowl

NOUN+SG

डालो

थु को

झोलो

बटु को

d̺alo

thumko

dzʰolo

bʌt ̺uko

डाला

थु का

झोला

बटु का

d̺ala

tʰumka

dzʰola

bʌt ̺uka

डाला

थु का

झोला

बटु का

d̺ala

tʰumka

dzʰola

bʌt ̺uka

डाल

थु क

झोल

बटु क

d̺aliː

tʰumkiː

dzʰoliː

bʌt ̺ukiː

NOUN+PL

NOUN+OBL

NOUN+DIM

The finite state transducer illustrated in Figure 3.3 is capable of analyzing and generating the word-forms illustrated in Table 3.10.

Figure 3.3: A finite state transducer for NounType 1c

58

The phonological rules listed in PR 3.3 are applied to the finite state transducer illustrated in Figure 3.3.

Phonological rule PR 3.3 i. Stem final vowel ◌ो o of the o-ending inanimate nouns of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural and oblique form. Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending inanimate nouns of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for diminutive feature. Regular expression: ◌ो -> ◌ी || _ .#.

d. NounType 1d In this class, the o-ending inanimate nouns which inflect only for number and oblique form are grouped. Some examples are listed in Table 3.11 with their corresponding morphological tags. Table 3.11: NounType 1d Morphological pine Tags NOUN+SG स लो NOUN+PL

NOUN+OBL

photo

ladder

flesh(dead)

फोटो

लःनो

सनो

sʌllo

pʰot ̺o

lisno

sino

स ला

फोटा

लःना

सना

sʌlla

pʰot ̺a

lisna

sina

स ला

फोटा

लःना

सना

sʌlla

pʰot ̺a

lisna

sina

The finite state transducer illustrated in Figure 3.4 is capable of analyzing and generating the word-forms illustrated in Table 3.11.

59

Figure 3.4: A finite state transducer for NounsType 1d The finite state transducer illustrated in Figure 3.4 is composed with the finite state transducer of rules listed in PR 3.4.

Phonological rule PR 3.4 i. Stem final vowel ◌ो o of the o-ending inanimate nouns of the lower language (i.e, surface level) is changed to vowel ◌ा a for plural and oblique. Regular expression: ◌ो -> ◌ा || _ .#.

3.2.2 Non-o-ending nouns I. Marked a. NounType 21a In this class, the non-o-ending human and animate nouns which inflect only for gender feature with marker -◌ी -i: are grouped. Some examples are listed in Table 3.12 with their corresponding morphological tags. Table 3.12: NounType 21a Morphological Tags

uncle

lad

pigeon

parrot

NOUN+MASC

काका

कुमार

परे वा

सुगा

kaka

kumar

pʌrewaː

suga

काक

कुमार

परे वी

सुगी

kakiː

kumariː

pʌrewiː

sugiː

NOUN+FEM

60

The finite state transducer illustrated in Figure 3.5 is capable of analyzing and generating the word-forms illustrated in Table 3.12.

Figure 3.5: A finite state transducer for NounType21a The phonological rules listed in PR 3.5 are combined at the lower side of the network illustrated in Figure 3.5.

Phonological rule PR 3.5 i. Stem final vowel ◌ा a of the non-o-ending animate nouns of the lower language (.i.e, surface level) is deleted when followed by gender marker ◌ी iː . Regular expression: ◌ा -> [ ] || _ ◌ी .#.

b. NounType 21b In this class, the non-o-ending human nouns which inflect for masculine and feminine features with marker -नी-ni are collected. Some examples are listed in Table 3.13 with their corresponding morphological tags. Table 3.13: NounType 21b Morphological grandson beggar priest Tags NOUN+MASC ना त जोगी पि डत pʌnd̺it NOUN+FEM

chief मुिखया

nati

dzogiː

ना तनी

जो गनी

पि ड ी

मुिखनी

nati-niː

dzogi-niː

pʌnd̺it-niː

mukʰi-niː

61

mukʰiya

The finite state transducer illustrated in Figure 3.6 is capable of analyzing and generating the word-forms illustrated in Table 3.13.

Figure 3.6: A finite state transducer for NounType 21b The finite state transducer in Figure 3.6 is composed with the phonological rules listed in PR 3.6.

Phonological rule PR 3.6 i. Stem final vowel ◌ी iː of the non-o-ending human nouns of the lower language (i.e. surface level) is changed to vowel ि◌ i before the feminine gender marker नी niː.

Regular expression: ◌ी -> ि◌ || _ न ◌ी .#. ii. Halanta ◌् is inserted between consonant symbol and feminine gender marker नी -niː at the surface level.6

Regular expression: [. .] -> ◌् || cons _ न ◌ी .#. iii. या ja is deleted before the feminine gender marker नी niː at the surface level. Regular expression: य ◌ा -> [ ] || _ न ◌ी .#.

6

Halanta ◌् is generic term for the diacritic in Devanagari that is used to suppress the inherent vowel that otherwise occurs with every consonant letter.

62

iv. Stem final vowel ◌ी iː of the non-o-ending nouns of the lower language (i.e. surface level) is replaced by a halanta ◌् after liquid sounds and before the feminine gender marker नी niː. Regular expression: ◌ी -> ◌् || liquids _ न ◌ी .#. c. NounType 21c In this class, the non-o-ending human and animate nouns which inflect only for the gender feature with the marker -इनी -iniː are grouped. This group differs from the other NounType 21b because the आ a sound within the stem changes to the अ ʌ sound while inflecting for feminine gender. Some examples are listed in Table 3.14 with their corresponding morphological tags.

Table 3.14: NounType 21c Morphological Tags tiger NOUN+MASC बाघ NOUN+FEM

Surname1

Surname2

Surname3

काक

थापा

था

bagʰ

karkiː

tʰapa

tʰaruː

बिघनी

क कनी

थ पनी

थ नी

bʌgʰi-niː

kʌrkiniː

tʰʌp-iniː

tʰʌru-niː

The finite state transducer illustrated in Figure 3.6 is capable of analyzing and generating the word-forms illustrated in Table 3.14 when the rules listed in PR 3.7 are applied.

Phonological rule PR 3.7 i. Vowel ◌ा a in the non-o-ending human and animate noun stems of the lower language (i.e. surface level) is changed to vowel अʌ when the feminine gender marker -इनी -iniː appears at the end of the word Regular expression: ◌ा -> [ ] || _ ि◌ न ◌ी .#. ii. Vowel ◌ी iː. is deleted before the feminine gender marker -इनी -iniː.

63

Regular expression: ◌ी -> [ ] || _ ि◌ न ◌ी .#. iii. Vowel ◌ू uː. is changed to ◌ु u before the feminine gender marker -इनी -iniː. Regular expression: ◌ू -> ◌ु || _ ि◌ न ◌ी .#. d. NounType 21d In this class, the non-o-ending human nouns which inflect only for gender feature with marker -इनी -iniː alternatively -एनी -eniː are grouped. Some examples are listed in Table 3.15 with their corresponding morphological tags.

Table 3.15: NounType 21d Morphological Tags NOUN+MASC NOUN+FEM

Ethnic name1

Surname

Ethnic2

खस

बःट

सु बा

kʰʌs

bist ̺ʌ

subba

खसेनी/ख सनी

बःटे नी/ बिःटनी

सु बेनी/सुि बनी

kʰʌs-eni/kʰʌs-ini

bist-eni/bist-ini

subb-eni/subb-ini

The finite state transducer illustrated in Figure 3.7 is capable of analyzing and generating the word-forms illustrated in Table 3.15.

Figure 3.7: A finite state transducer for NounType 21d The phonological rules involved in this process are listed in PR 3.8 which are compiled and composed with the finite state transducer illustrated in Figure 3.7.

64

Phonological rule PR 3.8 i. Vowel ◌ा a at the end of non-o-ending noun stem is deleted before the feminine gender marker -इनी -iniː. or -एनी eniː. Regular expression: ◌ा -> [ ] || _ ि◌ न ◌ी | ◌े न ◌ी.#.

II. Unmarked a. NounType 22a In this class, the non-o-ending human nouns that do not inflect for any features but are inherently masculine are grouped. Some examples are listed in Table 3.16 with their corresponding morphological tags.

Table 3.16: NounType 22a Morphological elder brother Tags NOUN+MASC दाइ dai

younger brother

father

husband

भाइ

बाबु

लो ने

bʰai

babu

logne

The finite state transducer illustrated in Figure 3.8 is capable of analyzing and generating the word-forms illustrated in Table 3.16.

Figure 3.8: A finite state transducer for NounType 22a

b. NounType 22b In this group, the non-o-ending human nouns do not inflect for any features, but are inherently feminine gender are collected. Some examples are listed in Table 3.17 with their corresponding morphological tags.

65

Table 3.17: NounType 22b Morphological Tags

elder sister

younger sister

mother

wife

NOUN+FEM

दद

ब हनी

आमा

ःवाःनी

didiː

bʌɦiniː

ama

swasniː

The finite state transducer illustrated in Figure 3.9 is capable of analyzing and generating the word-forms illustrated in Table 3.17.

Figure 3.9: A finite state transducer for NounsType 22b c. NounType 22c In this group, the proper names of males which never inflect for anything irrespective of their final sound segments, but grammatically agree with verb for masculine gender if they are in subject-NP position. Some examples are listed in Table 3.18 with their corresponding morphological tags.

Table 3.18: NounType 22c Morphological Pname1 Tags NOUN+PROPER+MASC कणाखर kʌrɳakʰʌr

Pname2

Pname3

Pname4

हर

ँयाम

बलराम

ɦʌri

sjam

bʌlʌram

The finite state transducer illustrated in Figure 3.10 is capable of analyzing and generating the word-forms illustrated in Table 3.18.

Figure 3.10: A finite state transducer for NounType 22c

66

d. NounType 22d In this group, the proper names of females, which never inflect for anything irrespective of their final sound segments, but grammatically agree with verb for feminine gender if they are in subject NP position, are collected. Some examples are listed in Table 3.19 with their corresponding morphological tags.

Table 3.19: NounType 22d Morphological Pname1 Tags NOUN+PROPER+FEM सीता sita

Pname2

Pname3

Pname4

गीता

जानक

नमला

gita

dzanʌki

nirmʌla

Figure 3.11: A finite state transducer for NounType 22d The finite state transducer illustrated in Figure 3.11 is capable of analyzing and generating the word-forms illustrated in Table 3.19.

e. NounType 22e In this group, all the common nouns which are non-o-ending are collected. These nouns never inflect for anything irrespective of their final sound segments, but grammatically agree with verb for default feature, i.e., third person masculine singular if they are in subject NP position. Some examples with their corresponding morphological tags are given in Table 3.20

Table 3.20: NounType 22e Morphological Promise Tags +NOUN कसम kʌsʌm

shoulder

farm-land

book

काँध

खेत

कताब

kãdʰ

kʰet

kitab

67

The finite state transducer illustrated in Figure 3.12 is capable of analyzing and generating the word-forms illustrated in Table 3.20.

Figure 3.12: A finite state transducer for NounType 22e

f. NounType 22f In this class, all the place names are grouped. Some examples are listed in Table 3.21 with their corresponding morphological tags.

Table 3.21: NounType 22f Morphological Tags NOUN+PLACE

PlaceName1

PlaceName2

PlaceName3 PlaceName4

झापा

भोजपुर

नेपाल

जापान

dzʰapa

bʰodzpur

nepal

dzapan

The finite state transducer illustrated in Figure 3.13 is capable of analyzing and generating the word-forms illustrated in Table 3.21.

Figure 3.13: A finite state transducer for NounType 22f

68

3.3 Pronouns 3.3.1 Characteristics of pronouns in Nepali a. Person Pronouns in Nepali have three persons: first, second and third. They are listed in Table 3.22. Table 3.22: Pronouns with respect to persons Person First Second

Pronouns म mʌ 'I', हामी ɦami 'we'

तँ tʌ̃ 'you', तमी timiː 'you', तपा

tʌpaĩː 'you',

यहाँ jʌɦã 'you',

हजुर ɦʌ̃dzur 'you',

मौसुफ mʌusupʰ 'royal you'

Third

यो jo 's/he', यनी jiniː 'she', यी jiː 'they', यो tjo 'that', तनी tiniː 's/he', ती tiː 'they',

ऊ uː 'he', उनी uniː 'she', उहाँ uhã 'he'

b. Number Personal pronouns in Nepali show two dimensions of number: singular and plural. The number feature in pronouns is also indicated by a plural/collective postposition ह

-ɦʌruː but some of them such as म mʌ 'I', तँ tʌ̃ 'you' do not take any number

maker. They have corresponding suppletive forms for the plural feature e.g., हामी

ɦami 'we', तमी timiː 'you'. Table 3.23 lists the personal pronouns in Nepali with number distinctions.

69

Table 3.23: Personal pronouns in number distinctions Singular म mʌ 'I'

First Second

Plural हामी ɦamiː 'we', हामीह

तँ tʌ̃, 'you'

तमीह

तमी timiː 'you',

तपा

timiː-ɦʌruː,

तपा ह यहाँह

tʌpaĩː 'you',

यहाँ jʌɦã 'you',

हजुरह

हजुर ɦʌdzur 'you',

tʌpaĩː-ɦʌruː, jʌɦã-ɦʌruː, ɦʌdzur-ɦʌruː,

मौसुफह

मौसुफ mʌusupʰ 'you'

Third

ɦamiːɦʌruː 'we-Pl'

यो jo 'this', यी jiː 'this'

यनीह

mʌusupʰ-ɦʌruː jiniːɦʌruː,

यनी jiniː 's/he', यो tjo 'that', ती

तनीह

tiniːɦʌruː,

tiː 'those' तनी tiniː 'those', ऊ uː

उनीह

uniːɦʌruː,

's/he' उनी uniː 's/he' उहाँ uhã 's/he'

उहाँह

uhãɦʌruː

c. Form Pronouns in Nepali show two morphological forms: direct and oblique. When a pronoun is followed by postpositions, it changes into oblique forms. The oblique forms are found in personal, demonstrative, relative, reflexive pronouns; and sporadically in interrogative, definite and indefinite pronouns. Table 3.24 lists the direct and oblique form of some pronouns. Table 3.24: Forms of pronouns: direct and oblique Direct form म mʌ 'I'

Oblique form मै mʌi 'I.OBL'

हामी ɦami 'we'

हाम् ɦam 'we.OBL'

तँ tʌ̃ 'you'

त tʌĩ 'you.OBL'

तमी timiː 'you'

तम् tim 'you.OBL'

यो jo 'this'

यस् jʌs 'this.OBL'

ऊ uː 's/he'

उन् un 's/he.OBL'

जो dzo 'who.REL'

जस् dzʌs 'who.REL.OBL'

यो tjo 'that' को ko 'who.INTERO'

यस् tjʌs 'that.OBL' कस् kʌs 'who.INTERO.OBL'

70

d. Honorificity The second and third person pronouns in Nepali show five levels of honorificity. There is no particular honorific marker but the hierarchy is maintained at the lexical level. The honorificity in the third person pronouns is marginally marked whereas in second person pronouns it is not morphologically significant. Table 3.25 lists the pronouns in terms of honorific levels. The honorific agreement with the verb at the morphological level occurs only for non-honorific (level 0) and mid honorific (level 1) pronouns and other higher honorific levels (levels 2, 3 and 4) have the syntactic means for encoding the honorificity. 1 Table 3.25: Honorific levels in Nepali pronouns Honorificity Non-honorific

level 0

Second Person तँ tʌ̃

Mid-honorific

1

तमी timiː

High-honorific

2

तपा

HHigh-honorific

3

यहाँ jʌɦã,

tʌpaĩː

आफू apʰuː, हजुर ɦʌdzur

Royal-honorific

4

मौसुफ mʌusupʰ

Third Person यो jo, यो tjo, ऊ uː

यी jiː, ती tiː, यनी jiniː, तनी

tiniː, उनी uniː, उहाँ uɦã उहाँ uɦã, आफू apʰuː, हजुर ɦʌdzur

मौसुफ mʌusupʰ

3.3.2 Grouping of pronouns The pronouns cannot be grouped like nouns. Each pronoun in Nepali is unique in form and meaning. Therefore, they are treated and illustrated individually. However, for convenience, we have grouped them in terms of their forms to demonstrate the finite-state network. a. Personal pronouns First person: First person pronouns have two forms: singular म mʌ and plural हामी

ɦami. Both first person singular and plural have oblique forms. First person singular pronoun has direct, oblique, emphatic forms, and genitive: masculine, feminine, plural 1

Though the pronouns in Nepali in terms of honorificity are not morphologically significant, they have been tagged into five levels for computational purpose in this study.

71

and emphatic forms. But, first person plural pronoun has direct, oblique forms and genitive: masculine, feminine, plural and emphatic forms. Table 3.26 lists first person singular forms with their corresponding morphological tags.

Table 3.26: First person singular pronouns Morphological Tags PRON+1SG

Devanagari म

IPA mʌ

Gloss I

PRON+1SG+OBL

मै

mʌi

I

PRON+1SG+EMPH

मै

mʌi

I

PRON+1SG+OBL+GEN+MASC

मेरो

mero

my

PRON+1SG+OBL+GEN+FEM

मेर

meriː

my

PRON+1SG+OBL+GEN+PL

मेरा

mera

my

PRON+1SG+OBL+GEN+HON

मेरा

mera

my

PRON+1SG+OBL+GEN+OBL

मेरा

mera

my

PRON+1SG+OBL+GEN+EMPH

मेरै

merʌi

my

The finite state transducer in Figure 3.14 encodes the first person singular pronouns in Nepali presented in Table 3.26. The finite state transducer in Figure 3.14 is capable of analyzing and generating the pronouns of Table 3.26.

Figure 3.14: A finite state transducer for first person singular pronouns The first person plural pronouns in Nepali are presented in Table 3.27 with their corresponding morphological tags.

72

Table 3.27: First person plural pronouns Morphological Tags PRON+1PL

Devanagari हामी

IPA ɦamiː

Gloss we

PRON+1PL+OBL+GEN+MASC

हाॆो

ɦamro

our

PRON+1PL+OBL+GEN+FEM

हाॆी

ɦamriː

our

PRON+1PL+OBL+GEN+PL

हाॆा

ɦamra

our

PRON+1PL+OBL+GEN+HON

हाॆा

ɦamra

our

PRON+1PL+OBL+GEN+OBL

हाॆा

ɦamra

our

PRON+1PL+OBL+GEN+EMPH

हाॆै

ɦamrʌi

our

The finite state transducer illustrated in Figure 3.15 is capable of analyzing and generating the plural pronouns illustrated in Table 3.27.

Figure 3.15: A finite state transducer for first person plural pronouns

Second person: The second person pronouns can be grouped into two classes. One consists of तँ tʌ̃ 'you' and तमी timiː 'you' which have various forms for direct, oblique, emphatic and genitive: masculine, feminine, plural and emphatic. And another group consists of तपा

tʌpaĩː, उहाँ uɦã, यहाँ jʌɦãː, आफू apʰuː, हजुर ɦʌdzur and मौसुफ

mʌusupʰ which do not have any other forms. Table 3.28 lists second person nonhonorific singular forms with their corresponding morphological tags.

73

Table 3.28: Second person singular non-honorific pronouns Morphological Tags PRON+2SG

Devanagari तँ

IPA tʌ̃

Gloss you

PRON+2SG+OBL

त

tʌĩ

you

PRON+2SG+EMPH

त

tʌĩ

you

PRON+2SG+OBL+GEN+MASC

तेरो

tero

your

PRON+2SG+OBL+GEN+FEM

तेर

teriː

your

PRON+2SG+OBL+GEN+PL

तेरा

tera

your

PRON+2SG+OBL+GEN+HON

तेरा

tera

your

PRON+2SG+OBL+GEN+OBL

तेरा

tera

your

PRON+2SG+OBL+GEN+EMPH

तेरै

terʌi

your

The second person singular non-honorific pronouns in Nepali are encoded into a finite state transducer as demonstrated in Figure 3.16 which is capable of analyzing and generating the pronouns listed in Table 3.28.

Figure 3.16: A finite state transducer for second person singular non-honorific pronouns Table 3.29 lists second person singular honorific forms with their corresponding morphological tags.

74

Table 3.29: Second person honorific pronouns Morphological Tags PRON+2SG+HON

Devanagari तमी

IPA timiː

Gloss you

PRON+2SG+OBL+HON+GEN+MASC

तॆो

timro

your

PRON+2SG+OBL+HON+GEN+FEM

तॆी

timriː

your

PRON+2SG+OBL+HON+GEN+PL

तॆा

timra

your

PRON+2SG+OBL+HON+GEN+HON

तॆा

timra

your

PRON+2SG+OBL+HON+GEN+OBL

तॆा

timra

your

PRON+2SG+OBL+HON+GEN+EMPH

तॆै

timrʌi

your

The finite state transducer illustrated in Figure 3.17 encodes the second person honorific pronouns in Nepali and it is capable of analyzing and generating the pronouns listed in Table 3.29.

Figure 3.17: A finite state transducer for second person honorific pronouns Table 3.30 lists second person high honorific singular forms with their corresponding morphological tags.

Table 3.30: Second person high honorific pronouns Morphological Tags Devanagari PRON+2SG+HHON तपा

IPA tʌpaĩː

Gloss you

PRON+2SG+HHON

यहाँ

jʌɦãː

you

PRON+2SG+HHON

उहाँ

uɦã

you

PRON+2SG+HHON

वहाँ

wʌɦã

you

PRON+2SG+HHON

हजुर

ɦʌdzur

you

75

The finite state transducer demonstrated in Figure 3.18 encodes the second person high honorific pronouns in Nepal and it is capable of analyzing and generating the pronouns listed in Table 3.30.

Figure 3.18: A finite state transducer for second person higher honorific pronouns A second person royal honorific pronoun in Nepali is given in Table 3.31 with its corresponding morphological tags.

Table 3.31: Second person royal honorific pronoun Morphological Tags PRON+2SG+RHON

Devanagari

IPA mʌusupʰʌ

मौसुफ

Gloss you.royal

The finite state transducer in Figure 3.19 encodes the royal honorific pronoun and it is capable of analyzing and generating it.

Figure 3.19: A finite state transducer for second person highest honorific pronoun

Third person: The third person pronouns can be grouped into three distinct sets. The first one is ऊ u: and its various forms. ऊ u: inflects for form: direct and oblique, honorificity: non-honorific and honorific; and emphatic. Table 3.32 lists the pronoun ऊ u: and its various forms with their corresponding morphological tags.

76

Table 3.32: Third person pronoun ऊ u: Morphological Tags

Devanagari

PRON+3SG

ऊ

IPA u:

Gloss he

PRON+3SG+EMPH

उह

uɦi:

he

PRON+3SG+OBL

उस

usʌ

he

PRON+3SG+OBL+EMPH

उसै

usʌ

he

PRON+3SG+HON

उनी

uni:

she

PRON+3SG+HON+OBL

उन

unʌ

she

PRON+3SG+HON+OBL+EMPH

उनै

unʌ

she

PRON+3SG+HON

उहाँ

uɦã

s/he

PRON+3SG+HON

वहाँ

wʌɦã

s/he

The finite state transducer illustrated in Fig 3.20 is capable of analyzing and generating the third person pronoun ऊ u: and its various forms illustrated in Table 3.32.

Figure 3.20: A finite state transducer for third person uː The second one is यो tjo, ती ti: and their various forms. यो tjo and ती ti: inflect for form: direct and oblique, honorificity: non-honorific and honorific and emphatic. Table 3.33 lists the pronoun यो tjo, ती ti: and their various forms with their corresponding morphological tags.

77

Table 3.33: Third person pronouns यो tjo and ती ti: Morphological Tags

यो

IPA tjo

Gloss he

PRON+3SG+DIST+EMPH

यह

tjʌɦiː

he

PRON+3SG+OBL

यस

tjʌsʌ

s/he

PRON+3SG+OBL+EMPH

यसै

tjʌsʌi

s/he

PRON+3SG+DIST

Devanagari

PRON+3SG+HON+DIST

ती

ti:

s/he

PRON+3PL+DIST

ती

ti:

s/he

PRON+3SG+HON+DIST

तनी

tini:

s/he

PRON+3SG+OBL+HON+DIST

तन

tinʌ

s/he

PRON+3SG+OBL+HON+DIST+EMPH

तनै

tinʌi

s/he

The finite state transducer illustrated in Figure 3.21 is capable of analyzing and generating the third person pronoun यो tjo, ती ti: and their various forms illustrated in Table 3.33.

Figure 3.21: A finite state transducer for third person pronouns यो tjo and ती ti: The third one is यो jo and यी ji: and their various forms. यो jo and यी ji: inflect for form: direct and oblique, honorificity: non-honorific and honorific and emphatic. Table 3.34 lists the pronoun यो jo and यी ji: and their various forms with their corresponding morphological tags.

78

Table 3.34: Third person pronouns यो jo and यी ji: Morphological Tags

Devanagari

PRON+3SG+PROX

यो

IPA jo

Gloss s/he

PRON+3SG+PROX+EMPH

यह

jʌɦi

s/he

PRON+3SG+OBL+PROX

यस

jʌsʌ

s/he

PRON+3SG+OBL+PROX+EMPH

यसै

jʌsʌi

s/he

PRON+3SG+PROX+HON

यी

ji:

s/he

PRON+3PL+PROX

यी

ji:

s/he

PRON+3SG+PROX+HON

यनी

jini:

s/he

PRON+3SG+PROX+OBL+HON

यन

jinʌ

s/he

PRON+3SG+PROX+OBL+HON+EMPH

यनै

jinʌi

s/he

The finite state transducer illustrated in Fig 3.22 encodes the pronouns listed in Table 3.34 and it is capable of analyzing and generating the third person pronouns यो jo, यी

ji: and their various forms illustrated in Table 3.34.

Figure 3.22: A finite state transducer for third person pronouns यो jo and यी ji: b. Reflexive pronoun There is a single reflexive pronoun आफू apʰu: 'self' in Nepali. But it has various forms. It inflects for form: direct and oblique, genitive case: singular, plural, honorific, oblique and feminine, and emphatic. The Table 3.35 lists आफू apʰu: 'self' and its various forms with their corresponding morphological tags.

79

Table 3.35: The reflexive pronouns Morphological Tags PRON+REFL

Devanagari आफू

IPA apʰu:

Gloss self

PRON+REFL+OBL+EMPH

आफै

apʰʌi

self

PRON+REFL+OBL+EMPH

आफ

apʰʌĩ

self

PRON+REFL+OBL+GEN+SG

आ नो

apʰno

own

PRON+REFL+OBL+GEN+PL

आ ना

apʰna

own

PRON+REFL+OBL+GEN+HON

आ ना

apʰna

own

PRON+REFL+OBL+GEN+OBL

आ ना

apʰna

own

PRON+REFL+OBL+GEN+FEM

आ नी

apʰni:

own

PRON+REFL+OBL+GEN+EMPH

आ नै

apʰnʌi

own

The finite state transducer illustrated in Figure 3.23 is capable of analyzing and generating the reflexive pronoun आफू apʰu: and its various forms illustrated in Table 3.35.

Figure 3.23: A finite state transducer for reflexive pronouns c. Demonstrative pronouns The demonstrative pronouns can be grouped into four distinct sets. The first one is यो

jo and यी ji: and their various forms. यो jo and यी ji: inflect for form: direct and oblique and emphatic. Table 3.36 lists the demonstrative pronouns यो jo and यी ji: and their various forms with their corresponding morphological tags.

80

Table 3.36: The demonstrative pronouns यो jo and यी ji: Morphological Tags

Devanagari

PRON+DEM+PROX

यो

IPA jo

Gloss this

PRON+DEM+PROX+EMPH

यह

jʌɦi:

this one

PRON+DEM+PROX

यी

ji:

these

PRON+DEM+PROX+HON

यनी

jini:

these

PRON+DEM+PROX+OBL

यन

jinʌ

these

PRON+DEM+PROX+OBL+EMPH

यनै

jinʌi

these ones

PRON+DEM+PROX+HON

यहाँ

jʌɦã

you

The finite state transducer illustrated in Figure 3.24 is capable of analyzing and generating the demonstrative pronouns यो jo, यी ji: and their various forms illustrated in Table 3.36.

Figure 3.24: A finite state transducer for demonstrative pronouns यो jo and यी

ji:

The second one is यो tjo and ती ti: and their various forms. यो tjo and ती ti: inflect for form: direct and oblique; and emphatic. Table 3.37 lists the demonstrative pronoun यो

tjo and ती ti: and their various forms with their corresponding morphological tags.

81

Table 3.37: The demonstrative pronouns यो tjo and ती ti: Morphological Tags PRON+DEM+DIST PRON+DEM+DIST+EMPH

Devanagari यो

IPA tjo

Gloss that

यह

tjʌɦi:

that one

PRON+DEM+DIST

ती

ti:

those

PRON+DEM+DIST+OBL+HON

तनी

tini:

those

PRON+DEM+DIST+OBL

तन

tinʌ

those

PRON+DEM+DIST+OBL+EMPH

तनै

tinʌi

those

The finite state transducer illustrated in Figure 3.25 is capable of analyzing and generating the demonstrative pronouns यो tjo and ती ti: and their various forms illustrated in Table 3.37.

Figure 3.25: A finite state transducer for demonstrative pronouns यो tjo and ती ti:

The third one is ऊ u: and its various forms. ऊ u: inflects for form: direct and oblique, and emphatic. Table 3.38 lists the pronoun ऊ u: and its various forms with their corresponding morphological tags.

82

Table 3.38: The demonstrative pronouns ऊ u: Morphological Tags

Devanagari

PRON+DEM+DIST

ऊ

IPA u

Gloss that

PRON+DEM+DIST+EMPH

उह

uɦi:

that same

PRON+DEM+DIST+HON

उनी

uni:

that

PRON+DEM+DIST+OBL

उन

unʌ

that

PRON+DEM+DIST+OBL+EMPH

उनै

unʌi

that

PRON+DEM+DIST+HON

उहाँ

uɦã

there

PRON+DEM+DIST+HON

वहाँ

wʌɦã

there

The finite state transducer illustrated in Fig 3.26 is capable of analyzing and generating the demonstrative pronouns ऊ u: and its various forms illustrated in Table 3.38.

Figure 3.26: A finite state transducer for demonstrative pronouns ऊ u: The fourth one is remaining demonstratives and their various forms that inflect only for emphatic. Table 3.39 lists the remaining demonstrative pronouns and their various forms with their corresponding morphological tags.

83

Table 3.39: The remaining demonstrative pronouns Morphological Tags PRON+DEM+DIST

Devanagari सो

IPA so

Gloss that

PRON+DEM+DIST+EMPH

सोह

soɦi

that

PRON+DEM+PROX

नज

nidzʌ

him/her

PRON+DEM+PROX+EMPH

नजै

nidzʌi

him/her

PRON+DEM+PROX

उ

uktʌ

that

The finite state transducer illustrated in Figure 3.27 is capable of analyzing and generating the remaining demonstrative pronouns and their various forms illustrated in Table 3.39.

Figure 3.27: A finite state transducer for remaining demonstrative pronouns

d. Relative pronouns There are three relative pronouns जो dzo, जे dze and जुन dzunʌ in Nepali. These relative pronouns inflect only for oblique and emphatic forms. Table 3.40 lists relative pronouns and their various forms with their corresponding morphological tags.

Table 3.40: The Relative Pronouns Morphological Tags PRON+REL+HUM

Devanagari जो

IPA dzo

Gloss who

PRON+REL+OBL+HUM

जस

dzʌsʌ

who

PRON+REL+OBL+HUM+EMPH

जसै

dzʌsʌi

who

PRON+REL+NHUM

जे

dze

which

PRON+REL

जुन

dzunʌ

which

PRON+REL+EMPH

जुनै

dzunʌi

which

84

The finite state transducer illustrated in Figure 3.28 is capable of analyzing and generating the relative pronouns and their various forms illustrated in Table 3.40.

Figure 3.28: A finite state transducer for relative pronouns

e. Interrogative pronouns There are three interrogative pronouns को ko, के ke and कुन kunʌ in Nepali. But two adverbs which act as interrogative form कन kinə and कसर kəsəriː are also included here. These interrogative pronouns inflect only for oblique and emphatic forms. Table 3.40 lists relative pronouns and their various forms with their corresponding morphological tags. Table 3.41a: The interrogative pronouns Morphological Tags

Devanagari

PRON+INTERRO+HUM

को

IPA ko

Gloss who

PRON+INTERRO+HUM+OBL

कस्

kʌs

who

PRON+INTERRO+HUM+OBL

कसै

kʌsʌi

who

PRON+INTERRO+NHUM

के

ke

what

PRON+INTERRO

कुन

kun

which

PRON+INTERRO

कन

kinʌ

why

PRON+INTERRO

कसर

kʌsʌri

how

The finite state transducer illustrated in Figure 3.29 is capable to analyze and generate the relative pronouns and their various forms illustrated in Table 3.41a.

85

Figure 3.29: A finite state transducer for interrogative pronouns

f. Indefinite pronouns The indefinite pronouns are derived from interrogative and relative pronouns. The indefinite pronouns derived from interrogative pronouns take ह ɦiː and ◌ै ʌi as an emphatic marker. And those derived from relative pronouns take सुकै sukʌi as an emphatic marker. Table 3.41b lists indefinite pronouns derived from interrogative pronouns with their corresponding morphological tags. Table 3.41b: The indefinite pronouns derived from interrogative pronouns Morphological Tags PRON+INDEF+HUM

Devanagari कोह

IPA koɦi

Gloss someone

PRON+INDEF+NHUM

केह

keɦi

something

PRON+INDEF+NEU

कुनै

kunʌi

anything

The finite state transducer illustrated in Figure 3.30 is capable of analyzing and generating the indefinite pronouns listed in Table 3.41b.

86

Figure 3.30: A finite state transducer for indefinite pronouns derived from interrogative pronouns Table 3.42 lists indefinite pronouns derived from relative pronouns with their corresponding morphological tags. Table 3.42: The indefinite pronouns derived from relative pronouns Morphological Tags PRON+INDEF+HUM

Devanagari जोसुकै

IPA dzosukʌi

Gloss whoever

PRON+INDEF+NHUM

ु ै जेसक

dzesukʌi

whatever

PRON+INDEF+NEU

जुनसुकै

dzunsukʌi

whichever

Figure 3.31: A finite state transducer for indefinite pronouns derived from relative pronouns The finite state transducer illustrated in Figure 3.31 is capable of analyzing and generating the indefinite pronouns and their various forms illustrated in Table 3.42.

87

g. Definite pronouns There is a small set of definite pronouns, which does not show any kind of inflections except अक ʌrko. अक ʌrko inflects for number, honorificity and form: oblique. Table 3.43 lists the definite pronouns with their corresponding morphological tags.

Table 3.43a: The definite pronouns Morphological Tags PRON+DEF

Devanagari ू येक

IPA prʌtekʌ

Gloss everyone

PRON+DEF

हरे क

hʌrekʌ

each one

PRON+DEF

सबै

sʌbʌi

all

PRON+DEF

अ

ʌruː

other

The finite state transducer in Figure 3.32a encodes the definite pronouns listed in Table 3.43a and it is capable of analyzing and generating those pronouns.

Figure 3.32a: A finite state transducer for definite pronouns The definite pronoun अक along with its various forms and their corresponding morphological tags are listed in Table 3.43b. Table 3.43b: The definite pronoun अक Morphological Tags PRON+DEF+SG

Devanagari अक

IPA ʌrko

Gloss another

PRON+DEF+PL

अका

ʌrka

another

PRON+DEF+HON

अका

ʌrka

another

PRON+DEF+OBL

अका

ʌrka

another

PRON+DEF+FEM

अक

ʌrkiː

another

PRON+DEF+EMPH

अक

ʌrkʌi

another

88

The definite pronoun अक ʌrko and its various forms listed in Table 3.43b have been compiled into a finite state transducer as demonstrated in Figure 3.32b and it is capable of analyzing and generating them.

Figure 3.32b: A finite state transducer for definite pronouns

h. Reciprocal pronouns The reciprocal pronouns in Nepali are compound forms except one, i.e., आपस apʌsʌ. A reciprocal pronoun एकअक

ekʌʌrko 'each other' inflect for form: oblique,

honorificity, number: plural and gender: feminine. Table 3.44 lists the reciprocal pronoun एकअक ekʌrko and its various forms with their corresponding morphological tags.

Table 3.44a: The reciprocal pronouns Morphological Tags

Devanagari

PRON+RECIP

एकअक

IPA ekʌrko

Gloss each other

PRON+RECIP+OBL

एकअका

ekʌrka

each other

PRON+RECIP+HON

एकअका

ekʌrka

each other

PRON+RECIP+PL

एकअका

ekʌrka

each other

PRON+RECIP+FEM

एकअक

ekʌrkiː

each other

PRON+RECIP+EMPH

एकअक

ekʌrkʌi

each other

The finite state transducer demonstrated in Figure 3.33a encodes the reciprocal pronouns listed in Table 3.44a and is capable of analyzing and generating them.

89

Figure 3.33a: A finite state transducer for reciprocal pronouns Some other reciprocal pronouns are listed in Table 3.44b with their corresponding morphological tags. Table 3.44b: The reciprocal pronouns Morphological Tags PRON+RECIP

Devanagari एकआपस

IPA ekapʌs

Gloss each other

PRON+RECIP

आपस

apʌs

each other

PRON+RECIP

आआफू

aphu:

each other

The finite state transducer illustrated in Figure 3.33b is capable of analyzing and generating the reciprocal pronouns and their various forms illustrated in Table 3.44b.

Figure 3.33b: A finite state transducer for reciprocal pronouns

90

3.4 Adjectives Adjectives in Nepali are the words indicating quality, quantity and frequency generally modifying the nouns. The adjectives show various kinds of morphological features which are discussed in the following sections. 3.4.1 Characteristics of adjectives in Nepali a. Significant stem finals The adjectives in Nepali, like that of nouns, show the binary division between oending adjectives and non-o-ending adjectives. The o-ending adjectives inflect for number, gender, form and honorificity. These adjectives agree with the features carried over by the head nouns that they modify. The non-o-ending adjectives are not consistent in their formal behavior. Rather a sub-group of non-o-ending adjectives take feminine gender marker and another sub-group, especially Sanskrit loan adjectives, inflects for comparative and superlative forms. Table 3.45 lists some oending and some non-o-ending adjectives.

Table 3.45: O-ending and non-o-ending adjectives O-ending Adjectives Stems Gloss good राॆो ramro black कालो kalo

Non-o-ending Adjectives Gloss good असल ʌsʌl clever चतुर tsʌtur

खॐो kʰʌsro

coarse

लघु lʌgʰu

small

मठो mʰito

sweet

पुव या purwija

related to east

Stem

b. Number Adjectives in Nepali show two dimensions of number: singular and plural. The number distinction is found only in o-ending adjectives. The citation form of o-ending adjective as राॆो ramro in (18a) changes to the a-ending as राॆा ramra in (18b) for plural. (18) a. एउटा राॆो केटो आयो।

eut ̺a ramro ket ̺o a-jo one.CL good.SG boy.SG come-P.3SG.MASC 'A handsome boy came.' 91

b. दुइटा राॆा केटा आए।

duit ̺a ramra ket ̺a a-je two.CL good.PL boy.PL come-P.3PL 'Two handsome boys came.'

Table 3.46 lists some adjectives that show the singular and plural form and this number feature in the adjectives agree with the number feature of the head noun in the noun phrase. Table 3.46: Number: singular and plural Singular Plural

good

black

coarse

old

राॆो

कालो

खॐो

बुढो

ramro

kalo

kʰʌsro

bud̺ʰo

राॆा

काला

खॐा

बुढा

ramra

kala

kʰʌsra

bud̺ʰa

c. Gender Adjectives in Nepali that are o-ending show masculine and feminine gender. The oending adjective such as राॆो ramro in (19a) changes to the iː-ending as राॆी ramriː in (19b) showing masculine and feminine alternation. Some of the non-o-ending adjectives change into feminine adjective with the suffix -नी-niː (alternatevely -इनी-iniː and -एनी-eniː). (19) a. एउटा राॆो केटो आयो।

eut ̺a ramro ket ̺o a-jo one.CL good.MASC.SG boy.MASC.SG come-P.3SG.MASC 'A handsome boy came.'

b. एउट राॆी केट आई।

eut ̺i ramri ket ̺i one.CL.FEM good.FEM.SG boy.FEM.SG 'A beautiful girl came.'

a-iː come-P.3SG.FEM

Table 3.47 lists some examples of adjectives showing the gender change. The gender distinction depends on the head noun. If head noun refers to human, then only the gender is functional.

92

Table 3.47: Gender: masculine and feminine Masculine Feminine

Good

black

clever

rural

राॆो

कालो

चतुर

पाखे

ramro

kalo

tsʌturʌ

pakʰe

राॆी

काल

चतुन

पिखनी

ramriː

kaliː

tsʌturniː

pʌkʰiniː

d. Form Adjectives in Nepali show two forms: direct and oblique. The o-ending adjective as राॆो ramro in (20a) shows oblique form and it changes to a-ending as राॆा ramra in

(20b) showing oblique form. (20) a. एउटा राॆो केटो आउँदै छ।

eut ̺a ramro ket ̺o a-ũdʌi tsʰʌ one.CL good.SG boy come-IMPERF be.NP.3SG.MASC 'A handsome boy is coming.'

b. एउटा राॆा केटाले ूःताव राखेको छ।

eut ̺a ramra ket ̺a-le prʌstaw rakʰ-eko tsʰʌ one.CL good.OBL boy.OBL-ERG proposal keep-PERF be.NP.3SG.MASC 'A handsome boy has proposed.'

Table 3.48 lists some examples of adjectives showing the direct and oblique forms Table 3.48: Form: direct and oblique Direct Oblique

good

black

coarse

old

राॆो

कालो

खॐो

बुढो

ramro

kalo

kʰʌsro

bud̺ʰo

राॆा

काला

खॐा

बुढा

ramra

kala

kʰʌsra

bud̺ʰa

e. Honorificity Adjectives in Nepali show two levels of honorificity: non-honorific and honorific. The o-ending adjectives as राॆो ramro in (21a) changes into a-ending as राॆा ramra in (21b) showing non-honorific and honorific, respectively.

93

(21) a. तँ राॆो छस्। tʌ̃ ramro tsʰʌs you.NHON good.NHON be.NP.2SG.NHON 'You are good.' b. तमी राॆा छौ। timi ramra tsʰʌu you.HON good.HON be.NP.2SG.HON 'You are good.' Table 3.49 lists some examples of adjectives showing the honorifcity.

Table 3.49: Honorificity: non-honorific and honorific Non-honorific Honorific

good

black

coarse

old

राॆो

कालो

खॐो

बुढो

ramro

kalo

kʰʌsro

bud̺ʰo

राॆा

काला

खॐा

बुढा

ramra

kala

kʰʌsra

bud̺ʰa

f. Degree Native adjectives in Nepali do not inflect for degree. The degrees in adjectives are handled syntactically. But the Sankrit loan adjectives show three levels of degree morphologically: positive, comparative and superlative. The positive adjective is unmarked as यून njuːnʌ in (22a). The comparative degree is indicated by a suffix तर-tʌr as यूनतर njuːnʌ-tʌr in (22b) and superlative by a suffix -तम -tʌm as यूनतम

njuːnʌ-tʌm in (22c). (22) a. हाॆो आ दानी यून छ। ɦamro amdani njuːnʌ our income less 'Our income is less.'

tsʰʌ be.NP.3SG.MASC

b. हाॆो आ दानी यूनतर छ। ɦamro amdani njuːnʌ-tʌr tsʰʌ our income less-COMP be.NP.3SG.MASC 'Our income is lesser.'

94

c. हाॆो आ दानी यूनतम छ। ɦamro amdani njuːnʌ-tʌm tsʰʌ our income less-SUPER be.NP.3SG.MASC 'Our income is the least.' Table 3.50 lists some examples of Sanskrit loan adjectives that show three degrees. Table 3.50: Degree: positive, comparative and superlative less

low

rigorous

small

यून

न न

गहन

लघु

njuːnʌ

nimnʌ

gʌɦʌnʌ

lʌgʰu

यूनतर

न नतर

गहनतर

लघुतर

nimnʌ-tʌrʌ

gʌɦʌnʌ-tʌrʌ

lʌgʰu-tʌrʌ

न नतम

गहनतम

लघुतम

nimnʌ-tʌmʌ

gʌɦʌnʌ-tʌmʌ

lʌgʰu-tʌmʌ

Positive Comprative

njuːnʌ-tʌrʌ यूनतम

Superlative

njuːnʌ-tʌmʌ 3.4.2 Classification of adjectives

On the basis of characteristic features of adjectives in Nepali as discussed in (3.4.1), the adjectives are classified into two major groups. The first one is o-ending adjectives whereas the second one is non-o-ending adjectives.

a. O-ending adjectives All the o-ending adjectives are grouped in a class. The adjectives in this group inflect for number, gender, form and honorificity. The inflection in the adjectives has direct relation with the head noun which it modifies because there is feature agreement between head noun and modifier adjective. Table 3.51 lists some examples of oending adjectives.

95

Table 3.51: O-ending adjectives Morphological Tags +ADJ+SG +ADJ+PL +ADJ+OBL +ADJ+HON +ADJ+FEM

good

Black

coarse

old

राॆो

कालो

खॐो

बुढो

ramro

kalo

kʰʌsro

bud̺ʰo

राॆा

काला

खॐा

बुढा

ramra

kala

kʰʌsra

bud̺ʰa

राॆा

काला

खॐा

बुढा

ramra

kala

kʰʌsra

bud̺ʰa

राॆा

काला

खॐा

बुढा

ramra

kala

kʰʌsra

bud̺ʰa

राॆी

काल

खॐी

बुढ

ramriː

kaliː

kʰʌsriː

bud̺ʰiː

The finite state transducer illustrated in Figure 3.33 is capable of analyzing and generating the o-ending adjectives and their forms illustrated in Table 3.44.

Figure 3.34: A finite state transducer for o-ending adjectives The phonological rules given in PR 3.9 are compiled into a finite state transducer and composed with finite state transducer illustrated in Figure 3.34.

Phonological rule PR 3.9 i. Stem final vowel ◌ो o of the o-ending adjectives of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural, oblique and honorificity. 96

Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending adjectives of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#.

b. Non-o-ending adjectives Non-o-ending adjectives in Nepali form a group which includes both marked and unmarked adjectives. Marked adjectives mean those which take some sort of marking such as feminine marker, comparative marker and superlative maker.

i. Marked adjectives Type 1: Those non-o-ending adjectives in Nepali that inflect for gender: masculine and feminine have been grouped in this class. The citation form is masculine in gender and maker -नी /-इनी -niː-iniː when suffixed to changes to feminine gender. Table 3.52 lists some adjectives of this group.

Table 3.52: Type 1 marked adjectives Morphological Tags

clever

cunning

of east

rural

ADJ+MASC

चतुर

धुत

पु वया

पाखे

tsʌturʌ

dʰurtʌ

purwija

pakʰe

चतुन

धु तनी

पु वनी

पिखनी

tsʌturniː

dʰurtiniː

purwiniː

pʌkʰeniː

ADJ+FEM

The finite state transducer illustrated in Figure 3.34 is capable of analyzing and generating the non-o-ending type 1 adjectives and their forms illustrated in Table 3.52.

97

Figure 3.35: A finite state transducer for Type 1 marked adjectives The phonological rules involved in this process are given in PR 3.10 which are compiled and composed with finite state transducer illustrated in Figure 3.35.

Phonological rule PR 3.10 i. Halant ◌् is inserted between consonant symbol and feminine gender marker नी

niː at the surface level. Regular expression: [. .] -> ◌् || liquids _ न ◌ी .#. ii. या ja is deleted before the feminine gender marker नी niː at the surface level. Regular expression: य ◌ा -> [ ] || _ न ◌ी .#. Type 2 Those non-o-ending adjectives in Nepali that inflect for comparative and superlative forms are grouped in this class. The adjectives in this group, in fact, are Sanskrit loan adjectives. The adjectives in this group take the comparative marker -तर

-tʌrʌ and superlative maker -तम -tʌmʌ forming the comparative and superlative forms respectively. Table 3.53 lists some examples of Sanskrit loan adjectives.

98

Table 3.53: Type 2 marked adjectives Morphological Tags +ADJ+POSIT +ADJ+COMP

less

low

regorous

small

यून

न न

गहन

लघु

njuːnʌ

nimnʌ

gʌɦʌnʌ

lʌgʰu

यूनतर

न नतर

गहनतर

लघुतर

nimnʌ-tʌrʌ

gʌɦʌnʌ-tʌrʌ

lʌgʰu-tʌrʌ

न नतम

गहनतम

लघुतम

gʌɦʌnʌ-tʌmʌ

lʌgʰu-tʌmʌ

njuːnʌ-tʌrʌ +ADJ+SUPER

यूनतम

njuːnʌ-tʌmʌ

nimnʌ-tʌmʌ

The finite state transducer illustrated in Figure 3.35 is capable of analyzing and generating the non-o-ending type 2 adjectives and their forms illustrated in Table 3.53. In this class of adjectives, no rules are involved.

Figure 3.36: A finite state transducer for Sanskrit loan adjectives ii. Unmarked adjectives All those non-o-ending adjectives in Nepali which never take any marker are grouped in this class. The adjective in this class remains unaltered. Table 3.54 lists some examples of unmarked adjectives. Table 3.54: Unmarked mdjectives Morphological Tags +ADJ

gentle

bad

new

rich

असल

खराब

नयाँ

धनी

ʌsʌl

kʰʌrab

nʌjã

dʰʌni

The finite state transducer illustrated in Figure 3.35 is capable of analyzing and generating the non-o-ending unmarked adjective forms illustrated in Table 3.54.

99

Figure 3.37: A finite state transducer for unmarked adjectives 3.5 Numerals The numerals in Nepali are of two types: cardinal numbers and ordinal numbers.

3.5.1 Cardinal numbers Cardinal number in Nepali from one to hundred and some other such as हजार ɦʌdzar 'thousand', लाख lakʰ 'hundred thousand', करोड kʌrod̺ 'ten million', अरब ʌrʌb 'ten billion' and खरब kʰʌrʌb 'ten trillion' are written as a single word. The cardinal numbers appear with numeral classifiers and modify the head nouns. Table 3.55 lists some examples of cardinal numbers.

Table 3.55: Some cardinal numbers Morphological Tags +NUM

Devanagari शू य

IPA ʃuːnjʌ

Gloss zero

+NUM+CARD

एक

ek

one

+NUM+CARD

दुई

duiː

two

+NUM+CARD

तीन

tiːn

three

+NUM+CARD

चार

tsar

four

+NUM+CARD

पाँच

pãts

five

+NUM+CARD

छ

tsʰʌ

six

+NUM+CARD

सात

sat

seven

+NUM+CARD

आठ

at ̺ʰ

eight

+NUM+CARD

नौ

nʌu

nine

+NUM+CARD

दस

dʌs

ten

+NUM+CARD

एघार

egʰarʌ

eleven

100

+NUM+CARD

बा॑

bahrʌ

twelve

+NUM+CARD

ते॑

tehrʌ

thirteen

+NUM+CARD

चौध

tsʌudʰʌ

fourteen

+NUM+CARD

प ी

pʌndʰrʌ

fifteen

+NUM+CARD

सो॑

sohrʌ

sixteen

+NUM+CARD

सऽ

sʌtrʌ

seventeen

+NUM+CARD

अठार

ʌtʰarʌ

eighteen

+NUM+CARD

उ ाइस

unnais

nineteen

+NUM+CARD

बीस

biːs

twenty

+NUM+CARD

ए ाइस

ekkais

twenty one

+NUM+CARD

प चीस

pʌtstsiːs

twenty five

+NUM+CARD

तीस

tiːs

thirty

+NUM+CARD

चाल स

tsaliːs

fourty

+NUM+CARD

पचास

pʌtsas

fifty

+NUM+CARD

साठ

sat ̺ʰiː

sixty

+NUM+CARD

स र

sʌttʌriː

seventy

+NUM+CARD

असी

ʌsiː

eighty

+NUM+CARD

न बे

nʌbbe

ninenty

+NUM+CARD

सय

sʌjʌ

hundred

+NUM+CARD

हजार

ɦʌdzar

thousand

+NUM+CARD

लाख

lakʰ

+NUM+CARD

करोड

kʌrod̺

hundred thousand ten million

+NUM+CARD

अरब

ʌrʌb

ten billion

+NUM+CARD

खरब

kʰʌrʌb

ten trillion

3.5.2 Ordinal number The ordinal numbers in Nepali are of two types: regular and irregular. a. Regular ordinal number: Numbers one, two, three, four and six constitute an exceptional set in the formation of the ordinal numbers from the cardinal numerals. Except the exceptional set, all the numerals take -औ -ʌũ as a suffix and form the ordinal numbers. Some examples are illustrated in Table 3.56.

101

Table 3.56: Some regular ordinal numbers Morphological Tags Devanagari +NUM+ORD पाँच

IPA pãtsʌũ

Gloss fifth

+NUM+ORD

सात

satʌũ

seventh

+NUM+ORD

आठ

atʰʌũ

eighth

+NUM+ORD

दस

dʌsʌũ

tenth

+NUM+ORD

बीस

biːsʌũ

twentieth

+NUM+ORD

सय

sʌjʌũ

hundredth

+NUM+ORD

हजार

hʌdzarʌũ

thousandth

+NUM+ORD

लाख

lakʰʌũ

+ORD+NUM

करोड

kʌrodʌũ

hundred thousandth ten millionth

The finite state transducer for cardinal numbers listed in Table 3.55 and ordinal number listed in Table 3.56 except the exceptional set is illustrated in the Figure 3.37 which is capable of analyzing and generating these numeral forms.

Figure 3.38 A finite state transducer for cardinal numbers and regular ordinal numbers The phonological rules involved in the regular numerals are given in PR 3.11, which are compiled and composed with finite state transducer illustrated in Figure 3.38.

Phonological rule PR 3.11 i. Vowel sequence औ ʌũ is changed to it corresponding dependent vowel symbol ◌ौ ʌũ if the numeral ends with consonant.

Regular expression: औ -> ◌ौ || cons _ .#.

102

b. Irregular ordinal numbers: The corresponding ordinal numerals from number one, two, three and four are different from the regular ordinal numerals. They inflect for number, gender, form and honorificity. Table 3.57, Table 3.58, Table 3.59, Table 3.60 list the ordinal numerals and their corresponding morphological tags of number one, two, three and four respectively. Table 3.57: Irregular ordinal numbers of one Morphological Tags +NUM+ORD+MASC

Devanagari प हलो

IPA pʌhilo

Gloss first

+NUM+ORD+PL

प हला

pʌhila

first

+NUM+ORD+OBL

प हला

pʌhila

first

+NUM+ORD+HON

प हला

pʌhila

first

+NUM+ORD+FEM

प हल

pʌhiliː

first

Table 3.58: Irregular ordinal numbers of two Morphological Tags +NUM+ORD+MASC

Devanagari दोॐो

IPA dosro

Gloss second

+NUM+ORD+PL

दोॐा

dosra

second

+NUM+ORD+OBL

दोॐा

dosra

second

+NUM+ORD+HON

दोॐा

dosra

second

+NUM+ORD+FEM

दोॐी

dosriː

second

Table 3.59: Irregular ordinal numbers of three Morphological Tags +NUM+ORD+MASC

Devanagari तेॐो

IPA tesro

Gloss third

+NUM+ORD+PL

तेॐा

tesra

third

+NUM+ORD+OBL

तेॐा

tesra

third

+NUM+ORD+HON

तेॐा

tesra

third

+NUM+ORD+FEM

तेॐी

tesriː

third

Table 3.60: Irregular ordinal numbers of four Morphological Tags +NUM+ORD+MASC

Devanagari चौथो

IPA tsʌutʰo

Gloss fourth

+NUM+ORD+PL

चौथा

tsʌutʰa

fourth

+NUM+ORD+OBL

चौथा

tsʌutʰa

fourth

+NUM+ORD+HON

चौथा

tsʌutʰa

fourth

+NUM+ORD+FEM

चौथी

tsʌutʰiː

fourth

103

The finite state transducer illustrated in Figure 3.39 is capable of analyzing and generating the ordinal numerals from numbers one, two, three and four and their corresponding forms illustrated in Table 3.57, Table 3.58, Table 3.59, Table 3.60.

Figure 3.39: A finite state transducer for irregular ordinal numerals The phonological rules involved in irregular ordinal numerals are given in PR 3.12, which are compiled and composed with finite state transducer illustrated in Figure 3.39.

Phonological rule PR 3.12 i. Stem final vowel ◌ो o of the o-ending irregular numeral of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural, oblique and honorificity Regular expression: ◌ो -> ◌ा || _ .#. ii. Stem final vowel ◌ो o of the o-ending irregular numeral of the lower language (.i.e, surface level) is changed to vowel ◌ी iː for feminine gender. Regular expression: ◌ो -> ◌ी || _ .#.

c. Ordinal numbers loaned from Sanskrit Some ordinal numbers in Nepali are loan words from Sanskrit. They are listed in Table 3.61.

104

Table 3.61: Some ordinal numbers from Sanskrit loan Morphological Tags +NUM+ORD

Devanagari ूथम

IPA prʌtʰʌm

Gloss first

dwitiːjʌ

second

+NUM+ORD

तीय

+NUM+ORD

तृतीय

tritiːjʌ

third

+NUM+ORD

चतुथ

tsʌturtʰʌ

fourth

+NUM+ORD

प म

pʌntsʌm

fifth

The ordinal numbers borrowed from Sanskrit are encoded in the finite state transducer as demonstrated in Figure 3.40 and it is capable of analyzing and generating them.

Figure 3.40: A finite state transducer for ordinal numerals form Sanskrit loan 3.5.2 Other numerals Some numerals in Nepali indicate the frequency and also modify the head nouns. Such numerals grouped into four classes and they are listed in Table 3.62, Table 3.63, Table 3.64 and Table 3.65.

Table 3.62: Frequency numerals (I) Morphological Tags +NUM+FREQ

Devanagari एकोहोरो

IPA ekoɦoro

Gloss one

+NUM+ FREQ

दोहोरो

doɦoro

two

+NUM+FREQ

तेहोरो

teɦoro

three

Table 3.63: Frequency numerals (II) Morphological Tags +NUM+FREQ

Devanagari एकसरो

IPA eksʌro

Gloss one layer

+NUM+ FREQ

दुईसरो

duiːsʌro

two layer

+NUM+FREQ

तीनसरो

tiːnsʌro

three layer

105

Table 3.64: Frequency numerals (III) Morphological Tags +NUM+FREQ

Devanagari दोबर

IPA dobʌr

Gloss twice/double

+NUM+ FREQ

तेबर

tebʌr

thrice

+NUM+FREQ

चौबर

tsʌubʌr

four times

Table 3.65: Frequency numerals (IV) Morphological Tags +NUM+FREQ

Devanagari दुईगुना

IPA duiːguna

Gloss two times

+NUM+ FREQ

तीनगुना

tiːnguna

three times

+NUM+FREQ

चौगुना

tsʌuguna

four times

The finite state transducer illustrated in Figure 3.41 is capable of analyzing and generating the frequency numerals illustrated in Table 3.62, Table 3.63, Table 3.64 and Table 3.65.

Figure 3.41: A finite state transducer for frequency numerals There are few numerals which indicate part of the measurement of things, time and space. Some of the portion numerals are listed in Table 3.66.

Table 3.66: Some portion numerals Morphological Tags Devanagari IPA +NUM+PORT आधा adha +NUM+PORT पौने pʌune

Gloss half (a number less) a quarter

+NUM+PORT

सवा

sʌwa

one and quarter

+NUM+PORT

डेढ

d̺ed̺ʰʌ

one and half

+NUM+PORT

साढे

sad̺ʰe

(a number and) half

+NUM+PORT

अढाइ

ʌd̺ʰai

two and half

+NUM+PORT

चौथाइ

tsʌutʰai

one fourth

106

The finite state transducer illustrated in Figure 3.42 is capable of analyzing and generating the portion numerals illustrated in Table 3.66.

Figure 3.42: A finite state transducer for portion numerals 3.6 Classifiers in Nepali 3.6.1 Numeral classifiers There are two numeral classifiers in Nepali. -जना -dzʌna is human masculine classifiers and it does not inflect for anything. -वटा -wʌt ̺a is a non-human classifier but it inflects for human feminine. The numeral classifiers appear only with countable nouns. Table 3.67 lists these two numeral classifiers and their various forms.

Table 3.67: Numeral classifiers Morphological Tags +CLF+HUM

Devanagari जना

IPA dzʌna

+CLF+NHUM

वटा/ओटा

wʌt ̺a/ot ̺a

+CLF+FEM

वट /ओट

wʌt ̺iː/ot ̺iː

Gloss

Figure 3.43: A finite state transducer for numeral classifiers

107

The finite state transducer illustrated in Figure 3.43 is capable of analyzing and generating the numeral classifiers illustrated in Table 3.67.

3.6.2 Quasi-classifiers Quasi-classifiers in Nepali have their lexical content as well as the properties of being the classifier. Each item in the list classifies a small set of nouns and also follows the numerals. Quasi-classifiers are related to mensurality or sortality. Such classifiers end in either o or non-o like nouns and adjectives in Nepali. o-ending quasi-classifiers inflect for number and oblique features. Some examples of o-ending classifiers are given in Table 3.68.

Table 3.68: o-ending quasi-classifiers Morphological Tags Classifier1 +CL+SG कोसो koso +CL+PL कोसा kosa +CL+OBL

कोसा kosa

Classifer2 दानो dano

Classifier3 थोपो tʰopo

दाना dana

थोपा tʰopa

दाना dana

थोपा tʰopa

The o-ending quasi-classifiers in Nepali are compiled into a finite state transducer as demonstrated in Figure 3.44 and it is capable of analyzing and generating the quasiclassifiers illustrated in Table 3.68.

Figure 3.44: A finite state transducer for general classifier type 1

108

The phonological rules involved in this set of quasi-classifiers are given in PR 3.13, which are compiled and composed with finite state transducer illustrated in Figure 3.44.

Phonological rule PR 3.13 i. Stem final vowel ◌ो o of the o-ending quasi-classifiers of the lower language (.i.e, surface level) is changed to vowel ◌ा a for plural and oblique. Regular expression: ◌ो -> ◌ा || _ .#. The finite state transducer in Figure 3.44 is capable of analyzing and generating the quasi-classifiers illustrated in Table 3.68. Non-o-ending quasi-classifiers do not inflect for anything. Table 3.69 presents some examples of non-o-ending quasi-classifiers in Nepali.

Table 3.69: General non-o-ending classifiers Morphological Tags +CL

Devanagari पोट

IPA pot ̺i

+CL

थुन

tʰun

+CL

जुवा

dzuwa

+CL

गाँस

gãs

+CL

चोइल

tsoili

+CL

िख ल

kʰilli

+CL

घर

gʰʌri

The finite state transducer in Figure 3.45 is capable of analyzing and generating the quasi-classifiers illustrated in Table 3.69.

Figure 3.45: A finite state transducer for general classifier type 2 109

3.7 Summary This chapter analyzed that nouns in Nepali. They can be grouped into two classes: oending and non-o-ending nouns. The o-ending are further sub-grouped into four classes and non-o-ending nouns are further sub-grouped into two classes, viz. marked and unmarked classes. Marked non-o-ending nouns are of four types and unmarked nouns are of six types. The basis on which the classification is done to match and implement the word categories into finite state technology is made up of the formal characteristic features possessed by the nouns in Nepali. Some of phonological rules for one group of nouns are repeated for another group; they are minimized, delimiters are used if required and implemented as regular expression and finally composed with the main noun lexicon. Personal pronouns in Nepali possess person, number, form and honorific features. Demonstratives, reflexives, reciprocal, definite and indefinite pronouns inconsistently possess number, form and honorific features. The formal grouping of the pronouns is significant for the illustration and demonstration of their finite state transducers. Since the number of pronouns is limited and their behavior is more or less idiosyncratic, they are directly encoded for creating the finite state network. Adjectives in Nepali are mainly of two major types: o-ending and non-o-ending. Nono-ending adjectives are of two types: marked and unmarked. One group of marked adjectives shows the distinction in masculine and feminine gender whereas another group containing Sanskrit loans shows three levels of degree: positive, comparative and superlative. And unmarked adjectives remain unaltered. The numerals in Nepali are mainly grouped into three classes; they are cardinal, ordinal and other numerals. Except some, all ordinal numerals are derived from the cardinal numerals. Some irregular ordinal numbers show the distinctions for the features like number, gender, honorficity and form. The classifiers in Nepali are grouped into two classes; true classifiers and quasiclassifiers. The true classifiers inflect for gender whereas some of the quasi-classifiers inflect for number and form.

110

CHAPTER 4 VERBAL MORPHOLOGY 4.0 Outline This chapter presents the analysis of verb stems in Nepali. It consists of six sections. Section 4.1 discusses the characteristic features of verbs, namely, significant verb stem finals, transitivity, syllabicity and sound a. In section 4.2, we discuss the morphological processes like causativization, passivization and negativization. The stem formation concept is presented in secion 4.3. Section 4.4 groups the verbs into various groups based on the features discussed above and presents them with their morphological tags. The finite state transducer of each group is illustrated. Section 4.5 deals with verbal inflections which include tense, aspect and mood. For every group of inflections the morphological tags and finite state transducers are illustrated. Section 4.5 deals with verbal inflections which include tense, aspect, mood and participial forms. For every group of inflections the morphological tags and finite state transducers are illustrated. Section 4.6 summarizes the findings of the chapter.

4.1 Characteristics of verb in Nepali 4.1.1 Significant verb stem finals The basic verb stems end with different sound segments. Some of the final segments are noteworthy from the morphophological point of view. The morphological processes that are under consideration such as passivization, causativization, negativization and other affixation processes need the information of the final segment of the verb to produce the acceptable surface forms. The stem of the basic verb is identified by removing the past tense third person singular marker -यो -jo from the verb forms and then the remaining segment is analyzed with reference to various phenomena. Those final segments which are significant from our point of view are discussed as follows:1

1

Pokharel (2010a) has mentioned the various strategies to derive the verb stems. Among them imperative singular form as the basic stem has been adopted here for the simplicity, although it leaves some exceptions.

111

a.Vowel final stems i. i-ending verb stems: A set of verb stems which end in vowel इ i are listed in Table 4.1. The majority of the verb stems in this class are intransitive verbs but some of them are transitive also. Some examples are listed in Table 4.2. The verbs उ ृ upʰri 'jump' and प ब pʌkri 'arrest' in (1a) and (1b), respectively end with vowel इ i. (1)

a. केटो उ ृयो।

ket ̺o upʰri-jo boy jump-PST.3SG.MASC 'The boy jumped.'

b. ूहर ले चोरलाई प बयो। prʌɦʌri-le tsor-lai pʌkri-jo police-ERG thief-DAT arrest-PST.3SG.MASC 'The police arrested the thief.' Table 4.1: i-ending intransitive verb stems Verb stem उृ

IPA upʰri-

Gloss 'jump'

खुि च

kʰumtsi-

'shrink'

चोइ ट

tsoit ̺i-

'be pieces'

भि क

bʰʌtki-

'be broken'

The i-ending intransitive verb stems listed in Table 4.1 and i-ending transitive stems listed in Table 4.2 look similar in their form. But they differ in their further morphology. Table 4.2: i-ending transitive verb stems Base form पब

IPA pʌkri-

Gloss of stem 'arrest'

पिख

pʌrkʰi-

'wait'

बस

birsi-

'forget'

मि स

mʌnsi-

'throw away'

सि झ

sʌmdzʰi-

'remember'

कुि च

kultsi-

'tread'

uĩt ̺i-

'spindle'

द

di-

'give'

ल

li-

'take'

उइँ ट

112

i-ending verb stems listed in Table 4.2a behave differently. The vowel उ u is obligatorily inserted between the stems and suffix if the suffix that follows the stems begins with न् n and उँ ũ if the suffix begins with

tsʰ and थ् tʰ.

Table 4.2a: i-ending transitive verb stems Verb stem प

IPA pi

Gloss 'drink'

स

si

'sew'

िज

dzi

'live'

The vowel इ i at the end of the verb stem optionally drops without change in meaning. The verb stem पि ल pʌgli 'melt' in (2a) has retained vowel इ i and verb stem प ल्

pʌgl 'melt' in (2b) vowel इ i is dropped.

(2)

a. हउँ पि लयो। hiũ pʌgli-jo ice melt-PST.3SG.MASC 'The ice melted.' b. हउँ प यो। hiũ pʌgl-jo ice melt-PST.3SG.MASC 'The ice melted.'

This vowel इ i at the end of the verb stems also is optionally changed to अ ʌ especially when the suffix begining with न् n,

d, and ए e. For example, when -नु -

nu '-INF' gets attached to verb stem, इ i optionally changes to अ ʌ. Table 4.3 lists these alternative forms due to change of इ i to अ ʌ in i-ending verb stems.

113

Table 4.3: Alternative forms of i-ending verb stems

-i forms

IPA

-ʌ froms

IPA

उृ

upʰri-

उृ

upʰrʌ-

खुि च

kʰumtsi-

खु च

kʰumtsʌ-

चोइ ट

tsoit ̺i-

चोइट

tsoit ̺ʌ-

भि क

bʰʌtki-

भ क

bʰʌtkʌ-

सउ र

siuri-

सउर

siurʌ-

बम

bigri-

बम

bigrʌ-

सू

sʌpri-

सू

sʌprʌ-

उिय

ugʰri-

उय

ugʰrʌ-

पि ल

pʌgli-

प ल

pʌglʌ-

उि ल

ugli-

उ ल

uglʌ-

उि ल

ukli-

उ ल

uklʌ-

पब

pʌkri-

पब

pʌkrʌ-

पिख

pʌrkʰi-

पख

pʌrkʰʌ-

बस

birsi-

बस

birsʌ-

ii. a-ending verb stems: Some of the verb stems ending with the vowel आ a are listed in Table 4.4 and Table 4.5. Verb stems in this group are of both intransitive and transitive types. The verb stem कमा kʌma- 'earn' in (3a) and आ a- 'come' in (3b) end with vowel आ a. (3)

a. उसले धेरै पैसा कमाएको छ।

us-le dʰerʌi pʌisa kʌma-eko tsʰʌ money earn-PERF be-PST.3SG.MASC 3SG.OBL-ERG more 'He has earned a lot of money.'

b. राम ःकुलबाट घर आयो। ram skul-bat ̺ʌ gʰʌr a-jo Ram school-ABL house come-PST.3SG.MASC 'Ram came home from school.'

114

Table 4.4: a-ending verb stems (group 1) Verb stem अघा

IPA ʌgʰa-

Gloss 'satisfy'

कमा

kʌma-

'earn'

टकरा

t ̺ʌkʌra-

'be broken'

मुःकुरा

muskura-

'insert'

पा

pa-

'get'

आ

a-

'come'

छा

tsʰa-

'cover the roof'

बा

ba-

'open (mouth)'

पा

pa-

'get'

ला

la-

'put on'

या

bʰja-

'manage'

या

bja-

'give birth'

Table 4.5: a-ending verb stems (group 2) Verb stem खा

IPA kʰa-

Gloss 'eat'

जा

dza-

'go'

The a-ending verb stems are also of two kinds, a set of verbs in which vowel उ u is inserted between stem and suffix if the following suffix begins with न् n, and उँ ũ with

tsʰ, and थ् tʰ as in Table 4.4. Those verb stems as listed in Table 4.5 do not

take उ u in the condition as stated above. In this group न् n is inserted in the non-past tense and past habitual aspect. ii. o-ending verbs stems: There are a few verb stems which end with ओ o. The stem final ओ o obligatorily changes to उ u if the following suffix begins with

tsʰ,

d,

थ् tʰ then न् n sound segments and न् n is obligatorily inserted in non-past tense.

Table 4.6 lists some of the o-ending verb stems and Table 4.7a shows the change of ओ o to उ u in the condition mentioned above.

115

Table 4.6: o -ending verb stems Verb stem रो

IPA ro-

Gloss 'weep'

धो

dʰo-

'wash'

छो

tsʰo-

'touch'

Table 4.7a: Change of o to u in o-ending verb stems Verb stem

IPA ru-nu

Gloss 'to weep'

धुन ु

dʰu-nu

'to wash'

छु नु

tsʰu-nu

'to touch'

नु

iii. ʌ-ending verbs stems: There is a small set of verbs which end with the vowel अ ʌ. The vowel अ ʌ in the end of the vowel stem drops if the following suffixes begining with ए e, इ i, उ u and ओ o are attached. Table 4.7a lists some ʌ-ending verb stems and Table 4.7b shows some dropping of vowel ʌ. Table 4.7b: ʌ-ending verb stems Verb stem सह

IPA sʌɦʌ-

Gloss 'tolerate'

रह

rʌɦʌ-

'remain'

Table 4.7c: ʌ-ending verb stems (ʌ-dropped) Verb stem सहे र

IPA sʌɦ-erʌ

Gloss 'tolerate-CONJUNT'

रहे र

rʌɦ-erʌ

'remain-CONJUNCT'

In the vowel ending verb stems, except verbs in Table 4.2a and Table 4.4, semantically null element न् n is inserted between stem and suffix if the suffix begins with

tsʰ or थ् tʰ sounds. But, in the case of the verb stems in Table 4.2a and Table

4.4, only ◌ँद ̃dʌ is inserted after उ u is inserted for some other purpose.

116

b. Consonant final stems i. Voiceless consonant ending stems: The verb stems that end with voiceless consonants are both intransitive and transitive types. Some examples of the verb stems ending with voiceless consonants are listed in Table 4.7d.

Table 4.7d: Verb stems ending with a voiceless consonant Verb stem कस्

IPA kʌs-

Gloss 'tighten'

काँप ्

kãp-

'tremble'

घसे

gʰʌset ̺-

'drag'

जाक्

dzak-

'insert'

pʰjãk-

'throw'

nats-

'dance'

याँक् नाच ्

In this group of verb stems, semantically null elements त tʌ or द dʌ are inserted optionally between the stem and suffix if the suffix begins with

tsʰ and थ् tʰ. These

forms are used only in non-past tense and past habitual aspect. These alternative forms of the stems are listed in Table 4.8.

Table 4.8: Alternative forms from stems ending with voiceless consonant Base stem कस् kʌs काँप ् kãp घसे gʰʌset ̺ जाक् dzak याँक् pʰjãk नाच ् nats

form1 कःत kʌstʌ-

form2 कःद kʌsdʌ-

काँ

kãptʌ-

काँ द kãpdʌ-

जा

dzaktʌ-

जा द dzakdʌ-

घसे त gʰʌset ̺tʌयाँ

pʰjãktʌ-

ना त natstʌ-

घसे द gʰʌset ̺dʌयाँ द pʰjãkdʌ-

ना द natsdʌ-

ii. Voiced consonant ending stems: The verb stems that end with voiced consonants are of both types intransitive and transitive. Some examples of the verb stems ending with voiced consonants are listed in Table 4.9.

117

Table 4.9: Verb stems ending with voiced consonant Verb stem

IPA bol-

Gloss 'speak'

pĩd-

'grind'

थुन ्

tʰun-

'close'

पछार ्

pʌtsʰar-

'throw down'

डुब्

d̺ub-

'sink'

छाम्

tsʰam-

'feel'

खोज्

kʰodz-

'search'

बोल् पँ

In this group of stems also, a semantically null element द dʌ is inserted optionally between the stem and suffix if the suffix begins with

tsʰ or थ् tʰ. These forms are

used only in non-past tense and past habitual aspect. These alternative forms of the stems are listed in Table 4.10.

Table 4.10: Alternative forms from stems ending with voiced consonant Base stem बोल् bolपँ pĩd-

थुन ् tʰunपछार ् pʌtsʰarडुब् d̺ubछाम् tsʰamखोज् kʰodz-

Alternative form बो द boldʌपँ

pĩddʌ-

थु द tʰundʌ-

पछाद pʌtsʰardʌडु द d̺ubdʌ-

छा द tsʰamdʌ-

खो द kʰodzdʌ-

4.1.2 Transitivity Transitivity is the number of argument that a verb takes (Katamba 1993:256-62; Pyane 1997:171). The transitivity is significant in verbs. Morphology of the verbs can be further analyzed in term of this feature.

a. Intransitive verbs Those verbs which take only one argument as subject noun phrase are intransitive verbs. In example (4) the verb उ

ut ̺ʰ 'get up' has taken only one argument उ u 'he' as

118

a subject and in example (5) the verb बस् bʌs 'sit' has taken only one argument उ u 'he' as subject, therefore, they are intransitive verbs. (4)

उ बहानै उ

u

ो।

bihan-ʌi

ut ̺ʰ-jo

he morning-EMP rise-PST.3SG.MASC 'He got up early in the morning.' (5)

उ सँधै घरमा बःछ।

u sʌ̃dhʌi gʰʌr-ma bʌs-tsʰʌ 3SG always home-LOC sit-NPST.3SG.MASC 'He always stays at home.'

The other verbs listed in Table 4.11 such as कु

kud 'run', बस् bʌs 'sit', सुत ् sut

'sleep', etc. also take only one argument as the subject. Table 4.11: Intransitive verbs Intransitive verb उ

IPA ut ̺ʰ-

Gloss 'wake up'

कु

kud-

'run'

बस्

bʌs-

'sit'

ल

lʌd̺-

'fall down'

सुत ्

sut-

'sleep'

अघा

ʌgʰa-

'satisfied'

b. Transitive/ditransitive verbs Those verbs which take two arguments are said to be transitive and those verbs which take three arguments are said to be ditransitive verbs. Both types of verbs are kept here under the same group as they behave in the same way at the morphological level. The TableS 4.12 and 4.13 list the transitive verbs and ditransitive verbs, respectively. The verb का

kat ̺ 'cut' in (6) has taken two arguments ँयाम sjam 'Shyam' and ख

rukʰ 'tree' as subject and object of the sentence, respectively. And the verb द di 'give' in (7) has taken three arguments म mʌi '1SG', उस् us 'he.OBL' and कताब kitab 'book' as subject, indirect and direct object of the sentence, respectively.

119

(6)

ँयामले

ख का

ो।

sjam-le

rukʰ kat ̺-jo

Shyam-ERG

tree cut-PST.3SG.MASC

'Shyam cut the tree.'

(7)

ँ मैले उसलाई कताब दए।

mʌi-le

us-lai

kitab di-ẽ

1SG.OBL-ERG 3SG.OBL-DAT book give-PST.1SG.MASC 'I gave him a book.' Some transitive verbs are listed in Table 4.12 which take only two arguments as subject and object and some ditransitive verbs as listed in Table 4.13 take three arguments as subject, indirect and direct objects. Table 4.12 Some transitive verbs Transitive verb का

IPA kat ̺-

Gloss 'cut'

खा

kʰa-

'eat'

च ुस्

tsu-

'suck'

प

pʌd̺ʰ-

'read'

टोक्

t ̺ok-

'bite'

Table 4.13: Some ditransitive verbs Ditransitive verb तर ्

IPA tir-

Gloss 'pay'

बेच ्

bets-

'sell'

द

di-

'give'

लेख्

lekʰ-

'write'

सोध्

sodʰ-

'ask'

4.1.3 Syllabicity Nepali verb stems can be grouped into two classes based on the number of syllables in a stem. This feature is significant especially in the causative stem formation.

a. Monosyllabic verb stems Those verb stems which have only one syllable are said to be monosyllabic verb stems. Some examples are listed in Table 4.14. 120

Table 4.14: Monosyllabic verb stems Verb stem बोल्

IPA bol-

Gloss 'speak'

खा

kʰa-

'eat'

pĩd-

'grind'

थुन ्

tʰun-

'close'

कस्

kʌs-

'tighten'

डुब्

d̺ub-

'sink'

छाम्

tsʰam-

'feel'

खोज्

kʰodz-

'search'

खोल्

kʰol-

'open'

सुक्

suk-

'be dried'

खा

kʰa-

'eat'

जा

dza-

'go'

द

di-

'give'

धो

dʰo-

'wash'

रो

ro-

'weep'

स

si-

'sew'

प

pi-

'drink'

पँ

b. Polysyllabic verb stems Those verb stems which are formed from two or more syllables are said to be polysyllabic verb stems. Some examples are illustrated in Table 4.15. Table 4.15: Polysyllabic verb stems Verb stem उृ

IPA upʰri-

Gloss 'jump'

खुि च

kʰumtsi-

'shrink'

भि क

bʰʌtki-

'be broken'

पछार ्

pʌtsʰar-

'throw down'

घसे

gʰʌset ̺-

'drag'

मुःकुरा

muskura-

'insert'

नचोर ्

nitsor-

'squeeze'

नमो

nimot ̺ʰ-

'twist'

िचथोर ्

tsitʰor-

'scratch'

छमल्

tsʰimʌl-

'prune'

121

4.1.4 Sound आ a The sound आ a appears in Nepali verb stems in two manifestations, one as a normal vowel phoneme आ a ; and another as a causative marker -आ -a while forming the causative verb stems. The presence and absence of आ a sound in the base verb stem is very significant for forming the causative stems. Therefore, the basic verb stems can be grouped into two classes, i.e., stems with आ a sound and stems without आ a sound. Some examples of former group are listed in Table 4.16 and of latter group are listed in Table 4.17. Table 4.16 Verb stems with a sound Verb stem खाँ

IPA kʰãd-

Gloss 'press down'

गाल्

gal-

'melt'

छान्

tsʰan-

'choose'

पछार ्

pʌtsʰar-

'throw down'

कोचार ्

kotsar-

'insert into'

डकार ्

d̺ʌkar-

'bulch'

Table 4.17: Verb stems without a sound Verb stem तर ्

IPA tir-

Gloss 'pay'

बल्

bʌl-

'burn'

खोप्

kʰop-

'cut deep'

घ

gʰʌt ̺-

'be less'

िचम

tsimʌt ̺-

'pinch'

छमल्

tsʰimʌl-

'prune'

4.2 Morphological processes 4.2.1 Causativization/transitivization In transitivization, an argument is added irrespective of the role of the argument but in causativization, the added argument is definitely the causer. The morphological change in the verb stem and syntactic make up are the same in both the processes, 122

however, the interpretation may differ semantically (Katamba 1993:274-5; Pokharel 2054VS:6-16). But, in this study, both are treated as a single process. In sentence (8a), the verb सु यो 'slept-PST.3SG' is non-causative which has taken ब चो bʌtstso 'child' as subject of the sentence. When it is causativized as सुताइन् sut-a-in 'sleep-CAUSPST.3SG.FEM.HON'

in (8b), it has taken a new subject आमा ama 'mother' as a causer and

the subject of the non-causative construction is demoted to the object of causativized verb. So, in the process of causativization, a morphological causative marker is suffixed to the verb stem and is followed by the agreement markers. Table 4.18 lists some examples of such causative verb stems.

(8)

a. ब चो सु यो। bʌtstso sut-jo child.SG.MASC sleep-PST.3SG.MASC 'The child slept.' b. आमाले ब चालाई सुताइन्। ama-le bʌtstsa-lai sut-a-in mother-ERG child-DAT sleep-CAUS-PST.3SG.FEM.HON 'The mother made the child sleep.' Table 4.18 Causative verb stems Casuative verb उठा

IPA ut ̺ʰ-a

Gloss 'cause to wake up'

सुता

sut-a

'cause to sleep'

तरा

tir-a

'cause to pay'

लेखा

lekʰ-a

'cause to write'

भना

bʰʌn-a

'cause to say'

123

Some ways of causative formation a. by -आ -a suffix The causativization by a causative marker -आ -a is the most regular and the bulk of the non-causative stems become causative stem by this process. The verb stems listed in Table 4.18 are formed by this method.2 b. by both -आ -a and आल् -al suffixes A small set of verb stems which, instead of taking marker -आ -a, also take marker आल् -al to form the causative stems. For example, verb stem खस् kʰʌs 'drop' in (9a),

gets causativized by maker -आ -a in (9b) and by आल् -al in (9c). Table 4.19 lists some examples of this type of causative stem formation. a. ढु ा खःयो।

(9)

d̺ʰuŋga kʰʌs-jo stone drop-PST.3SG.MASC 'The stone dropped.' b.

केटाले ढु ा खसायो।

keta-le d̺ʰuŋga kʰʌs-a-jo boy-ERG stone drop-CAUSE-PST.3SG.MASC 'The boy dropped a stone.' c.

केटाले ढु ा खसा यो।

keta-le d̺ʰuŋga kʰʌs-al-jo drop-CAUSE-PST.3SG.MASC boy-ERG stone 'The boy dropped a stone.' Table 4.19: Verb stems forming causatives with -आ -a and आल् -al

2

Base बस् bʌs-

Gloss sit

Causative बसा/बसाल् bʌsa-/bʌsal-

Gloss cause to sit

खस् kʰʌs-

drop

खसा/खसाल् kʰʌsa-/kʰʌsal-

cause to drop

च ुँ tsũd̺-

snatch

चडुँ ा/चडाल् tsũd̺a-/tsũd̺al-

cause to snatch

छन् tsʰin-

chop off

छना/ छनाल् tsʰina-/tsʰinal-

cause to chop off

Most of the Nepali grammarians believe that the basic causative marker is -आउ -au. But in this

study, -आ -a is assumed to be the basic causative marker simply for computing purpose.

124

c. by अ ʌ → आ a A small set of monosyllabic verb stems having the vowel अ ʌ in between consonants (i.e. CʌC structure) form the causative stem by changing the vowel अ ʌ to आ a. The verb stem मर ् mʌr 'die' in (10a) is causativized as मार ् mar 'kill' in (10b). Some of the verb stems in which causative stems are formed by this way are listed in Table 4.20. (10) a. मृग म यो। mrigʌ mʌr-jo deer die-PST.3SG.MASC 'The deer died.' b.

बाघले मृग मा यो।

bagʰ-le mrigʌ mar-jo die-CAUSE-PST.3SG.MASC tiger-ERG deer 'The tiger killed the deer.' Table 4.20: Verb stems forming causatives by changing अ ʌ to आ a Base verb मर ् mʌr-

Gloss die

Causative मार ् mar-

Gloss kill

सर ् sʌr-

shift

सार ् sar-

cause to shift

चल् tsʌl-

move

चाल् tsal-

cause to move

टर ् t ̺ʌr-

pass over

टार ् t ̺ar-

cause to pass over

पर ् pʌr-

fall

पार ् par-

cause to fall

गल् gʌl-

melt

गाल् gal-

cause to melt

बल् bʌl-

burn

बाल् bal-

cause to burn

d. by उ u → ओ o Another set of monosyllabic verb stems having vowel उ u in between the consonants (i.e. CuC structure) forms the causative stem by changing the vowel u to o. The verb stem खुल ् kʰul 'open' in (11a) is causativized as खोल् kʰol 'open.CAUSE' in (11b). Some of the verb stems in which causative stems are formed by this way are listed in Table 4.21a.

125

(11) a. ढोका खु यो। d̺ʰoka kʰul-jo door open-PST.3SG.MASC 'The door opened.' b. पालेले ढोका खो यो।

pale-le d̺ʰoka kʰol-jo open.CAUSE-PST.3SG.MASC gate-keeper-ERG door 'The gate keeper opened the door.'

Table 4.21a: Verb stems forming causatives by chaning उ u to ओ o Base verb छु tsʰut ̺-

Gloss be left behind

Causative छो tsʰod̺-

Gloss cause to be left behind

खुल ् kʰul-

open

खोल् kʰol-

cause to open

फु pʰut ̺-

break

फो pʰod̺-

cause to break

घुल ् gʰul-

dissolve

घोल् gʰol-

cause to dissolve

Interestingly, both the verb stems listed in Table 4.21a can also be causativized with causative marker -आ -a like the verb stems as listed in Table 4.18. The causative verb stems of this set are listed in Table 4.21b.3 Table 4.21b: Verb stems forming causatives by suffixing -आ -a

3

Base verb छु tsʰut ̺-

Gloss be left behind

Causative छु टा tsʰut ̺-a

Gloss cause to be left behind

खुल ् kʰul-

open

खुला kʰul-a

cause to open

फु pʰut ̺-

break

फुटा pʰut ̺-a

cause to break

घुल ् gʰul-

dissolve

घुला gʰul-a

cause to dissolve

छो tsʰod̺-

be left behind

छोडा tsʰod̺-a

cause to be left behind

खोल् kʰol-

open

खोला kʰol-a

cause to open

फो pʰud̺-

break

फोडा pʰod̺-a

cause to break

घोल् gʰul-

dissolve

घोला gʰol-a

cause to dissolve

In Table 4.21a, the change of to has not been discussed here (see Pokharel 2054VS). The causativizations shown in Table 4.21a and Table 4.21b have slightly different semantics.

126

e. by आ a insertion A subset of polysyllabic i-ending verb stems containing consonant cluster form the causative stem by inserting the vowel आ a in between the consonants in the cluster. The verb stem पि ल pʌgli 'melt' in (12a) is causativized as पगाल् pʌgal 'melt.CAUSE' in (12b). Some examples of verb stems in this process are listed in Table 4.22. (12) a. हउँ पि लयो ɦiũ pʌgli-jo ice melt-PST.3SG.MASC 'The ice melted.' b. घामले हउँ पगा यो। gʰam-le ɦiũ pʌgal-jo sun-ERG ice melt.CAUSE-PST.3SG.MASC 'The sun melted the ice.' Table 4.22: Verb stems form causatives by inserting आ a Base verb उ ृ upʰri-

Gloss jump

Causative उफार ् upʰar-

Gloss cause to jump

ब म bigri-

spoil

बगार ् bigar-

cause to spoil

स ू sʌpri-

flourish

सपार ् sʌpar-

cause to flourish

उिय ugʰri-

open

उघार ् ugʰar-

cause to open

पि ल pʌgli-

melt

पगाल् pʌgal-

cause to melt

उि ल ukli-

climb up

उकाल् ukal-

cause to climb up

Now, it has been clear that the causative stem formation from base verb stems depends on various features available in the verb stems such as syllabicity, presence and absence of आ a sound in the stem, transitivity and stem final segments.4

4.2.2 Passivization Passivization is an opposite phenomenon to causativization in terms of syntax. When passivization takes place, the subject noun phrase is either demoted to postpositional phrase or dropped (Katamba 1993:268-9; Pokharel 2054VS:1-5) In Nepali,

4

See Adhikari (2062VS) and Pokharel (2054VS) for detail information.

127

passivization from intransitive verbs is also possible, but it is restricted only to default agreement (i.e, third person singular), and to some other morphology and interpretation as well (Pokharel 2054VS; Adhikari 2062VS). But the passivization from transitive/causative verbs undergoes for full morphological paradigm and in its interpretations. However, in both cases, the passive marker is the same, i.e., इ -i that follows the non-passive stem. The verb as सुत ् sut 'sleep' in (13a) is intransitive and सु त sut-i 'sleep-PASS' is the passive form in (13b). The verb लेख् lekʰ 'write' in (13c)

is a transitive verb and लेिख lekʰ-i 'write-PASS' in (13d) is the passive form, लेखा lekʰ-

a 'write-CAUSE' in (13e) is causative stem and लेखाइ lekʰ-a-i 'write-CAUSE-PASS' in (13f) is the causative-passive stem. Therefore, the passive stem of a verb is at least theoretically possible to be derived from intransitive, transitive and causative verb stems. Table 4.23 lists some passive forms of the verbs. (13) a. म आज राॆर सुत। mʌ

adzʌ ramrʌri sut-ẽ

1SG today nice

'I slept nicely today.'

sleep-PST.1SG

b. आज राॆर सु तयो। adzʌ ramri

today nice

sut-i-jo

sleep-PASS-PST.3SG.MASC

'(Myself) slept nicely today.' c. उसले एउटा िच ी ले यो। us-le

3SG-ERG

eut ̺a

tsitʰtʰiː lekʰ-jo

one.CLF letter

write-PST.3SG.MASC

'He wrote a letter.' d. उसबाट एउटा िच ी लेिखयो। us-bat ̺ʌ

eut ̺a

tsitʰtʰiː lekʰ-i-jo

3SG-ABL one.CLF letter write-PASS-PST.3SG.MASC 'A letter was written by him.'

128

e. उसले एउटा िच ी लेखायो। us-le

eut ̺a

tsitʰtʰiː lekʰ-a-jo

3SG-ERG one.CLF letter

write-CAUS-PST.3SG.MASC

'He caused to write a letter.'

f. उसबाट एउटा िच ी लेखाइयो। us-bat ̺ʌ eut ̺a

tsitʰtʰiː lekʰ-a-i-jo

3SG-ABL one.CLF letter

write-CAUS-PASS-PST.3SG.MASC

'He was made to write a letter.'

Table 4.23: Some passive verb stems Passive verb उठ

IPA ut ̺ʰ-i-

Gloss 'be waken up'

सु त

sut-i-

'be slept'

तराइ

tir-a-i-

'cause to be paid'

लेखाइ

lekʰ-a-i-

'cause to be written'

अघाइ

ʌgʰa-i-

'be satisfied'

4.2.3 Negativization Negativization in Nepali is primarily an affixation process which includes both prefixation and suffixation. Basically the negative marker is न nʌ 'NEG' is used in both cases; it is consistent in its form in prefixation process whereas it gets slightly modified in suffixation due to morphophonemic changes (Pokharel 2054 VS:40-6).

a. Prefixation The negativization by prefixation takes place in moods: potential, optative and imperative, aspects: perfect and imperfect and participial forms: absolutive, conjunctive, infinitive, purposive, perfective, prospective and conditional as shown in Table 4.24, the negative by prefixation in a verb खा kʰa- 'eat'.

129

Table 4.24: Negation by the prefixation of negative marker न- nʌGrammatical categories Potential

Positive खाला kʰala

Negative नखाला nʌ-kʰala

Optative

खाएस् kʰaes

नखाएस् nʌ-kʰaes

खा kʰa

नखा nʌ-kʰa

खाएको kʰa-eko

नखाएको nʌ-kʰa-eko

खाँदै kʰã-dʌi

नखाँदै nʌ-kʰã-dʌi

खाई kʰa-iː

नखाई nʌ-kʰa-iː

खाएर kʰa-erʌ

नखाएर nʌ-kʰa-erʌ

Infinitive

खानु kʰa-nu

नखानु nʌ-kʰa-nu

Purposive

खान kʰa-nʌ

नखान nʌ-kʰa-nʌ

Conditional

खाए kʰa-e

नखाए nʌ-kʰa-e

Perfective

खाए kʰa-e

नखाए nʌ-kʰa-e

खाने kʰa-ne

नखाने nʌ-kʰa-ne

Imperative Perfect Aspect Imperfect Aspect Absolutive Conjunctive Participle

Prospective b. Suffixation

The negativization by suffixation takes place in tense: past and non-past and aspects: past habitual and inferential as shown in Table 4.25 in a verb खा kʰa- 'eat'. The negative marker न- nʌ- 'NEG' always follow the tense marker and precedes the agreement markers.5 Table 4.25: Negation by the suffixation of negative marker -न -nʌ Grammatical categories Non-Past Tense

Positive खा छ kʰa-ntsʰʌ

Negative खाँदैन kʰã-dʌinʌ

खायो kʰa-jo

खाएन kʰa-enʌ

Past Habitual Aspect

खा यो kʰa-ntʰjo

खाँदैन यो kʰã-dʌinʌ-tʰjo

Inferential Aspect

खाएछ kʰa-etsʰʌ

खाएनछ kʰa-e-nʌ-tsʰʌ

Past Tense

4.3 Stem formation As discussed in (4.2.1) the causativization is very productive in Nepali verbs at morphological level. The causative stems are formed from both intransitive and transitive verb stems. Thus, from a causativization process, the stems can be divided 5

In non-past tense and past habitual aspect, negative marker is preceded by दै dʌi, it's status is yet to be discovered.

130

into two types of stems: base verb stems and causative stems. However, there are some verb stems from which the causative verb stems can not be formed due to either phonological or semantic constraints. The passivization as discussed in (4.2.2) is a very productive phenomenon in Nepali morphology. That means, almost all the verb stems either intransitive or transitive verb stems can be passivized. Above all, the causative stems formed from the non-causative stems can still be passivized. This means, causative-passive stems have also been possible. Therefore, it can be generalized that a verb can have at least four different forms as shown in Table 4.26.

Table 4.26: Pattern of the stem formation Category Basic verb stem

Form V

Passive verb stem

V-i

Causative verb stem Causative Passive verb stem

V-a

V-a-i

Example 'write' लेख् lekʰ लेिख lekʰ-i लेखा lekʰ-a लेखाइ lekʰ-a-i

4.4 Grouping of verb stems in Nepali Characteristic features of Nepali verbs discussed in (4.1) and (4.2) are taken as the bases for grouping of Nepali verbs into various classes, so that the syntax of morphemes can be described and implemented to create the finite state network. At the same time, classification of verb stems also helps in branching of sub-lexicons to their respective inflectional paradigms. The phonological rules that are identified can also be systematically implemented.

4.4.1 Intransitive verb stems a. Verb stem Type1a

a-ending polysyllabic verbs in Nepali which have only two forms: base stem and passive stem. Some such verb stems with both the forms are listed in Table 4.27 with their corresponding morphological tags and gloss of base stems.

131

Table 4.27: Type1a verb stems Base form अघा ʌgʰa

Tags +VERB

Passive form अघाइ ʌgʰa-i

Tags Gloss of base +VERB+PASS 'to be satisfied'

करा kʌra

+VERB

कराइ kʌra-i

+VERB+PASS 'to shout'

नदा nida

+VERB

नदाइ nida-i

+VERB+PASS 'to sleep'

बहुला bʌhula

+VERB

बहुलाइ bʌhula-i

+VERB+PASS 'to be mad'

मुःकुरा muskura

+VERB

मुःकुराइ muskura-i

+VERB+PASS 'to smile'

लजा lʌdza

+VERB

लजाइ lʌdza-i

+VERB+PASS 'to shy'

टु सा t ̺usa

+VERB

टु साइ t ̺usa-i

+VERB+PASS 'to sprout'

The finite state transducer illustrated in Figure 4.1 encodes the verb stems listed in Table 4.27 and it is capable of analyzing and generating verb stems listed in Table 4.27.

Figure 4.1: A finite state transducer for Type1a verb stems b. Verb stem Type1b

i-ending polysyllabic verbs in Nepali which have four forms: base stems, passive stems, causative stems and causative-passive forms. Some such verbs with all the forms are listed in Table 4.28 with their corresponding morphological tags and gloss of base stems.

132

Table 4.28: Type1b verb stems चोिख

+VERB+PASS +VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'to be pure' चोिखइ चो या चो याइ tsokʰj-a-i

tsokʰi

tsokʰi-i

tsokʰj-a

गुि स

गुि सइ

गु ःया

gumsi

gumsi-i

gumsj-a

घोि ट

घोि टइ

घो

gʰopti

gʰopti-i

gʰoptj-a

टु ब

टु बइ

टु या

t ̺ukri

t ̺ukri-i

t ̺ukrj-a

+VERB

ा

गु ःयाइ gumsj-a-i

'to be suffocated'

घो

'to be overturned'

ाइ gʰoptj-a-i

टुबायाइ t ̺ukrj-a-i

'to be broken into pieces'

The finite state transducer illustrated in Figure 4.2 encodes the verb stems listed in Table 4.28 and it is capable of analyzing and generating them.

Figure 4.2: A finite state transducer for Type1b verb stems

The rules listed in PR 4.1 are compiled into a finite state transducer and composed with the finite state transducer illustrated in Figure 4.2.

Phonological rule PR 4.1 i. Stem final vowel ि◌ i of the i-ending intransitive verbs at the surface level is changed to vowel य j before the causative marker आ a. Regular expressions:

ि◌ ->◌् य || __ आ

133

ii. Independent vowel आ a changes to its corresponding dependent vowel ◌ा a after य j. आ -> ◌ा || य __ ;

Regular expression:

c. Verb stem Type1c

i-ending polysyllabic verbs in Nepali which have four forms: base stems, passive stems, causative stem and causative-passive stems. In this group of verbs, causative marker -a is inserted between the consonants in consonant cluster while forming the causative stems and final vowel इ i is dropped. Some examples are listed in Table 4.29 with their corresponding morphological tags.

Table 4.29: Type1c verb stems +VERB उि ल ukli

+VERB+PASS उि लइ ukli-i

+VERB+CAUSE उकाल् ukal

+VERB+CAUSE+PASS उका ल ukal-i

Gloss of base 'step up'

उिय ugʰri

उियइ ugʰri-i

उघार ् ugʰar

उघा र ugʰar-i

'be opened'

उ ृ upʰri

उ ृइ upʰri-i

उफार ् upʰar

उफा र upʰar-i

'jump'

घ ॐ gʰʌsri

घ ॐइ gʰʌsri-i

घसार ् gʰʌsar

घसा र gʰʌsar-i

'scrawl'

थु ू tʰupri

थु ूइ tʰupri-i

थुपार ् tʰupar

थुपा र tʰupar-i

'be piled up'

निभ nikʰri

निभइ nikʰri-i

नखार ् nikʰar

नखा र nikʰar-i

'be empty'

पि ल pʌgli

पि लइ pʌgli-i

पगाल् pʌgal

पगा ल pʌgal-i

'melt'

स ू sʌpri

स ूइ sʌpri-i

सपार ् sʌpar

सपा र sʌpar-i

'grow well'

सु ी sudʰri

सु ीइ sudʰri-i

सुधार ् sudʰar

सुधा र sudʰar-i

'improve'

The verb stems listed in Table 4.29 are compiled into a finite state transducer as illustrated in Figure 4.3 and it is capable of analyzing and generating them.

134

Figure 4.3: A finite state transducer for Type1c verb stems

The phonological rules in PR 4.2 are compiled into a finite state transducer and composed with the finite state transducer demonstrated in Figure 4.3.

Phonological rule PR 4.2 i. Causative marker आ a inserted between the consonants cluster of the i-ending intransitive verbs at the surface level. Regular expression: [. .] -> ◌ा || cons __ cons; ii. Stem final vowel ि◌ i of the i-ending intransitive verbs is deleted for causative stem. Regular expression:

ि◌ -> [ ] || __ ;

iii. Independent vowel आ a is changed to its corresponding dependent vowel ◌ा a. Regular expression:

आ -> ◌ा || __ .#.;

d. Verb stem Type1d Monosyllabic verb stems in Nepali having CaC structure which have four forms: base stem, passive stems, causative stems and causative-passive stems. Vowel ◌ा a of the verb stems changes to vowel अ ʌ when causative maker आ a follows the stem. Some examples are illustrated in Table 4.30 with their corresponding morphological tags. 135

Table 4.30: Type1d verb stems

काँप ् kãp

+VERB+PASS +VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'shiver' काँ प kãp-i कँपा kʌ̃p-a कँपाइ kʌ̃p-a-i

हाँस ् ɦãs

हाँ स ɦãs-i

+VERB

हँसा ɦʌ̃s-a

हँसाइ ɦʌ̃s-a-i

'laugh'

The verb stems listed in Table 4.30 are compiled into a finite state transducer as illustrated in Figure 4.4 and it is capable of analyzing and generating them.

Figure 4.4 A finite state transducer for Type1d verb stems The phonological rules involved in this process are listed in PR 4.3 which are compiled and composed with the transducer illustrated Figure 4.4.

Phonological rule PR 4.3 i. Vowel ◌ा a of the verb stems having Ca C structure is changed to vowel अ ʌ if the stem is followed by causative maker आ a at the surface level. Regular expression:

आ -> [ ] || cons __ cons;

ii. Independent vowel आ a changes to its corresponding dependent vowel ◌ा a. Regular expression:

आ -> ◌ा || __ .#.;

136

e. Verb stem Type1e Monosyllabic consonant ending intransitive verb stems in Nepali that have four forms: base stem, passive stems, causative stems and causative-passive stems have been grouped. Some examples of verb stems of are illustrated in Table 4.31 and Table 4.32 with their corresponding morphological tags.

Table 4.31: Type1e verb stems (i) +VERB

+VERB+PASS

+VERB+CAUSE

बस् bʌs

ब स bʌs-i

बसा bʌs-a

खस् kʰʌs

ख स kʰʌs-i

खसा kʰʌs-a

+VERB+CAUSE+PASS Gloss of base sit बसाइ bʌs-a-i drop खसाइ kʰʌs-a-i

छन् tsʰin

छ न tsʰin-i

छना tsʰin-a

छनाइ tsʰin-a-i

chop off

Table 4.32: Type1e verb stems (ii) +VERB

+VERB+PASS

मर ् mʌr

म र mʌr-i

गल् gʌl

ग ल gʌl-i

+VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'to kill' मरा mʌr-a मराइ mʌr-a-i 'to melt' गला gʌl-a गलाइ gʌl-a-i

चल् tsʌl

च ल tsʌl-i

चला tsʌl-a

ँ ् dzʌ̃ts जच

जिँ च dzʌ̃ts-i

ँ ा dzʌ̃ts-a जच

चलाइ tsʌl-a-i

'to move?'

ँ ाइ dzʌ̃ts-a-i जच

'to examine'

झर ् dzʰʌr

झ र dzʰʌr-i

झरा dzʰʌr-a

झराइ dzʰʌr-a-i

'to drop'

टर ् t ̺ʌr

ट र t ̺ʌr-i

टरा t ̺ʌr-a

टराइ t ̺ʌr-a-i

सर ् sʌr

स र sʌr-i

सरा sʌr-a

सराइ sʌr-a-i

'to escape artfully' 'to move aside'

The verb stems listed in Table 4.31 and Table 4.32 are compiled into a finite state transducer as demonstrated in Figure 4.5 and it is capable of analyzing and generating them.

137

Figure 4.5: A finite state transducer for Type1e verb stems

The phonological rules in PR 4.4 are compiled and composed with the transducer illustrated in Figure 4.5. Phonological rule PR 4.4 i. Halanta ◌् at the end of the consonant ending verb stem is deleted before the causative marker आ a and passive marker ि◌ i at the surface level. Regular expression: ◌् -> [ ] || __ ई|आ; ii. Independent vowel आ a and इ i change to their corresponding dependent vowels ◌ा a and ि◌ i, respectively. Regular expressions:

आ -> ◌ा || __ .#.; इ -> ि◌ || __.#.;

4.4.2 Transitive verb stem a. Verb stem Type2a Polysyllabic verbs which contain vowel आ a within the stems and have only two forms: base stem and passive stem are grouped in this class. Some examples of such verbs are listed in Table 4.33 and Table 4.34 with their corresponding morphological tags and the gloss of base stems. 138

Table 4.33: Type2a verb stems (i) उचाल् utsal

Tags +VERB

उचा ल utsal-i

Tags Gloss of stem +VERB+PASS 'to lift'

अजाप् ʌrdzap

+VERB

अजा प ʌrdzap-i

+VERB+PASS 'to sharpen'

झपार ् dzʰʌpar

+VERB

झपा र dzʰʌpar-i

+VERB+PASS 'to scold'

पछार ् pʌtsʰar

+VERB

पछा र pʌtsʰar-i

सराप् sʌrap

+VERB

सरा प sʌrap-i

+VERB+PASS 'to make upside down' +VERB+PASS to curse'

Table 4.34: Type2a verb stems (ii) VI-form मार ् mar

Tags +VERB

Passive form मा र mar-i

Tags Gloss of stem +VERB+PASS 'to lift'

गाल् gal

+VERB

गा ल gal-i

+VERB+PASS 'to melt'

चाल् tsal

+VERB

चा ल tsal-i

+VERB+PASS 'to make move'

जाँच ् dzats

+VERB

जाँिच dzats-i

+VERB+PASS 'to examine'

झार ् dzʰar

+VERB

झा र dzʰar-i

+VERB+PASS 'to drop'

टार ् t ̺ar

+VERB

टा र t ̺ar-i

+VERB+PASS 'to '

सार ् sar

+VERB

सा र sari

+VERB+PASS 'to shift'

The verb stems listed in Table 4.33 and Table 4.34 are compiled into a finite state transducer as demonstrated in Figure 4.6 and it is capable of analyzing and generating them.

Figure 4.6: A finite state transducer for Type2a verb stems The phonological rules involved in this process are listed in PR 4.5 and they are compiled and composed with the finite state transducer as demonstrated in Figure 4.6. 139

Phonological rule PR 4.5 i. Halanta ◌् at the end of the consonant ending verb stems is deleted before the causative marker आ a and passive marker इ i at the surface level. Regular expression: ◌् -> [ ] || __ ई|आ; ii. Independent vowel आ a and इ i are changed to their corresponding dependent vowels ◌ा and ि◌, respectively. Regular expressions:

आ -> ◌ा || __ .#.; इ -> ि◌ || __.#.;

b. Verb stem Type2b

i-ending polysyllabic basic verb stems which have four forms: base stems, passive stems, causative stems and causative-stems have been grouped in this class. Some examples are listed in Table 4.35 with their corresponding morphological tags.

Table 4.35: Type2b verb stems +VERB

+VERB+PASS

+VERB+CAUS

+VERB+CAUSE+PASS

E

पब

प बइ

पबा

पबाइ

pʌkri

pʌkri-i

pʌkr-a

pʌkr-a-i

पिख

पिखइ pʌrkʰi-i

पखा

पखाइ

pʌrkʰ-a

pʌrkʰ-a-i

pʌrkʰi बस

ब सइ

बसा

बसाइ

birsi

birsi-i

birs-a

birs-a-i

मि स

मि सइ mʌnsi-i

म सा

म साइ

mʌns-a

mʌns-a-i

mʌnsi सि झ

सि झइ

स झा

स झाइ

sʌmdzʰi

sʌmdzʰi-i

sʌmdzʰ-a

sʌmdzʰ-a-i

कुि च

कुि चइ kultsi-i

कु चा

कु चाइ

kults-a

kults-a-i

kultsi उइँ ट

उइँ टइ

uĩt ̺i

uĩt ̺i-i

उइँटा

उइँटाइ

uĩt ̺-a

uĩt ̺-a-i 140

Gloss of base 'arrest' 'wait' 'forget' 'throw away' 'remember' 'tread' 'spindle'

The verb stems listed in Table 4.35 are compiled into a finite state transducer as demonstrated in Figure 4.7 and it is capable of analyzing and generating them.

Figure 4.7: A finite state transducer for Type2b verb stems

The finite state transducer in Figure 4.7 is composed with the network of phonological rules listed in PR 4.6.

Phonological rules PR 4.6 i. Vowel ि◌ i of the i-ending transitive verbs is deleted before the causative marker आ a at the surface level.

Regular expression:

ि◌ -> [ ] || __ आ;

ii. Independent vowel आ a is changed to its corresponding dependent vowel ◌ा. Regular expression: आ -> ◌ा || __ .#.;

c. Verb stem Type2c Monosyllabic verb stems having C a C structure which have four forms: base stem, passive stems, causative stems and causative-passive stems. Vowel आ a of the basic

141

verb stems changes अ ʌ when the causative maker आ a follows the base stem. Some examples are illustrated in Table 4.36 with their corresponding morphological tags.

Table 4.36: Type2c verb stems +VERB खाप्

+VERB+PASS +VERB+CAUSE +VERB+CAUSE+PASS Gloss of base 'pile up' खा प खपा खपाइ

kʰap

kʰap-i

kʰʌp-a

kʰʌp-a-i

गा

गा ड

गडा

गडाइ

gad̺

gad̺-i

gʌd̺-a

gʌd̺-a-i

छाप्

छा प

छपा

छपाइ

tsʰap

tsʰap-i

tsʰʌp-a

tsʰʌp-a-i

टाँस ्

टाँ स

t ̺ãs

t ̺ãs-i

तान्

ता न

tan

टँसा

'bury' 'print'

टँसाइ

'stick up'

तना

तनाइ

'pull'

tan-i

tʌn-a

tʌn-a-i

नाच ्

नािच

नचा

नचाइ

nats

nats-i

बाँच ्

बाँिच

ँ ा बच

bãts

bãts-i

bʌ̃ts-a

हाँक्

हाँ क

हँका

ɦãk

ɦãk-i

ɦʌ̃k-a

t ̺ʌ̃s-a

nʌts-a

t ̺ʌ̃s-a-i

'dance'

nʌts-a-i ँ ाइ बच

'survive'

bʌts-a-i हँकाइ

'drive'

ɦʌ̃k-a-i

The verb stems listed in Table 4.36 are compiled into a finite state transducer as illustrated in Figure 4.8 and it is capable of analyzing and generating them.

Figure 4.8: A finite state transducer for Type2c verb stems

142

The phonological rules involved in this process are listed in PR 4.7 which are compiled and composed with the transducer illustrated in Figure 4.8.

Phonological rule PR 4.7 i. Vowel ◌ा a of the verb stems having C a C structure is changed to vowel ʌ if the stem is followed by causative maker आ a at the surface level. आ -> [ ] || cons __ cons;

Regular expression:

ii. Independent vowel आ a is changed to its corresponding dependent vowel ◌ा a. आ -> ◌ा || __ .#.;

Regular expression: d. Verb stem Type2d

Monosyllabic consonant ending transitive basic verb stems which have four forms: base stem, passive stems, causative stems and causative-passive stems have been grouped in this class.

Some examples are illustrated in Table 4.37 with their

corresponding morphological tags.

Table 4.37: Type2d verb stems +VERB

+VERB+PASS

+VERB+CAUSE

प pʌd̺ʰ

प ढ pʌd̺ʰ-i

पढा pʌd̺ʰ-a

कन् kin

क न kin-i

कना kin-a

+VERB+CAUSE+PASS Gloss of base read पढाइ pʌd̺ʰ-a-i buy कनाइ kin-a-i

जोत् dzot

जो त dzot-i

जोता dzot-a

जोताइ dzot-a-i

plough

घस् gʰʌs

घ स gʰʌs-i

घसा gʰʌs-a

घसाइ gʰʌs-a-i

massage

The finite state transducer illustrated in Figure 4.9 encodes the verb stems listed in Table 4.37 and it is capable of analyzing and generating them.

143

Figure 4.9: A finite state transducer for Type2d verb stems The phonological rules listed in PR 4.8 are compiled and composed with the transducer demonstrated in Figure 4.9.

Phonological rule PR 4.8 i. Halanta ◌् at the end of the consonant ending verb stem is deleted before the causative marker आ a and passive marker इ i at the surface level. Regular expression: ◌् -> [ ] || __ इ|आ; ii. Independent vowel आ a and इ i are changed to their corresponding dependent vowels ◌ा a and ि◌ i, respectively. Regular expression:

आ -> ◌ा || __ .#.; इ -> ि◌ || __.#.;

4.4.3 Irregular verb (intransitive and transitive) stems Some of the intransitive and transitive verb stems which are not regular in their stem formation process. Some of the verbs of this type are listed in Table 4.38.

144

Table 4.38: Irregular verb stems +VERB आa

+VERB+PASS आइ a-i

+VERB+CAUSE

+VERB+CAUSE+PASS Gloss of base 'come'

जा dza

जाइ dza-i

रो ro

रोइ ro-i

हो ho

होइ ho-i

खा kʰa

खाइ kʰa-i

पा pa

पाइ pa-i

द di

दइ di-i

दला di-la

दलाइ di-la-i

'give'

ल li

लइ li-i

या l-ja

याइ l-ja-i

'take'

धो dʰo

धोइ dʰo-i

'go' वा ru-wa

वाइ ru-wa-i

'cry' 'be'

वा kʰw-a

वाइ kʰw-a-i

'eat' 'get'

धुला dʰu-la

धुलाइ dʰu-la-i

'wash'

बस् bʌs-

बसाल् bʌs-al-

बसा ल bʌs-al-i

'sit'

खस् kʰʌs-

खसाल् kʰʌsal-

खसा ल kʰʌs-al-i

'drop'

च ुँ ड tsũd̺i-

च ुडाल् tsũd̺al-

च ुडा ल tsũd̺-al-i

'snatch'

छन् tsʰin-

छनाल् tsʰinal-

छना ल tsʰin-al-i

'chop off'

The finite state transducer for irregular can not be generalized as done in earlier cases. Therefore, their network is not demonstrated rather they will directly be encoded and implemented.

4.4.4 Suppletive verb stems There are two pairs of suppletive verb stems; they are हु- ɦu- 'become' vs. भ- bʰʌ'became' and जा- dza- 'go' vs. ग- gʌ- 'went'. First and secod members of the suppletive pairs follow the different tracks in the inflectional paradigm and this is illustrated in Table 4.39.

145

Table 4.39: Suppletive verb stems हु ɦu- 'become'!

भ bʰʌ- 'became'

Non-past tense

जा dza- 'go'

Non-past tense Past tense Perfect aspect

Imperfect aspect Habitual aspect

Past tense Perfect aspect Imperfect aspect Habitual aspect

Inferential aspect Imperative Optative Potential

ग gʌ- 'went'

Optative

Inferential aspect Imperative Optative Potential

Absolutive Infinitive Purposive Prospecctive Durative

Optative Absolutive

Infinitive Purposive Prospective Durative Conditional Perfective Conjunctive

Conditional Perfective Conjunctive

The phonological rules involved in altering the suppletive forms are listed in PR 4.9.

Phonological rule: PR 4.9 i. The verb stem हु ɦu changes to भ bʰʌ if the following suffix begins with ए e or इ

i or ई iː or य jʌ. Regular expression: हु -> भ || __ ए|इ|ई|य; ii. The verb stem dza is changed to gʌ if the following suffix begins with ए e or इ i or ई iː or य jʌ. Regular expression: जा -> ग || __ ए|इ|ई|य;

146

4.5 Verbal inflections In Nepali, verbal inflections are suffixes attached to verb stems. A verb stem can be a base form, a causative form or a passive form. The inflectional suffixes in general encode the inherent verbal features such as tense, aspect and mood. Besides these inherent features, these inflectional suffixes also encode the agreement features such as person, number, gender and honorificity with reference to the subject of the sentence. Inherent features and agreement features are not clearly distinguishable in terms of symbols rather represented by a set of symbols. Therefore, these inflectional suffixes form a paradigm with respect to above mentioned features. The suffixal negative marker gets intermixed with these inflectional suffixes in some forms. This leads the formation of both positive and negative paradigms of the verbal inflection. The second type of negative marker is a prefix which appears in front of the verb stems (see 4.2.3). The auxiliary verbs in Nepali are more or less equivalent to the inflections in encoding the verbal features, therefore, they are discussed in this section.

4.5.1 Auxiliary verbs in Nepali Nepali has two kinds of auxiliaries, namely, non-past existential auxiliary छ tsʰʌ and non-past identificational auxiliary हो ɦo. But the past auxiliary थ- tʰi- is a different stem which inflects like main verbs (Dahal 1974 and Adhikari 2062VS). The existential and indentificational auxiliary verbs for both non-past and past tenses (i and ii) have been discussed with their inflections.1 Non-past form

Past form

i. Existential

छ

थ-

ii. Indentificational

हो

--

a. Non-past existential auxiliary verb छ tsʰʌ 'be' inflects for person, number, gender and honorific agreement as in छन् tsʰin 'be.NPST.3SG.FEM.HON' in (14). In the default case, the auxiliary verb form छ tsʰʌ 'be' itself represents the third person singular masculine and carries non-past tense and for other cases agreement inflections follow

1

See Sharma (1980) for detailed description of auxiliary verbs in Nepali.

147

it. All together, this existential verb छ tsʰʌ 'be' has twelve forms and the inflections are listed in Table 4.39 with their corresponding morphological tags. (14) द द घरमा छन्। didi gʰʌr-ma tsʰʌin elder sister home-LOC be.NPST.3SG.FEM.HON 'The elder sister is at home.' Table 4.39: Inflections for non-past existential verb छ chʌ ‘be’ (affirmative) Grammatical category First person singular

Inflections

First person plural

औ

IPA Tags NPST.1SG u NPST.1PL ʌũ

Second person masculine singular

स्

s

NPST.2SG.MASC

Second person feminine singular

एस्

es

NPST.2SG.FEM

Second person masculine singular hon

औ

ʌu

NPST.2SG.MASC.HON

Second person feminine singular hon

यौ

jʌu

NPST.2SG.FEM.HON

Second person plural

औ

NPST.2PL

उ

Third person singular masculine Third person feminine singular

φ

ए

ʌu φ e

Third person masculine singular hon

न्

ʌn

NPST.3SG.MASC.HON

Third person feminine singular hon

इन्

in

NPST.3SG.FEM.HON

Third person plural

न्

ʌn

NPST.3PL

NPST.3SG.MASC NPST.3SG.FEM

The finite state transducer in Figure 4.10 encodes the auxiliary verb छ chʌ ‘be’ and its various forms and it is can analyze and generate them. The rules involved in this case are directly encoded into the finite state transducer.

148

Figure 4.10: A finite state transducer for inflections of non-past existential verb छ chʌ ‘be’ (affirmative)

b. In the negative formation of the existential verb छ tsʰʌ, the negative suffix -न -nʌ 'NEG'

is inseted between the auxiliary stem छ tsʰʌ 'be' and agreement inflections.

During the process of negativization some morphophonemic changes occur as छै नन्

tsʰʌinʌn 'be-NEG' in (15). There are eight negative forms where there is no distinction in gender. Table 4.40 lists the inflections with their corresponding morphological tags. (15) द द

घरमा छै नन्।

didi gʰʌr-ma tsʰʌinʌn elder sister home-LOC be.NPST.NEG.3SG.FEM.HON 'The elder sister is not at home.' Table 4.40: Inflection for non-past existential verb छ chʌ 'be' (Negative) Grammatical category First person singular

Inflections इनँ

inʌ̃

NPST.NEG.1SG

First person plural

इन

inʌũ

NPST.NEG.1PL

Second person singular

इनस्

inʌs

NPST.NEG.2SG

Second person singular hon

इनौ

inʌu

NPST.NEG.2SG.HON

Second person plural

इनौ

inʌu

NPST.NEG.2PL

Third person singular

इन

inʌ

NPST.NEG.3SG

Third person singular hon

इनन्

inʌn

NPST.NEG.3SG.HON

Third person plural

इनन्

inʌn

NPST.NEG.3PL

149

Tags

The finite state transducer in Figure 4.11 encodes both inflections and rules involved. It can analyze and generate the negative forms of the auxiliary verb छ tsʰʌ 'be'.

Figure 4.11: A finite state transducer for inflections of non-past existential verb छ

chʌ 'be' (negative)

c. Non-past idenficational auxiliary verb हो ɦo 'be' takes similar agreement inflections but this verb does not have gender distinction. That means, there are no feminine forms. The verb itself represents the third person singular form. Some morphophonemic changes occur when inflections combine with हो ɦo 'be' as हौ ɦʌu 'be.NPST.2SG.HON' in (16). There are altogether eight forms of this verb and the inflections are listed in Table 4.41 with their corresponding morphological tags. (16) तमी िश क हौ। timi siktsʰʌk ɦʌu 2SG.HON teacher be.NPST.2SG.HON 'You are a teacher.'

150

Table 4.41: Inflections for non-past identificational verb हो ɦo ‘be’ (affirmative) Grammatical category First person singular

Inflections IPA उँ ũ

NPST.1SG

First person plural

औ

ʌũ

NPST.1PL

Second person singular

स्

s

NPST.2SG

Second person singular hon

औ

ʌu

NPST.2SG.HON

Second person plural

औ

NPST.2PL

NPST.3PL

Third person singular Third person singular hon

φ

न्

ʌu φ n

Third person plural

न्

n

Tags

NPST.3SG NPST.3SG.HON

The auxiliary verb हो ɦo ‘be’ and its various forms are compiled into a finite state transducer as demonstrated in Figure 4.12 and it is capable of analyzing and generating them.

Figure 4.12: A finite state transducer for inflections of non-past identificational verb हो ɦo ‘be’ (affirmative) The phonological rules involved in this process are listed in PR 4.10 and they have been directly encoded into the finite state transducer illustrated in Figure 4.12.

Phonological rules PR 4.10 i. Independent vowels उ u and औ ʌu are changed to their corresponding dependent vowels ◌ु u and ◌ौ ʌu, respectively after हो.

151

Regular expressions:

उ -> ◌ु || हो __; औ -> ◌ौ || हो __; ◌ो -> ◌ु || ह __ न् .#. ◌ो -> [ ] || ह __ ◌ु|◌ौ;

d. In the negative formation of identificational auxiliary verb हो ɦo, the negative suffix न -nʌ (इन inʌ) is inserted between the auxiliary stem हो ɦo and agreement inflections.

In this process of negativization, no changes occur in the stem itself as होइनौ ɦoinʌu in (17). There are eight negative forms parallel to the positive ones. They are listed in Table 4.42 with their corresponding morphological tags. (17) तमी

timi

िश क होइनौ।

siktsʰʌk ɦoinʌu

2SG.HON teacher be.NPST.NEG.2SG.HON 'You are not a teacher.' Table 4.42: Inflection for non-past identificational verb हो ɦo ‘be’ (Negative) Grammatical category First person singular

Inflections इनँ

inʌ̃

Tags NPST.NEG.1SG

First person plural

इन

inʌũ

NPST.NEG.1PL

Second person singular

इनस्

inʌs

NPST.NEG.2SG

Second person singular hon

इनौ

inʌu

NPST.NEG.2SG.HON

Second person plural

इनौ

inʌu

NPST.NEG.2PL

Third person singular

इन

inʌ

NPST.NEG.3SG

Third person singular hon

इनन्

inʌn

NPST.NEG.3SG.HON

Third person plural

इनन्

inʌn

NPST.NEG.3PL

The finite state transducer in Figure 4.13 is capable of analyzing and generating the negative forms of auxiliary verb हो ɦo ‘be’.

152

Figure 4.13: A finite state transducer for inflection of non-past identificational verb हो ɦo ‘be’ (Negative) a. Past existential auxiliary verb थ tʰi 'be.PST' inflects for person, number, gender and honorific agreement as in थय tʰijʌũ in (18). The auxiliary verb form थ tʰi itself carries past tense and the agreement inflections follow it. In this case also the auxiliary verb stem does not change when suffixes are attached. All together, this existential verb has ten forms and the inflections are listed in Table 4.43 with their corresponding morphological tags. (18) हामी तनाबमा थय । ɦami tʌnab-ma tʰi-jʌũ 2PL tension-LOC be.PST.2PL 'We were at tension.' Table 4.43: Inflections for past existential verb थ tʰi 'be' (affirmative) Grammatical category First person singular

Inflections

First person plural

य

IPA Tags PST.1SG ẽ jʌũ PST.1PL

Second person singular

इस्

is

PST.2SG

Second person singular hon

यौ

jʌu

PST.2SG.HON

Second person plural

यौ

jʌu

PST.2PL

Third person masculine singular

यो

jo

PST.3SG.MASC

Third person feminine singular

ई

i:

PST.3SG.FEM

Third person masculine singular hon

ए

e

PST.3SG.MASC.HON

Third person feminine singular

इन्

in

PST.3SG.FEM.HON

Third person plural

ए

e

PST.3PL

एँ

153

The finite state transducer illustrated in Figure 4.14 can analyze and generate the positive forms of auxiliary verb थ tʰi 'be'.

Figure 4.14: A finite state transducer for inflections of past existential verb थ tʰi 'be' (affirmative) b. In the negative formation of the existential verb थ tʰi-, the negative suffix -न -nʌ is inseted between the auxiliary stem थ tʰi- and agreement inflections. During the process of negativization no morphophonemic changes occur auxiliary verb stem as थएन tʰi-enʌjʌũ in (19). There are eleven negative forms one more than positive ones

and Table 4.44 lists the inflections with their corresponding morphological tags. (19) हामी तनाबमा थएन । ɦami tʌnab-ma tʰi-enʌjʌũ 2PL tension-loc be.P-NEG.2PL 'We were not at tension.'

154

Table 4.44: Inflections for past existential verb थ tʰi ‘be’ (negative) Grammatical category First person singular

Inflections इनँ

IPA inʌ̃

Tags PST.NEG.1SG

First person plural

एनौ

enʌũ

PST.NEG.1PL

Second person singular

इनस्

inʌs

PST.NEG.2SG

Second person singular masculine एनौ hon Second person singular hon इनौ

enʌu

PST.NEG.2SG.MASC.HON

enʌu

PST.NEG.2SG.FEM.HON

Second person plural

एनौ

enʌu

PST.NEG.2PL

Third person masculine singular

एन

enʌ

PST.NEG.3SG.MASC

Third person feminine singular

इन

inʌ

PST.NEG.3SG.FEM

Third person masculine singular hon

एनन्

enʌn

PST.H.NEG.3SG.MASC

Third person feminine singular hon

इनन्

inʌn

PST.NEG.3SG.FEM.HON

Third person plural

एनन्

enʌn

PST.NEG.3PL

The finite state transducer in Figure 4.15 can analyze and generate the negative forms of auxiliary verb थ tʰi 'be'.

Figure 4.15: A finite state transducer for inflections of past existential verb थ tʰi ‘be’ (negative) 4.5.2 Tense The Nepali morphologically exhibits two tenses: past and non-past. Past tense refers to the action that is completed prior to the speech event and non-past tense refers to

155

the action that happens at the time of speech event or later (Schmidt 1993; Pokharel 2054VS and Adhikari 2055VS).

a. Non-past Tense Non-past tense in Nepali covers both present and future. There is no such future tense marker; however, the same present (non-past) makers and sometimes combines with prospective marker referring to the future tense. In fact, there are no such definite non-past tense markers in Nepali. The non-past existential auxiliary verb छ tsʰʌ as discussed in (4.5.1) behaves as non-past tense marker. The auxiliary meaning 'be' seems to be completely absorbed and agreement features are retained. This feature can be seen in (22), the verb खा kʰa 'eat' combines with auxiliary verb छ tsʰʌ and with its agreement inflection but the form छन् tsʰʌn only indicates non-past tense and agreement features. The non-past tense and inflections for person, number, gender and honorificity altogether are in ten forms and they are listed in Table 4.45 with their corresponding morphological tags.2 (22) भाइह

भात खा छन्।

bʰai-ɦʌruː bʰat kʰa-n-tsʰʌn brother-PL rice eat-φ-NPST.3PL 'The brothers eat rice.'

2

Most of the traditional grammarians treat छ tsʰʌ as an auxiliary verb separately. So, the most of the verb stems, except past form, are considered as the compound stems.

156

Table 4.45: Inflections for non-past tense (affirmative) Grammatical category First person singular

Inflections छु

IPA tsʰu

Tags NPST.1SG

First person plural

छ

tsʰʌũ

NPST.1PL

Second person masculine singular

छस्

tsʰʌs

NPST.2SG.MASC

Second person feminine singular

छे स ्

tsʰes

NPST.2SG.FEM

Second person masculine singular hon

छौ

tsʰʌu

NPST.2SG.MASC.HON

tsʰjʌu

NPST.2SG.FEM.HON

ौ

Second person feminine singular hon Second person plural

छौ

tsʰʌu

NPST.2PL

Third person masculine singular

छ

tsʰʌ

NPST.3SG.MASC

Third person feminine singular

छे

tsʰe

NPST.3SG.FEM

Third person masculine singular hon

छन्

tsʰʌn

NPST.3SG.MASC.HON

Third person feminine singular hon

छन्

tsʰin

NPST.3SG.FEM.HON

Third person plural

छन्

tsʰʌn

NPST.3PL

The finite state transducer illustrated in Figure 4.16 can analyze and generate the nonpast positive forms of the inflections. This transducer is concatenated with finite state transducer of verb stems.

Figure 4.16: A finite state transducer for inflections of non-past tense Non-past negative forms are formed from the suffixation of negative marker -न -nʌ with verb stems. The non-tense marker छ tsʰʌ 'be' is completely absorbed and semantically null element द di or दै dʌi are inserted before the negative marker न nʌ;

157

and the agreement markers follow it. Even though the non-past tense marker is not overtly present, the tense is indicated by the sequence. There are altogether twelve non-past negative forms which are listed in Table 4.46 with their corresponding morphological tags.

Table 4.46: Inflections for non-past tense negative 1 Grammatical category First person singular

Inflections IPA दनँ dinʌ̃

Tags NPST.NEG.1SG

First person plural

दै न

dʌinʌũ

NPST.NEG.1PL

Second person masculine singular दै नस् Second person feminine singular दनस्

dʌinʌs

NPST.NEG.2SG.MASC

dinʌs

NPST.NEG.2SG.FEM

Second person masculine singular दै नौ hon Second person feminine singular दनौ hon Second person plural दै नौ Third person masculine singular दै न

dʌinʌu

NPST.NEG.2SG.MASC.HON

dinʌu

NPST.NEG.2SG.FEM.HON

dʌinʌu

NPST.NEG.2PL

dʌinʌ

NPST.NEG.3SG.MASC

dinʌ

NPST.NEG.3SG.FEM

dʌinʌn

NPST.NEG.3SG.MASC.HON

dinʌnʌn

NPST.NEG.3SG.FEM.HON

dʌinʌn

NPST.NEG.3PL

Third person feminine singular

दन

Third person masculine singular दै नन् hon Third person feminine singular दनन् hon Third person plural दै नन्

The finite state transducer in Figure 4.17 can analyze and generate the non-past negative forms of the inflections set 1. This transducer is concatenated with transducer of verb stems.

158

Figure 4.17: A finite state transducer for inflections of non-past tense negative 1 There is another set of negative marker -न -nʌ and agreement inflections which appear exclusively only with vowel ending verb stems. As discussed above the nonpast tense maker छ tsʰʌ 'be' is completely absorbed; however, the tense feature is indicated by the sequence. In this case, inflections indicate person, number and honoricity but not the gender. Therefore, there are only eight non-past negative forms in this set as listed in Table 4.47 with their corresponding morphological tags. Table 4.47: Inflections for non-past tense negative 2 Grammatical category First person singular

Inflections नँ

nʌ̃

Tags NPST.NEG.1SG

First person plural

न

nʌũ

NPST.NEG.1PL

Second person singular

नस्

nʌs

NPST.NEG.2SG

nʌu

NPST.NEG.2SG.HON

nʌu

NPST.NEG.2PL

Second person singular hon नौ Second person plural नौ Third person singular

न

nʌ

NPST.NEG.3SG

Third person singular hon

नन्

nʌn

NPST.NEG.3SG.HON

Third person plural

नन्

nʌn

NPST.NEG.3PL

The finite state transducer in Figure 4.17a can analyze and generate the non-past negative forms of the inflections set 2. This transducer is concatenated with transducer of verb stems.

159

Figure 4.17a: A finite state transducer for inflections of non-past tense negative 2

b. Past Tense The past tense in Nepali is indicated by the same inflectional paradigm as that of the past tense auxiliary verb stem थ- tʰi- 'be'. In example (23), the verb stem रो ro 'cry' is followed by the past tense and agreement inflections -यो -jo 'PST.3SG.MASC'. Although, the past tense marker is not clear, ए e, य् j and इ i in the paradigm indicate the past tense and the inflections indicating the agreement features follow them. For the convenience, the paradigm listed in Table 4.43 is reproduced in Table 4.48 with their corresponding morphological tags. (23) ब चो सारै रोयो। bʌtstso sarʌi ro-jo child more cry-PST.3SG.MASC 'The child cried a lot.'

160

Table 4.48: Inflections for past tense (affirmative) Grammatical category First person singular

Inflections एँ

ẽ

PST.1SG

Tags

First person plural

य

jʌũ

PST.1PL

Second person singular

इस्

is

PST.2SG

Second person singular hon

यौ

jʌu

PST.2SG.HON

Second person plural

यौ

jʌu

PST.2PL

Third person masculine singular

यो

jo

PST.3SG.MASC

Third person feminine singular

ई

i:

PST.3SG.FEM

Third person masculine singular hon

ए

e

PST.3SG.MASC.HON

Third person feminine singular hon

इन्

in

PST.3SG.FEM.HON

Third person plural

ए

e

PST.3PL

The finite state transducer in Figure 4.18 can analyze and generate the positive inflections of past tense. This transducer is also concatenated with transducer of verb stems.

Figure 4.18: An Finite State Transducer for inflections of past tense (affirmative) The phonological rules listed in PR 4.11 are directly encoded into the finite state transducer demonstrated in Figure 4.18.

161

Phonological rule PR 4.11 i. Independent vowels ए e, इ i and ई i: change to their corresponding dependent vowels ◌े e, ि◌ i and ◌ी i:, respectively if the verb stems end with consonants. Regular expressions:

ए -> ◌े || cons __; इ -> ि◌ || cons __; ई -> ◌ी || cons __;

In the past tense negative forms, the negative marker न nʌ '-NEG' is inserted between the past markers ए e or य् j or इ i and agreement inflections. In example (24), the negative maker न nʌ '-NEG' is in between past tense marker ए e and agreement inflection न् n. Altogether, there are eleven past tense negative forms as listed in Table 4.49 with their corresponding morphological tags. (24) ब चाह रोएनन्। bʌtstsa-ɦʌruː ro-enʌn cry-PST.NEG.3PL child-PL 'The children did not cry.' Table 4.49: Inflections for past tense (negative) Grammatical category First person singular

Inflections इनँ

IPA inʌ̃

Tags PST.1SG

First person plural

एनौ

enʌũ

PST.1PL

Second person singular

इनस्

inʌs

PST.2SG

Second person singular hon

एनौ

enʌu

PST.2SG.HON

Second person singular Female hon

इनौ

enʌu

PST.NEG.2SG.FEM.HON

Second person plural

एनौ

enʌu

PST.2PL

Third person masculine singular

एन

enʌ

PST.3SG.MASC

Third person feminine singular

इन

inʌ

PST.3SG.FEM

Third person masculine singular एनन् hon Third person feminine singular hon इनन्

enʌn

PST.3SG.MASC.HON

inʌn

PST.3SG.FEM.HON

एनन्

enʌn

PST.3PL

Third person plural

162

The finite state transducer in Figure 4.19 can analyze and generate the negative inflections of past tense. This transducer is also concatenated with a transducer of verb stems.

Figure 4.19: A finite state transducer for inflections of past tense (negative) The phonological rules involved are listed in PR 4.12 have been directly encoded in the finite state transducer illustrated in Figure 4.19. Phonological rules PR 4.12 i. Independent vowels ए e, इ i, and ई i: change to their corresponding dependent vowels ◌े e, ि◌ i, and ◌ी i: if the verb stems end with consonants. Regular expressions:

ए -> ◌े || cons __; इ -> ि◌ || cons __; ई -> ◌ी || cons __;

4.5.3 Aspects The internal temporal orientation in a language is said to be aspect and this phenomena is expressed in Nepali morphologically through inflections. Traditionally four aspects, namely perfect, imperfect, habitual and inferential (unknown) aspect

163

(Pokharel 2054VS and Adhikari 2055VS) have been illustrated and discussed in the subsequent sections.

a. Perfect Aspect In Nepali, the perfect aspect is indicated by a suffix -एको -eko 'PERF.SG.MASC'. The aspect marker inflects for number and gender. The verb form गरे को gʌr-eko 'doPERF.SG.MASC'

in (25a) is singular masculine, the verb form गरे का gʌr-eka 'do-PERF.PL'

in (25b) is plural and the verb form गरे क gʌr-eki 'do-PERF.SG.FEM' in (25c) is singular feminine. The inflections of perfect aspects are listed in Table 4.50. (25) a. मैले यो काम गरे को छु ।

mʌi-le jo kam gʌr-eko tsʰu 1SG.OBL-ERG this work do-PERF.SG.MASC be.NPST.1SG 'I have done this work.'

b.

हामीले यो काम गरे का छ ।

ɦami-le jo kam gʌr-eka tsʰʌũ 1PL-ERG this work do-PERF.PL be.NPST.1PL 'We have done this work.' c.

सीताले यो काम गरे क छे ।

siːta-le jo kam gʌr-eki tsʰe Sita.FEM-ERG this work do-PERF.SG.FEM be.NPST.3SG.FEM 'Sita has done this work.' Table 4.50: Inflections for perfect aspect Grammatical category Perfect singular masculine

Inflections एको

IPA eko

Tags PERF.SG.MASC

Perfect plural

एका

eka

PERF.PL

Perfect singular feminine

एक

ekiː

PERF.SG.FEM

Perfect singular feminine Emphatic

एकै

ekʌi

PERF.SG.FEM.EMPH

The finite state transducer in Figure 4.20 can analyze and generate the inflections of perfect aspect. This transducer is also concatenated with the finite state transducer of verb stems.

164

Figure 4.20: A finite state transducer for inflections of perfect aspect The phonological rule listed in PR 4.13 has been compiled and composed with the transducer of Figure 4.20.

Phonological rules PR 4.13 i. Independent vowel ए e changes to its corresponding dependent vowel ◌े e if the verb stem ends with consonant. Regular expression:

ए -> ◌े || cons __;

The perfect aspect negative forms are formed by prefixing the negative marker न nʌ with the perfect aspect form of the verb stem as नगरे का nʌ-gʌr-eka 'NEG-do-PERF.HON' in (26).

(26)

तमीले यो काम नगरे का भए पैसा पाउँदैनौ।

timiː-le jo kam nʌ-gʌr-eka bʰʌ-e pʌisa 2SG.HON-ERG this work NEG-do-PERF.HON be-COND money pa-ũ-dʌinʌu get-NPST.NEG.2.HON 'If you have not done this work, (you) won't get money.' b. Imperfect Aspect The imperfect aspect in Nepali is indicated by a marker -दै -dʌi '-IMPERF' as in (27) the verb form गद gʌr-dʌi 'do-IMPERF' is imperfect aspect. The maker -दै -dʌi is

165

neutral with respect to number and gender. However, it has other three forms which distinguish between number and gender. All four imperfect aspect markers are listed in Table 4.51.3 (27) केटो यो काम गद छ।

ket ̺o jo kam gʌr-dʌi tsʰʌ boy.SG.MASC this work do-IMPERF be.NPST.3SG.MASC 'The boy is doing this work.' Table 4.51: Inflections for imperfect aspect

Grammatical category Imperfect singular masculine

Inflections

Tags

दो

do

IMPERF.SG.MASC

Imperfect singular feminine

द

diː

IMPERF.SG.FEM

Imperfect plural

दा

da

IMPERF.PL

Imperfect

दै

dʌi

IMPERF

The finite state transducer in Figure 4.21 can analyze and generate the inflections of imperfect aspect.

Figure 4.21 A finite state transducer for Inflections of imperfect aspect The finite state transducer demonstrated in Figure 4.21 is composed with the finite state transducer of phonological rules listed in PR 4.14.

3

The suffixes -दो do, -दा -da and -द -diː may have some other syntactic and semantic status, however, the computational purpose, they are treated as imperfect suffixes.

166

Phonological rules PR 4.14 i. Nasal ◌ँ is inserted if the verb stems end with vowels and the following affix begins with द d. Regular expression:

[. .] -> ◌ँ || vowel __ द;

The imperfect aspect negative forms are formed by prefixing the negative marker न

nʌ 'NEG-' with the imperfect aspect form of the verb stem as नखाँदै nʌ-kʰã-dʌi 'NEG-eatIMPERF'

in (28).

(28) िचया नखाँदै उहाँले फोन उठाउनुभयो ।

tsija nʌ-kʰã-dʌi uɦã-le pʰon ut ̺ʰ-au-nu tea NEG-eat-IMPERF 3SG.HON-ERG phone lift-CAUSE-INF 'He lifted the phone while not drinking tea.'

bʰ-jo be.p-3SG

c. Habitual Aspect There are two habitual aspects in the Nepali: present habitual and past habitual. The present habitual aspect is encoded by the non-past tense marker छ tsʰʌ and its inflections as पउँछु pi-ũ-tsʰu 'drink-φ-NPST.1SG' in (29a) whereas the past habitual aspect is indicated by a marker थ् tʰ plus inflections for agreement such as person, number, gender and honorificity. In example (29c), the verb form पउँथ pi-ũ-tʰẽ 'drinkφ-HAB.PST.1SG' is in past habitual form in which थ् tʰ is past habitual marker which is followed by the agreement inflection. There are altogether ten past habitual forms of inflections. They are listed in Table 4.52. (29) a. म धेरै रि स पउँछु। mʌ dʰerʌi rʌksi pi-ũ-tsʰu 1SG more alcohol drink-φ-NPST.1SG 'I drink a lot of alcohol.' b. म धेरै रि स पउँथ।

mʌ dʰerʌi rʌksi pi-ũ-tʰẽ 1SG more alcohol drink-φ-HAB.PST.1SG 'I used to drink a lot of alcohol.'

167

Table 4.52: Inflections for past habitual aspect (Affirmative) Grammatical category First person singular

Inflections थ

IPA tʰẽ

Tags PST.HAB.1SG

First person plural

य

tʰjʌũ

PST.HAB.1PL

Second person singular

थस्

tʰis

PST.HAB.2SG

Second person singular hon

यौ

tʰjʌu

PST.HAB.2SG.HON

Second person plural

यौ

tʰjʌu

PST.HAB.2PL

Third person masculine singular

यो

tʰjo

PST.HAB.3SG.MASC

Third person feminine singular

थ

tʰi

PST.HAB.3SG.FEM

Third person masculine singular hon

थे

tʰe

PST.HAB.3SG.MASC.HON

Third person feminine singular hon

थन्

tʰin

PST.HAB.3SG.FEM.HON

Third person plural

थे

tʰe

PST.HAB.3PL

The finite state transducer in Figure 4.22 can analyze and generate the inflections of past habitual aspect. This finite state transducer is concatenated with the finite state transducer of the verb stems.

Figure 4.22: A finite state transducer for inflections of habitual aspect (affirmative) The past habitual negative forms are formed by inserting the negative marker न nʌ between verb stems and past habitual maker थ् tʰ plus agreement inflections. In this case, semantically null elements दै dʌi and द di are inserted between the negative maker न nʌ '-NEG' and verb stem as पउँदैनथ pi-ũ-dʌinʌ-tʰẽ 'drink-φ-HAB.NEG.PST.1SG'

168

in (30). There are altogether eleven past habitual negative forms and they are listed in Table 4.53. 30.

म रि स पउँदैनथ।

mʌ rʌksi pi-ũ-dʌinʌ-tʰẽ 1SG alcohol drink-φ-HAB.NEG.PST.1SG 'I did not used to drink alcohol.'

Table 4.53: Inflections for habitual aspect (negative) Grammatical category First person singular

Inflections दै नथ

dʌinʌtʰẽ

Tags PST. NEG.HAB.1SG

First person plural

दै न य

dʌinʌtʰjʌũ

PST.NEG.HAB.1PL

Second person singular

दै न थस्

dʌinʌtʰis

PST.NEG.HAB.2SG

Second person singular hon

दै न यौ

dʌinʌtʰjʌu

PST.NEG.HAB.2SG.HON

Second person plural

दै न यौ

dʌinʌtʰjʌu

PST.NEG.HAB.2PL

dinʌtʰis

PST.NEG.HAB.2SG.FEM

dinʌtʰjʌu

PST.NEG.HAB.2SG.FEN.HO N

dʌitʰjo

PST.NEG.HAB.3SG.MASC

dinʌtʰis

PST.NEG.HAB.3SG.FEM

दनथे

dinʌtʰe

PST.NEG.HAB.3SG.MASC. HON

दै नथे

dʌinʌtʰe

PST.NEG.HAB.3PL

Second person feminine दन थस् singular Second person feminine दन यौ singular hon Third person masculine दै न यो singular Third person feminine singular दन थस् Third person singular hon Third person plural

masculine

The finite state transducer in Figure 4.23 can analyze and generate the negative inflections of habitual aspect and it is also concatenated with the finite state transducer of the verb stems.

169

Figure 4.23: A finite state transducer for inflections of habitual aspect (negative)

d. Inferential (unknown) aspect Inferential (unknown) aspect indicates the event that took place in past but it is known at present based on some evidence or clues. The inferential form of a verb is formed by inserting the inferential aspect marker ए e and इ i between the verb stems and non-past tense plus agreement inflections. The verb form सुतेछौ sut-etsʰʌu 'sleepINFER.2.PL'

in (31) is the inferential aspect form and Table 4.54 lists twelve inferential

aspect inflections. (31)

तमीह

त हजो बजारमा सुतेछौ।

timiː-ɦʌruː tʌ ɦidzo bʌdzar-ma sut-etsʰʌu PART yesterday market-LOC sleep-INFER.2.PL 2-PL '(I came to know that) you slept in the market yesterday.'

170

Table 4.54: Inflections for inferential aspect (affirmative) Grammatical category

Inflections

IPA

Tag

First person singular

एछु

etsʰu

PST.INFER.1SG

First person plural

एछ

etsʰʌũ

PST.INFER.1PL

Second person masculine singular

एछस्

etsʰʌs

PST.INFER.2SG.MASC

Second person feminine singular

इछस्

itsʰʌs

PST.INFER.2SG.FEM

Second person masculine singular एछौ hon Second person feminine singular hon इछौ

etsʰʌu

PST.INFER.2SG.MASC.HON

itsʰʌu

PST.INFER.2SG.FEM.HON

Second person plural

एछौ

etsʰʌu

PST.INFER.2PL

Third person masculine singular

एछ

etsʰʌ

PST.INFER.3SG.MASC

Third person feminine singular

इछ

itsʰʌ

PST.INFER.3SG.FEM

Third person masculine singular hon

एछन्

etsʰʌn

PST.INFER.3SG.MASC.HON

Third person feminine singular hon

इछन्

itsʰʌn

PST.INFER.3SG.FEM.HON

Third person plural

एछन्

etsʰʌn

PST.INFER.3PL

The finite state transducer in Figure 4.24 can analyze and generate the positive inflections of inferential aspect and it is concatenated with the finite state transducer of verb stems.

Figure 4.24: A finite state transducer for inflections of inferential aspect (affirmative) The phonological rules listed PR 4.15 are compiled and composed with the transducer demonstrated in 4.24.

171

Phonological rules PR 4.15 i. Independent vowels ए e and इ i change to their corresponding dependent vowels ◌े e and ि◌ i, respectively if the verb stems end with consonants. Regular expression: ए -> ◌े || cons __; इ -> ि◌ || cons __;

The inferential aspect negative forms are formed by inserting the negative marker न

nʌ between inferential aspect maker ए e or इ e and agreement inflections as सुतेनछौ sut-enʌtsʰʌu 'sleep-INFER.NEG.2.PL' in (32). There are twelve inferential aspect negative forms which are listed in Table 4.55. (32)

तमीह

त हजो बजारमा सुतेनछौ।

timiː-ɦʌruː tʌ ɦidzo bʌdzar-ma sut-enʌtsʰʌu PART yesterday market-LOC sleep-INFER.NEG.2PL 2-PL '(I came to know that) you did not sleep in the market yesterday.'

Table 4.55: Inflections for inferential aspect (negative) Grammatical category

Inflectio ns

IPA

Tags

First person singular

एनछु

enʌtsʰu

PST.INFER.NEG.1SG

First person plural

एनछ

enʌtsʰʌũ

PST.INFER.NEG.1PL

Second person masculine singular Second person feminine singular Second person masculine singular hon Second person feminine singular hon Second person plural

एनछस्

enʌtsʰʌs

PST.INFER.NEG.2SG.MASC

इनछे स्

inʌtsʰes

PST.INFER.NEG.2SG.FEM

एनछौ

enʌtsʰʌu

PST.INFER.NEG.2SG.MASC.HON

इनछौ

inʌtsʰʌu

PST.INFER.NEG.2SG.FEM.HON

एनछौ

enʌtsʰʌu

PST.INFER.NEG.2PL

Third person masculine singular Third person feminine singular Third person masculine singular hon Third person feminine singular hon Third person plural

एनछ

enʌtsʰʌ

PST.INFER.NEG.3SG.MASC

इनछ

inʌtsʰʌ

PST.INFER.NEG.3SG.FEM

एनछन्

enʌtsʰʌn

PST.INFER.NEG.3SG.M.HON

इनछन्

inʌtsʰʌn

PST.INFER.NEG.3SG.FEM.HON

एनछन्

enʌtsʰʌn

PST.INFER.NEG.3PL

172

The finite state transducer in Figure 4.25 can analyze and generate the negative inflections of inferential aspect. This transducer is also concatenated with transducer of verb stems.

Figure 4.25: A finite state transducer for inflections of inferential aspect (negative) The rules involved in this process are listed in PR 4.16 which are compiled and composed with finite state transducer illustrated in Figure 4.25.

Phonological rules PR 4.16 i. Independent vowels ए e, इ i and ई i: change to their corresponding dependent vowels ◌े e, ि◌ i and ◌ी i:, respectively if the verb stems end with consonants. Regular expressions:

ए -> ◌े || cons __; इ -> ि◌ || cons __; ई -> ◌ी || cons __;

4.5.4 Moods Morphologically, Nepali has two types of moods, namely, declarative and nondeclarative. The former one does not have a distinct marker to indicate the mood, rather they are indicated by the default system; and the latter is further sub-divided

173

into imperative, optative and potential moods (Pokharel 2054VS). Each of them is indicated by their respective markers.

a. Imperative Mood The imperative form of a verb has number and honorific distinctions. The base stem of the verb indicates singular non-honorific imperative form. Other two forms are singular honorific and plural. The imperative markers differ depending upon the end segment of the verb stems. The consonant ending verb stems take -φ for singular nonhonorific and अ-ʌ for singular honorific and plural. In the case of vowel ending verb stems, i-ending verb stems take -ई -iː for singular non-honorific and -अ -ʌ for singular honorific and plural; and other vowel ending take -φ for singular non-honorific, -ऊ -u: for singular honorific and for plural. The imperative inflections are listed in Table 4.56. In example (33), the imperative verb form जाओ dza-o 'go-IMP.2PL' indicates plural form for instance. (33)

तमीह

बजार तर जाओ!

timiː-ɦʌruː bʌdzar-tirʌ dza-o! market-DIR go-IMP.2PL 2-PL '(You) go towards the market.' Table 4.56: Inflections for imperative mood Grammatical Category Second person singular

Inflections -φ/-iː -φ/-ई

Tag IMP.2SG

Second person singular hon

-अ/-ऊ

-ʌ/u:

IMP.2SG.HON

Second person plural

-अ/-ओ

-ʌ/-o

IMP.2PL

The finite state transducer in Figure 4.26 can analyze and generate the inflections of imperative mood. This transducer is also concatenated with transducer of verb stems.

174

Figure 4.26: A finite state transducer for inflections of imperative mood The phonological rules listed in PR 4.17 are compiled into a network and composed with the finite state transducer as demonstrated in Figure 4.26. Phonological rules PR 4.17 i. ^IMPsg removed for non-honorific imperative Regular expression: ^IMPsg -> [ ], ii. ^IMPpl changes to ऊ for honorific imperative after आ a ending verb stems Regular expression: ^IMPpl -> ऊ || आ _ , iii. ^IMPhon changes to ओ for plural imperative after आ a ending verb stems Regular expression: ^IMPpl -> ऊ || ओ _ ,4 Negative Imperative forms are obtained from prefixing the negative marker न nʌ 'NEG-' to the imperative form of the verb stems. The example sentence in (34) is negative sentence of example (33) in which form नजाओ nʌ-dza-o 'NEG-go-IMP.2PL' is negative imperative one. (34)

तमीह

बजार तर नजाओ !

timiː-ɦʌruː bʌdzar-tirʌ nʌ-dza-o! market-DIR NEG-go-IMP.2PL 2-PL '(You) go towards the market.'

4

Arbitrary tags ^IMPsg and ^IMPpl are used for creating the environment and are finally eliminated from the network.

175

b. Optative Mood The optative forms are obtained from the combination of verb stems and the optative inflections. The verb stems in optative mood inflect for person, number, gender and honorificity. The example in (35), the verb form गरे स् gʌr-es 'do-OPT.2SG' indicates second person singular optative form. There are altogether eight optative inflections and they are listed in Table 4.57. ् (35) ल परदे शमा राॆोसँग काम गरे स।

lʌ

pʌrʌdes-ma ramro-sʌ̃gʌ kam gʌr-es foreign-LOC good-COM work do-OPT.2SG 'I wish, (you) work nicely in foreign country.'

PART

Table 4.57: Inflections for optative mood (affirmative) Grammatical category First person singular

Inflections ऊँ

ũ

Tags OPT.1SG

First person plural

औ

ʌũ

OPT.1PL

Second person singular

एस्

es

OPT.2SG

Second person singular hon

ए

e

OPT.2SG.HON

Second person plural

ए

e

OPT.2PL

Third person singular

ओस्

os

OPT.3SG

Third person singular hon

ऊन्

u:n

OPT.3SG.HON

Third person plural

ऊन्

u:n

OPT.3PL

The finite state transducer in Figure 4.27 can analyze and generate the inflections of optative mood. This transducer is also concatenated with transducer of verb stems.

Figure 4.27: A finite state transducer for inflections of optative mood

176

The phonological rules listed in PR 4.18 are compiled and composed with the transducer illustrated in Figure 4.27.

Phonological rules PR 4.18 i. Independent vowels ए e, औ ʌu, ओ o and ऊ u: change to their corresponding dependent vowels ◌े e, ◌ौ ʌu, ◌ो o and ◌ू u:, respectively if the verb stems end with consonants. Regular expression:

ए -> ◌े || cons __; औ -> ◌ौ || cons __; ओ -> ◌ो || cons __; ऊ -> ◌ू || cons __;

The negative optative forms are obtained by prefixing the negative marker न nʌ 'NEG' to the optative forms of the verb. In the sentence (36) the form नगरोस् nʌ-gʌr-os 'NEG-do-OPT.3SG' is the third person singular negative optative form.

36.

उसले यःतो काम नगरोस्।

us-le tjʌsto kam nʌ-gʌr-os 3SG.OBL-ERG like that work NEG-do-OPT.3SG 'I wish him not to do work like that.'

c. Potential Mood Potential forms of the verbs are obtained from the combination of verb stems and potential inflections. The potential forms make the distinction on person, number, gender and honorificity. In example (37), the verb stem गर ् gʌr 'do' and potential mood marker -ला-la '-POT' form third person singular potential verb form. Altogether, there are twelve potential mood inflections and they are listed in Table 4.58.

177

(37) यो केटाले प र ा पास गला। jo

ket ̺a-le pʌriksja pas gʌr-la DEM.PROX boy-ERG examination pass do-POT.3SG 'This boy may pass the examination.' Table 4.58: Inflections for potential mood (affirmative)

Grammatical category First person singular

Inflections

उँला

First person plural

औला

IPA Tag ũla POT.1SG ʌũla POT.1PL

Second person masculine singular

लास्

las

POT.2SG.MASC

Second person feminine singular

लस्

lis

POT.2SG.FEM

Second person masculine singular hon

औला

ʌula

POT.2SG.MASC.HON

Second person feminine singular hon

औल

ʌuli

POT.2SG.FEM.HON

Second person plural

औला

ʌula

POT.2PL

Third person masculine singular

ला

la

POT.3SG.MASC

Third person feminine singular

ल

li

POT.3SG.FEM

Third person masculine singular hon

लान्

lan

POT.3SG.MASC.HON

Third person feminine singular hon

लन्

lin

POT.3SG.FEM.HON

Third person plural

लान्

lan

POT.3PL

The finite state transducer in Figure 4.28 can analyze and generate the inflections of potential mood.

Figure 4.28: A finite state transducer for inflections of potential mood The transducer presented in Figure 4.28 is composed with the network of the phonological rules in PR 4.19. 178

Phonological rules PR 4.19 i. Independent vowels उ u and औ ʌu change to their corresponding dependent vowels ◌ु u and ◌ौ ʌu if the verb stems end with consonants. Regular Expression:

उ -> ◌ु || cons __; औ -> ◌ौ || cons __;

The negative potential mood forms are obtained from prefixing the negative marker न

nʌ 'NEG-' to the verb stems. The verb form नगला nʌ-gʌr-la 'NEG-do-POT.3SG' in (38) is third person negative potential form. (38) यो केटाले प र ा पास नगला। jo

ket ̺a-le pʌriksja pas nʌ-gʌr-la DEM.PROX boy-ERG examination pass NEG-do-POT.3SG 'This boy may not pass the examination.'

4.5.5 Participial forms a. Absolutive The absolutive form is formed from the combination of verb stems and the absolutive marker ई i: '-ABS'. The absolutive form normally occurs with other forms of the verbs forming the compound verbs. In example (39) the verb form पठाई pʌt ̺ʰa-iː 'send-ABS' is the absolutive form. The inflection for absolutive participle is listed in Table 4.59 with its morphological tag and its finite state transducer is demonstrated in Figure 4.29. (39) उसले िच ी पठाई दयो। us-le tsit ̺ʰt ̺ʰː 3SG.OBL-ERG letter 'He sent a letter.'

pʌt ̺ʰa-iː send-ABS

di-jo give-PST.3SG

Table 4.59: Inflection for absolutive participle Grammatical category Absolutive

Inflections ई

179

IPA i:

Tags ABS

The finite state transducer presented in Figure 4.29 can analyze and generate the absolutive form when it is concatenated with the finite state transducer of the verb stems.

Figure 4.29: A finite state transducer for inflection of absolutive form The phonological rules involved are compiled and composed with finite state transducer as demonstrated in Figure 4.29.

Phonological rules PR 4.20 i. Independent vowel ई i: changes to its corresponding dependent vowel ◌ी i: if the verb stems end with consonants. Regular expression: ई -> ◌ी || cons __; The negative absolutive form is obtained from prefixing the negative marker न nʌ ' NEG-' to

the absolutive verb form. The verb form नपठाई nʌ-pʌt ̺ʰa-iː 'NEG-send-ABS' in

(40) is negative absolutive form. (40) उसले िच ी नपठाई रा यो। us-le tsit ̺ʰt ̺ʰː nʌ-pʌt ̺ʰa-iː rakʰ-jo 3SG.OBL-ERG letter NEG-send-ABS keep-PST.3SG 'He kept the letter without reading it.' b. Infinitive The infinitive form of a verb, in fact, is the dictionary entry in Nepali. It is obtained from suffixing an infinitive marker नु -nu '-INF' to the verb stem. In example (41), the verb form हँ नु ɦĩd̺-nu 'walk-INF' is the infinitive form. The infinitive has three forms:

180

infinitive, oblique and emphatic. The infinitive is the default one with the marker नु -

nu '-INF', the oblique form with marker -ना -na occurs with case markers and the emphatic form with marker -नै -nʌi occurs at pragmatic level. The infinitive markers are listed in Table 4.60 with their morphological tags. बहान हँ नु राॆो कुरो हो।

(41)

biɦan ɦĩd-nu ramro kuro ɦo morning walk-INF good.SG.MASC thing be.ID.NPST.3SG 'To walk in the morning is a good thing.' Table 4.60: Inflections for infinitive participle Grammatical category Infinitive

Inflections

Tags

नु

nu

INF

Infinitive Oblique

ना

na

INF.OBL

Infinitive Emphatic

नै

nʌi

INF.EMPH

The finite state transducer illustrated in Figure 4.30 can analyze and generate the infinite forms when it is concatenated with the finite state transducer of the verb stems.

Figure 4.30: A finite state transducer for inflections of infinitive participial form The negative infinitive form is obtained from prefixing the negative marker न- nʌ'NEG-' to the verb stems. The verb form न हँ नु nʌ-ɦĩd̺-nu 'NEG-walk-INF' in (42) is negative infinitive form. (42) बहान न ह नु राॆो कुरो होइन। biɦan nʌ-ɦid̺-nu ramro kuro ɦoinʌ morning NEG-walk-INF good.SG.MASC thing be.ID.NPST.NEG.3SG 'Not to walk in the morning is not a good thing.' 181

c. Purposive Purposive form of the verb is obtained from the combination of the verb stem and purposive marker -न -nʌ '-PURP'. The verb form क

kin-nʌ 'buy-PURP' in (43) is the

purposive form. The purposive marker inflects for emphasis also. The inflections for purposive participle are listed in Table 4.61 with their morphological tags. (43) म समान क बजार जा छु । mʌ sʌman kin-nʌ bʌdzar dza-n-tsʰu 1SG thing buy-PURP market go-φ-NPST.1SG 'I will go the market (in order) to buy things.' Table 4.61: Inflections for purposive participle Grammatical category Purposive

Inflections न

IPA nʌ

PURP

Purposive emphasis

Tags

नै

nʌi

PURP.EMPH

The finite state transducer illustrated in Figure 4.31 encodes purposive participial inflections listed in Table 4.61 and it is capable of analyzing and generating the purposive forms when it is concatenated with the finite state transducer of the verb stems.

Figure 4.31: A finite state transducer for inflections of purposive participial form d. Prospective The prospective form of the verb is formed from the suffixation of the prospective marker -ने -ne with the verb stems. The verb form गुजान gudzar-ne in (44) is the prospective form. The prospective inflection is listed in Table 4.62 with its morphological tag. (44) उ नह ले यो वष प न ऽपालमु न नै गुजान छन्। uniː-ɦʌruː-le 3.OBL-PL-ERG

jo wʌrsʌ pʌni tripalmuni nʌi gudzar-ne this year also tent PART spend-PROSP 182

tsʰʌn be.NPST.3PL 'This year also, they will spend under the tent.' Table 4.62: Inflection for prospective participle Grammatical category Prospective

Inflections ने

IPA ne

Tags PROSP

The finite state transducer demonstrated in Figure 4.32 encodes the prospective participial forms and it can analyze and generate the prospective forms when it is concatenated with the finite state transducer of the verb stems.

Figure 4.32: A finite state transducer for inflection of prospective participial form

e. Durative The durative form is formed from the combination of verb stems and the durative markers -दा -da. In example (45), the verb आउँदा a-ũ-da form is the durative form which indicates the duration of the action. The durative form also inflects for emphasis. The inflections for durative participles are listed in Table 4.63 with their morphological tags. ँ (45) उ नह आउँदा म सु तरहेको थए।

uniː-ɦʌruː a-ũ-da mʌ sut-i-rʌɦ-eko tʰiẽ 3.OBL-PL come-φ-DUR 1SG sleep-ABS-remain-PERF.SG.M be.P1.SG '(I was sleeping while they arrived.' Table 4.63: Inflections for durative participle Grammatical category Durative

Inflections

Tags

दा

IPA da

Durative emphatic

दै

dʌi

DUR.EMPH

183

DUR

The durative inflections listed in Table 4.63 are compiled into a network as illustrated in Figure 4.33 and it becomes capable of analyzing and generating the durative forms of the verbs when it is concatenated with the finite state transducer of verb stems.

Figure 4.33 A finite state transducer for inflections of durative participial forms f. Conjunctive The conjunctive participle form of the verb is obtained from the combination of the verb stems and a conjunctive marker -एर -erʌ. In example (46), the पढे र pʌdʰ-erʌ is the conjunctive participle form. This conjunctive form inflects for emphasis at pragmatic level with marker -ऐ ʌi. There is another conjunctive marker -इकन -iʌknʌ but it has low frequency use. The inflections for conjunctive participle with their corresponding morphological tags are listed in Table 4.64 with their morphological tags. (46)

राम ःकुलमा पढे र आयो।

ram iskul-ma pʌdʰ-erʌ a-jo Ram school-LOC read-CONJ come-PST.3SG.M 'Ram studied in the school and came.' Table 4.64: Inflections for conjunctive participle Grammatical category Conjunctive

Inflections एर

erʌ

CONJ

Tags

Conjunctive emphasis

एरै

erʌi

CONJ.EMPH

Conjunctive

इकन

ikʌnʌ

CONJ

Conjunctive emphasis

इकनै

ikʌnʌi

CONJ.EMPH

The finite state transducer illustrated in Figure 4.34 encodes the conjunctive participial forms and it can analyze and generate them when it is concatenated with the finite state transducer of the verb stems.

184

Figure 4.34: A finite state transducer for inflections of conjunctive participial form The phonological rules involved in this process are listed in PR 4.21. They are compiled and composed with the finite state transducer illustrated in Figure 4.34.

Phonological rules PR 4.21 i. Independent vowels ए e and इ i change to their corresponding dependent vowels ◌े e and ि◌ i if the verb stems end with consonants. Regular expression: ए -> ◌े || cons __; इ -> ि◌ || cons __;

g. Conditional The conditional form of the verb is obtained form the combination of the verb stems and the conditional marker ए -e. In example (47) गरे gʌr-e is the conditional form of the verb. The inflection for conditional participle is listed in Table 4.65 with its morphological tags. (47) तमीले सहयोग गरे म पास हु छु । timiː-le sʌɦʌjog gʌr-e 2SG-ERG help do-COND 'If you help I will pass.'

mʌ pas ɦu-n-tsʰu 1SG pass be-φ-NPST.1SG

Table 4.65: Inflection for conditional participle Grammatical category Conditional

Inflections ए

Tag e

185

COND

The finite state transducer illustrated in Figure 4.35 encodes the conditional inflection and in association with transducer of verb stems, it can analyze and generate the conditional forms.

Figure 4.35: A finite state transducer for conditional participial form The phonological rules in PR 4.22 are compiled and composed with the finite state transducer illustrated in Figure 4.35.

Phonological rules PR 4.22 i. Independent vowel ए e changes to its corresponding dependent vowel ◌े e if the verb stems end with consonants. Regular expression: ए -> ◌े || cons __; h. Perfective The perfective form of the verb is obtained form the combination of the verb stems and the perfective marker ए -e. In example (48) गरे gʌr-e is the perfective form of the verb. But this form generally occurs with some postpositions. The inflection for perfective participle is listed in Table 4.65 with its morphological tag.5

(48)

तमीले भनेअनुसार मैले काम गरे ।

timiː-le bʰʌn-e-ʌnusar mʌi-le kam gʌr-ẽ 2SG-ERG say-PERFT-POSTP 1SG.OBL-ERG work do-PST.1SG 'I did the work as you said.'

Table 4.65: Inflection for conditional participle Grammatical category Perfective

5

Inflection ए

Tag e

Pokharel (2054VS) has treated this form as oblique of past tense forms.

186

PERFT

The finite state transducer illustrated in Figure 4.36 encodes the perfective inflection. It can analyze and generate the perfective forms of the verbs when concatenated with the finite state transducer of the verb stems.

Figure 4.36: A finite state transducer for inflection of conditional participial form The phonological rules listed in PR 4.23 are compiled and composed with the finite state transducer illustrated in Figure 4.36. Phonological rules PR 4.23 i. Independent vowel ए e changes to its corresponding dependent vowel ◌े e if the verb stems end with consonants. Regular expression:

ए -> ◌े || cons __;

4.6 Summary In this chapter, we presented the verb stems that are classified into two major groups: intransitive based stems and transitive based stems. Intransitive based stems are further grouped into five classes and transitive based stems are grouped into four classes. Each class is distinct at least in one feature discussed in the process of stem formation. And a set of few irregular verb stems have been discussed as a separate class. The existential and identificational auxiliary verbs were discussed and illustrated under inflections as they more or less carry the features similar to the inflections. Inflectional paradigms for both affirmative and negative forms of tense, aspects, moods and participle are analyzed and illustrated. The finite state transducers of each type have been demonstrated. The phonological rules are identified, stated and expressed in regular expression format.

187

CHAPTER 5 ADVERBS, CONJUNCTIONS, POSTPOSITIONS AND PARTICLES

5.0 Outline This chapter presents the analysis of closed words. It consists of four sections. Section 5.1 groups adverbs and presents them with morphological tags and corresponding finite state transducers. In section 5.2, we present conjunctions with their morphological tags and finite state transducers. Section 5.3 deals with postpositions, namely, plural marker, case markers and adverbial postpositions with their morphological tags. Section 5.4 discusses the particles and interjections and also presents morphological tags and finite state transducers. And section 5.5 finally summarizes the findings.

5.1 Adverbs in Nepali Adverbs in Nepali indicate manner, place, time and intensity. They do not inflect for anything but appear with postpositions in writing (Adhikari 2062VS). They are not the obligatory elements in the sentence. However, they are classified into various groups based on their semantics for our purpose.1 5.1.1 Temporal adverbs Temporal adverbs are those adverbs which indicate time with respect to the action performed as आज adzʌ 'today' in (1). Table 5.1 lists some temporal adverbs. (1)

राम आज ःकुल गयो।

ram adzʌ skul gʌ-jo Ram today school go.PST-3SG.MASC 'Today, Ram went to school.'

1

Though the classification of adverbs in Nepali into various classes is not computationally significance, it has been done simply for the identification.

188

Table 5.1: Temporal Adverbs Morphological tags +ADV+TEMP

Devanagari अ हले

IPA ʌhile

Gloss now

+ADV+TEMP

आज

adzʌ

today

+ADV+TEMP

अबेर

ʌberʌ

late

+ADV+TEMP

सोमबार

somʌbarʌ Monday winter hiũdʌ

हउँद

+ADV+TEMP

Since temporal adverbs do not inflect, the finite-state transducer demonstrated in Figure 5.1 is capable of analyzing and generating them.

Figure 5.1: A finite state transducer for temporal adverbs

5.1.2 Spatial adverbs Spatial adverbs are those adverbs which indicate place or location in the space where the action has taken place as नजकै nʌdzik-ʌi 'near-EMPH' in (2). Table 5.2 lists some of the spatial adverbs. (2)

मेरो घर नजकै मि दर छ।

mero gʰʌr nʌdzik-ʌi mʌndir tsʰʌ 1SG.GEN house near-EMPH temple be.NP.3SG.MASC 'There is a temple near my house.' Table 5.2: Spatial adverbs Morphological tags +ADV+SPAC

Devanagari तल

IPA tʌlʌ

Gloss below

+ADV+SPAC

यहाँ

jʌhã

here

+ADV+SPAC

पछा ड

pʌtsad̺i

behind

+ADV+SPAC

भऽ

bʰitrʌ

inside

+ADV+SPAC

निजक

nʌdzik

near

The finite state transducer illustrated in Figure 5.2 encodes the adverbs listed in Table 5.2 and it can analyze and generate them. 189

Figure 5.2: A finite state transducer for spatial adverbs

5.1.3 Amount adverbs Amount adverbs are those words which indicate the amount of the head nouns it modifies as धेरै derʌi 'more' in (3). Table 3 lists some of the amount adverbs. (3)

रामसँग धेरै पैसा छ।

ram-sʌ̃gʌ derʌi pʌisa tsʰʌ Ram-COM more money be.NP.3SG.MASC 'Ram has a lot of money.' Table 5.3: Amount adverbs Morphological tags +ADV+AMOUNT

Devanagari IPA धेरै dʰerʌi

Gloss more

+ADV+AMOUNT

थोरै

tʰorʌi

less

+ADV+AMOUNT

अ लक

ʌlik

little

+ADV+AMOUNT

यत

tjʌti

that much

+ADV+AMOUNT

कत

kʌti

how much

Since amount adverbs do not inflect, the finite-state transducer is simple and it is demonstrated in Figure 5.3 and it is capable of analyzing and generating them.

Figure 5.3: A finite state transducer for amount adverbs

190

5.1.4 Manner adverbs Manner adverbs are those adverbs which indicate the ways or modes how the action has taken place as बःतारो bistaro 'slowly' in (4). Table 4 lists the some of the manner adverbs. (4)

सीता बःतारो प छे ।

siːta bistaro pʌd̺ʰ-tsʰe Sita.FEM slowly read-NP.3SG.FEM 'Sita slowly reads.' Table 5.4: Manner Adverbs Morphological tags +ADV+MANNER

Devanagari IPA सुःतर sustʌri

Gloss slowly

+ADV+MANNER

कसर

kʌsʌri

how

+ADV+MANNER

सुटु

sut ̺ukkʌ

quietly

+ADV+MANNER

छटो

tsit ̺o

quickly

+ADV+MANNER

यसर

jʌsʌri

this way

The finite state transducer demonstrated in Figure 5.4 encodes the manner adverbs listed in Table 5.4 and it can analyze and generate them.

Figure 5.4 A finite state transducer for manner adverbs 5.1.5 Frequency adverbs Frequency adverbs are those words which indicate the frequency of the action that is performed or takes place as क हलेकाह kʌhilekahĩ 'sometimes' in (5). Table 5 lists some of the frequency adverbs. (5)

हामी क हलेकाह बजार जा छ ।

ɦamiː kʌhilekahĩ bʌdzar dza-n-tsʰʌũ 1PL sometimes market go-NP.3PL 'We sometimes go to market.' 191

Table 5.5: Frequency Adverbs Morphological tags +ADV+FREQ

Devanagari

+ADV+FREQ

सध

IPA Gloss kʌhilekahĩ sometimes always sʌdʰʌĩ

+ADV+FREQ

बार बार

barʌmbar

frequently

+ADV+FREQ

ूायः

prajʌ

often

+ADV+FREQ

नर तर

nirʌntʌr

continuously

क हलेकाह

The finite-state transducer is simple since frequency adverbs do not inflect. It is demonstrated in Figure 5.5 and it is capable of analyzing and generating them.

Figure 5.5: A finite state transducer for frequency adverbs

5.1.6 Reason adverbs Reason adverbs are those which provide the reasons as यसकारण tjʌskarʌɳ 'therefore' in (6). Table 5.6 lists some of the reason adverbs. (6)

यसकारण म मा थ दुवै ठाउँको ूभाव छ।

tjʌskarʌɳ mʌ matʰi duwʌi tʰaũko prʌbʰawʌ tsʰʌ therefore 1SG above both place-GEN influence be-NP.3SG.MASC 'Therefore, I have the influences of both places.' Table 5.6: Reason adverbs Morphological tags +ADV+REASON

Devanagari

+ADV+REASON

फलःव प

+ADV+REASON

तसथ

यसकारण

IPA tjʌsʌkarʌɲ

Gloss therefore

pʰʌlʌswʌrup as a result thus tʌsʌrtʰʌ

The reason adverbs listed in Table 5.6 are compiled into a finite state transducer as demonstrated in Figure 5.6 and it can analyze and generate them.

192

Figure 5.6: A finite state transducer for reason adverbs

5.1.7 Sentential adverbs Sentential adverbs are those which modify the entire sentence as सायद sajʌdʌ 'probably' in (7). Table 5.7 lists some of the sentential adverbs. (7)

ँ सायद य त धेरै खुशीलाई मनमै अटाउन असमथ भए।

sajʌdʌ probably

bʰʌẽ

jʌti dʰerʌi kʰusi-laiː mʌn-mʌi ʌtaunʌ ʌsʌmʌrtʰ this much more happy-DAT heart-LOC.EMPH keep-INF unable

be-P.1SG 'Probably, (I) could not keep this much happiness within the heart.'

Table 5.7: Sentential adverbs Morphological tags +ADV+SENT

Devanagari IPA सायद sajʌd

Gloss probably

+ADV+SENT

अवँय

ʌwʌʃjʌ

surely

+ADV+SENT

सामा यतः

samanjʌtʌ

generally

+ADV+SENT

य प

jʌdjʌpi

however

+ADV+SENT

साँ चै

sãtstsʌi

truly

The finite state transducer illustrated in Figure 5.7 encodes the sentential adverbs listed in Table 5.7 and it is capable of analyzing and generating them.

Figure 5.7: A finite state transducer for sentential adverbs

193

The individual finite-state transducers of the adverbs can be unioned together into a finite state transducer that can handle all the adverbs, that means, it can analyze and generate them.

5.2 Conjunctions in Nepali Conjunctions in Nepali are of two kinds: coordinate and subordinate. The coordinate conjunctions are simple in their formation whereas the subordinate conjunctions are simple as well as compound types (Pokharel 2054VS; Adhikari 2062VS).

5.2.1 Coordinate conjunctions Coordinate conjunctions join the constituents of the sentence having equal status as र

rʌ 'and' in (8) it has joined कलम kʌlʌm 'pen' and कताब kitab 'book'. Some coordinate conjunctions in Nepali are listed in Table 5.8. (8)

रामले कलम र कताब क यो।

ram-le kʌlʌm rʌ kitab Ram-ERG pen and book 'Ram bought a pen and a book.'

kin-jo buy-P.3SG.MASC

Table 5.8: Coordinate conjunctions Morphological tags +CCONJ

Devanagari र

IPA rʌ

Gloss and

+CCONJ

वा

wa

or

+CCONJ

तर

tʌrʌ

but

+CCONJ+EMPH

तरै

tʌrʌi

but

+CCONJ

अन

ʌni

and

+CCONJ

तथा

tʌtʰa

and

The coordinators are compiled into a finite state transducer as demonstrated in Figure 5.8 and it can analyze and generate them.

194

Figure 5.8: A finite state transducer for coordinate conjunctions 5.2.2 Subordinate conjunctions Coordinate conjunctions join the constituents of the sentence that have unequal status, mainly subordinate clause with matrix clause as

क ki 'that' in (9). Some of

subordinate conjunctions in Nepali are listed in Table 5.9 and the finite-state transducer to account them is demonstrated in Figure 5.9. (9)

भ डार ले भ ु भयो क उहाँले सं माम कन छो नु भयो।

bʰʌndari-le bʰʌn-nu bʰʌ-jo ki uhã-le sʌŋgram kinʌ Bhandari-ERG say-INF be-P.3SG that 3SG.HON war why tsʰod-nu bʰʌ-jo leave-INF be-P.3SG 'Bhandari said that why he gave up the war.' Table 5.9: Subordinate conjunctions Morphological Tags +SCONJ

Devanagari IPA कन क kinʌki

Gloss so that

+SCONJ

भने

bʰʌne

If

+SCONJ

क

ki

That

+SCONJ

कनभने

+SCONJ

यसकारण

kinʌbʰʌne that's why jʌsʌkarʌɳ Therefore

The subordinate conjunctions are compiled into a finite state transducer as illustrated in Figure 5.9 and it is capable of analyzing and generating them.

Figure 5.9: A finite state transducer for subordinate conjunctions 195

5.3 Postpositions in Nepali Postpositions in Nepali always follow the nominal. According to their functions they perform in the sentence and their semantics, the postpositions can be grouped into three classes, namely, plural/collective marker, case markers and adverbial postpositions (Hardie et al 2005).

5.3.1 Plural/collective marker -ह

-ɦʌruː is the plural or collective marker in Nepali and it occurs optionally with o-

ending nouns but systematically occurs with other non-o-ending nouns and pronouns. However, this is not an obligatory element to indicate the plural in nominals as there are other mechanisms to express the plural number in Nepali (see 3.1.1). The finitestate transducer is demonstrated in Figure 5.10.

Table 5.10: Collective/plural marker Morphological Tags +PL

Devanagari IPA -ह ɦʌruː

Gloss Plural

The plural marker has been compiled into a finite state transducer as illustrated in Figure 5.10 and it can analyze and generate it.

Figure 5.10: A finite state transducer plural/collective marker

5.3.2 Case markers in Nepali Cases in Nepali morphology are marked with postpositions except nominative case and absolutive case. In traditional Nepali grammars, the following cases are identified, namely, ergative, instrumental, dative, ablative, locative, commutative, genitive and allative. The ergative and instrumental cases have same marker -ले -le, ablative

case

is

marked

by

two

markers 196

बाट

bat ̺ʌ

and

दे िख

-dekʰi,

commutative/associative case with two markers सँग -sʌ̃gʌ and सत -sitʌ and genitive case is marked with को -ko , का -ka and क -ki for singular, plural and feminine features. The allative case is marked with तर -tirʌ (see 3.1.1). The case markers in Nepali are listed in Table 5.11a and Table 5.11b. The case markers in Table 5.11a do not take an emphatic marker whereas case markers in Table 5.11b take an emphatic marker. Table 5.11a: Case marker postpositions (i) Morphological tags +ERG

Devanagari ले

IPA -le

Gloss Ergative

+INST

ले

-le

Instrumental

+DAT

लाई

-laiː

Dative

+ABL

दे िख

-dekʰi

Ablative

They are separately compiled into a finite state transducer which can analyze and generate them. Table 5.11b: Case marker postpositions (ii) +ABL

बाट

-bat ̺

Ablative

+ABL+EMPH

बाटै

-bat ̺ʌi

Ablative

+LOC

मा

-ma

Locative

+LOC+EMPH

मै

-mʌi

Locative

-sʌ̃gʌ

Commitative

-sʌ̃gʌi

Commitative

+COM

सँग

+COM+EMPH

सँगै

+COM

सत

-sitʌ

Commitative

+COM+EMPH

सतै

-sitʌi

Commitative

+GEN+SG

को

-ko

Genitive

+GEN+PL

का

-ka

Genitive

+GEN+FEM

क

-ki

Genitive

+GEN+EMPH

कै

-kʌi

Genitive

+DIR

तर

-tirʌ

Directional

+DIR+EMPH

तरै

-tirʌi

Directional

The case markers listed in Table 5.11b also take emphatic marker. Therefore, they are compiled into a finite state transducer along with emphatic marker. It can analyze and generate them. 197

5.3.3 Adverbial postpositions There are a number of forms which behave like postpositions but they also have the content meaning like that of adverbs. These forms usually occur with nominals providing the information about time, space, amount, frequency and manner. Some adverbial postpositions which do not take emphatic marker are listed in Table 5.12a.

Table 5.12a: Adverbial postpositions (a) Morphological tags +POSTP

Devanagari मा थ

IPA -matʰi

Gloss above

+POSTP

कहाँ

-kʌhã

where

+POSTP

मु न

-muni

below, under

+POSTP

वा र

-wari

this side

+POSTP

पा र

-pari

other side

+POSTP

वर

-wʌri

this side

+POSTP

पर

-pʌri

other side

+POSTP

प

-pʌt ̺t ̺i

towards

+POSTP

ूत

-prʌti

towards

+POSTP

पा ल

-pali

time

+POSTP

खे र

-kʰeri

while doing

+POSTP

पा ल

-pali

time

+POSTP

छे उ

-tsʰeu

at the side

+POSTP

पछ

-pʌtsʰi

later

+POSTP

पछा ड

-pʌtsʰad̺i

back side

+POSTP

अिघ

-ʌgʰi

before

+POSTP

अगा ड

-ʌgad̺i

before

+POSTP

भर

-bʰʌri

full of

+POSTP

प हले

-pʌhile

before

+POSTP

नि त

-nimti

for

+POSTP

ला ग

-lagi

for

+POSTP

बारे

-bare

about

+POSTP

सामु

-samu

before

+POSTP

सर

-sʌri

like

+POSTP

म ये

-mʌdʰje

among

+POSTP

जत

-dzʌti

whatever

+POSTP

प छे

-pitstsʰe

each

+POSTP

पा ल

-pali

time

198

The finite-state transducer is demonstrated in Figure 5.12a encodes the adverbial postpositions listed in Table 5.12a which is capable of analyzing and generating them.

Figure 5.12a: A finite state transducer for adverbial postpositions that do not take emphatic marker Some postpositions take emphatic marker, they are listed in Table 5.12b. Table 5.12b: Adverbial postpositions (b) Morphological tags +POSTP

Devanagari स हत

IPA -sʌhitʌ

Gloss along with

+POSTP

साथ

-satʰʌ

with

+POSTP

स म

-sʌmmʌ

until, upto

+POSTP

बा हर

-baɦirʌ

out side

+POSTP

वार

-warʌ

+POSTP

पार

-parʌ

+POSTP

वर

-wʌrʌ

little near

+POSTP

पर

-pʌrʌ

little further

-ũbʰo

up

-ũdʰo

down

+POSTP

उँभो

+POSTP

उँधो

+POSTP

तफ

-tʌrpʰʌ

towards

+POSTP

नेर

-nerʌ

near

+POSTP

नर

-nirʌ

near

+POSTP

सम

-sʌmʌk

before

+POSTP

पय त

-pʌrjʌntʌ

till

+POSTP

खेर

-kʰerʌ

moment

+POSTP

उूा त

-uprantʌ

then after

+POSTP

साथ

-satʰʌ

with

+POSTP

पख

-pʌkʰʌ

time

+POSTP

ताक

-takʌ

time

+POSTP

ताका

-taka

time

199

+POSTP

पाला

-pala

time

+POSTP

पटक

-pʌt ̺ʌkʌ

times

+POSTP

प ट

-pʌlt ̺ʌ

times

+POSTP

प ात्

-pʌʃtsat

after

+POSTP

छे क

-tsʰekʌ

at the time

+POSTP

भऽ

-bʰitrʌ

inside

+POSTP

निजक

-nʌdzikʌ

near

+POSTP

भर

-bʰʌrʌ

full of

+POSTP

तक

-tʌkʌ

till

+POSTP

यता

-jʌta

here

+POSTP

उता

-uta

there

+POSTP

बीच

-biːtsʌ

between

+POSTP

नम

-nimittʌ

+POSTP

खा तर

-kʰatirʌ

for the sake of for

+POSTP

अ तगत

-ʌntʌrgʌtʌ

within

+POSTP

बमोिजम

-bʌmodzimʌ

according to

+POSTP

मा फक

-mapʰikʌ

according to

+POSTP

मुता बक

-mutabikʌ

according to

+POSTP

अनुसार

-ʌnusarʌ

according to

+POSTP

उपर

-upʌrʌ

on

+POSTP

माफत

-marpʰʌtʌ

via

+POSTP

अलावा

-ʌlawa

beside

+POSTP

अतर

-ʌtiriktʌ

in addition

+POSTP

बाहे क

-bahekʌ

except

+POSTP

जःतो

-dzʌsto

like

+POSTP

सरह

-sʌrʌhʌ

same as

+POSTP

बाबजुद

-babʌdzudʌ

+POSTP

व

-wiruddʰʌ

against

+POSTP

बापत

-bapʌtʌ

for

+POSTP

स ा

-sʌt ̺t ̺a

instead of

+POSTP

बदला

-bʌdʌla

instead of

+POSTP

लेखा

-lekʰa

+POSTP

सु

-suddʌ

+POSTP

समेत

-sʌmetʌ

200

along with

The adverbial postpositions which take emphatic marker listed in Table 5.12b are compiled into a network as demonstrated in Figure 5.12b which can analyze and generate them.

Figure 5.12b: A finite state transducer for adverbial postpositions that take emphatic marker Phonological Rule: PR 5.1 1. Vowels ◌ा a, ◌ो o and halant ◌् at the end of the adverbial postpositions of this group are removed when the emphatic marker ʌi ◌ै is attached. Regular expression ◌ा|◌ो|◌् ‐> [ ] || cons __ ◌ै #;

5.4 Particles and interjections in Nepali 5.4.1 Particles In Nepali, particles are the residual class comprising those stems which do not enter into inflectional constructions and stand as free forms (Dahal 1974). They appear before or after any lexical word and add an abstract meaning to the word that they are associated with. The extra meaning added can only be predicted from the context where they are used. The Nepali particles are monosyllabic or disyllabic words and their behaviors are different from other indeclinable words such as adverbs, postpositions and conjunctions. In written Nepali, the particles are written separately. Some particles are listed in Table 5.13 with morphological tags.

201

Table 5.13: Particles in Nepali Morphological tags +PARTICLE

Devanagari IPA नै nʌi

+PARTICLE

माऽ

matrʌ

+PARTICLE

चा हँ

+PARTICLE

पन

pʌni

+PARTICLE

ल

lʌ

+PARTICLE

है

hʌi

+PARTICLE

न

nʌ

+PARTICLE

न

ni

+PARTICLE

त

tʌ

+PARTICLE

पो

po

+PARTICLE

र

rʌ

tsahĩ

The finite state transducer as illustrated in Figure 5.13 encodes the particles listed in Table 5.13 and it can analyze and generate them.

Figure 5.13: A finite state transducer for particles

5.4.2 Emphatic markers a. ऐ ʌi ! emphasizes or draws an attention to or focuses a sentence or a part of the sentence. In addition, the techniques such as (a) sentence stress, (b) use of particles, (c) dislocation of the sentence constituents and (d) intonation are used for emphasizing the sentence or part of the sentence. The technique (a & d) are phonological and technique (b) is syntactic where the particles follow the word that gets focused. The technique (c) involves topicalization and some sorts of movements. The use of emphatic marker ऐ ʌi is not restricted to a particular class of words. Except some phonological constraints, it gets attached to any words irrespective of its parts of 202

speech. So, this marker can be said as global marker (Pokharel 2053VS). The following are the conditions where, this emphatic marker can or cannot appear. i. It doesn't appear with इ i, ए e and उ u (except आफू apʰuː) ending words. ii. When it gets attached to the words ending with ʌ, a and o, these vowels get deleted as illustrated below. म mʌ ' I'

केटो keto 'boy'

राजा radza 'king'

mʌ + ʌi

= mʌi

मै

keto + ʌi radza + ʌi

= ketʌi = radzʌi

केटै

राजै

iii. But, with the words ending with consonants, this maker gets attached without any change as given below. ख ruːkʰ 'tree'

कताब kitab 'book'

ruːkʰ + ʌi

= ruːkʰʌi

खै

kitab + ʌi

= kitabʌi

कताबै

In contrast, when this marker gets attached to the adjective, it deemphasizes the attribute possessed by adjectives. (10) रामको बानी राॆै छ। ram-ko bani ramrʌi tsʰʌ Ram-GEN habit good.DEMPH be.NP.3SG.MASC 'Ram's habit is okay (lesser than good).' b. ह ɦi emphasizes only some of the demonstratives. But, the same marker with interrogative pronouns changes it to the indefinite pronouns. The examples given below demonstrate this phenomenon. यो 'this' + ह = यह 'this-EMPH' यो 'that' + ह = यह 'that-EMPH' ऊ 'this' + ह = उह 'this-EMPH' सो 'the same' + ह = सोह 'the same-EMPH' को 'who' + ह = कोह 'who-INDEF' के 'what' + ह = केह 'what-INDEF'

5.4.3 Interjections in Nepali Interjections are a subset of particles which generally appear at the sentence initial position and express speaker's emotions such as surprise, pain, disgust, joy, 203

excitement and enthusiasm (Pokharel 2054VS:125-31). Interjections are also indeclinable words. Some examples of interjections are listed in Table 5.14 with morphological tags. Table 5.14: Interjections in Nepali Morphological Tags +INTERJ

Devanagari IPA ओहो oho

+INTERJ

आ था

attʰa

+INTERJ

छ

tsʰi

+INTERJ

थुइ

tʰuikkʌ

+INTERJ

हरे

hʌre

+INTERJ

बचरा

bitsʌra

+INTERJ

या ए

+INTERJ

यू

+INTERJ +INTERJ

नाइँ

+INTERJ

कुि

dzja e dzju naĩ kunni

The interjections listed in Table 5.14 are compiled into a finite state transducer as demonstrated in Figure 5.14 and it is capable of analyzing and generating them.

Figure 5.14: A finite state transducer for interjections 5.5 Summary In this chapter, we discussed and presented the grouping of adverbs, conjunctions, case markers, particles and interjections. Adverbs in Nepali are grouped into seven semantic classes: temporal, spatial, amount, frequency, manner, reason and sentential. Conjunctions are of two types: coordinate and subordinate. Postpositions are of three types: plural marker, case markers and adverbial postpositions. Particles in general are in the single class but two emphatic markers which can be applied globally except in the phonologically constrained case is equivalent to some particles. The interjections form a single class. 204

CHAPTER 6 DERIVATIONAL MORPHOLOGY 6.0 Outline This chapter presents analysis of derivational processes in Nepali morphology. It consists of three sections. Section 6.1 presents the prefixation that includes noun to noun derivation, noun to adjective derivation, noun to adverb derivation and adjective to adjective derivation. In each types, a model for the finite state transducer and a tag for prefixaton is provided. In section 6.2, we present the suffixation process that includes noun to noun derivation, noun to noun adjective derivation, noun to noun/adjective derivation, adjective to noun derivation, adjective/noun to noun derivation, verb to noun derivation, verb to adjective derivation, verb to adverb derivation, adverb to adjective derivation, verb to noun conversion, verb to adjective/noun conversion and verb to noun derivation. In each type of derivation, the morphological tags and finite state transducers for each group are illustrated. And finally, section 6.3 summarizes the chapter.

6.1 Prefixation Derivation is a morphological process of word formation. It involves the additon of bound affix forms to an existing lexeme/stem, whereby the addition of the affix derives a new word or a lexeme (Katamba 1993; Payne 1997). The word class of newly formed word is generally different from the original word from which it is derived. Sometimes, this not always true, i.e. the word class remains the same, however, the semantics of the word definitely changes. The meaning of a derived new word may have clear meaning change, addition of speciality, technicality and stylistics. The derivation of a word from the same word class and from different class is possible. In most of the prefixing derivation, the word class remains same except few cases whereas in suffixing derivation the word class changes except few cases (Adhikari 2062VS). In prefixation, an affix is prefixed to a base stem and a new word is derived. In Nepali, a number of prefixes as listed in the Tables (6.1-6.4) with base stem derived a 205

word. The various types of derivation using prefixes are discussed in subsquent sections.

6.1.2 Noun to noun derivation In this type of derivation, 24 prefixes are involved and they derive a noun from a noun stem. The semantics of the prefixes is not predictable, so they are simply marked as prefix with a tag PFX. Table 6.1 lists those prefixes with base stem and derived word. Table 6.1: Noun to noun derivation Prefix

Base noun stem

Gloss

Derived noun

Gloss

चलन tsʌlʌn

tradition

ू prʌ-

ूचलन prʌtsʌlʌn

Tradition

परा pʌra-

जय dzʌjʌ

victory

पराजय pʌradzʌjʌ

Defeat

अप ʌpʌ-

श द ʃʌbdʌ

word

अपश द ʌpʌʃʌbdʌ

abusing word

सम् sʌm-

मान man

honor

स मान sʌmman

respect

अनु ʌnu-

शासन ʃasʌn

अनुशासन ʌnuʃasʌn

discipline

अव ʌwa-

गुण guɳ

quality

अवगुण ʌwaguɳ

demerit

दुस ् dus-

प रणामpʌriɳam

result

दुःप रणाम duspʌriɳam

bad result

दुर ् dur-

घटना gʰʌt ̺ʌna

event

दुघटना durgʰʌt ̺ʌna

accident

व wi-

नाश naʃ

loss

वनाश winaʃ

damage

अ ध ʌdʰi-

रा य radzjʌ

state

अ धरा य ʌdʰi radzjʌ

kingdom

अ त ʌti-

वृ wristi

rain

अ तवृ

over rain

अ भ ʌbʰi-

िच rutsi

interest

अ भ िच ʌbʰi rutsi

interest

ू त prʌti-

व न dʰwʌni

sound

ू त व न prʌti dʰwʌni

echo

प र pʌri-

योजना jodzʌna

plan

प रयोजना pʌrijodzʌna

project

उप upʌ-

मह grʌɦʌ

planet

उपमह upʌgrʌɦʌ

satalite

सह sʌɦʌ-

काय karjʌ

work

सहकाय sʌɦʌkarjʌ

collaboration

स sʌ-

प रवार pʌriwar

family

सप रवार sʌpʌriwar

whole family

कु ku-

पुऽ putrʌ

son

कुपुऽ kuputrʌ

bad son

ान gjan

knowledge

अ ान ʌgjan

ignorance

अन् ʌn-

आःथा astʰa

belief

अनाःथा ʌnastʰa

disbelief

बे be-

इ जत idzdzʌt

prestige

बेइ जत beidzdzʌt

insult

बद bʌdʌ-

नाम nam

name

बदनाम bʌdʌnam

bad name

ला la-

वा रस waris

heir

लावा रस lawaris

???

सु su-

समाचारsʌmatsar

news

सुसमाचार susʌmatsar

good news

अ ʌ-

governance

206

ʌtiwristi

The finite-state transducer in Figure 6.1 is common to all the prefixes listed in Table 6.1. It can analyze and generate both base and derived word.

+NOUN:0

NounType

PFX+:Prefix

NounType

Figure 6.1: A finite state transducer for noun to noun derivation

6.1.3 Noun to adjective derivation In this type of derivation, 9 prefixes are involved and they derive an adjective from a noun stem. The semantics of the prefixes is not predictable, so they are simply marked as prefix with a tag

PFX.

Table 6.2 lists those prefixes with base stem and derived

word. Table 6.2: Noun to adjective derivation Prefix नर ् nir-

Base noun stem दोष dos

Gloss fault

Derived adjective नद ष nirdos

Gloss innocent

नः ni:-

ःवाथ swartʰʌ

self-interest

नःःवाथ ni:swartʰʌ

selfless

न ni-

डर d̺ʌr

व wi-

मुख mukʰ

mouth

वमुख wimukʰ

deviated

नस् nis-

फल pʰʌl

fruit

नःफल nispʰʌl

fruitless

स sʌ-

बल bʌl

force

सबल sʌbʌl

capable

बे be-

घर gʰʌr

house

बेघर begʰʌr

homeless

अ ʌ-

मू य muːljʌ

value

अमू य ʌmuːljʌ

valueless

अन ʌnʌ-

मोल mol

price

अनमोल ʌnʌmol

priceless

fear

नडर nid̺ʌr

bold

The finite-state transducer in Figure 6.2 is common to all the prefixes listed in Table 6.2 and it is capable of analyzing and generating them.

207

+NOUN:0

NounType

+ADJ:0 NounType

PFX+:Prefix

Figure 6.2: A finite state transducer for noun to adjective derivation 6.1.4 Noun to adverb derivation In this type of derivation, 4 prefixes are involved and they derive an adverb from a noun stem. The semantics of the prefixes is not predictable, so they are simply marked as prefix with a tag

PFX.

Table 6.3 lists those prefixes with example base stem and

derived word. Table 6.3: Noun to adverb derivation Prefix आa-

स sʌ नर ् nir-

ू तprʌti-

Base Noun stem मरण mʌrʌɳ

Gloss

Derived Adverb

Gloss

death

आमरण amʌrʌɳ

till death

हष ɦʌrʂʌ

happy

सहष sʌɦʌrʂʌ

घात gʰat

stroke

नघात nirgʰat

with happiness severly

ह ा ɦʌpta

week

ू तह ा prʌtiɦʌpta

per week

The finite-state transducer in Figure 6.3 is common to all the prefixes listed in Table 6.3 and it can analyze and generate them.

+NOUN:0

NounType

+ADV:0 NounType

PFX+:Prefix

Figure 6.3: A finite state transducer for noun to adverb derivation

208

6.1.5 Adjective to adjective derivation In this derivation, 6 prefixes are involved and derive an adjective from an adjective stem. The semantics of the prefixes is not consistent, so they are simply marked as prefix with a tag PFX. Table 6.4 lists those prefixes with base stem and derived word.

Table 6.4: Adjective to adjective derivation Prefix सम्sʌmवwi-

दुर ् durउन् unसु su-

प र pʌri-

Base adjective stem पूण puːrɳʌ

Gloss

मु

muktʌ

िशि त ʃiktsʰit पूण puːrɳʌ

Gloss

complete

स पूण sʌmpuːrɳʌ

total

pure

वशु wiʃuddʰʌ

pure

शु ʃuddʰʌ भे bʰedjʌ

Derived adjective

vulnerable दुभ durbʰedjʌ free उ मु unmuktʌ educated सुिशि त suʃiktsʰit प रपूण pʌripuːrɳʌ

complete

invulnerable free educated sufficient

The finite-state transducer in Figure 6.4 is common to all the prefixes listed in Table 6.4 and it is capable of analyzing and generating.

+ADJ:0

AdjType

AdjType

PFX+:Prefix

Figure 6.4: A finite state transducer for adjective to adjective derivation 6.2 Suffixation 6.2.1 Noun to noun derivation In this derivation, 2 suffixes are involved and they derive a noun from a noun stem. The semantics of the suffixes is not considered, so they are simply marked as suffix with a tag SFX. Table 6.5 lists those suffixes with base stem and derived word. Table 6.5: Noun to noun derivation Base noun stem सुन sun

Gloss

Suffix

Derived noun

Gloss

gold

आर-ar

सुनारsunar

goldsmith

घाँस gʰãs

grass

ई-iː

घाँसी gʰãsiː

grass cutter

209

The finite-state transducer in Figure 6.5 is common to all the suffixes listed in Table 6.5 and it is capable of analyzing and generating them.

+NOUN:0

NounType

NounType

+SFX:Suffix

Figure 6.5: A finite state transducer for noun to noun derivation The phonological rules involved in the derivation are listed in PR 6.1. They are compiled and composed with the finite state transducer illustrated in Figure 6.5.

Phonological rule PR 6.1 i. Independent vowels आ a and ई i change to corresponding dependent vowels ◌ा a and ◌ी i:, respectively after the consonants Regular expressions:

आ -> ◌ा || cons __ ई -> ◌ी || cons __

6.2.2 Noun to adjective derivation In this derivation, 11 suffixes are involved and they derive an adjective from a noun stem. The semantics of the prefixes is not considered, so they are simply marked as suffix with a tag SFX. Table 6.6 lists those suffixes with example of base stem derived word.

210

Table 6.6: Noun to adjective derivation Base noun stem दया dʌja

Gloss

Suffix

Derived adjective

love

अनीय -ʌnijʌ

दयनीय dʌjanijʌ

लाभ labʰ

profit

अक -ʌkʌ

लाभक labʰʌkʌ

profitable

सेवा sewa

service

इका-ika

से वका sewika

service girl

मुगल mugʌl

Mugal

आन –an

मुगलान mugʌlan

ल बु limbu

Limbu

वान –wan

ल बुवान limbuwan

दान dan खच kʰʌrtsʌ

donation ई iː expense आलु-alu

दानी daniː

Gloss lovable

Indian Limbu area donor

खचालु kʰʌrtsalu

expensive

भर bʰir

cliff

आलो-alo

भरालो bʰiralo

steep

रस ris

anger

आहा-aha

रसाहा risaha

angry

शहर ʃʌɦʌr

town

इया-ija

शह रया ʃʌɦʌrija

urban

होस ɦos

sense

इयार-ijar

हो सयार ɦosijar

careful

The finite-state transducer illustrated in Figure 6.6 is common to all the suffixes listed listed in Table 6.6.

+NOUN:0

NounType

+ADJ:0 +SFX:Suffix

NounType

Figure 6.6 A finite state transducer for noun to adjective derivation The phonological rules involved in this derivation are listed in PR 6.2. They are compiled and composed with the finite state transducer demonstrated in Figure 6.6 which is capable of analyzing and generating them.

Phonological rule PR 6.2 i. Independent vowels आ a, ई i: and इ i change to their corresponding dependent vowel ◌ा a, ◌ी i: and ि◌ i after the consonants 211

आ -> ◌ा || cons __

Regular expressions:

ई -> ◌ी || cons __ इ -> ि◌ || cons __ ii. Vowel sequence of dependent ◌ा a and अ ʌ changes to अ ʌ. ◌ा अ -> [ ]

Regular expression:

iii. Vowel sequence of dependent ◌ा a and independentइ i changes to ि◌ i. ◌ा इ -> ि◌ [ ]

Regular expression:

6.2.3 Noun to noun/adjective derivation In this derivation, 3 suffixes are involved and they derive a noun or adjective from a noun stem. The semantics of the prefixes is not considered, so they are simply marked as suffix with a tag SFX. Table 6.7 lists those suffixes with example of base stem and derived word.

Table 6.7: Noun to noun/adjective derivation Base noun stem झापा dzʰapa

Gloss

Suffix ल -li

Derived noun/adjective झापाल dzʰapali

Jhapa

गु मी gulmi इलाम ilam

Gloss of Jhapa

Gulmi

एल -eli

गु मेल gulmeli

of Gulmi

Illam

इलामे ilame

of Illam

गाउँ gaũ

ए-e

village

ले-le

गाउँले gaũle

villager

नेपाल nepal

Nepali

ई-iː

नेपाल nepaliː

of Nepal

The finite-state transducer in Figure 6.7 is common to all the suffixes listed in Table 6.7. +NOUN:0 NounType +ADJ:0 +SFX:Suffix +NOUN:0

Figure 6.7: A finite state transducer for noun to noun/adjective derivation 212

The phonological rules involved in this derivation are listed in PR 6.3; they are compiled and composed with the finite state transducer illustrated in Figure 6.7 and it can analyze and generate both stem and derived words. Phonological rule: PR 6.3 i. Independent vowels ई i: and ए e change to their corresponding dependent vowel ◌ी i: and ◌े e after the consonants

Regular expressions:

ई -> ◌ी || cons __ ; ए -> ◌े || cons __;

ii. Vowel sequence of dependent vowel ◌ी iː and independent vowel ए e changes

to ◌े e. ◌ी ए -> ◌े;

Regular expression:

iii. Vowel sequence of dependent vowel◌ा a and independentइ i changes to ि◌ i. ◌ा इ -> ि◌ [ ]

Regular expression:

6.2.4 Adjective to noun derivation In this derivation, 1 suffix is involved and they derive a noun from a adjective stem. The semantics of the suffixes is not considered but they are marked as suffix with a tag SFX. Table 6.8 lists those suffixes with example of derived and base stem of each.

Table 6.8: Adjective to noun derivation Base adjective Stem लामो lamo

Gloss long

Suffix आइ-ai

Derived noun लमाइ lʌmai

Gloss length

The finite-state transducer in Figure 6.8 is common to all the suffixes listed in Table 6.8.

213

+ADJ:0

AdjType

+NOUN:0 +SFX:Suffix

AdjType

Figure 6.8: A finite state transducer for noun to adjective derivation The phonological rules involved in this derivation are listed in PR 6.4; they are compiled and composed with the finite state transducer and it can analyze and generate both underived and derived words.

Phonological rule PR 6.4 i. Dependent vowel ◌ा a between consonant changes to ʌ. Regular expression:

◌ा -> [ ] || cons __ cons;

ii. Vowel sequence of dependent vowel ◌ो o and independent vowel आ a changes to dependent vowel ◌ा a. Regular expression:

◌ो आ-> ◌ा;

6.2.5 Adjective/noun to noun derivation In this derivation, 1 suffix is involved and it derives a noun from a noun/adjective stem. The semantics of the suffixes is not considered, so it is simply marked as suffix with a tag SFX. Table 6.9 lists those suffixes with base stem and derived word.

Table 6.9: Adjective/Noun to Noun Derivation Base noun/adjective stem ग रब gʌrib

Gloss poor

Suffix ई-iː

Derived noun ग रबीgʌribi

Gloss poverty

The finite-state transducer in Figure 6.9 is common to all the suffixes listed in Table 6.9. It is capable of analyzing and generating the derived words.

214

+ADJ:0 +NOUN:0

NAType

SFX:Suffix

+NOUN:0

Figure 6.9: A finite state transducer for noun/adjective to noun derivation The phonological rules involved in this process are listed in PR 6.5; they are compiled into a network and composed with the network illustrated Figure 6.9.

Phonological rule PR 6.5 i. Independent vowel ई i: changes to its corresponding dependent vowel ◌ी i: after the consonant Regular expression:

ई -> ◌ी || cons __;

6.2.6 Verb to noun derivation In this type of derivation, 32 suffixes are involved and they derive a noun from a verb stem. The semantics of the suffixes is not considered but they are marked as suffix with a tag

SFX.

Table 6.10 lists those suffixes with example of derived and base stem

of each.

215

Table 6.10: Verb to Noun Derivation Base verb stem च ुन् tsun

Gloss elect

च ुन् tsun

elect

क kit ̺

decide

क kit ̺

decide

ढाक् d̺ʰak

cover

जल् dzʌl

burn

चोर ् tsor

steal

हाँस ् ɦãs

laugh

प pʌd̺ʰ

read

थाक् tʰak

tire

छाप् tsʰap

print

छान् tsʰan

choose

िच या tsitsja

shout

झर ् dzʌr

drop

ढोग् dʰog

bow

राख् rakʰ

keep

दाब् dab

press

बच ् bʌts

save

स sʌd̺

decay

रोप् rop

plant

छे क् tsʰek

block

िचर ् tsir

split

ब bʌd̺ʰ ् सरsʌr

उ utʰ

चाल्tsal ् बेरber गाga

भ bid̺

िजत्dzit ् कोरkor

् ʰul खुलk

grow move rise sieve roll sing fight win scratch open

Suffix

आउ-au आब-ab आन-an

आनी-ani अनी-ʌni अन-ʌn

Derived noun च ुनाउ tsunau

Gloss election

च ुनाब tsunab

election

कटान kit ̺an

decision

कटानी kit ̺ani

decision

ढकनी d̺ʰʌkʌni

lid

जलन dzʌlʌn

ई-iː

ओ-o

आइ-ai

आवट-awʌt ̺ आ-a

ओट-ot ̺

हट-ɦʌt ̺

अना-ʌna

आउनी-auni आलो-alo आब-ab अत-ʌt

अल-ʌl

आइँ-aĩ

आरो-aro

औटो-ʌuto औती-ʌuti उवा-uwa ती-ti

नी-ni

नो-no ना-na

अ त-ʌnt

और -ʌuri एसो-eso

अःत-ʌstʌ

चोर tsoriː

theft

हाँसो ɦãso

laughter

पढाइ pʌd̺ʰai

reading

थकावट tʰakawʌt ̺

tiredness

छापा tsʰapa

printing

छनोट tsʰanot ̺

selection

िच याहट tsitsjaɦʌt ̺

shouting

झरना dzʌrʌna

water fall

ढोगाउनी dʰogauni

bowing

रखालो rakʰalo

servant

दबाब dabab

pressure

बचत bʌtsʌt

saving

सडल sʌd̺ʌl

decay

रोपाइँ ropaĩ

plantation

छे कारो tsʰekaro

blockade

िचरौटोtsirʌuto

split

बढौती bʌd̺ʰʌuti

growth

स वा sʌruwa

shift

उ ती utʰti

credit

चा नी tsalni

sieve

बेन berno

piles

गाना gana

song

भड त bid̺ʌnt

fighting

िजतौर dzitʌuri

winning

कोरे सो koreso

scratcher

खुलःत kʰulʌstʌ

216

burning

open

The finite-state transducer in Figure 6.10 is common to all the prefixes listed in Table 6.10 and it can analyze and generate the derived words.

Vstem

+SFX:Suffix

+NOUN:0

Figure 6.10: A finite state transducer for verb to noun derivation The phonological rules involved in this derivation process are listed in PR 6.6; they are compiled and composed with the finite state transducer illustrated in Figure 6.10.

Phonological rule PR 6.6 i. Independent vowels आ a, ई i:, ओ o, इ i , उ u, ए e, ऐ ʌi, औ ʌu, and अ ʌ change to their corresponding dependent vowels ◌ा a, ◌ी i:, ◌ो o, ि◌ i, ◌ु u, ◌े e, ◌ै ʌi, ◌ौ ʌu and [ ] after the consonants, respectively.

Regular expressions:

आ -> ◌ा || cons __ ; ई -> ◌ी || cons __; ओ -> ◌ो || cons __; इ -> ि◌ || cons __; उ -> ◌ु || cons __; ए -> ◌े || cons __; ऐ -> ◌ै || cons __; औ -> ◌ौ || cons __; अ -> [ ] || cons __;

6.2.7 Verb to adjective derivation In this derivation, 14 suffix are involved and they derive an adjective from a verb stem. The semantics of the suffixes is not considered but they are marked as suffix

217

with a tag

SFX.

Table 6.11 lists those suffixes with example of base stem and derived

word.

Table 6.11: Verb to adjective derivation Base verb stem मच ् mits

Gloss squeeze

Suffix आहा-aha

Gloss suppressor

अ ड -ʌkkʌd̺

Derived adjective मचाहा mitsaha भ ुल ड bʰulʌkkʌd̺

forgetful

भ ुल् bʰul

forget

पोस् pos

feed

इलो-ilo

पो सलो posilo

nutritious

घुम ् gʰum

roam

अ ते-ʌnte

घुम ते gʰumʌnte

vagabond

घुम ् gʰum

roam

अ ता-ʌnta

घुम ता gʰumʌnta

vagabond

खप् kʰʌp

last

आलु-alu

खपालु kʰʌpalu

long lasting

प pʌd̺ʰ

read

ऐया-ʌija

पढै या pʌd̺ʰʌija

studious

छा tsʌd̺

leave

आ-a

छाडा tsʌd̺a

wanton

रोप् rop

plant

आर-ar

रोपार ropar

planter

सक् sik

learn

आ -aru

सका sikaru

learner

बक् bik

sell

आउ-au

बकाउ bikau

salable

भाग् bʰag

flee

औटो-ʌut ̺o

भगौटो bʰagʌut ̺o

runner

छे र ् tsʰer

pass stool attach

औट -ʌut ̺i

छे रौट tsʰerʌut ̺i

sufferer

उ-u

लागु lagu

addicted

लाग् lag

The finite-state transducer in Figure 6.11 is common to all the suffixes listed in Table 6.11. It can analyze and generate the derived words of this type.

Vstem

+SFX:Suffix

+ADJ:0

Figure 6.11: A finite state transducer for verb to adjective derivation The phonological rules involved in this type of derivation are listed in PR 6.7; they are compiled and composed with the finite state transducer illustrated in Figure 6.11.

218

Phonological rule PR 6.7 i. Independent vowels आ a, ई i:, ओ o, इ i , उ u, ए e, ऐ ʌi, औ ʌu, and अ ʌ change to their corresponding dependent vowels ◌ा a, ◌ी i:, ◌ो o, ि◌ i, ◌ु u, ◌े e, ◌ै ʌi, ◌ौ ʌu and [ ] after the consonants, respectively.

Regular expressions:

आ -> ◌ा || cons __ ; ई -> ◌ी || cons __; ओ -> ◌ो || cons __; इ -> ि◌ || cons __; उ ->◌ु || cons __; ए -> ◌े || cons __; ऐ -> ◌ै || cons __; औ -> ◌ौ || cons __; अ -> [ ] || cons __;

6.2.8 Verb to adverb derivation In this derivation, 2 suffixes are involved and they derive an adverb from a verb stem. The semantics of the suffixes is not considered, so they are simply marked as suffix with a tag

SFX.

Table 6.12 lists those suffixes with example of base stem and derived

word. Table 6.12: Verb to adverb derivation Base verb stem ् गरgʌr ् गरgʌr

Gloss do do

Suffix उ जेल -undzel

Derived adverb ग जेल gʌrundzel

Gloss till doing

इ जेल -indzel

ग र जेल gʌrindzel

till doing

The finite-state transducer in Figure 6.12 is common to all the suffixes listed in Table 6.12. It can analyze and generate the derived words.

219

Vstem

+SFX:Suffix

+ADV:0

Figure 6.12: A finite state transducer for verb to adverb derivation The phonological rules involved in this derivation are listed in PR 6.8; they are compiled and composed with the finite state transducer illustrated in Figure 6.12.

Phonological rule PR 6.8 i. Independent vowels उ u and इ i change to their corresponding dependent vowel ◌ु u and ि◌ i after the consonants Regular expressions:

उ -> ◌ु || cons __ ; इ -> ि◌ || cons __;

6.2.9 Adverb to adjective derivation In this derivation, 1 suffix is involved and it derives an adjective from an adverb stem. The semantics of the suffixes is not considered but it is marked as suffix with a tag SFX.

Table 6.12 lists those suffixes with example of base stem derived word.

Table 6.13: Adverb to adjective derivation Base adverb stem भऽ bʰitrʌ

Gloss inside

Suffix ई-iː

Derived adjective भऽी bʰitriː

Gloss inner

बा हर bahirʌ

outside

ई-iː

बा हर bahiri

ourter

The finite-state transducer in Figure 6.12 is common to all the suffixes listed in Table 6.13. It is capable of analyzing and generating the base stems and derived words.

220

+ADV:0

AdvType

+ADJ:0 +SFX:Suffix

AdvType

Figure 6.13: A finite state transducer for noun to adjective derivation The phonological rules involved in this process is listed in PR 6.9; it is compiled and composed with the finite state transducer illustrated in Figure 6.13.

Phonological rule PR 6.9 i. Independent vowel ई i: changes to corresponding dependent vowel ◌ी i: after the consonants Regular expression:

ई -> ◌ी || cons __;

6.2.10 Verb to noun conversion Some of the verb stems alter between verb and noun. They are same phonologically but differ in written form. In the noun form, a diacritic halanta is dropped. Some examples of such stems are listed in Table 6.14.

Table 6.14: Verb to noun conversion Base verb stem खेल ् kʰel

Gloss 'play'

Derived noun खेल kʰel

खोज् kʰodz

'search'

खोज kʰodz

Gloss 'game' 'research'

The finite-state transducer in Figure 6.14 encodes stems listed in Table 6.14 and it is capable of analyzing and generating the base stem and derived words.

221

Vstem

+NOUN:0

Figure 6.14: A finite state transducer for verb to adverb derivation The phonological rule involved in this conversion is listed in PR 6.10; it is compiled and composed with finite state transducer illustrated in Figure 6.14.

Phonological rule PR 6.10 i. Halanta ◌् at the end of verb stem is removed. Regular Expression:

◌् -> [ ] || __ .#.

6.2.11 Verb to adjective/noun conversion Some verb stems alter between verb and noun or adjective forms. Some examples of such stems are listed in Table 6.15. Phonologically they are same but orthographically differ by halanta.

Table 6.15: Verb to Adjective/Noun Conversion Base verb Stem ठग् t ̺ʰʌg

Gloss cheat

Derived adjective ठग t ̺ʰʌg

Gloss cheat

चोर ् tsor

steal

चोर tsor

thief

थप् tʰʌp

add

थप tʰʌp

additional

The finite-state transducer illustrated in Figure 6.14 encodes stems listed in Table 6.15 and it is capable of analyzing and generating the base stems and derived words.

+NOUN:0

Vstem

+ADJ:0

Figure 6.15: A finite state transducer for verb to adverb derivation 222

The phonological rule involved in this conversion is listed in PR 6.11. It is compiled and composed with the finite state transducer illustrated in Figure 6.15.

Phonological rule PR 6.11 i. Halanta ◌् at the end of verb stem is removed. Regular expression:

◌् -> [ ] || __ .#.

6.2.12 Verb to noun derivation (vowel insertion) Some verb stems change from verb form to noun forms by inserting vowel अ ʌ between consonants in the stem. Some examples of such stems are listed in Table 6.16. Table 6.16: Verb to noun (vowel insertion) Base verb stem च क tsʌmkʌ

Gloss shine

Derived adjective चमक tsʌmʌk

Gloss shining

स झ sʌmdzʌ

remember

समझ sʌmʌdz

understanding

ट कt ̺ʌlkʌ

shine

टलक t ̺ʌlʌk

shining

The finite-state transducer illustrated in Figure 6.16 encodes stems listed in Table 6.16 and it is capable of analyzing and generating the derived words.

Vstem

+NOUN:0

Figure 6.16: A finite state transducer for verb to adverb derivation The phonological rule involved in this derivation is listed in PR 6.12; it is compiled and composed with the finite state transducer illustrated in Figure 6.16.

223

Phonological rule PR 6.12 i. Halant ◌् between the consonants of verb stem is removed. Regular expression:

◌् -> [ ] || cons__ cons;

6.3 Summary In this chapter, we presented the derivation process in Nepali. The various derivation such as noun to noun, noun to adjective, noun to adverb and adjective to adjective are the former types and noun to noun, noun to adjective, nount to noun/adjective, adjective to noun, adjective/noun to noun, verb to noun, verb to adjective, verb to adverb are the latter types. In addition, there are two kinds of conversions: verb to noun and verb to adjective/noun. And verb to noun derivation due to vowel insertion is also included. Each prefix and suffix has its own set of words from which derivation takes place. The derivation process in Nepali is not as productive and regular as inflectional process in Nepali. However there exists a quite good number of derived words. Two major types of derivation, prefixation and suffixation, are discussed and implemented.

224

CHAPTER 7 IMPLEMENTATION

7.0 Outline This chapter presents the implementation of morphological categories and phonological rules analyzed in the earlier chapters to design a computational model using the Xerox finite state toolkit. It consists of four sections. Section 7.1 presents the morphotactics, i.e. syntax of morphemes. The morphological categories and grammatical categories have been separated based on the earlier analysis. Section 7.2 presents the lexc grammar for nouns, pronouns, adjectives, verbs, numerals, classifiers, adverbs, postpositions, conjunctions, particles, interjections and derivation. Section 7.3 deals with the realization, i.e. rules for alternation using xfst interface for each category. Finally, section 7.4 summarizes the chapter.

7.1 Morphotactics: syntax of morphemes 7.1.1 Morphological categories As discussed and analyzed in chapters (3-6), two major categories are identified, open word class and closed word class. Table 7.1 shows four open word classes and their corresponding morphological tags used in the morphological analyzer. Table 7.2 shows seven closed word classes with their corresponding morphological tags used in the morphological analyzer.

Table 7.1: The open word classes S.N. 1. 2. 3. 4.

Morphological Categories Nouns Adjectives Verbs Adverbs

225

Tags +NOUN +ADJ +VERB +ADV

Table 7.2: The closed word classes S.N. 1. 2. 3. 4. 5. 6. 7.

Morphological Categories Pronouns Numeral Classifier Postpositions Conjunctions Particles Interjections

Tags +PRON +NUM +CLF +POSTP +CCONJ, +SCONJ +PARTICLE +INTERJ

7.1.2 Grammatical categories Altogether 45 grammatical categories are identified in all open and closed word classes. Features in adverbs from 37 to 42 in Table 7.3 and two features in nouns in 44 and 45 of Table 7.3 are semantic whereas the rest of the features in other categories are formal. Table 7.3 lists all the grammatical categories and their corresponding morphological tags used in the morphological analyzer. The redundant features such as augmentative, non-causative, active, direct form have not been incorporated into the analyzer.

Table 7.3: The grammatical categories and features S.N. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.

Grammatical categories Number Gender Form Honorificity Evaluation Persons Cases Distal Proximate Reflexive Demonstrative Relative Interrogative Indefinite Definite Reciprocal Degree Cardinal Ordinal Frequency Portion

226

Tags +SG, +PL +MASC, +FEM +DIRT, +OBL +NHON, +HON, +HHON, +RHON +AUG, +DIM 1, 2, 3 +ERG, +INST, +DAT, +ABL, +LOC, +COM, +GEN, +VOC, +ALL +DIST +PROX +REFL +DEM +REL +INTERRO +INDEF +DEF +RECIP +POSIT, +COMP, +SUPER +CARD +ORD +FREQ +PORT

22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45.

Voice Causative Existential Idenficational Tenses Aspects Moods Absolutive Infinitive Purposive Prospective Durative Conjunctive Conditional Perfective Temporal Spatial Amount Manner Reason Sentential Emphasis Proper Name Place Name

+PASS +CAUSE +EXIST +ID +NPST, +PST +PERF, +IMPERF, +INFER, +HAB +IMP, +OPT, +POT +ABS +INF +PURP +PROS +DUR +CONJUCT +COND +PERFT +TEMP +SPAC +AMOUNT +MANNER +REASON +SENT +EMPH +PROPER +PLACE

There are a number of arbitrary tags used in the lexc file to restrict the scope of replace rules. Finally these tags are removed for the transducer after the application of the replace rules. Table 7.3 lists a sample of such tags.

Table 7.3: The arbitrary tags S.N. Purpose of arbitrary tag

Tags

1.

O-ending nouns for plural, honorific, oblique and vocative

^MP

2.

O-ending nouns for feminine and diminutive

^FE

3.

Non-honorific imperative

^IMPsg

4.

Honorific imperative

^IMPhon

5.

Plural imperative

^IMPpl

6.

Noun to adjective derivation

^NA

7.

Noun to adverb derivation

^NADV

8.

Adjective to verb derivation

^ADJV

9.

Verb to noun conversion

^R

10.

Insertion in verb to noun derivation

^a

227

7.2 Lexc grammar Nouns, pronouns, adjectives, numerals, classifiers, verbs, adverbs, postpositions and particles and interjections are encoded in lexc files. The main morphological forms are in Devanagari script and the morphological tags are in Roman script using UTF-8 character encoding. The lexc files begin with Multichar_Symbols and lexicon follows it. 7.2.1 Nouns The nouns discussed and analyzed in (3.1) are implemented in a lexc file named nouns.txt which includes 14 classes of nouns. The upper language contains stems and sequence of morphological tags and the lower language contains surface forms. Besides regular morphological tags, some arbitrary tags are used for restricting the application of replace rules. The encoding of the nouns with their morphological tags is as follows. This lexc file accumulates the transducers form Figure 3.1 to Figure 3.13.

Multichar_Symbols +NOUN +MASC +FEM +OBL +PL +SG +DIM +VOC +HON ^MP ^FE +PLACE +PROPER LEXICON ROOT Nouns; LEXICON Nouns !! Type 1a Nouns: केटो inflection_1a; !!Type 1b Nouns: मुसो inflection_1b; !!Type 1c Nouns: डालो inflection_1c; !!Type 1d Nouns: फोटो inflection_1d; !!Type 21a Nouns: काका inflection_21a; !!Type 21b Nouns: ना त inflection_21b; !!Type 21c Nouns: बाघ inflection_21c; !!Type 21d Nouns: बःट inflection_21d;

228

!!Type 22a Nouns: दाइ inflection_22a; !!Type 22b Nouns: द द inflection_22b; !!Type 22c Nouns: राम inflection_22c; !!Type 22d Nouns: सीता inflection_22d; !!Type 22e Nouns: खेत inflection_22e; !!Type 22f Nouns: पोखरा inflection_22f; LEXICON inflection_1a #; +NOUN+MASC+SG:0 +NOU N+MASC+PL:^MP #; +NOUN+MASC+OBL:^MP #; +NOUN+MASC+HON:^MP #; +NOUN+MASC+VOC:^MP #; +NOUN+FEM:+FE #; LEXICON inflection_1b +NOUN+MASC+SG:0 #; +NOUN+MASC+PL:^MP #; +NOUN+MASC+OBL:^MP #; +NOUN+FEM:+FE #; LEXICON inflection_1c +NOUN+SG:0 #; +NOUN+PL:^MP #; +NOUN+OBL:^MP #; +NOUN+DIM:^FE #; LEXICON inflection_1d +NOUN+SG:0 #; +NOUN+PL:^MP #; +NOUN+OBL:^MP #; LEXICON inflection_21a +NOUN+MASC:0 #; #; +NOUN+FEM:◌ी LEXICON inflection_21b +NOUN+MASC:0 #; +NOUN+FEM:नी #; LEXICON inflection_21c +NOUN+MASC:0 #; #; +NOUN+FEM:ि◌नी

229

LEXICON inflection_21d #; +NOUN+MASC:0 #; +NOUN+FEM:◌ेनी #; +NOUN+FEM:ि◌नी LEXICON inflection_22a +NOUN+MASC:0 #; LEXICON inflection_22b +NOUN+FEM:0 #; LEXICON inflection_22c +NOUN+PROPER+MASC:0 #; LEXICON inflection_22d +NOUN+PROPER+FEM:0 #; LEXICON inflection_22e +NOUN:0 #; LEXICON inflection_22f +NOUN+PLACE:0 #; END

7.2.2 Pronouns Nepali pronouns are limited in number and more or less idiosyncratic in their forms and functions. Hence, instead of organizing them where rules can be applied to get their surface forms, they are directly encoded uniting all the finite state transducers from Figure 3.14 to Figure 3.33 along with their morphological tags. Therefore, this section does not contain any replace rules. Multichar_Symbols +PRON +1SG +OBL +EMPH +GEN +MASC +FEM +HON +1PL +2SG +HHON +RHON +3SG +PROX +DIST +REFL +DEM +HUM +NHUM +DEF +INTERRO +RECIP +PL +SG +REL +INDEF +3PL LEXICON ROOT pronouns; LEXICON pronouns !!First person singular pronoun म+PRON+1SG:म

#;

म+PRON+1SG+OBL:मै

#;

म+PRON+1SG+EMPH:मै

#;

म+PRON+1SG+OBL+GEN+MASC:मेरो #; म+PRON+1SG+OBL+GEN+FEM:मेर #;

म+PRON+1SG+OBL+GEN+PL:मेरा #; म+PRON+1SG+OBL+GEN+HON:मेरा #;

म+PRON+1SG+OBL+GEN+OBL:मेरा #;

230

म+PRON+1SG+OBL+GEN+EMPH:मेरै

#;

!!First Person Plural Prouns हामी+PRON+1PL:हामी

#;

हामी+PRON+1PL+OBL+GEN+MASC:हाॆो

#;

हामी+PRON+1PL+OBL+GEN+FEM:हाॆी #; हामी+PRON+1PL+OBL+GEN+PL:हाॆा #;

हामी+PRON+1PL+OBL+GEN+HON:हाॆा

#;

हामी+PRON+1PL+OBL+GEN+OBL:हाॆा #; हामी+PRON+1PL+OBL+GEN+EMPH:हाॆै

!! Second Person Singular तँ+PRON+2SG:तँ तँ+PRON+2SG+OBL:त

तँ+PRON+2SG+EMPH:त

#;

#; #; #;

तँ+PRON+2SG+OBL+GEN+MASC:तेरो #;

तँ+PRON+2SG+OBL+GEN+FEM:तेर #;

तँ+PRON+2SG+OBL+GEN+PL:तेरा

#;

तँ+PRON+2SG+OBL+GEN+HON:तेरा #;

तँ+PRON+2SG+OBL+GEN+OBL:तेरा #; तँ+PRON+2SG+OBL+GEN+EMPH:तेरै #;

!!Second Person honorific Pronoun तमी+PRON+2SG+HON: तमी

#;

तमी+PRON+2SG+OBL+HON+GEN+MASC: तॆो #;

तमी+PRON+2SG+OBL+HON+GEN+FEM: तॆी #; तमी+PRON+2SG+OBL+HON+GEN+PL: तॆा

#;

तमी+PRON+2SG+OBL+HON+GEN+HON: तॆा #;

तमी+PRON+2SG+OBL+HON+GEN+OBL: तॆा #;

तमी+PRON+2SG+OBL+HON+GEN+EMPH: तॆै #; !! Second person high honorific pronouns #; तपा + PRON+2SG+HHON:तपा यहाँ+PRON+2SG+HHON:यहाँ #; उहाँ+PRON+2SG+HHON:उहाँ #;

231

वहाँ+PRON+2SG+HHON:वहाँ #;

हजुर+PRON+2SG+HHON:हजुर #; !! Second person royal honorific pronoun मौसुफ+PRON+2SG+RHON:मौसुफ #; !!Third Person Singular Pronoun ऊ ऊ+PRON+3SG:ऊ

#;

ऊ+PRON+3SG+EMPH:उह

#;

ऊ+PRON+3SG+OBL:उस

#;

ऊ+PRON+3SG+OBL+EMPH:उसै

#;

ऊ+PRON+3SG+HON:उनी

#;

ऊ+PRON+3SG+HON+OBL:उन

#;

ऊ+PRON+3SG+HON+OBL+EMPH:उनै #; ऊ+PRON+3SG+HON:उहाँ ऊ+PRON+3SG+HON:वहाँ

#; #;

!! Third person singular pronoun यो यो+PRON+3SG+PROX:यो

#;

यो+PRON+3SG+PROX+EMPH:यह

#;

यो+PRON+3SG+OBL+PROX:यस

#;

यो+PRON+3SG+OBL+PROX+EMPH:यसै

#;

यो+PRON+3SG+PROX+HON:यी

#;

यो+PRON+3PL+PROX:यी

#;

यो+PRON+3SG+PROX+HON: यनी

#;

यो+PRON+3SG+PROX+OBL+HON: यन

#;

यो+PRON+3SG+PROX+OBL+HON+EMPH: यनै #; !!!Third person singular pronoun यो and ती यो+PRON+3SG+DIST: यो

#;

यो+PRON+3SG+DIST+EMPH: यह

#;

यो+PRON+3SG+OBL: यस

#;

यो+PRON+3SG+OBL+EMPH: यसै

#;

ती+PRON+3SG+HON+DIST:ती

ती+PRON+3PL+DIST:ती

#; #;

232

ती+PRON+3SG+HON+DIST: तनी

#;

ती+PRON+3SG+OBL+HON+DIST: तन

#;

ती+PRON+3SG+OBL+HON+DIST+EMPH: तनै !!Reflexive pronoun आफू+PRON+REFL:आफू

#;

#;

आफू+PRON+REFL+OBL+EMPH:आफै

#;

आफू+PRON+REFL+OBL+EMPH:आफ

#;

आफू+PRON+REFL+OBL+GEN+SG:आ नो

#;

आफू+PRON+REFL+OBL+GEN+HON:आ ना

#;

आफू+PRON+REFL+OBL+GEN+PL:आ ना #;

आफू+PRON+REFL+OBL+GEN+OBL:आ ना

आफू+PRON+REFL+OBL+GEN+FEM:आ नी

#; #;

आफू+PRON+REFL+OBL+GEN+EMPH:आ नै

#;

!! Demonstrative pronouns यो यो+PRON+DEM+PROX:यो

#;

यो+PRON+DEM+PROX+EMPH:यह

#;

यो+PRON+DEM+PROX:यी

#;

यो+PRON+DEM+PROX+HON: यनी

#;

यो+PRON+DEM+PROX+OBL: यन

#;

यो+PRON+DEM+PROX+OBL+EMPH: यनै यो+PRON+DEM+PROX:यहाँ

#;

#;

!!Demonstrative pronoun यो and ती यो+PRON+DEM+DIST: यो

#;

यो+PRON+DEM+DIST+EMPH: यह #;

ती+PRON+DEM+DIST:ती

#;

ती+PRON+DEM+DIST+OBL: तन

#;

ती+PRON+DEM+DIST+OBL+HON: तनी #; ती+PRON+DEM+DIST+OBL+EMPH: तनै #; !!Demonstrative pronoun ऊ ऊ+PRON+DEM+DIST:ऊ

#;

233

ऊ+PRON+DEM+DIST+EMPH:उह ऊ+PRON+DEM+DIST+HON:उनी

#; #;

ऊ+PRON+DEM+DIST+OBL:उन #; ऊ+PRON+DEM+DIST+OBL+EMPH:उनै #; ऊ+PRON+DEM+DIST+HON:उहाँ

#;

ऊ+PRON+DEM+DIST+HON:वहाँ

#;

!!Other Demonstrative pronouns #; सो+PRON+DEM+DIST:सो

सो+PRON+DEM+DIST+EMPH:सोह नज+PRON+DEM+PROX: नज

#;

नज+PRON+DEM+PROX+EMPH: नजै उ

#;

+PRON+DEM+PROX:उ

#;

#;

!! Relative pronouns जो+PRON+REL+HUM:जो

#;

जो+PRON+REL+OBL+HUM:जस #; जो+PRON+REL+OBL+HUM+EMPH:जसै #; जे+PRON+REL+NHUM:जे

#;

जुन+PRON+REL:जुन

#;

जुन+PRON+REL+EMPH:जुनै

#;

!! Interrogative pronouns को+PRON+INTERRO+HUM:को#; को+PRON+INTERRO+HUM+OBL:कस #; को+PRON+INTERRO+HUM+OBL+EMPH:कसै

#;

के+PRON+INTERRO+NHUM:के #;

कुन+PRON+INTERRO:कुन

#;

कन+PRON+INTERRO: कन

#;

कसर +PRON+INTERRO:कसर #; !! Indefinite pronouns को+PRON+INDEF+HUM:कोह #;

के+PRON+INDEF+NHUM:केह #;

कुन+PRON+INDEF:

कुनै

#;

जो+PRON+INDEF+HUM:जोसुकै #; 234

ु ै #; जे+PRON+INDEF+NHUM:जेसक जुन+PRON+INDEF:जुनसुकै

#;

!! Definite pronouns ू येक+PRON+DEF:ू येक #; हरे क+PRON+DEF:हरे क

#;

सबै+PRON+DEF:सबै

#;

अक +PRON+DEF+SG:अक #; अक +PRON+DEF+PL:अका #; अक +PRON+DEF+HON:अका

#;

अक +PRON+DEF+OBL:अका

#;

अक +PRON+DEF+FEM:अक

#;

अक +PRON+DEF+EMPH:अक #; अ +PRON+DEF:अ

#;

!! Reciprocal pronouns एकअक +PRON+RECIP:एकअक

#;

एकअक +PRON+RECIP+OBL:एकअका #; एकअक +PRON+RECIP+HON:एकअका #;

एकअक +PRON+RECIP+PL:एकअका

#;

एकअक +PRON+RECIP+FEM:एकअक #;

एकआपस+PRON+RECIP:एकआपस आपस+PRON+RECIP:आपस

आफू+PRON+RECIP:आआफू

#;

#; #;

END 7.2.3 Verbs The verb stems analyzed and classified in (4.4); and auxiliary verbs and inflections analyzed in (4.5) are implemented in a single lexc file. The stems and inflections are concatenated in same lexc file with the help of continuation lexicons. Two flag diacritics @U.NEG.PRESENT@ and @U.NEG.ABSENT@ are defined and implemented to restrict negative prefix. This lexc file includes the tranducers from Figure 4.1 to Figure 4.36. MULTICHAR_SYMBOLS ^IMP +1PL +1SG +2PL +2SG +3PL +3SG +ABS +COND +CONJ +DUR +EMPH +FEM +HAB +HON +IMP +IMPERF +INF +MASC +NEG +NPST +OBL +OPT +PST +PERF +PERFT +PL +POT +PROSP

235

+PURP +SG +UNA +EXIST +IDEN @U.NEG.ABSENT@ +VERB +PASS +CAUSE

NEG+

@U.NEG.PRESENT@

LEXICON ROOT !!Auxiliary verbs छ+EXIST:छ auxcha; हो+IDEN:हो

auxho;

थ+EXIST: थ past;

!!=============Main Verb Stems ==========!! [email protected]@:न@U.NEG.PRESENT@ Verbs; LEXICON Verbs !! Verb Type 1a अघा Type1a; !! Verb Type 1b चोिख Type1b; !!Verb Type1c उि ल Type1c; !!Type verb1d हाँस् Type1d; !!Verb Type1e बस् Type1e; !! Verb Type2a उचाल् Type2a; !! Verb Type 2b प ब Type2b; !!Verb Type2c नाच ् Type2c; !!Type verb2d कन् Type2d; !! Irregular verbs आउनु+VERB:आ Group; आउनु+VERB+PASS:आइ

खानु+VERB:खा

intGroup;

Group;

खानु+VERB+PASS:खाइ

Group;

खानु+VERB+CAUSE: वा

Group;

खानु+VERB+CAUSE+PASS: वाइ Group; बःनु+VERB+CAUSE:बसाल् Group;

236

Verbs;

बःनु+VERB+CAUSE+PASS:बसा ल Group;

बःनु+VERB+CAUSE:बसाल् Group;

बःनु+VERB+CAUSE+PASS:बसा ल Group; LEXICON Type1a उनु+VERB:0 Group; उनु+VERB+PASS:इ intGroup; LEXICON Type1b नु+VERB:0 Group; नु+VERB+PASS:इ

intGroup;

नु+VERB+CAUSE:आ

Group;

नु+VERB+CAUSE+PASS:आइ LEXICON Type1c नु+VERB:0 Group; नु+VERB+PASS:इ

intGroup;

नु+VERB+CAUSE:आ

Group;

नु+VERB+CAUSE+PASS:आइ LEXICON Type1d नु+VERB:0 Group; नु+VERB+PASS:इ

Group;

नु+VERB+CAUSE+PASS:आइ LEXICON Type1e नु+VERB:0 Group; Group;

नु+VERB+CAUSE+PASS:आइ LEXICON Type2a नु+VERB:0 Group;

नु+VERB+PASS:इ

Group;

Group; Group; Group;

नु+VERB+CAUSE:आ

Group;

नु+VERB+CAUSE+PASS:आइ LEXICON Type2c नु+VERB:0 Group; नु+VERB+PASS:इ

Group;

intGroup;

नु+VERB+CAUSE:आ

नु+VERB+PASS:इ LEXICON Type2b नु+VERB:0

Group;

intGroup;

नु+VERB+CAUSE:आ

नु+VERB+PASS:इ

Group;

Group;

Group;

237

नु+VERB+CAUSE:आ

Group;

नु+VERB+CAUSE+PASS:आइ LEXICON Type2d नु+VERB:0 Group; नु+VERB+PASS:इ

Group;

Group;

नु+VERB+CAUSE:आ

Group;

नु+VERB+CAUSE+PASS:आइ !!!=============Verbs end

LEXICON intGroup @U.NEG.ABSENT@ intGroup2; LEXICON intGroup1 +NPST+3SG+MASC:छ

Group;

intGroup1;

#;

+NPST+NEG+3SG+MASC:दै न

#;

+NPST+NEG+3SG:न #; +PST+3SG+MASC:यो

#;

+PST+3SG+MASC:एन

#;

+PST+HAB+3SG+MASC: यो

#;

+PST+NEG+HAB+3SG+MASC:दै न यो +PST+INFER+3SG+MASC:एछ

#;

+PST+INFER+NEG+3SG+MASC:एनछ LEXICON intGroup2 +PERF+SG+MASC:एको +IMPERF:दै

#;

#;

#;

+OPT+3SG:ओस्

#;

+POT+3SG+MASC:ला +INF:नु

#;

#;

#;

+INF+OBL:ना #; +PURP:न

#;

+PROSP:ने

#;

+DUR+EMPH:दै

#;

+CONJ:एर

#;

+CONJ:इकन #; +COND:ए

#;

238

+PERFT:ए

#;

LEXICON Group @U.NEG.ABSENT@ Group1; Group2; LEXICON Group1 NonpastAffirmative; NonpastNegative1; NonpastNegative2; PastAffirmative; PastNegative; HabitualAspectAffirmative; HabitualAspectNegative; InferentialAffiramtive; InferentialNegative; LEXICON Group2 PerfectAspect; ImperfectAspect; Imperative; Optative; Potential; Participles; LEXICON past PastAffirmative; PastNegative; !Inflection for non-past existential verb छ chʑ 'be' (Affirmative) LEXICON auxcha +NPST+1SG:◌ु #; +NPST+1PL:◌ौ◌ँ

#;

+NPST+2SG+MASC:स्

#;

+NPST+2SG+FEM:◌ेस्

#;

+NPST+2SG+MASC+HON:◌ौ

#;

+NPST+2SG+FEM+HON:◌्यौ

#;

+NPST+2PL:◌ौ #; +NPST+3SG+MASC:0 +NPST+3SG+FEM:◌े #;

#;

+NPST+3SG+MASC+HON:न्

#;

+NPST+3SG+FEM+HON:ि◌न्

#;

+NPST+3PL:न्

#;

239

!Inflection for non-past existential verb छ chʑ 'be' (Negative) +NPST+NEG+1SG:◌ैनँ

#;

+NPST+NEG+1PL:◌ैन

#;

+NPST+NEG+2SG:◌ैनस्

#;

+NPST+NEG+2SG+HON:◌ैनौ +NPST+NEG+2PL:◌ैनौ

#;

+NPST+NEG+3SG:◌ैन

#;

+NPST+NEG+3SG+HON:◌ैनन्

+NPST+NEG+3PL:◌ैनन्

#;

#;

#;

!! end of छ

!Inflections for non-past identificational verb हो ɦo ‘be’ (affirmative) LEXICON auxho +NPST+1SG:उँ #; +NPST+1PL:औ

#;

+NPST+2SG:स्

#;

+NPST+2SG+HON:औ

#;

+NPST+2PL:औ #; +NPST+3SG:0 #; +NPST+3SG+HON:न्#; +NPST+3PL:न्

#;

!Inflection for non-past identificational verb हो ɫo ‘be’ (Negative)

+NPST+NEG+1SG:इनँ

#;

+NPST+NEG+1PL:इन

#;

+NPST+NEG+2SG:इनस्

#;

+NPST+NEG+2SG+HON:इनौ +NPST+NEG+2PL:इनौ

#;

+NPST+NEG+3SG:इन

#;

+NPST+NEG+3SG+HON:इनन् +NPST+NEG+3PL:इनन्

#;

#;

#;

!!!===Tense, aspect and mood ==== !Inflections for non-past tense (affirmative) LEXICON NonpastAffirmative +NPST+1SG:छु #; +NPST+1PL:छ

#;

240

+NPST+2SG+MASC:छस्

#;

+NPST+2SG+FEM:छे स ्

#;

+NPST+2SG+MASC+HON:छौ

#;

ौ

#;

+NPST+2SG+FEM+HON: +NPST+2PL:छौ

#;

+NPST+3SG+MASC:छ

#;

+NPST+3SG+FEM:छे

#;

+NPST+3SG+MASC+HON:छन् +NPST+3SG+FEM+HON: छन् +NPST+3PL:छन्

#; #;

#;

!Inflections for non-past tense negative 1 LEXICON NonpastNegative1 +NPST+NEG+1SG: दनँ #; +NPST+NEG+1PL:दै न

#;

+NPST+NEG+2SG+MASC:दै नस् +NPST+NEG+2SG+FEM: दनस्

#; #;

+NPST+NEG+2SG+MASC+HON:दै नौ

#;

+NPST+NEG+2SG+FEM+HON: दनौ

#;

+NPST+NEG+2PL:दै नौ

#;

+NPST+NEG+3SG+MASC:दै न +NPST+NEG+3SG+FEM: दन

#; #;

+NPST+NEG+3SG+MASC+HON:दै नन् +NPST+NEG+3SG+FEM+HON: दनन् +NPST+NEG+3PL:दै नन्

#; #;

#;

!Inflections for non-past tense negative 2 LEXICON NonpastNegative2 +NPST+NEG+1SG:नँ #; +NPST+NEG+1PL:न #; +NPST+NEG+2SG:नस्

#;

+NPST+NEG+2SG+HON:नौ #; +NPST+NEG+2PL:नौ#; +NPST+NEG+3SG:न #; +NPST+NEG+3SG+HON:नन् +NPST+NEG+3PL:नन्

#;

#;

241

!Inflections for past tense (affirmative) LEXICON PastAffirmative +PST+1SG:ए ँ #; +PST+1PL:य #; +PST+2SG:इस्

#;

+PST+2SG+HON:यौ #; +PST+2PL:यौ #; +PST+3SG+MASC:यो

#;

+PST+3SG+FEM:ई #; +PST+3SG+MASC+HON:ए #; +PST+3SG+FEM+HON:इन् #; +PST+3PL:ए #; !Inflections for past tense (negative) LEXICON PastNegative +PST+NEG+1SG:इनँ #; +PST+NEG+1PL:एन #; +PST+NEG+2SG:इनस्

#;

+PST+NEG+2SG+MASC+HON:एनौ

#;

+PST+NEG+2SG+FEM+HON:इनौ #; +PST+NEG+2PL:एनौ #; +PST+NEG+3SG+MASC:एन

#;

+PST+NEG+3SG+FEM:इन #; +PST+NEG+3SG+MASC+HON:एनन्

#;

+PST+NEG+3SG+FEM+HON:इनन् #; +PST+NEG+3PL:एनन्

#;

!Inflections for perfect aspect LEXICON PerfectAspect +PERF+SG+MASC:एको #; +PERF+PL:एका

#;

+PERF+SG+FEM:एक +PERF+EMPH:एकै

#;

#;

!Inflections for imperfect aspect LEXICON ImperfectAspect +IMPERF+SG+MASC:दो #;

242

+IMPERF+SG+FEM:द +IMPERF+PL:दा +IMPERF:दै

#;

#;

#;

! Inflections for habitual aspect (affirmative) LEXICON HabitualAspectAffirmative +PST+HAB+1SG:थ #; +PST+HAB+1PL: य #; +PST+HAB+2SG: थस्

#;

+PST+HAB+2SG+HON: यौ #;

+PST+HAB+2PL: यौ #;

+PST+HAB+3SG+MASC: यो

#;

+PST+HAB+3SG+FEM: थ #; +PST+HAB+3SG+MASC+HON:थे #; +PST+HAB+3SG+FEM+HON: थन् #; +PST+HAB+3PL:थे #; !Inflections for habitual aspect (negative) LEXICON HabitualAspectNegative +PST+NEG+HAB+1SG:दै नथ #; +PST+NEG+HAB+1PL:दै न य

#;

+PST+NEG+HAB+2SG:दै न थस्

#;

+PST+NEG+HAB+2SG+HON:दै न यौ +PST+NEG+HAB+2PL:दै न यौ

#;

#;

+PST+NEG+HAB+2SG+FEM: दन थस्

#;

+PST+NEG+HAB+2SG+FEM+HON: दन यौ

#;

+PST+NEG+HAB+3SG+MASC:दै न यो

#;

+PST+NEG+HAB+3SG+FEM: दन थस्

#;

+PST+NEG+HAB+3SG+MASC+HON: दनथे +PST+NEG+HAB+3PL:दै नथे #;

!Inflections for Inferential aspect (affirmative) LEXICON InferentialAffiramtive +PST+INFER+1SG:एछु #; +PST+INFER+1PL:एछ

#;

+PST+INFER+2SG+MASC:एछस्

#;

+PST+INFER+2SG+FEM:इछस्

#;

243

#;

+PST+INFER+2SG+MASC+HON:एछौ

#;

+PST+INFER+2SG+FEM+HON:इछौ

#;

+PST+INFER+2PL:एछौ

#;

+PST+INFER+3SG+MASC:एछ

#;

+PST+INFER+3SG+FEM:इछ

#;

+PST+INFER+3SG+MASC+HON:एछन्

#;

+PST+INFER+3SG+FEM+HON:इछन्

#;

+PST+INFER+3PL:एछन्

#;

!Inflections for Inferential aspect (negative) LEXICON InferentialNegative +PST+INFER+NEG+1SG:एनछु #; +PST+INFER+NEG+1PL:एनछ

#;

+PST+INFER+NEG+2SG+MASC:एनछस् #; +PST+INFER+NEG+2SG+FEM:इनछे स्

#;

+PST+INFER+NEG+2SG+MASC+HON:एनछौ

#;

+PST+INFER+NEG+2SG+FEM+HON:इनछौ

#;

+PST+INFER+NEG+2PL:एनछौ

#;

+PST+INFER+NEG+3SG+MASC:एनछ

#;

+PST+INFER+NEG+3SG+FEM:इनछ

#;

+PST+INFER+NEG+3SG+HON:एनछन्

#;

+PST+INFER+NEG+3SG+FEM+HON:इनछन्

#;

+PST+INFER+NEG+3PL:एनछन्

#;

!Inflection for imperative mood LEXICON Imperative +IMP+2SG:^IMPsg #; !-/-ई +IMP+2SG+HON:^IMPhon #; !-अ/-ऊ +IMP+2PL:^IMPpl

#; !-अ/-ओ

! Inflections for optative mood (affirmative) LEXICON Optative +OPT+1SG:ऊँ #; +OPT+1PL:औ#; +OPT+2SG:एस्

#;

+OPT+2SG+HON:ए #; +OPT+2PL:ए #;

244

+OPT+3SG:ओस्

#;

+OPT+3SG+HON:ऊन् +OPT+3PL:ऊन्

#;

#;

! Inflections for potential mood (affirmative) LEXICON Potential +POT+1SG:उँला #; +POT+1PL:औला

#;

+POT+2SG+MASC:लास्

#;

+POT+2SG+FEM: लस्

#;

+POT+2SG+MASC+HON:औला

#;

+POT+2SG+FEM+HON:औल

#;

+POT+2PL:औला

#;

+POT+3SG+MASC:ला

#;

+POT+3SG+FEM:ल #; +POT+3SG+MASC+HON:लान्

#;

+POT+3SG+FEM+HON: लन्

#;

+POT+3PL:लान्

#;

!Inflection for participles LEXICON Participles +ABS:ई #; +INF:नु

#;

+INF+OBL:ना #; +INF+EMPH:नै +PURP:न

#;

+PURP+EMPH:नै +PROSP:ने

#;

+DUR:दा

#;

+DUR+EMPH:दै +CONJ:एर

#; #;

#;

#;

+CONJ+EMPH:एरै

#;

+CONJ:इकन #; +CONJ+EMPH:इकनै #; +COND:ए

#;

+PERFT:ए

#;

245

END 7.2.4 Adjectives Adjectives described, analyzed and classified in (3.4) are implemented in a lexc file. The adjectives are classified into four groups. This lexc file contains the transducers from Figure 3.34 to Figure 3.37. Multichar_Symbols +ADJ +SG +PL +OBL +HON +FEM +POSIT +COMP +SUPER LEXICON ROOT Adjectives; LEXICON Adjectives !!O-ending Adjectives राॆो inflection_o_ending; !!Non-o-ending Adjective Type 1 चतुर inflection_non_o_ending1; !!Non-o-ending Adjective Type 2 यून inflection_non_o_ending2; !!Unmarked adjectives असल inflection_unmarked; LEXICON inflection_o_ending +ADJ+SG:0 #; +ADJ+PL:+MP #; +ADJ+OBL:+MP #; +ADJ+HON:+MP #; +ADJ+FEM:+FE #; LEXICON inflection_non_o_ending1 +ADJ+SG:0 #; #; +ADJ+FEM:नी+FE LEXICON inflection_non_o_ending2 +ADJ+POSIT:0 #; +ADJ+COMP:तर #; +ADJ+SUPER:तम

LEXICON +ADJ:0 #; END

#;

inflection_unmarked

246

7.2.5 Numerals and classifiers The numerals analyzed and described in (3.5) and classifiers described and analyzed in (3.6) are implemented in a lexc file. Irregular ordinal numerals are directly encoded and this lexc file contains the transducers from Figure 3.38 to Figure 3.45.

Multichar_Symbols +CARD +ORD +NUM +CLF +PORT +FREQ +FEM +PL +HON +MASC +HUM +NHUM +OBL +CL +MP +SG LEXICON ROOT Numbers; LEXICON Numbers !!Cardinal Numbers पाँच CardOrd; सात सय

CardOrd; CardOrd;

हजार CardOrd; लाख

CardOrd;

अरब

CardOrd;

खरब

CardOrd;

करोड CardOrd;

नील

CardOrd;

श

CardOrd;

दानो

ctag1;

प CardOrd; !!Clasifier like items !!O-ending classifiers कोसो ctag1;

!!Non-o-ending classifiers पोट ctag2; थुन

ctag2;

!!Exceptional numbers एक+NUM+CARD:एक

#;

दुई+NUM+CARD:दुई #; तीन+NUM+CARD:तीन

#;

247

चार+NUM+CARD:चार

#;

छ+NUM+CARD:छ #; नौ+NUM+CARD:नौ #; !!Exceptional ordinal numerals एक+NUM+ORD+MASC:प हलो

#;

एक+NUM+ORD+PL:प हला #;

एक+NUM+ORD+OBL:प हला

#;

एक+NUM+ORD+HON:प हला

#;

एक+NUM+ORD+FEM:प हल

#;

दुई+NUM+ORD+MASC:दोॐो

#;

दुई+NUM+ORD+PL:दोॐा

#;

दुई+NUM+ORD+OBL:दोॐा #; दुई+NUM+ORD+HON:दोॐा #; दुई+NUM+ORD+FEM:दोॐी #;

तीन+NUM+ORD+MASC:तेॐो तीन+NUM+ORD+PL:तेॐा

#;

#;

तीन+NUM+ORD+OBL:तेॐा #;

तीन+NUM+ORD+HON:तेॐा #; तीन+NUM+ORD+FEM:तेॐी #; चार+NUM+ORD+MASC:चौथो चार+NUM+ORD+PL:चौथा

#;

#;

चार+NUM+ORD+OBL:चौथा #; चार+NUM+ORD+HON:चौथा #;

चार+NUM+ORD+FEM:चौथी #; एक+NUM+ORD:ूथम

#;

दुई+NUM+ORD: तीय

#;

तीन+NUM+ORD:तृतीय

#;

चार+NUM+ORD:चतुथ

#;

पाँच+NUM+ORD:प म

#;

छै ट +NUM+ORD:छै ट

#;

नव +NUM+ORD:नव #; !! Frequency numerals एक+NUM+FREQ:एकोहोरो

#;

248

दुई+NUM+FREQ:दोहोरो

#;

तीन+NUM+FREQ:तेहोरो

#;

एक+NUM+FREQ:एकसरो

#;

दुई+NUM+FREQ:दुईसरो

#;

तीन+NUM+FREQ:तीनसरो

#;

दुई+NUM+FREQ:दोबर

#;

तीन+NUM+FREQ:तेबर

#;

चार+NUM+FREQ:चौबर

#;

दुई+NUM+FREQ:दुईगुना

#;

तीन+NUM+FREQ:तीनगुना

#;

चार+NUM+FREQ:चौगुना

#;

!! Portion Numerals आधा+NUM+PORT:आधा

#;

पौने+NUM+PORT:पौने

#;

सवा+NUM+PORT:सवा

#;

डेढ+NUM+PORT:डेढ #; साढे +NUM+PORT:साढे

अढाइ+NUM+PORT:अढाइ

#; #;

चौथाइ+NUM+PORT:चौथाइ #; !! Classifiers जना+CLF+HUM:जना #; वटा+CLF+NHUM:वटा

#;

ओटा+CLF+NHUM:ओटा

#;

वट +CLF+FEM:वट #;

ओट +CLF+FEM:ओट LEXICON CardOrd +NUM+CARD:0 #; +NUM+ORD:औ #; LEXICON ctag1 +CL+SG:0 #; +CL+PL:+MP #; LEXICON ctag2 +CL:0 #; END

#;

249

7.2.6 Adverbs The adverbs described, analyzed and classified in (5.1) are implemented in a lexc file. Since the adverbs do not inflect, they are classified into semantic classes. This lexc file contains the transducers from Figure 5.1 to Figure 5.7. Multichar_Symbols +ADV +TEMP +SPAC +AMOUNT +MANNER +FREQ +REASON +SENT LEXICON Root !!! Temporal adverbs अ हले AdvT; हजो AdvT; !!! Spatial Adverbs तल AdvS;

यहाँ AdvS; !!! Amount adverbs धेरै AdvAm; अझ

AdvAm;

!!! Manner adverbs सुःतर AdvMa; फटाफट AdvMa; !!! Frequency adverbs बार बार AdvFr; नर तर AdvFr;

!!! Reason adverbs यसकारण AdvRe; तसथ AdvRe; !!! Sentential adverbs साँ चै AdvSe; ःवाभावतः

AdvSe;

LEXICON AdvT +ADV+TEMP:0 LEXICON AdvS +ADV+SPAC:0 LEXICON AdvAm +ADV+AMOUNT:0 LEXICON AdvMa +ADV+MANNER:0

#; #; #; #;

250

LEXICON AdvFr +ADV+FREQ:0 #; LEXICON AdvRe +ADV+REASON:0 #; LEXICON AdvSe +ADV+SENT:0 #; END 7.2.7 Postpositions Postpositions discussed and analyzed in (5.3) are implemented in a lexc file. Plural marker and case markers are directly encoded whereas adverbial postpositions are implemented through continuations lexicons. This lexc file contains transducers from Figure 5.10 to Figure 5.12.

Multichar_Symbols +POSTP +EMPH +ERG +INST +DAT +ABL +LOC +COM +GEN +DIR +SG +PL +FEM LEXICON ROOT !!Case Markers which do not take emphatic marker +ERG:ले #; +INST:ले

#;

+DAT:लाई

#;

+ABL:दे िख

#;

!!Case marker which take emphatic marker also +ABL:बाट #; +ABL+EMPH:बाटै +LOC:मा

#;

+LOC+EMPH:मै +COM:सँग

#;

#;

+COM+EMPH:सँगै +COM: सत

#;

#;

#;

+COM+EMPH: सतै

#;

+GEN+SG:को #; +GEN+PL:का #; +GEN+FEM:क

#;

251

+GEN+EMPH:कै

+ALL: तर

#;

#;

+ALL+EMPH: तरै #; !!Plural/collective marker +PL:ह #; !!Adverbial postpositions which do not take emphatic marker मा थ tag1; कहाँ

tag1;

!!Adverbial Postpositions which take emphatic marker स हत tag2; साथ

अनुसार

tag2; tag2;

बाहे क tag2; LEXICON tag1 +POSTP:0 #; LEXICON tag2 +POSTP:0 #; +POSTP+EMPH:◌ै END

#;

7.2.8 Conjunctions, particles and interjections The conjunctions analyzed in (5.2) and particles and interjections analyzed in (5.4) are implemented in a lexc file. This lexc file contains the transducers from Figure 5.8 to Figure 5.9 and Figure 5.13 to Figure 5.14. Multichar_Symbols +PART +INTERJ +CCONJ +SCONJ +PARTICLE

LEXICON Root !!!समप दक सं योजकह र

वा

Coordinate; Coordinate;

अथवा Coordinate; या

Coordinate;

क

Coordinate;

नक

Coordinate;

अन

Coordinate;

252

पन

Coordinate;

तथा

Coordinate;

एवं

Coordinate;

तर

Coordinate;

क तु Coordinate;

पर तु Coordinate; !!! वषमप दक सं योजकह भ े

Subordinate;

भनेर

Subordinate;

भने

Subordinate;

क

Subordinate;

कनभने Subordinate; कन क Subordinate;

यसकारण

Subordinate;

!!!Particles नपातह नै

Particle;

माऽ

Particle;

केवल Particle; चा हँ

Particle;

पन

Particle;

ल

Particle;

है

Particle;

न

Particle;

न

Particle;

त

Particle;

पो

Particle;

या

के

Particle; Particle;

क

Particle;

रे

Particle;

हँ

यारे

हग

Particle; Particle; Particle;

253

खै

Particle;

लौ

Particle;

हौ

Particle;

यार

Particle;

यारे

Particle;

या

Particle;

है

Particle;

झ

Particle;

!!!Interjections वःमाया दबोधकह अहा

Interjection;

अहो

Interjection;

ओहो

Interjection;

उहु

Interjection;

उफ

Interjection;

आ था Interjection; आ थु Interjection; आ छु Interjection; छ

Interjection;

धत्

Interjection;

थु

Interjection;

थुइ

Interjection;

ध े र Interjection;

बःड

Interjection;

हाय

Interjection;

कठै

Interjection;

हरे

Interjection;

िशव

Interjection;

च

Interjection;

बरा

Interjection;

उस्

Interjection;

हाहा

Interjection;

हह

Interjection;

बचरा Interjection;

254

या

Interjection;

ए

Interjection;

ऐ

Interjection;

औ

Interjection;

हौ

Interjection;

ऐ या

Interjection;

ल

Interjection;

हवस्

Interjection;

अँ

Interjection;

यू

Interjection;

हजुर

Interjection;

हँ

अहँ

नाइँ

Interjection; Interjection; Interjection;

कुि

Interjection;

स े

Interjection;

साँ ची Interjection; धरोधम Interjection; भो

Interjection;

ई

Interjection;

ऊ

Interjection;

वाह

Interjection;

अबुइ

Interjection;

आ पै

Interjection;

ःयाबास Interjection;

उफ्

Interjection;

ओ

Interjection;

चै

Interjection;

ओइ

Interjection;

एइ

Interjection;

याहै

Interjection;

ह

Interjection;

हा

ब:ड

Interjection; Interjection;

255

LEXICON Coordinate +CCONJ:0 #; LEXICON Subordinate +SCONJ:0 #; LEXICON Particle +PARTICLE:0 #; LEXICON Interjection +INTERJ:0 #; END 7.2.9 Derivations The derivational process prefixation described and analyzed in (6.1) and suffixation described and analyzed in (6.2) are implemented in lexc file. This lexc file contains the transducers form Figure 6.1 to Figure 6.16.

Multichar_Symbols +NOUN PFX+ +ADJ +SFX ^R ^a +ADV LEXICON ROOT !Lexcons for prefixation PNtoN; PNtoAdj; PNtoAdv; PAdjtoAdj; !Lexicons for suffixation SNtoN; SNtoAdj; SNtoNAdj; SAdjtoN; SAdjNtoN; SVtoN; SVtoAdj; SVtoAdv; SAdvtoAdj; ConVtoN; ConVtoNAdj; InsVtoN; LEXICON PNtoN PFX+:ू PNtoN1; PFX+:परा

PNtoN2;

PFX+:अप

PNtoN3;

PFX+:सम्

PNtoN4;

PFX+:अनु

PNtoN5;

PFX+:अव

PNtoN6;

256

PFX+:दुस ्

PNtoN7;

PFX+:दुर ्

PNtoN8;

PFX+: व

PNtoN9;

PFX+:अ ध

PNtoN10;

PFX+:अ त

PNtoN11;

PFX+:अ भ

PNtoN12;

PFX+:ू त

PNtoN13;

PFX+:प र

PNtoN14;

PFX+:उप

PNtoN15;

PFX+:सह

PNtoN16;

PFX+:स

PNtoN17;

PFX+:कु

PNtoN18;

PFX+:अ

PNtoN19;

PFX+:अन्

PNtoN20;

PFX+:बे

PNtoN21;

PFX+:बद

PNtoN22;

PFX+:ला

PNtoN23;

PFX+:सु

PNtoN24;

!!Lexicon of underived nouns -----!! LEXICON PNtoN1 चलन PNtoNtag; LEXICON PNtoN2 जय PNtoNtag; LEXICON PNtoN3 श द PNtoNtag; LEXICON PNtoN4 मान PNtoNtag; LEXICON PNtoN5 शासन PNtoNtag; LEXICON PNtoN6 गुण PNtoNtag; LEXICON PNtoN7 प रणाम PNtoNtag; LEXICON PNtoN8 घटना PNtoNtag; LEXICON PNtoN9 नाश PNtoNtag; 257

LEXICON PNtoN10 रा य PNtoNtag; LEXICON PNtoN11 वृ PNtoNtag; LEXICON PNtoN12 िच PNtoNtag; LEXICON PNtoN13 व न PNtoNtag; LEXICON PNtoN14 योजना PNtoNtag; LEXICON PNtoN15 मह PNtoNtag; LEXICON PNtoN16 काय PNtoNtag; LEXICON PNtoN17 प रवार PNtoNtag; LEXICON PNtoN18 पुऽ PNtoNtag; LEXICON PNtoN19 ान PNtoNtag; LEXICON PNtoN20 आःथा PNtoNtag; LEXICON PNtoN21 इ जत PNtoNtag; LEXICON PNtoN22 नाम PNtoNtag; LEXICON PNtoN23 वा रस PNtoNtag; LEXICON PNtoN24 समाचारPNtoNtag; !Lexicon for common tag LEXICON PNtoNtag +NOUN:0 #; !!---Noun to adjective derivation -----!! LEXICON PNtoAdj PFX+: नर ् PNtoAdj1; PFX+: नः

PNtoAdj2;

PFX+: न

PNtoAdj3;

PFX+: व

PNtoAdj4;

258

PFX+: नस्

PNtoAdj5;

PFX+:स

PNtoAdj6;

PFX+:बे

PNtoAdj7;

PFX+:अ

PNtoAdj8;

PFX+:अन

PNtoAdj9;

!!Lexicon of underived nouns LEXICON PNtoAdj1 दोष PNtoAdjtag; LEXICON PNtoAdj2 ःवाथ PNtoAdjtag; LEXICON PNtoAdj3 डर PNtoAdjtag; LEXICON PNtoAdj4 मुख PNtoAdjtag; LEXICON PNtoAdj5 फल PNtoAdjtag; LEXICON PNtoAdj6 बल PNtoAdjtag; LEXICON PNtoAdj7 घर PNtoAdjtag; LEXICON PNtoAdj8 मू य PNtoAdjtag; LEXICON PNtoAdj9 मोल PNtoAdjtag; !!Lexicon for common tag LEXICON PNtoAdjtag +ADJ:0 #; !!-----Noun to adverb derivation -----!! LEXICON PNtoAdv PFX+:आ PNtoAdv1; PFX+:स

PNtoAdv2;

PFX+: नर ्

PNtoAdv3;

PFX+:ू त

PNtoAdv4;

!!Lexicon of underived nouns LEXICON PNtoAdv1 मरण PNtoAdvtag; LEXICON PNtoAdv2

259

हष PNtoAdvtag; LEXICON PNtoAdv3 घात PNtoAdvtag; LEXICON PNtoAdv4 ह ा PNtoAdvtag; !!Lexicon for common tag LEXICON PNtoAdvtag +ADV:0 #; !!-----Adjective to adjective derivation -----!! LEXICON PAdjtoAdj PFX+:सम् PAdjtoAdj1; PFX+: व

PAdjtoAdj2;

PFX+:दुर ्

PAdjtoAdj3;

PFX+:उन्

PAdjtoAdj4;

PFX+:सु

PAdjtoAdj5;

PFX+:प र

PAdjtoAdj6;

!!Lexicon of underived nouns LEXICON PAdjtoAdj1 पूण PAdjtoAdjtag; LEXICON PAdjtoAdj2 शु PAdjtoAdjtag; LEXICON PAdjtoAdj3 भे PAdjtoAdjtag; LEXICON PAdjtoAdj4 मु PAdjtoAdjtag; LEXICON PAdjtoAdj5 िशि त PAdjtoAdjtag; LEXICON PAdjtoAdj6 पूण PAdjtoAdjtag; !!Lexicon for common tag LEXICON PAdjtoAdjtag +ADJ:0 #; !! Suffixation !!Noun to Noun Derivation LEXICON SNtoN !!Nountype1 सुन SNtoN1; !!Nountpe2 260

घाँस SNtoN2; LEXICON SNtoN1 +SFX:आर SNtoNtag; LEXICON SNtoN2 +SFX:ई SNtoNtag; LEXICON SNtoNtag +NOUN:0 #; !! Noun to adjective derivation LEXICON SNtoAdj !!Nountype1 दया SNtoAdj1; !!Nountpe2 लाभ SNtoAdj2; !!Nountpe3 सेवा SNtoAdj3; !!Nountpe4 मुगल SNtoAdj4; !!Nountpe5 ल बु SNtoAdj5; !!Nountpe6 दान SNtoAdj6; !!Nountpe7 खच SNtoAdj7; !!Nountpe8 भर SNtoAdj8; !!Nountpe9 रस SNtoAdj9; !!Nountpe10 शहर SNtoAdj10; !!Nountpe11 होस SNtoAdj11; LEXICON SNtoAdj1 +SFX:अनीय SNtoAdjtag; LEXICON SNtoAdj2 +SFX:अक SNtoAdjtag; LEXICON SNtoAdj3 +SFX:इका SNtoAdjtag; LEXICON SNtoAdj4 +SFX:आन SNtoAdjtag; LEXICON SNtoAdj5

261

+SFX:वान SNtoAdjtag; LEXICON SNtoAdj6 +SFX:ई SNtoAdjtag; LEXICON SNtoAdj7 +SFX:आलु SNtoAdjtag; LEXICON SNtoAdj8 +SFX:आलो SNtoAdjtag; LEXICON SNtoAdj9 +SFX:आहा SNtoAdjtag; LEXICON SNtoAdj10 +SFX:इया SNtoAdjtag; LEXICON SNtoAdj11 +SFX:इयार SNtoAdjtag; LEXICON SNtoAdjtag +ADJ:0 #; !!-----Noun to noun/adjective derivation -----!! LEXICON SNtoNAdj !!Nountype1 झापा SNtoNAdj1; !!Nountpe2 गु मी SNtoNAdj2; !!Nountpe3 इलाम SNtoNAdj3; !!Nountpe4 गाउँ SNtoNAdj4; !!Nountpe5 नेपाल SNtoNAdj5; LEXICON SNtoNAdj1 +SFX:ल SNtoNAdjtag1; LEXICON SNtoNAdj2 +SFX:एल SNtoNAdjtag1; LEXICON SNtoNAdj3 +SFX:ए SNtoNAdjtag1; LEXICON SNtoNAdj4 +SFX:ले SNtoNAdjtag1; LEXICON SNtoNAdj5 +SFX:ई SNtoNAdjtag1; LEXICON SNtoNAdjtag +NOUN:0 #; LEXICON SNtoNAdjtag1 +NOUN:0 #;

262

+ADJ:0

#;

!!-----Adjective to noun derivation -----!! LEXICON SAdjtoN !!Nountype1 लामो SAdjtoN1; छोटो SAdjtoN1; !!Nountpe2 !!xy SAdjtoN2; LEXICON SAdjtoN1 +SFX:आइ SAdjtoNtag; LEXICON SAdjtoNtag +NOUN:0 #;

!!----- Adjective/noun to noun derivation -----!! LEXICON SAdjNtoN !!Nountype1 ग रब SAdjNtoN1; LEXICON SAdjNtoN1 +SFX:ई SAdjNtoNtag1; LEXICON SAdjNtoNtag +NOUN:0 #; +ADJ:0 #; LEXICON SAdjNtoNtag1 +NOUN:0 #; !!------ Verb to noun derivation -----!! LEXICON SVtoN !!verbtype1 च ुन् SVtoN1; !verbtype2 च ुन् SVtoN2; !!verbtype3 क SVtoN3; !!verbtype4 क SVtoN4; !!verbtype5 ढाक् SVtoN5; !!verbtype6 जल् SVtoN6; !!verbtype7 चोर ् SVtoN7; !!verbtype8 हाँस् SVtoN8;

263

!!verbtype9 प SVtoN9; !!verbtype10 थाक् SVtoN10; !!verbtype11 छाप् SVtoN11; !!verbtype12 छान् SVtoN12; !!verbtype13 िच या SVtoN13; !!verbtype14 झर ् SVtoN14; !!verbtype15 ढोग् SVtoN15; !!verbtype16 राख् SVtoN16; !!verbtype17 दाब् SVtoN17; !!verbtype18 बच ् SVtoN18; !!verbtype19 स SVtoN19; !!verbtype20 रोप् SVtoN20; !!verbtype21 छे क् SVtoN21; !!verbtype22 िचर ् SVtoN22; !!verbtype23 ब SVtoN23; !!verbtype24 सर ् SVtoN24; !!verbtype25 उ SVtoN25; !!verbtype26 चाल् SVtoN26; !!verbtype27 बेर ् SVtoN27; !!verbtype28 गा SVtoN28; !!verbtype29

264

भ SVtoN29; !!verbtype30 िजत् SVtoN30; !!verbtype31 कोर ् SVtoN31; !!verbtype32 खुल ् SVtoN32; LEXICON SVtoN1 +SFX:आउ SVtoNtag; LEXICON SVtoN2 +SFX:आब SVtoNtag; LEXICON SVtoN3 +SFX:आनी SVtoNtag; LEXICON SVtoN4 +SFX:आनी SVtoNtag; LEXICON SVtoN5 +SFX:अनी SVtoNtag; LEXICON SVtoN6 +SFX:अन SVtoNtag; LEXICON SVtoN7 +SFX:ई SVtoNtag; LEXICON SVtoN8 +SFX:ओ SVtoNtag; LEXICON SVtoN9 +SFX:आइ SVtoNtag; LEXICON SVtoN10 +SFX:आवट SVtoNtag; LEXICON SVtoN11 +SFX:आ SVtoNtag; LEXICON SVtoN12 +SFX:ओट SVtoNtag; LEXICON SVtoN13 +SFX:हट SVtoNtag; LEXICON SVtoN14 +SFX:अना SVtoNtag; LEXICON SVtoN15 +SFX:आउनी SVtoNtag; LEXICON SVtoN16 SVtoNtag; +SFX:आलो LEXICON SVtoN17

265

+SFX:आब SVtoNtag; LEXICON SVtoN18 +SFX:अत SVtoNtag; LEXICON SVtoN19 +SFX:अल SVtoNtag; LEXICON SVtoN20 +SFX:आइँ SVtoNtag; LEXICON SVtoN21 +SFX:आरो SVtoNtag; LEXICON SVtoN22 +SFX:औटो SVtoNtag; LEXICON SVtoN23 +SFX:औती SVtoNtag; LEXICON SVtoN24 +SFX:उवा SVtoNtag; LEXICON SVtoN25 +SFX:ती SVtoNtag; LEXICON SVtoN26 +SFX:नी SVtoNtag; LEXICON SVtoN27 +SFX:नो SVtoNtag; LEXICON SVtoN28 +SFX:ना SVtoNtag; LEXICON SVtoN29 +SFX:अ त SVtoNtag; LEXICON SVtoN30 +SFX:और SVtoNtag; LEXICON SVtoN31 +SFX:एसो SVtoNtag; LEXICON SVtoN32 +SFX:अःत SVtoNtag; LEXICON SVtoNtag +NOUN:0 #; !!----- Verb to adjective derivation -----!! LEXICON SVtoAdj !!verbtype1 मच ् SVtoAdj1; !!verbtype2 भ ुल् SVtoAdj2; !!verbtype3

266

पोस् SVtoAdj3; !!verbtype4 घुम ् SVtoAdj4; !!verbtype5 घुम ् SVtoAdj5; !!verbtype6 खप् SVtoAdj6; !!verbtype7 प SVtoAdj7; !!verbtype8 छा SVtoAdj8; !!verbtype9 रोप् SVtoAdj9; !!verbtype10 सक् SVtoAdj10; !!verbtype11 बक् SVtoAdj11; !!verbtype12 भाग् SVtoAdj12; !!verbtype13 छे र ् SVtoAdj13; !!verbtype14 लाग् SVtoAdj14; LEXICON SVtoAdj1 +SFX:आहा SVtoAdjtag; LEXICON SVtoAdj2 +SFX:अ ड SVtoAdjtag; LEXICON SVtoAdj3 +SFX:इलो SVtoAdjtag; LEXICON SVtoAdj4 +SFX:अ ते SVtoAdjtag; LEXICON SVtoAdj5 +SFX:अ ता SVtoAdjtag; LEXICON SVtoAdj6 +SFX:आलु SVtoAdjtag; LEXICON SVtoAdj7 SVtoAdjtag; +SFX:ऐया LEXICON SVtoAdj8 +SFX:आ SVtoAdjtag; LEXICON SVtoAdj9

267

+SFX:आर SVtoAdjtag; LEXICON SVtoAdj10 +SFX:आ SVtoAdjtag; LEXICON SVtoAdj11 +SFX:आउ SVtoAdjtag; LEXICON SVtoAdj12 +SFX:औटो SVtoAdjtag; LEXICON SVtoAdj13 +SFX:औट SVtoAdjtag; LEXICON SVtoAdj14 +SFX:उ SVtoAdjtag; LEXICON SVtoAdjtag +ADJ:0 #; !!----- Verb to adverb derivation -----!! LEXICON SVtoAdv !!verbtype1 गर ् SVtoAdv1; LEXICON SVtoAdv1 +SFX:उ जेल SVtoAdvtag; +SFX:इ जेल SVtoAdvtag; LEXICON SVtoAdvtag +ADV:0 #;

!!------ Adverb to adjective derivation -----!! LEXICON SAdvtoAdj भऽ SAdvtoAdj1; LEXICON SAdvtoAdj1 +SFX:ई SAdvtoAdjtag; LEXICON SAdvtoAdjtag +ADJ:0 #; !!------ Verb to noun converstion -----!! LEXICON ConVtoN !!verbtype1 खेल ् ConVtoNtag; खोज् ConVtoNtag; LEXICON ConVtoNtag +NOUN:^R #; !!------ Verb to noun/adjective conversion -----!! LEXICON ConVtoNAdj !!verbtype1 ठग् ConVtoNAdjtag; 268

चोर ्

ConVtoNAdjtag;

थप् ConVtoNAdjtag; LEXICON ConVtoNAdjtag +NOUN:^R #; +ADJ:^R #; !!----- Verb to noun derivation (by vowel insertion) -----!! LEXICON InsVtoN !!verbtype1 च क InsVtoNtag; स झ

InsVtoNtag;

ट क InsVtoNtag; LEXICON InsVtoNtag +NOUN:^a #; END 7.3 Realization: rules of alternations When the lexc files are compiled into an finite state transducer, the upper language contains sequence of stem (citation form) and morphological tags and the lower language contains actual spelling of the stems and affixes. At the same time, there may be some of the arbitrary tags used for creating some sorts of environment for the application of rules. The rules demonstrated right after each figure of finite state transducer are respectively collected and put them into certain order. The required variables are defined and these rules are composed into a single finite state transducer. Finally the rules are applied to the lower language of the lexicon finite state transducer. Each category of the word class is treated separately in the subsequent sections.

7.3.1 Phonological rules for nouns clear define cons |ग|घ|ङ|च|छ|ज|झ|ञ|ट|ठ|ड|ढ|ण|त|थ|द|ध|न|प|फ|ब|भ|म|य|र|ल|व|स|ष|श|ह; define liquids र|ल; define change [[◌ो %+MP -> ◌ा || _ .#.] .o. [◌ो %+FE -> ◌ी || _ .#.] .o.

269

[◌ा -> [ ] || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ा -> [ ] || _ ◌े न ◌ी .#.] .o. [◌ा -> [ ] || _ ि◌ न ◌ी .#.] .o. [◌ा -> [ ] || _ ◌ी .#.] .o. [◌ी -> ◌् || liquids _ न ◌ी .#.] .o. [◌ी -> ि◌ || _ न ◌ी .#.] .o. [[. .] -> ◌् || cons _ न ◌ी .#.] .o. [य ◌ा -> [ ] || _ न ◌ी .#.] .o. [◌ी -> [ ] || _ ि◌ न ◌ी .#.] .o. [◌ू -> ◌ु || _ ?* %^b ि◌ न ◌ी .#.] .o. [◌ी -> [ ] || _ %^b ि◌ न ◌ी .#.] .o. [%^b -> [ ] ] .o. [ि◌ -> [ ] || ◌ु _ न ◌ी .#.] ];1

1

When more nouns are included, new rules, if any, can be added to this set of rules.

270

read lexc ◌ा || cons _ cons ?* %^a ] .o. [ि◌ %^a -> ◌् ] .o. [◌ा -> [ ] || cons _ ?* %^ia ] 271

.o. [◌ा -> [ ] || cons _ ?* %^ta ] .o. [◌् %^ia -> ◌ा ] .o. [◌् %^ta -> ◌ा ] .o. [%^ia -> ◌ा ] .o. [ि◌ %^ta -> ◌ा] .o. [%^IMPsg -> [ ], %^IMPpl -> ऊ, %^IMPhon -> ओ || ◌ा _ .#.] .o. [◌् -> [ ] || cons _ %^IMPhon ] .o. [◌्

-> [ ] || cons _ %^IMPpl]

.o. [ि◌ -> [ ] || cons _ %^IMPhon] .o. [ि◌ -> [ ] || cons _ %^IMPpl] .o. [%^IMPsg -> [ ] || ◌् _ .#.] .o. [%^IMPsg -> [ ] || _ .#.] .o. [%^IMPpl -> [ ] || _ .#.] .o. [%^IMPhon -> [ ] || _ .#.]

272

.o. [◌् आ -> ◌ा || cons _ ] .o. [◌् इ -> ि◌ || cons _ ] .o. [◌् ई -> ◌ी || cons _ ] .o. [◌् उ -> ◌ु || cons _ ] .o. [◌् ऊ -> ◌ू || cons _ ] .o. [◌् ए -> ◌े || cons _ ] .o. [◌् ओ -> ◌ो || cons _ ] .o. [◌् औ -> ◌ौ || cons _ ] .o. [इ इ

-> इ ]

.o. [इ ई

-> ई ]

.o. [[. .] -> उ ◌ँ || ◌ा _ npastinfl|npastneginfl1|habinfl|habneginfl|imperfect .#.] .o. [[. .] -> न ◌् || ि◌|इ _ npastinfl|habinfl|imperfect .#.] .o. [i -> इ ] .o.

273

[◌् इ -> ि◌ || cons _ ] .o. [इ इ

-> इ ]

];2 read lexc ◌ा || _ .#.] .o. [◌ो %+FE -> ◌ी || _ .#.] .o. [[. .] -> ◌् || liquids _ न ◌ी .#.] .o. [य ◌ा -> [ ] || _ न ◌ी .#.] .o. [◌ा -> [ ] || _ ?* न ◌ी .#.] .o. [[. .] -> ि◌ || cons _ न ◌ी .#.] .o. [ि◌ -> [ ] || ध _ न ◌ी .#.] ];3 2

More rules may come up when more verbs will be added.

3

When more adjectives are added to lexc file, it may require some other rules.

274

read lexc [ ] || _ ◌ै .#.] .o. [◌ो -> [ ] || _ ◌ै .#.] .o. [◌् -> [ ] || _ ◌ै .#.] ]; read lexc ◌ो] .o. [◌् इ -> ि◌] .o.

276

[◌् ई -> ◌ी] .o. [◌् उ ->◌ु] .o. [◌् ऊ ->◌ू] .o. [◌् ए -> ◌े] .o. [◌् ऐ -> ◌ै] .o. [◌् औ -> ◌ौ] .o. [◌् अ -> [ ] ] .o. [अ -> [ ] || con _ ] .o. [ए -> ◌े, इ -> ि◌ || con _ ] .o. [◌ी ए -> ◌े || _ ल ◌ी] .o. [आ -> ◌ा || con _ ] .o. [◌ो आ -> ◌ा || con _ ] .o. [ई -> ◌ी || con _ ] .o. [◌ा इ -> ि◌ || con _ con] .o. [◌ा अ -> [ ] || con _ ] .o. [◌ा -> [ ] || .#. con _ ?* इ .#.] .o. [◌् -> [ ] || _ ?* %^a] .o. [◌् %^R -> [ ] ] .o. [%^a -> [ ] ] ]; read lexc