16 Categories and Constituents

Dr. Atunu Saha

epgp books

 

 

 

Learning outcome

In this module one is expected to learn the definition of category and various views on how categorization is described by the linguists. Alongside, the notion of part of speech and the short comings of the traditional classification system are shown. The current view about categories and categorization draw much insight from the morphology of a particular language at one hand and syntax on the other.

This idea also has given prominence to the fact that sentences of any language are not mere summation of words. Rather a group of words within a sentence acts like a unit which is called a constituent. Finally, some diagnostic tests are represented to prove the existence of the constituents in natural languages.

Introduction

This module focuses on categories and constituents as part of the paper 13 e.g. Introduction to Linguistics and Phonetics. These two components are an integral part of Syntax which deals with phrase and clause formations in any language or languages in general. The organization of the module is as the following. At the onset the definitions pertaining to the categories is given in Section 16.1. The idea of category and categorization is discussed in 16.2. Various views on categories are noted in sections 16.2.1-16.2.4. Part of speech and glitches of the traditional part of speech are described in 16.3. Section 16.4 introduces two sets of rules to understand the categories and section 17 narrates the idea of constituents and tests for their identification. Finally, a summary is represented in section 18.

What is a category?

A category is a minimal unit in syntax that selects a particular word or lexical item to be put under it. For example N (oun)s and V(erb)s are called categories as they allow a proper noun or a pronoun to be inserted under the first one and only a verb under the latter.

In other words, a category comprises of a set of features which are responsible for distinguishing the words into the traditional word classes of Noun, Verbs, Adjectives etc. (Adger, 2002).

Category and categorization

In this section, I am going to discuss the process of categorization in human languages and various views about the categories that are an integral part of the human cognition. According to (Rauh, 2010), ‘Categorizing is a fundamental aspect of how humans process reality and the formation of categories gives structure to the enormous amount of sensory input.’

Lexical items that share properties are combined into the same category. So a category essentially helps to capture the generalization of a particular type. By this means John, him, assassination would all be classified as nouns and reading, putting and sleeping be classified as verbs. This enables a child to acquire a language effectively and an adult learner to learn a second language.

It is also argued that because of their generalizing properties, categories are indispensable in any area of scientific study. (Rauh, 2010) notes that ‘forming categories and describing the relations between them’ yields a structure and give us ‘insights into the object of study.’

However, there is no uniformity in the analysis of the categories by linguists. Instead a variety of terminologies can be found including the terms ‘parts of speech’, ‘word classes’, ‘form classes’, ‘lexical categories’, ‘grammatical categories’, and ‘syntactic categories’.

Typological explanations on categories

(Haspelmath, 2001) classifies the Word Classes or the Parts of Speech by listing ten traditional categories, namely Noun, Verb, Adjective, Adverb, Pronoun, Preposition/Ad position, Conjunction, Numeral, Article, and Interjection.

The term part of speech has originated from Latin ‘pars orationis’ and often largely used by linguists to denote categories. On the other hand, the term word class was introduced in the first half of the twentieth century by the structural linguists. According to Bloomfield, the term ‘word class’ refers to the relationship between parts of speech and word classes in the following sense i.e. the maximum quantity of word-classes of a given language are considered as the parts of speech of that language. This simply indicates that the sets of parts of speech and word classes are not identical.

(Haspelmath, 2001) argues that the dispute arises due to the confusion between these terminologies. What one needs to understand is that not every language shows equal number of word class or part of speech. Some languages will have more of them compared to the others.

The generalization can be captured as the following

  • Not every language will have the same ten categories as noted by (Haspelmath, 2001). This means that in many languages distinguishing the Nouns from the adjectives or adjectives from the verbs would be difficult.
  •  Many languages will have categories other than those ten traditional categories and without describing these unique categories, the grammatical descriptions of these languages will not be exhaustive.

Under this view, ‘word class’, ‘part of speech’, ‘syntactic category’, and ‘lexical category’ are equivalent or at least ‘roughly equivalent’, and they are synonymous or near-synonymous.

Syntactic views on categories

 

In terms of the Chomskyan linguistics categories are considered as ‘syntactic categories’ albeit technically this refers not only to lexical categories such as nouns and verbs but also to phrasal categories such as noun phrases and verb phrases).

 

Syntacticians primarily distinguish categories into lexical categories e.g. noun, verb, and adjective (e.g. Chomsky 1981: 48; Baker 2003); noun, verb, adjective, and preposition (postposition) (e.g. Chomsky 1986a: 160; Chomsky 1986b: 2; Haegeman 1994:146;Wunderlich 1996: 2); or noun, verb, adjective, adverb, and preposition (e.g. Radford 1997: 37 f; Haegeman and Gue´ron 1999: 58) as compared to the functional categories, which include complementizers, determiners, pronouns, and auxiliaries (cf. e.g. Radford 1997b: 45) as cited in (Rauh, 2010).

 

(Lyons, 1966) characterizes categories in favour of a ‘form-class’ ‘in terms of its syntactic equivalence’. He simply states

 

“Two forms, fi and fj, are members of the same form-class Fx if and only if they are intersubstitutable (i.e. have the same distribution) throughout the sentences of the language.”

 

Consider the following examples:

  • John is a doctor (if we substitute John with Bill then we get 2)
  • Bill is a doctor

Now John and Bill would form the same form class namely N(oun) because they can be interchangeably placed in English.

Cognitive approaches to categorization

Cognitive psychologist Rosch (e.g. 1973a,b, 1977a,b, 1978; Rosch and Mervis 1975) as cited in (Rauh, 2010) postulates ‘a different basis for the human categorization of the real world.’ She claims that the cognitive categories do not correspond to a set of entities featuring identical properties. They tend to exhibit a prototype structure. For example according to this theory Bird is a prototype that would cover all the names of the birds in a given language.

Rosch and Mervis (1975 cited in (Rauh, 2010) calls this a ‘family resemblance’ and the idea is based on Wittgenstein’s statements on categorial structures and the relationship between members of a category (1953 [2001]: 27e f.). ‘Family resemblance’ simply denotes that the entities may share several or only one property with each other and/or with the prototype, but that they do not share all the relevant properties.

Combinatorial approach

(Taylor, 1995) takes a combinatorial path in the sense he considers the categorial structure of the ‘grammatical categories’ include ‘word classes – the traditional parts of speech’ and ‘syntactic categories like noun phrase’. He proposes grammatical categories have ‘a prototype structure, with central members sharing a range of both syntactic and semantic attributes’. Following the prototype theory, other category members may then share properties either with the core members or with one another, but they need not require exhibiting all the properties of the core members.

Consider examples from English:

1) John is playing football.
2) Samir’s uncle is sleeping.
3) The destruction of the city is farfetched.

All the bold items belong to a single proto type Noun because they share at least one property. John and Samir’s uncle both refer to some person in this world. Similarly destruction would also fall into the same prototype as it refers to an action and according to the morphology of English suffix –ion turns a verb into a noun. In the following section, traditional notions of categories are taken up and the problems pertaining to this classification is discussed.

Part of Speech

According to (Carnie, 2002) part of speech acts like ‘how a word is going to function in the sentence.’ The most common parts of speech discussed in descriptive grammars are nouns, verbs, adjectives, adverbs and prepositions. This triggers a pertinent question as in how do we classify them? Often the traditional definitions will fail to account for certain classifications. Consider the following examples:

4) The assassination of the prime minister
5) Singing is a quality.

These two underlined words cannot be classified as nouns if we delimit our notion of nouns as name of place, object or a thing. (Carnie, 2002) argues that semantic criteria often inadequately captured part of speech. Consider the example from Bangla.

6) Choto-Ta douro-cch-e Small-CLF run-HAB.PRS-3P ‘The small one is running.’

Normally, we expect to see nouns occupying the subject position. In 6) instead of a noun an adjective has occupied the nominal position and it indicates that a word might change its category depending upon the position that it is going to occupy in a sentence.

Based on these examples syntacticians propose a distributional criteria for identifying the part of speech. The criteria comprises of morphological distribution and syntactic distribution. The first refers to the morphological marking namely the affixes that appear with a particular word help to identify that word. For example suffix –ing in English can either appear as a derivational morpheme or as an inflectional morpheme. However, their categorial nature will be ultimately decided based on their occurrence in a sentence. If singing appears at the beginning of a sentence then it is highly likely that it is going to act like a derived nominal or gerund as show in 5) above. However if it appears after a subject then it shows up as a verb as in 7).

7) Rakhi was singing.

Classification of the major part of speech based on morphology and syntax (Carnie, 2002)

POS Morphological distribution Syntactic distribution
Nouns

ment (basement), -ness (friendliness), -ity (sincerity), -ty (certainty),

-(t)ion (devotion), -ation (expectation), – ist (specialist), -ant (attendant)

often   appear    after   determiners such as the,

those, these, (e.g., these peanuts/ yeh kitabe ‘these books’) and can

 

-s (cats), -es (glasses), -en (oxen), -ren

(children), -i (cacti), -a (addenda)

In Hindi -e as in larke (boys)

appear after adjectives (the big peanut)/ woh lal topi ‘the red cap’.

Nouns can also follow prepositions (in school)  or precede post positions (school se ‘from school’).

Verbs

derivational endings such as -ate (dissipate), and -ize/-ise (regularize)

-ed or -t ending for past.

In the present tense, third person  singular (he, she, it), they take the -s ending.

In Hindi, gender and person marking Larka jat-a he ‘boy goes’

Larki jat-i he ‘girl goes’

Verbs can follow auxiliaries and modals such as will,

have, having, had, has, am, be, been, being, is, are, were, was, would, can, could,

shall, should, and the special infinitive marker to.

In    Hindi,    verbs     can    follow negation

me khana nehi khaunga ‘I will not eat’

Adjectives

derivational endings such as

-ing (the dancing cat), -ive (indicative), – able (readable), -al (traditional), -ate (intimate),

-ish (childish), -some (tiresome), -(i)an (reptilian), -ful (wishful), -less

(selfless), -ly (friendly). -er and –est as in happier and happiest

In Hindi wala or wali is added with an adjective

hariwali chatni (green chatney) nimbuwali chay (lemon tea) khanewala admi (food loving person)

Adjectives can appear between determiners such as

the, a, these etc. and nouns: (the big peanut)

yeh bara talav (Hindi) ‘this big lake’

Adverbs

adverbs end in -ly: quickly, frequently

In Indian languages adverbs are often reduplicated – dhire dhire (slowly slowly)

The syntactic distribution of adverbs is most easily described by stating where they can’t appear. Adverbs can’t appear between

a determiner and a noun (*the quickly fox)

* woh jaldi billi ‘that quickly cat’

Categories or part of speech take part in the formation of phrases and sentences. In the next section, I am going to explain two sets of rules proposed by (Bhatt, 2008) to understand the categories in terms of sentence formation in a language.

Two sets of rules for categories

 

Phrase structure rules involve categorizations i.e. these rules can either generate a category from another category or they may simply allow the categories to be filled in by the lexical items. Essentially then it predicts that a grammar of any language is a combination of these Categorial rules and Lexical rules.

Categorial rules Lexical rules
S-> NP (aux) VP N-> girl, boy , John
NP-> (Det) (AdjP) N (PP) V -> see, like, sleep
VP-> V (NP) (PP)(Advp) Adj->big, red
Adjp-> (Adjp) Adj Adv-> quickly
Advp-> Adv Aux-> am, can
PP-> P(NP) P-> to, from
Det> this, the

Some of the sentences in English that these rules will generate can be seen as under:

 

1) The girl likes

2) The big boy is

3) John is coming from School

 

What is observed in 8)-10) that these sentences are not mere summation of lexical items. Words have been selected by categories and these categories are combined hierarchically. This hierarchy is explained through a technical notion in syntax known as constituents.

 Constituent

 

As (Carnie, 2002) puts it a Constituent is a group of words that functions together as a unit. In example 8) above, we can see that ‘the’ and ‘girl’ are closely related and they are capable of forming one unit. Likewise likes and John combine as a unit; john being the direct object of the verb. Let me show this structure in terms of a syntactic tree in 11).

 

In order to determine the constituents (Bhatt, 2008) formulates the following principle:
  •  A set of nodes can combine into a unit called as constituent a common node exhaustively dominates them.
  •  A is a constituent of B if B dominates A.
From 11) it can be said that a sentence is a highest level constituent noted here as S. S comprises of two constituents i.e. an NP and a VP. In turn the subject NP constituent is formed by combining two categories that is a determiner and a noun. The VP comprises of a category and another NP constituent.

Constituency tests

In the section above we noted that a set of nodes that are dominated by a single node is known as constituency. Thus NP,VP, ADjP, AdvP and PPs are all called constituents and these  constituents in various combinations ultimately form a bigger constituent known as S or sentence.

Now we turn to a discussion which will focus on how to decide a combination of nodes as a constituent. These tests are noted as tests for constituency.

Replacement test

If a group words can be replaced by single word or lexical item then the group of words will qualify as a constituent. This test is called as replacement test.

  •  The big red balloon is flying high in the sky.
  •  It is flying high in the sky.

Since It can replace the entire unit The big red balloon in 12) the unit stand as a constituent namely an NP constituent.

Stand alone or fragment test

If a group words can be produced as an answer to a question, then that group of words can form one constituent.

  • John ate at the rooftop view restaurant.
  • Where did John eat?
The answer to 15) will be ate at the rooftop view restaurant but not *ate at or *ate at the. Hence, we can clearly say that at the rooftop view restaurant is a constituent.

Movement test

If a group of words can be moved within a sentence then that will confirm the group as a constituent. There are three kinds of movement tests. i) Clefting which involves putting a string of words between It was (or It is) and a that at the beginning of the sentence.

  • Mohan bought a new shirt
  • It is a new shirt that Mohan bought.
ii) Preposing or the so called pseudoclefting) involves putting the string of words before a is/are what or is/are who at the front of the sentence.
  •  A new shirt, Mohan bought.
iii) Constituency hood can also be attested through passivization. In case of passives the subject NP and the object NP interchange positions and it can establish them as constituents.
  •  A new shirt was bought by Mohan.

Ellipsis test

Ellipsis test is widely used for the constituency hood of the VP. In this test, the VP is dropped from the second clause and replaced with did too/so. That is the reason this test is also called as did so test.

  • Subbarao went to the shop and Mohanty did it too(went to the shop).
  •  Sita- ne ram ko bulawa aur gopi-ne vi (ram ko bulawa)
Sita-ERG Ram-ACC called and Gopi-ERG too ‘Sita called ram and Gopi did so too.’

Coordination test

Coordinate structures are constituents linked by a conjunction like and or or. Only constituents of the same syntactic category can be conjoined e.g. an NP can be conjoined with another NP and a VP with a VP but say not with an AdjP.

  •  [The woman] and [the man] went to the shop yesterday.
  •  The woman [went to the shop] and [bought ice creams].
  •  [The woman] and [*the grey] went to the shop yesterday.
  •  [The woman] went to the shop and [*the purple].

Summary

In this module categories and the constituents which are considered as the objects of syntax (phrase and clause formation) are discussed.

  • A category is a minimal unit in syntax that selects a particular word or lexical item to be put under it.
  • Lexical items that share properties are combined into the same category.
  • According to the typologists Parts of Speech comprises of ten traditional categories, namely Noun, Verb, Adjective, Adverb, Pronoun, Preposition/Ad position, Conjunction, Numeral, Article, and Interjection.
  • Syntacticians primarily distinguish categories into lexical categories e.g. noun, verb, preposition and adjective.
  • Cognitive view on categories entails that categories correspond to a set of entities featuring identical properties and they tend to exhibit a prototype structure.
  • Part of speech acts like how a word is going to function in the sentence.
  • The current argument shows that the categories can be identified in terms of their morphological and syntactic distribution in a language.
  • Two sets of rules in grammar can explain the categories, categorial rules and lexical rules.
  • A constituent is a group of words that functions together as a unit. NP,VP, ADjP, AdvP and PPs are all called constituents.
  • If a group words can be replaced by single word or lexical item then the group of words will qualify as a constituent. This test is called as replacement test.
  • If a group words can be produced as an answer to a question, then that group of words can form one constituent.
  • In case a group of words can be moved within a sentence then that will confirm the group as a constituent. iii) Constituency hood can also be attested through passivization.
  • Ellipsis test is widely used for the constituency hood of the VP. In this test, the VP is dropped from the second clause and replaced with did so too.
  • Coordinate structures are constituents linked by a conjunction like and or or. Only constituents of the same syntactic category can be conjoined.
you can view video on Categories and Constituents

Reference

  • Adger, D. (2002). Core Syntax: A minimalist approach.
  • Bhatt, R. (2008). Pharse Structure rules, Tree rewriting and recursion. Amherst: UMASS. Carnie, A. (2002). Syntax: A Generative Introduction. Wiley-Blackwell.
  • Haspelmath, M. (2001). Word Classes and Parts of Speech. International Encyclopedia of the Social and Behavioral, 16538-45.
  • Lyons, J. (1966). Towards a ‘‘Notional’’ Theory of the ‘‘Parts of Speech’’ . Journal of Linguistics 2:, 209–36.
  • Rauh, G. (2010). Syntactic Categories. Oxford University Press.
  • Taylor, J. R. (1995). Linguistic Categorization: Prototypes in Linguistic Theory. Oxford: Clarendon Press.