Chapter 2 Words

2.1 Introduction

This chapter presents all the types of words we will aim to distinguish. We will do this by introducing word classes to which words belong. Also, some words, notably verbs, take a grammatical code that is additional to their word class information.

As seen in the queries of section 1.7, we capture the parse consequences for a word by having it as an item in a word list together with its word class and any grammatical code. The work of this chapter is to establish the initial parse outcome of encountering a class(code)/word pairing in a word list.

2.2 Noun words

We start off with the noun rules of (2.1) for establishing the contribution of the common noun words and proper noun words of Table 2.1.

`NS`	plural common noun (e.g., `children`, `revelations`, `times`, `wishes`)
`N`	common noun not subclassified as `NS`, that is, either singular (e.g., `child`, `revelation`, `time`, `wish`) or neutral for number (e.g., `committee`, `fish`, `information`)
`NPRS`	plural proper noun (e.g., `Clintons`, `Koreas`)
`NPR`	proper noun not subclassified as `NPRS`, that is either singular (e.g., `Clinton`, `Tokyo`) or neutral for number (e.g., `Andes`, `IBM`)

Table 2.1: Tags for common noun words and proper noun words

(2.1): noun([node('NS',[node(Word,[])])|L],L) -->
  [w('NS',Word)].
noun([node('N',[node(Word,[])])|L],L) -->
  [w('N',Word)].
noun([node('NPRS',[node(Word,[])])|L],L) -->
  [w('NPRS',Word)].
noun([node('NPR',[node(Word,[])])|L],L) -->
  [w('NPR',Word)].

Prolog query (2.2) asks whether a word list with w('NS','boys') has content for a successful parse of noun structure.

(2.2): | ?- tphrase_set_string([w('NS','boys')]), parse(noun).

(NS boys)

yes

When we connect the noun rules of (2.1) to rules for parsing the internal content of noun phrases in section 3.2, we will find that they contribute nouns which can have a role within the wider noun phrase that makes them either:

the head for the noun phrase
a noun modifier of the noun phrase head

There are also the words of Table 2.2 that are noun phrase head words that cannot have premodifiers. The noun_head rules of (2.3) distinguish this kind of word at the rule level.

`Q;_nphd_`	indefinite pronoun with quantification, which can be a compound (e.g., `everybody`, `nothing`) or a word which often occurs with the preposition `of` (e.g., `much`, `many`, `a_lot`)
`D;_nphd_`	indefinite pronoun not subclassified as `Q` (e.g., `someone`, `anything`, `another`) and demonstrative pronoun (e.g., `this`, `that`, `these`, `those`)

Table 2.2: Tags for noun phrase head words that cannot have premodifiers

(2.3): noun_head([node('Q;_nphd_',[node(Word,[])])|L],L) -->
[w('Q;_nphd_',Word)].
noun_head([node('D;_nphd_',[node(Word,[])])|L],L) -->
[w('D;_nphd_',Word)].

Additionally, there are the noun phrase head words of Table 2.3 that give full content for noun phrases that cannot have modifiers of the noun head. To distinguish these head words, there are the noun_head_full rules of (2.4).

`PNX`	reflexive pronoun (e.g., `myself`, `yourself`, `itself`, `ourselves`) or reciprocal pronoun (`each_other`, `one_another`)
`PRO`	personal pronoun (e.g., `I`, `you`, `them`, `us`)
`PRO;_expletive_`	expletive `it` e.g., occuring in a weather construction (it's raining)
`PRO;_ppge_`	nominal possessive personal pronoun (e.g., `mine`, `yours`, `ours`)
`WPRO`	wh-pronoun (`what`, `who`, `whom`)
`RPRO`	relative pronoun (`which`, `who`, `whom`, `that`)

Table 2.3: Tags for noun phrase head words that are full content for noun phrases

(2.4): noun_head_full(non_privileged,[node('PNX',[node(Word,[])])|L],L) -->
  [w('PNX',Word)].
noun_head_full(Type,[node('PRO',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,established])
  },
  [w('PRO',Word)].
noun_head_full(Type,[node('PRO;_expletive_',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,established])
  },
  [w('PRO;_expletive_',Word)].
noun_head_full(Type,[node('NP-GENV',[node('PRO;_ppge_',[node(Word,[])])])|L],L) -->
  {
    member(Type,[non_privileged,established])
  },
  [w('PRO;_ppge_',Word)].
noun_head_full(Type,[node('WPRO',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,interrogative])
  },
  [w('WPRO',Word)].
noun_head_full(relative,[node('RPRO',[node(Word,[])])|L],L) -->
  [w('RPRO',Word)].

Note how [rule 4] of (2.4) involves the word class with code PRO;_ppge_ that results in the projection of a genitive case marked noun phrase (NP-GENV) that goes on to be the only element of a containing noun phrase, as seen with NP-SBJ in the parse result of (2.5).

(2.5): | ?- tphrase_set_string([w('PRO;_ppge_','Theirs'), w('BEP',';~La','is'), w('ADJ','large'), w('PUNC','.')]), parse(sentence).

(IP-MAT (NP-SBJ (NP-GENV (PRO;_ppge_ Theirs)))
        (BEP;~La is)
        (ADJP-PRD2 (ADJ large))
        (PUNC .))

yes

Section 2.4 below will introduce other words that lead to genitive case marked noun phrases, with the difference that the projected genitive noun phrases will themselves need to occur inside noun phrases where there is head content to modify.

We should also note how the noun_head_full rules of (2.4) are of four types that depend on the value of the first parameter:

non_privileged
established
interrogative
relative

Question

What do the results of (2.6)–(2.21) tell us about where words with the different word classes of Table 2.3 can occur?

(2.6): | ?- tphrase_set_string([w('PNX','itself')]), parse(noun_head_full(non_privileged)).

(PNX itself)

yes

(2.7): | ?- tphrase_set_string([w('PNX','itself')]), parse(noun_head_full(established)).

no

(2.8): | ?- tphrase_set_string([w('PNX','itself')]), parse(noun_head_full(interrogative)).

no

(2.9): | ?- tphrase_set_string([w('PNX','itself')]), parse(noun_head_full(relative)).

no

(2.10): | ?- tphrase_set_string([w('PRO','it')]), parse(noun_head_full(non_privileged)).

(PRO it)

yes

(2.11): | ?- tphrase_set_string([w('PRO','it')]), parse(noun_head_full(established)).

(PRO it)

yes

(2.12): | ?- tphrase_set_string([w('PRO','it')]), parse(noun_head_full(interrogative)).

no

(2.13): | ?- tphrase_set_string([w('PRO','it')]), parse(noun_head_full(relative)).

no

(2.14): | ?- tphrase_set_string([w('WPRO','who')]), parse(noun_head_full(non_privileged)).

(WPRO who)

yes

(2.15): | ?- tphrase_set_string([w('WPRO','who')]), parse(noun_head_full(established)).

no

(2.16): | ?- tphrase_set_string([w('WPRO','who')]), parse(noun_head_full(interrogative)).

(WPRO who)

yes

(2.17): | ?- tphrase_set_string([w('WPRO','who')]), parse(noun_head_full(relative)).

no

(2.18): | ?- tphrase_set_string([w('RPRO','who')]), parse(noun_head_full(non_privileged)).

no

(2.19): | ?- tphrase_set_string([w('RPRO','who')]), parse(noun_head_full(established)).

no

(2.20): | ?- tphrase_set_string([w('RPRO','who')]), parse(noun_head_full(interrogative)).

no

(2.21): | ?- tphrase_set_string([w('RPRO','who')]), parse(noun_head_full(relative)).

(RPRO who)

yes

2.3 Determiner words

The det rules of (2.22) establish the contribution of the determiner words of Table 2.4. These words can only serve as noun head premodifiers. There can be at most one occurrence of such a word following on from an initial noun phrase layer (see section 3.2.2).

`Q`	quantifier (e.g., `every`, `no`)
`D`	determiner, which includes articles (e.g., `a`, `the`) and demonstratives (e.g., `this`, `that`)
`WD`	wh-determiner (e.g., `which`, `what`, `whichever`)
`RD`	relative determiner of a relative clause (e.g., `what`, `whatever`)

Table 2.4: Tags for determiner words

(2.22): det(Type,[node('Q',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,established])
  },
  [w('Q',Word)].
det(Type,[node('D',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,established])
  },
  [w('D',Word)].
det(Type,[node('WD',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,interrogative])
  },
  [w('WD',Word)].
det(relative,[node('RD',[node(Word,[])])|L],L) -->
  [w('RD',Word)].

Like the words detected by the noun_head_full rules of (2.4), words detected by the det rules of (2.22) are of four types that depend on the value of the first parameter:

non_privileged
established
interrogative
relative

2.4 Genitive markers

A genitive noun phrase which itself acts as the premodifier of an external noun phrase head follows from presence of either a genitive marker or genitive pronoun word from Table 2.5. Phrase structure integration occurs with genm of (2.23) and pronoun_genm of (2.24).

`GENM`	genitive marker (`<apos>s` or `<apos>`)
`PRO;_genm_`	possessive pronoun, pre-nominal (`my`, `your`, `our`)
`WPRO;_genm_`	genitive wh-pronoun (`whose`)
`RPRO;_genm_`	genitive relative pronoun (`whose`)

Table 2.5: Tags for genitive markers

(2.23): genm([node('GENM',[node(Word,[])])|L],L) -->
[w('GENM',Word)].

(2.24): pronoun_genm(Type,[node('PRO;_genm_',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,established])
  },
  [w('PRO;_genm_',Word)].
pronoun_genm(Type,[node('WPRO;_genm_',[node(Word,[])])|L],L) -->
  {
    member(Type,[non_privileged,interrogative])
  },
  [w('WPRO;_genm_',Word)].
pronoun_genm(relative,[node('RPRO;_genm_',[node(Word,[])])|L],L) -->
  [w('RPRO;_genm_',Word)].

Words detected by the pronoun_genm rules of (2.24) are of four types that depend on the value of the first parameter:

non_privileged
established
interrogative
relative

2.5 Subject indicating words

Except when the clause Type setting is imperative_clause, there is always the requirement that a finite clause should include a subject. The rules of (2.25) integrate the subject requirement, provided the clause is not a main clause interrogative whose subject contains an interrogative word.

(2.25): subject(there_sbj,[node('EX',[node(Word,[])])|L],L) -->
  [w('EX',Word)].
subject(cleft_sbj,[node('NP-SBJ',[node('PRO;_cleft_',[node(Word,[])])])|L],L) -->
  [w('PRO;_cleft_',Word)].
subject(provisional_sbj,[node('NP-SBJ',[node('PRO;_provisional_',[node(Word,[])])])|L],L) -->
  [w('PRO;_provisional_',Word)].
subject(filled_sbj,L,L0) -->
  noun_phrase('-SBJ',established,L,L0).
subject(derived_sbj,L,L0) -->
  noun_phrase('-SBJ',established,L,L0).

The first three rules of (2.25) allow for the subject to be a formal word from Table 2.6 that will set the value of the clause SbjType parameter to:

[rule 1] there_sbj
[rule 2] cleft_sbj
[rule 3] provisional_sbj

`EX`	existential `there`, i.e., there of the there is ... or there are ... construction co-occurring with an existential subject (`NP-ESBJ`)
`PRO;_cleft_`	cleft `it` occuring as part of a cleft construction (so it was you that got them together)
`PRO;_provisional_`	provisional `it` occuring with extraposition (it bothered her that she probably would never know)

Table 2.6: Tags for subject indicating words

The subject is more typically a contentful noun phrase, where this noun phrase will correspond to the ‘do-er’, ‘be-er’ or ‘have-er’ of the verb. [rule 4] of (2.25) identifies such a noun phrase while also setting the clause SbjType parameter to the value of:

[rule 4] filled_sbj

Also, in the case of a tough-construction (see section 6.6.2), a contentful noun phrase can lead to the clause SbjType parameter being set to the value of:

[rule 5] derived_sbj

2.6 Adverb words

Table 2.7 sets out the range of support for adverb words. Parse integration follows from the adv rules of (2.26).

`ADV`	general adverb (e.g., `often`, `well`, `really`).
`ADVR`	comparative adverb (e.g., `more`, `less`, `farther`)
`ADVS`	superlative adverb (e.g., `most`, `least`, `farthest`)
`WADV`	wh-adverb (e.g., `how`, `when`, `where`, `why`)
`RADV`	relative adverb of a relative clause (e.g., `how`, `when`, `where`, `whereby`)
`RP`	adverbial particle (e.g., `up`, `off`, `out`)

Table 2.7: Tags for adverb words

(2.26): adv(established,[node('ADV',[node(Word,[])])|L],L) -->
  [w('ADV',Word)].
adv(established,[node('ADVR',[node(Word,[])])|L],L) -->
  [w('ADVR',Word)].
adv(established,[node('ADVS',[node(Word,[])])|L],L) -->
  [w('ADVS',Word)].
adv(interrogative,[node('WADV',[node(Word,[])])|L],L) -->
  [w('WADV',Word)].
adv(relative,[node('RADV',[node(Word,[])])|L],L) -->
  [w('RADV',Word)].
adv(particle,[node('RP',[node(Word,[])])|L],L) -->
  [w('RP',Word)].

The adv rules of (2.26) are of five types that depend on the value of the first parameter:

established
interrogative
relative
particle

2.7 Adjective words

Table 2.8 sets out the range of support for adjective words. Parse integration follows from the adj rules of (2.27).

`ADJ`	general adjective (e.g., `old`, `good`, `male`)
`ADJR`	comparative adjective (e.g., `older`, `better`)
`ADJS`	superlative adjective (e.g., `oldest`, `best`)
`ADJ;_cat_`	catenative adjective (`able` in be able to, `willing` in be willing to)

Table 2.8: Tags for adjective words

(2.27): adj(established,[node('ADJ',[node(Word,[])])|L],L) -->
  [w('ADJ',Word)].
adj(established,[node('ADJR',[node(Word,[])])|L],L) -->
  [w('ADJR',Word)].
adj(established,[node('ADJS',[node(Word,[])])|L],L) -->
  [w('ADJS',Word)].
adj(catenative,[node('ADJ;_cat_',[node(Word,[])])|L],L) -->
  [w('ADJ;_cat_',Word)].

The adj rules of (2.27) are of two types that depend on the value of the first parameter:

established
catenative

2.8 Verb words

This section covers how verb words are integrated into a parse. The verb rule of (2.28) matches verb words from the word list, collecting information for three parameters: Tag, Code and Word.

(2.28): verb(Infl,Code,[node(TagCode,[node(Word,[])])|L],L) -->
  [w(Tag,Code,Word)],
  {
    verb_tag(Infl,TagList),
    member(Tag,TagList),
    sub_atom(Tag,0,1,_,C),
    verb_code(C,Infl,Code),
    atom_concat(Tag,Code,TagCode)
  }.

There is a check of collected word information:

Tag has to be a member of Tag_list, which is a list of candidate tags obtained with a call to verb_tag of (2.29) below restricted by the inflection information of Infl inherited from the clause context
Code has to be a code licensed by a call to verb_code of (2.30) below restricted by the first character of Tag and the local clause inflection information

Once matched and licensed, the verb word information goes into the parse tree information accumulated with L.

The remainder of this section recognises verb classes with tags that further distinguish verb form (section 2.8.1), and establishes the grammatical codes that are compatible with combinations of verb classes and inflection information (section 2.8.2). Complement selection for grammatical codes is the topic of chapter 4.

2.8.1 Verb classes

The aim of this section is to provide tags to distinguish the following classes of verb words:

lexical verb words
DO verb words
HAVE verb words
BE verb words

Tags of the same class will have the same initial letter (V, D, H or B) and then vary to distinguish form:

present tense form
past tense form
infinitive form
present participle ({ing}) form
past participle ({ed}/{en}) form

Table 2.9 gives tags for the different forms of lexical verbs.

`VBP`	present tense form of lexical verbs (e.g., `reaches`, `supports`, `writes`, `sinks`, `puts`, `reach`, `support`, `write`, `sink`, `put`)
`VBD`	past tense form of lexical verbs (e.g., `reached`, `supported`, `wrote`, `sank`, `put`)
`VB`	infinitive form of lexical verbs (e.g., `reach`, `support`, `write`, `sink`, `put`)
`VAG`	present participle ({ing}) form of lexical verbs (used in the progressive construction) (e.g., `reaching`, `supporting`, `writing`, `sinking`, `putting`)
`VVN`	past participle ({ed}/{en}) form of lexical verbs (used in the perfect construction and the passive construction) (e.g., `reached`, `supported`, `written`, `sunk`, `put`)

Table 2.9: Tags for lexical verb words

Table 2.10 gives tags for the different forms of DO.

`DOP`	present tense forms of the verb DO: `do`, `does`, `<apos>s`
`DOD`	past tense form of the verb DO: `did`
`DO`	infinitive form of the verb DO: `do`
`DAG`	present participle form of the verb DO: `doing`
`DON`	past participle form of the verb DO: `done`

Table 2.10: Tags for DO

Table 2.11 gives tags for the different forms of HAVE.

`HVP`	present tense forms of the verb HAVE: `have`, `<apos>ve`, `has`, `<apos>s`
`HVD`	past tense form of the verb HAVE: `had`, `<apos>d`
`HV`	infinitive form of the verb HAVE: `have`
`HAG`	present participle form of the verb HAVE: `having`
`HVN`	past participle form of the verb HAVE: `had`

Table 2.11: Tags for HAVE

Table 2.12 gives tags for the different forms of BE.

`BEP`	present tense forms of the verb BE: i.e. `is`, `am`, `are`, `<apos>m`, `<apos>re` and `<apos>s`
`BED`	past tense forms of the verb BE: `was` and `were`
`BE`	infinitive form of the verb BE: `be`
`BAG`	present participle form of the verb BE: `being`
`BEN`	past participle form of the verb BE: `been`

Table 2.12: Tags for BE

We can access the verb tag information of tables 2.9–2.12 on the basis of inflection information with verb_tag of (2.29).

(2.29): verb_tag(finite,['VBP','VBD','DOP','DOD','HVP','HVD','BEP','BED']).
verb_tag(imperative,['VB','DO','HV','BE']).
verb_tag(infinitive,['VB','DO','HV','BE']).
verb_tag(do_supported_infinitive,['VB','DO','HV']).
verb_tag(ing_participle,['VAG','DAG','HAG','BAG']).
verb_tag(en_participle,['VVN','DON','HVN','BEN']).

When seeking to match a verb word together with tag and verb code from the word list, the verb rule of (2.28) above calls the verb_tag rules of (2.29) to ensure that the tag is compatible with the inflection inherited from the Infl parameter.

2.8.2 Verb codes

This section outlines verb codes as tag label extensions. The codes allow for a distinction of verbs to reflect the selection criteria each verb has for its complements, detailed in chapter 4.

To form a handle on the complement information for main verbs, we adopt the verb code system from the fourth edition of the Oxford Advanced Learner's Dictionary (OALD4; Cowie 1989). In this dictionary, there is matching of verb codes to word sense definitions. The system is a mnemonic based reworking of the earlier system of Hornby (1975). A code from the system has:

a capital letter (L, I, T, C or D; see Table 2.13) to signal the number and function of clause elements required by the main verb
zero or more lower case letters (a, n, p, pr, n/pr, n/a, t, f, w, g, i and r; see Table 2.14), possibly separated by the dot (‘.’) character, to represent information about the form of the required elements

For example, the La code marks clause structure (L) with a linking verb + a subject predicative constituent that is an adjective phrase (a). As an example with the dot character, the Cn.a code marks a complex-transitive verb in clause structure (C) with the complex-transitive verb + a direct object constituent that is a noun phrase (n) + an object predicative constituent that is an adjective phrase (a).

`L`	Linking verb		selects a subject predicative (`-PRD2`), an element which provides information about the subject of the clause.
`I`	Intransitive verb		there is no selection of a subject predicative or an object, although there may be selection of an adverbial, an element which tells us about time, place, manner, etc of the action of the verb.
`T`	Transitive verb	Mono-transitive verb	selects a direct object (`-OB1`), an element which often refers to the person or thing affected by the action of the verb.
`C`		Complex-transitive verb	selects both a direct object (`-OB1`) and an object predicative (`-PRD`), an element which provides more information about the direct object. Note: in the code, a dot divides information about the realisation of the direct object from information about the realisation of the object predicative.
`D`		Ditransitive verb	selects both a direct object (`-OB1`) and an indirect object (`-OB2`), an element which refers to a person who receives something or benefits from an action. Note: in the code, a dot divides information about the realisation of the direct object from information about the realisation of the indirect object.

Table 2.13: Capital letters L, I, T, C and D

`a`	adjective phrase
`n`	noun phrase
`p`	adverb particle
`pr`	preposition phrase
`n/pr`	noun phrase/preposition phrase
`n/a`	as + noun phrase/adjective phrase
`t`	non-finite clause (to-infinitive) (`IP-INF` with to tagged `TO` and verb tagged `VB`)
`f`	that-clause (`CP-THT`)
`w`	finite or non-finite clause with wh element (`CP-QUE`)
`g`	participial clause ({ing} form) (`IP-PPL` with verb tagged `VAG`)
`i`	non-finite clause (bare infinitive) (`IP-INF` with verb tagged `VB` but no `TO` tagged word)
`r`	utterance

Table 2.14: Lower case letters a, n, p, pr, n/pr, n/a, t, f, w, g, i and r

Also, we source three codes directly from Hornby (1975):

VP24A
VP24B
VP24C

In addition to Cowie (1989) and Hornby (1975) codes, further verb codes distinguish:

catenative verbs (prefixed cat_)
existential verbs (prefixed ex_)
equative verbs (prefixed equ_)
cleft verbs (prefixed cleft_)

Different verb classes allow for different verb codes. verb_code of (2.30) determines compatible verb codes, taking a capital letter as the value for its first parameter to identify verb class:

V for lexical verbs
D for DO verbs
H for HAVE verbs
B for BE verbs

A verb_code call will then return through the Code parameter a code picked from the corresponding list for compatible codes.

(2.30): verb_code('V',_,Code) :-
  member(Code,[
     ';~La',';~Ln',
     ';~I',';~Ip',';~Ipr',';~In/pr',';~It',
     ';~Tn',';~Tn.p',';~Tn.pr',
     ';~Tf',';~Tw',';~Tr',
     ';~Tt',';~Tnt',';~Tni',';~Tg',';~Tng',';~Tsg',
     ';~Dn.n',';~Dn.f',';~Dn.w',';~Dn.r',';~Dn.t',';~Dn.*',
     ';~Dn.pr',';~Dpr.f',';~Dpr.r',
     ';~Cn.a',';~Cn.n',';~Cn.n/a',';~Cn.pr',
     ';~Cn.t',';~Cn.i',';~Cn.g',
     ';~V_as_though/as_if/like',
     ';~VP24A',';~VP24C'
    ]).
verb_code('V',Infl,Code) :-
  member(Infl,[finite,infinitive,do_supported_infinitive]),
  member(Code,[';~cat_Vt',';~cat_Vi',';~cat_Vg',';~cat_Vg_passive_',
               ';~cat_Ve_passive_',';~ex_V',';~ex_Vpr',';~ex_cat_Vt']).
verb_code('V',Infl,Code) :-
  member(Infl,[imperative,ing_participle,en_participle]),
  member(Code,[';~cat_Vt',';~cat_Vi',';~cat_Vg',';~cat_Vg_passive_',';~cat_Ve_passive_']).
verb_code('D',_,Code) :-
  member(Code,[';~Tn']).
verb_code('H',_,Code) :-
  member(Code,[';~Tn',';~VP24B',';~VP24C']).
verb_code('H',Infl,Code) :-
  member(Infl,[finite,infinitive]),
  member(Code,[';~cat_Vi',';~cat_Vt',';~cat_Ve']).
verb_code('H',Infl,Code) :-
  member(Infl,[imperative,ing_participle,en_participle]),
  member(Code,[';~cat_Vi',';~cat_Vt']).
verb_code('B',_,Code) :-
  member(Code,[
     ';~La',';~Ln',
     ';~I',';~Ip',';~Ipr',
     ';~cat_Vt',';~cat_Vt_passive_',';~cat_Ve_passive_',
     ';~equ_Vf',';~equ_Vw',';~equ_Vt',';~equ_Vg'
    ]).
verb_code('B',Infl,Code) :-
  member(Infl,[finite,infinitive,ing_participle,en_participle]),
  member(Code,[';~ex_V',';~ex_Vp',';~ex_Vpr']).
verb_code('B',Infl,Code) :-
  member(Infl,[finite,infinitive,en_participle]),
  member(Code,[
     ';~cat_Vg',
     ';~ex_cat_Vt',';~ex_cat_Vt_passive_',';~ex_cat_Vg',';~ex_cat_Ve_passive_',
     ';~cleft_Vn'
    ]).

Note how returned codes sometimes depend on inflection information from the Infl parameter.

Question

What does the Prolog query of (2.31) achieve?

(2.31): | ?- tphrase_set_string([Word]), parse(verb(Infl,Code)), fail.

2.9 Modal verbs

Modal verbs have the tags and verb codes of Table 2.15.

Present tense	Past tense
`w('MD',';~cat_Vi','shall')`	`w('MD',';~cat_Vi','should')`
`w('MD',';~cat_Vi','will')`	`w('MD',';~cat_Vi','would')`
`w('MD',';~cat_Vi','can')`	`w('MD',';~cat_Vi','could')`
`w('MD',';~cat_Vi','may')`	`w('MD',';~cat_Vi','might')`
`w('MD',';~cat_Vi','must')`
`w('MD',';~cat_Vt','ought')`
`w('MD',';~cat_Vi','need')`
`w('MD',';~cat_Vi','dare')`
	`w('MD',';~cat_Vt','used')`

Table 2.15: Modal verbs with tags and verb codes

The modal rules of (2.32) support the integration of the modal words of Table 2.15.

(2.32): modal(';~cat_Vi',[node('MD;~cat_Vi',[node(Word,[])])|L],L) -->
[w('MD',';~cat_Vi',Word)].
modal(';~cat_Vt',[node('MD;~cat_Vt',[node(Word,[])])|L],L) -->
[w('MD',';~cat_Vt',Word)].

2.10 Other clause level words

Besides verbs, other clause level components are words with the tags of Table 2.16.

`NEG`	negative particle `not`
`NEG;_clitic_`	negative clitic particle `n<apos>t`
`TO`	Infinitive marker `to`
`CONJ;_cl_`	discourse coordination (e.g., `And`, `But`)
`INTJ`	interjection (e.g., `aah`, `eh`, `ummmmm`)
`REACT`	reaction signal (e.g., `good_grief`, `really`, `yes`, `wow`)
`FRM`	formulaic expression (e.g., `good_afternoon`, `you_see`, `thank_you`)

Table 2.16: Tags for other clause level words

The optional_clitic_negation, neg and to rules of (2.33) and the initial_adverbial rules of (2.34) support the integration of the words of Table 2.16.

(2.33): optional_clitic_negation([node('NEG;_clitic_',[node(Word,[])])|L],L) -->
  [w('NEG;_clitic_',Word)].
optional_clitic_negation(L,L) -->
  [].
neg([node('NEG',[node(Word,[])])|L],L) -->
  [w('NEG',Word)].
to([node('TO',[node(Word,[])])|L],L) -->
  [w('TO',Word)].

(2.34): initial_adverbial([node('CONJ;_cl_',[node(Word,[])])|L],L) -->
  [w('CONJ;_cl_',Word)].
initial_adverbial([node('INTJ',[node(Word,[])])|L],L) -->
  [w('INTJ',Word)].
initial_adverbial([node('REACT',[node(Word,[])])|L],L) -->
  [w('REACT',Word)].
initial_adverbial([node('FRM',[node(Word,[])])|L],L) -->
  [w('FRM',Word)].
initial_adverbial(L,L0) -->
  adverb_phrase('-NIM',established,L,L0).
initial_adverbial(L,L0) -->
  preposition_phrase('-NIM',established,L,L0).
initial_adverbial(L,L0) -->
  scon_clause(L,L0).

As single word content, initial_adverbial calls can pick up:

[rule 1] discourse coordination words (e.g., And, But)
[rule 2] clause level interjection (e.g., aah, eh, ummmmm)
[rule 3] reaction signals (e.g., good_grief, really, yes, wow)
[rule 4] formulaic expressions (e.g., good_afternoon, you_see, thank_you)

initial_adverbial calls can also pick up:

[rule 5] non-interrogative adverb phrases with -NIM (unselected adverbial) function
[rule 6] non-interrogative preposition phrases with -NIM (unselected adverbial) function
[rule 7] the subordinate component of a subordinate conjunction

2.11 Connective words

So far, we have considered words that serve as components of either phrases or clauses. There is a further class of words with the tags of Table 2.17 that serve as the means to connect phrases and clauses.

`CONJ`	Coordinating conjunction (`and`, `or`, `but`)
`C`	The complementizer `that`
`WQ`	Marker of indirect question (`whether` or `if`)
`P-CONN`	Subordinating conjunction (e.g., `although`, `when`, `in_order`)
`P-ROLE`	Role preposition (e.g., `in`, `of`, `under`)

Table 2.17: Tags for connective words

The conj, comp, comp_wq, conn and role rules of (2.35) support the integration of the connective words of Table 2.17.

(2.35): conj(node('CONJ',[node(Word,[])])) -->
  [w('CONJ',Word)].
comp([node('C',[node(Word,[])])|L],L) -->
  [w('C',Word)].
comp_wq([node('WQ',[node(Word,[])])|L],L) -->
  [w('WQ',Word)].
conn([node('P-CONN',[node(Word,[])])|L],L) -->
  [w('P-CONN',Word)].
role([node('P-ROLE',[node(Word,[])])|L],L) -->
  [w('P-ROLE',Word)].

2.12 Punctuation

Punctuation points are treated as words for the purposes of word tagging with the tags of Table 2.18.

`PUNC`	punctuation: general separating mark (`?` `.` `!` `,`)
`PULQ`	punctuation: left quotation mark (`<ldquo>` `<lsquo>`)
`PURQ`	punctuation: right quotation mark (`<rdquo>` `<rsquo>`)

Table 2.18: Tag for punctuation

This makes punctuation part of a sentence in its own right. With the creation of constituent structure, punctuation occurs as high as possible. For example, a full stop that ends a sentence is the last constituent of the highest clause layer.

The punc rules of (2.36) give support for the integration of punctuation, with an initial parameter to distinguish types:

final
final_question
non_final
left_quotation_mark
right_quotation_mark

(2.36): punc(final,[node('PUNC',[node('.',[])])|L],L) -->
  [w('PUNC','.')].
punc(final,[node('PUNC',[node('!',[])])|L],L) -->
  [w('PUNC','!')].
punc(final_question,[node('PUNC',[node('?',[])])|L],L) -->
  [w('PUNC','?')].
punc(non_final,[node('PUNC',[node(',',[])])|L],L) -->
  [w('PUNC',',')].
punc(left_quotation_mark,[node('PULQ',[node('<ldquo>',[])])|L],L) -->
  [w('PULQ','<ldquo>')].
punc(left_quotation_mark,[node('PULQ',[node('<lsquo>',[])])|L],L) -->
  [w('PULQ','<lsquo>')].
punc(right_quotation_mark,[node('PURQ',[node('<rsquo>',[])])|L],L) -->
  [w('PURQ','<rsquo>')].
punc(right_quotation_mark,[node('PURQ',[node('<rdquo>',[])])|L],L) -->
  [w('PURQ','<rdquo>')].

Optional non-final punctuation follows from (2.37).

(2.37): optional_punc_non_final(L,L0) -->
punc(non_final,L,L0).
optional_punc_non_final(L,L) -->
[].