John Sinclair - D7 Words and phrases - Section D Extension

English grammar - Roger Berry 2012

John Sinclair
D7 Words and phrases
Section D Extension

John Sinclair (1991) reprinted from Chapter 5 of Corpus, Concordance, Collocation, Oxford: Oxford University Press, pp. 67-75.

Sinclair takes the reader through the process of analysing a particular verb to discover (some of) the ways in which it is used. It is an excellent example of how corpora can be used for grammatical/lexical analysis (Sinclair’s final aim being to arrive at grammatically adequate dictionary definitions).

Sinclair appears to be using the term ’phrasal verb’ in the more general sense identified in B7 (= ’multi-word verbs’), since he includes ’set about’ (a prepositional verb in our narrower sense) under this heading. However, the two main multi-word verbs, ’set off’ and ’set in’, are both phrasal verbs in the narrower sense.

Some terms may be unfamiliar: lemma = ’word-family’, and adjunct = ’adverbial’

Introduction

The studies of decline and yield in the previous two chapters open up the possibility of associating grammar and vocabulary in ways that can improve the description of both. They are both rich in semantic associations.

In this chapter and the next, we explore the use of concordances in the descrip­tion of words that are much more frequent than so-called ’vocabulary words’. They are not normally expected to have a strong lexical environment, but are more asso­ciated with the grammatical end of the spectrum. The objectives are to demonstrate:

- how carefully the language is patterned;

- how the description is very sensitive to the number of instances of a form;

- how criteria for meaning (see Chapter 4) are applied in a specific case.

In addition, the relevance of this research to the teaching and learning of English is pointed up by the consideration of a feature of English much dreaded by learners - phrasal verbs.

Phrasal verbs

The choice of an example to illustrate the argument of this chapter was, as usual, partly accidental, and partly deliberate. I was looking for a fairly common, rather dull little word that was comparatively neglected in description and in teaching. I found out by chance that the word set was not well regarded by some experienced teacher-colleagues, and noticed that it got scant treatment in the syllabuses that I was able to examine. The immediate presumption was that it was a difficult word to isolate semantically. ’What does set mean?’ is hardly a sensible question. It has to be put into context, because in most of its usage it contributes to meaning in combination with other words.

Among the many combinations of set are a number of phrasal verbs, such as set about, set in, set off, and these are picked out in language teaching as offering exqui­site problems to the learner. The reason for their causing problems is easily explained. The co-occurrence of two quite common little words can unexpectedly create a fairly subtle new meaning that does not seem to be systematically related to either or both of the original words. The disposition of the words involved, and their syntax, is governed by complex and unpredictable rules.

The prospect sounds formidable even for native speakers, yet they not only man­age phrasal verbs with aplomb, but seem to prefer them to single word alternatives. In fact, the whole drift of the historical development of English has been towards the replacement of words by phrases, with word-order acquiring greater significance.

Some recent work (Sinclair, Moon et al. 1989) shows that the semantics of phrasal verbs is not as arbitrary as it is often held to be. We usually cite phrasal verbs based on the verb element (give up, give out, give over, etc.). If, instead, we group them by the particle (give over, get over, tide over), it is possible to make sense groupings.

It will, however, be a long time before people will routinely look up give over in a dictionary under over rather than give. The presentation in this chapter emphasises the importance of the environment of set in determining the meaning. In the first instance, a particle following raises the prospect of a phrasal verb; then, the other words around help in indicating the precise meaning.

Some numerical facts

In the corpus of approximately 7.3 million words, used in Chapters 3 and 4, there are 2,320 instances of the different forms of the lemma set. We associate together the forms set, sets, and setting as instances of the word set, and the frequency of each is:

Other possible associates such as setter and settee are ignored.

Set is thus one of the commonest words in the language - the uninflected form is ranked number 272. However, if we compare the relative frequency of the inflected forms sets and setting, we see that they are not nearly as common as set, being approximately 9 per cent and 11 per cent of the lemma.

This is a commonly observed pattern, where one of the forms is much more common than any other. Similar, if less dramatic, tendencies are shown for decline and yield in Chapters 3 and 4 respectively. This means that if sets or setting has a use which is not shared by set, we have much less evidence to go on. Whatever criteria we use, there is nearly ten times as much evidence available for set.

It could be argued that, in one respect at least, the inflection of set is untypical, and that the frequency of forms of set will reflect the oddity. Set is one of a handful of verbs in English which do not have a separate past tense form. So whatever fre­quency is assigned to walk and walked, say and said, etc. is not differentiated in set. To complicate the picture further, all three forms of the lemma set are also readily available as nouns, and the picture is not at all straightforward.

However, compared to the vast majority of words, even the least common form sets is generously represented. But when we look for combinations of even these frequent words, the expectations are not promising.

If a corpus is held to be representative of the language as a whole, the prob­ability of occurrence of a word-form can be expressed in general as a relation between the frequency of the word-form in the corpus and the total number of word-forms in the corpus.

In the case of set this is:

This means that the chance of set being the next word in the text is about 250 per million, or one occurrence in every 3,935 words.

Combinations of set + particle

How common are the phrasal verbs with set? Set is particularly rich in making com­binations with words like about, in, up, out, on, off, and these words are themselves very common.

As an example, how likely is set off to occur, whether phrasal verb or not? Both are frequent words; off occurs approximately 556 times in a million words. Its prob­ability of being the next word is 0.00055. We must now multiply the probabilities of set and off, because the question we are asking can be roughly rephrased as follows: how likely is off to occur immediately after set? This is approximately 0.00025 x 0.00055, which gives us the minute figure of 0.0000001375.

Two important considerations are left out of this calculation, one linguistic and the other statistical.

a. the phrasal verb set off can have a noun group inside it, for example: It was the hedge which set the garden off.

There are very few of these and so they have little effect on the general numerical argument.

 b. The assumption behind this calculation is that the words are distributed at random in a text. It is obvious to a linguist that this is not so, and a rough measure of how much set and off attract each other is to compare the probability with what actually happens.

In a text of 7.3 million words distributed at random, we might expect 0.0000001375 х 7,300,000 occurrences of set off, that is, one only. Since there are several different phrasal verbs with the form set off and no doubt some occurrences of set followed by off which do not provide an instance of a phrasal verb, we might require a fairly large number of occurrences of the combination of forms to show the characteristic pat­terns. At a frequency of about 1 in 7 million, we would require to collect large amounts of text, running into the hundreds of millions of words.

The gloomy picture thus projected by our arithmetic is, in fact, considerably relieved by what we find in actual texts. This is because our initial assumption, that the words are distributed at random, is false. Set off occurs nearly seventy times in the 7.3 million-word corpus, as against the random prediction of only one occurrence. The 70 instances give us enough evidence of the main patterning.

The combination set in

In this central part of the chapter, I shall consider all the instances of set, sets, and setting followed by in. The different ways in which the occurrence of these words together contribute to meanings will emerge, and the evidence will be found to be mainly in the surrounding language.

We begin by gathering all instances of the sequence set in. There are 90 of them. To this we add: sets in (16); setting in (6); and, for the sake of completeness, settings in (2). The total of instances of a form of set followed immediately by in is thus 114.

The first analysis combines several steps:

- noting the word class of each example;

- classifying the meaning roughly into word-meaning and phrase-meaning;

- assigning the word-meanings to senses, where possible;

- working out the phrases;

- assigning the phrase-meanings to senses.

The distinction of word-meaning and phrase-meaning is of considerable import­ance in language study, and is explained in some detail in Chapter 8. Intuitively, we feel that some instances of a word are quite independently chosen, while in other cases we feel that the word combines with others to deliver a single multi-word unit of meaning. We shall call word-meaning independent, and phrase-meaning dependent.

In between these two fixed points is collocation, where we see a tendency for words to occur together though they remain largely independent choices. In what follows, the 114 instances are divided into:

• Nouns

• Verbs

- sense (i)

- sense (ii)

- minor sense

- sundry idioms

• Phrasal verbs

Let us first dispose of a few instances that do not fit into the above classification. There are five of these, one a typographical error (as-sets) and one where even the twenty-word citation does not give enough evidence of its meaning. Two are instances of other idioms with set, but in the passive so that set is immediately followed by in, for example:

He was asking a precedent to be set in a field where . . .

The last is:

. . . the controlled fires he sets in spring devastate shrubs . . .

We now go on to examine the first five categories in the above classification, before giving fuller attention to set in as a phrasal verb.

Nouns

The use of set as a noun includes all four forms of the lemma:

Both instances of settings are nominal, of course. All but one of setting are nominal, which suggests that the verb is not much used in the progressive tense. Collocations include work setting, social setting, a suitable setting. Sets as a noun includes television sets, and chemistry sets. Set as a noun includes: the social set, the Martini set, theatre set, a fishing set, and a TV set.

These are all characteristic nominal uses of set which have been captured because they happen to be immediately followed by in. They would be best treated in a description of the whole nominal pattern of set. Here, we merely note them and clear them out of the way.

Verbs

Among the verbal uses, there are two principal independent senses and two minor ones. Only the form set occurs - not even sets, suggesting a preponderance of past­tense usage.

Sense (i)

There are 25 instances of set followed by in and meaning approximately ’placed’. Seven are to do with physical position (including one about someone who had his bones set in an awmbry; only the OED was able to tell me that an awmbry was a kind of cup­board, and this was not an instance of bonesetting). Twelve more are to do with the disposition of buildings, streets, etc. Three are abstract placings (for example, ’high expectations, set in the commercial future for nuclear power’); two are variations of a well-known quotation (including the remarkable ’no man, or woman, is an island, set in a silver sea’). One is a figurative extension, ’set in a haze of blue’.

Sense (ii)

There are 18 instances of set followed by in, meaning approximately ’located’, and characteristically used of plays, films, and stories, such as:

Clearly, the film, set in Glasgow and the Highlands . . .

Minor senses

Of the two minor independent senses of set, one is to do with typesetting and the other is set in my memory, which means in context ’fixed in place’ and not just ’placed’.

In all the above verbal instances, the words set and in make an important collocation, and not the casual co-occurrence that was found in the nominal uses.

Sundry idioms

There are 20 instances of uses which I have termed idiomatic, because in addition to set and in there are other restrictions as well.

a. Of these, six have set in followed by a possessive pronoun and the word ways, such as: . . . too old and set in her ways ever to change.

b. Five are of the phrase set in motion.

c. The remainder are one or two instances of set in train, set in hand, set in order, set in a traditional mould, set in front of, set in juxtaposition to, and set in the balance.

This group of idioms comprises items for which a much larger text corpus would be needed to see if it was justifiable to pick them out as I have done here. They all seem common enough, and it is a slight shock to see how rare they are in a large corpus - one has to keep in mind the extreme unlikelihood, on statistical grounds, of any of them occurring at all.

Set in as a phrasal verb

Up to this point, we have been merely clearing the ground for study of the phrasal verb set in. The original 114 instances of set followed by in are reduced to 29. Three of the four forms are involved, as follows:

The phrasal verb meaning is that if something sets in, it begins, and seems likely to continue and develop.

One of the first things to note about the phrasal verb is that it seems to occur typically in a small and/or minor part of a sentence. It is not easy to say exactly what gives this feeling but the following may be factors.

1. The clauses in which set in is chosen are in general rather short - six words or fewer in the main. The longer ones are longer because of an adjunct rather than the subject, which is in most cases a single word or an article and noun pair.

2. A number of the clauses are subordinate. With the samples available, it is not possible to assign status in every case, and there are some of clear main clauses; but I think the tendency to lower status should be noted.

3. Set in is final in the clause in 22 of the 29 cases, and sentence-final in nine of them, showing a clear tendency to end structures.

Observations such as those above are difficult to evaluate because we lack compara­tive stylistic data, but the following is a very typical example:

. . . where the rot set in . . .

Word-forms

As suggested in 1 above, the majority of verbal groups are simple, containing just the form of set. All the occurrences of sets (10) are, of course, in the present tense, and at least nine of these deal with general states of affairs rather than the here-and-now. None of them is unambiguously in a main clause, where the tense choice relates directly to time.

Of the others, the vast majority are in the narrative past - either simple past (9) or pluperfect with had (4). There are single instances of would, has, and was, and one complex verb started to set in which again shows the narrative past.

From this we can conclude that there is a tendency towards reference to things past or things which are not sensitive to the passage of time, which goes reasonably well with the meaning of the phrase; the phrasal verb is not used in speculation about the future, or in statements about the present. For example:

It was no wonder that disillusion had set in . . .

Subjects

The most striking feature of this phrasal verb is the nature of the subjects. In general, they refer to unpleasant states of affairs. Only three refer to the weather; a few are neutral, such as reaction and trend. The main vocabulary is rot, decay, malaise, despair, ill-will, decadence, impoverishment, infection, prejudice, vicious (circle), rigor mortis, numbness, bitterness, mannerism, anticlimax, anarchy, disillusion, disillusionment, slump. Not one of these is conventionally desirable or attractive.

The subjects of set in are also, as can be seen above, largely abstractions: several are nominalisations of another part of speech.

A dictionary entry

These observations characterise the phrase and illustrate its use. In a dictionary, a great deal of information has to be compressed into a couple of lines and it must be reasonably easy to read. The explanation given in the Collins COBUILD English Language Dictionary is:

If something unpleasant sets in, it begins and seems likely to continue or develop.

The three examples cited in the dictionary illustrate many of the points made in this section. A feeling of anticlimax set in; the subject is one of the longer ones, but is abstract and fairly unpleasant. It must be treated quickly before infection sets in illus­trates the very short subordinate clause with the present tense verb. The bad weather has set in for the winter is one of a small but distinctive group of concrete subjects that would very likely be recognised as appropriate by native speakers.

[. . .]

Questions, suggestions and issues to consider

1. What problems does Sinclair identify with investigating a verb such as set using a corpus?

2. What reasons are there for treating set off as one unit lexically?

3. Why would this sentence be unlikely?

From that moment happiness set in.

4. Do the same as Sinclair. Take a phrasal verb and investigate its use in a corpus. A number of corpuses are available online; a list of them is given on the Website. Remember to go through all the steps that Sinclair does.