Vocabulary and English for Specific Purposes Research - Averil Coxhead 2018
Why research multi-word units and metaphor in ESP?
Multi-word units and metaphor in ESP
There are many types of formulaic language (Schmitt, 2010) and many names for it (Wray, 2000). Multi-word units might include common col-locations (such as significant finding and data analysis) as well as bundles of three or four words (for example, as a result of /on the basis of). The relationship between these words is not a matter of chance, and much research has focused on the statistical relationships between words. Granger and Paquot (2008) explain how continuous sequences of words (which do not have free slots between words, for example, at the end of the) are called clusters, bundles or n-grams, and those with slots are called collocations and frames. Discontinuous sequences require statistical analysis that takes co-occurrence into account.
Multi-word units can make up a fairly large proportion of texts (Nattinger & DeCarrico, 1992), but estimates of the amount of formulaic language in English vary. Some accounts suggest high proportions of formulaic language; Altenberg (1998) suggests around 80% of the London-Lund Corpus of Spoken English ’form part of a recurrent word-combination in one way or another.’ Erman and Warren (2000) found between 52% and 58% of English texts were formulaic. Gardner and Davies (2007) found high proportions of phrasal verbs in the BNC. Such estimates can vary depending on how the units are counted. A particularly important point is that multi-word units in spoken and written English can be quite different (Carter & McCarthy, 2006; Scott & Tribble, 2006). That said, it is common to both speaking and writing that high frequency words are part of multi-word units, as can be seen in these examples of lexical bundles from Biber, Johansson, Leech, Conrad and Finegan (1999): on the other hand, in the case of the, and as a result of.
Research suggests that multi-word units benefit language learners in several ways. Wood and Appel (2014, p. 9) state that multi-word units play ’an essential role in creating meaning and structure in academic discourse’. Biber, Conrad and Cortes (2004, p. 371) refer to lexical bundles as ’basic building blocks of discourse’. Wray (2002) points out that formulaic sequences allow for speakers to process and interact, and express their identity with a group. Schmitt (2010) reports benefits while reading for both native and non-native speakers. Finally, just like technical single lexical items in disciplines, specialised multi-word units relate closely to subject knowledge. For example, Pinna’s (2007) study of dentistry examines words on their own as well as clusters of two and three words together. Pinna finds key relationships between words such as ’bone’ to other words such as ’graft’ and ’cortical’. While awareness of multi-word units in teaching and learning for specific purposes has been rising, as Gardner and Davies (2007) and Coxhead (2008) state, it has not been very clear which multi-word units to focus on and in what way.
An important element of research into multi-word units is to consider the difficulties multi-word units might present for language learners. Coxhead (2008), in a study of vocabulary use in academic writing, reported learner beliefs as a factor which affects whether learners will actively attempt to learn and use multi-word units in academic writing. For example, one learner in Coxhead’s (2008) study chose to learn verbs in academic English so that she did not have to learn more than one word at a time. This research can help shed light on the choices learners make. Byrd and Coxhead (2010) point out that teachers (and by extension learners) would find it difficult to access information about the context of multi-word unit use in research which is based on privately held corpus data. Websites such as Mark Davies’ searchable interface of the academic section of the COCA Academic Word and Phrase means teachers and learners can now go online to explore the context and use of academic multi-word units themselves (go to www.wordandphrase.info/academic/).
The gathering of large corpora and analysis of vocabulary patterns in different academic subjects has begun to shed light on the frequency, roles, and use of such multi-word units in ESP. Frequency analyses are commonplace in multi-word unit research (Biber, 2006), carried out through corpus analysis (Schmitt, 2010). Two basic approaches have been identified in the identification of multi-word units in texts: the frequency-based approach which is guided by large-scale corpus studies and a semantic/grammatical approach to phraseology (Ebeling & Hasselgård, 2015). Quantitative and qualitative research can combine to provide a fuller picture of frequency and meaning of multi-word units, in studies such as Gardner and Davies (2007), for example. These researchers identified the most frequent phrasal verbs (lexical verbs plus adverbial particle, for example go out, take up and carry on) in the BNC and then documented the meanings of the most frequent 100 phrasal verbs they found. The phrasal verbs with the most meanings (or word senses) according to the Gardner and Davies (2007, p. 352) study were take in (17 senses), pick up (16 senses) and set up (15 senses). Other research into multi-words in general English include Martinez and Schmitt’s (2012) list of 505 phrasal expressions such as work on and think about (also based on the BNC). Martinez and Schmitt (2012) organised these expressions using frequency of occurrence and a learner-oriented test of items from this list is available on Cobb’s (n.d.) Lex Tutor website www.lextutor.ca/tests/levels/recognition/phrasal/). Shin and Nation (2010) also used the BNC, but focused on the spoken section of the corpus to develop their high frequency multiword units list. Shin and Nation (2010) note that many of the items in their list would fit within the first 2,000 words of English, reinforcing the idea that high frequency words occur in high frequency multi-word units.
Corpus analysis allows researchers to not only identify frequent multiword units, but also categorise them and investigate their functions. Biber (2006, p. 134) defines lexical bundles as ’simply the most frequently occurring sequences of words, such as do you want to and I don’t know what’. Biber et al. (2004) identified three functions of lexical bundles in corpora of university classroom teaching and university textbooks. These three functions are stance (for example, it’s important to; well I don’t know), discourse organisers (for example, on the other hand) and referential expressions (for example, at the end of). Many studies have since used this analysis of functions in research on lexical bundles (see the following). Biber et al. (2004) list bundles in each of these three categories, and see also Biber (2006) for more on these three functions of bundles. An analysis of play and role by Cheng, Greaves, Sinclair and Warren (2009) illustrates how lexical items that co-occurred with these two items (as in play an important role, fundamental role to play/significant role to play) add to the meaning of ’participate and/or contribute in a weighty/meaningful manner’ (see also Cheng, 2012).
Investigations into two-word collocations in academic texts, lexical bundles (strings of recurrent multi-word units, typically between four to six words long), slots or frames, and academic formulas are more common in the literature now than ever before. Studies in ESP and EAP of each of these types of multi-word units are looked at more closely in the next section.