Specialised vocabulary in undergraduate studies - Sciences - Pre-university, undergraduate and postgraduate vocabulary

Vocabulary and English for Specific Purposes Research - Averil Coxhead 2018

Specialised vocabulary in undergraduate studies - Sciences
Pre-university, undergraduate and postgraduate vocabulary

A particular difficulty with research in this field is to decide on whether it fits into academic or occupational purposes. The examples of research in this section focus on subject-specific research in undergraduate studies. Chapter 7 contains research on English for Professional or Occupational Purposes. In some cases, this division is somewhat arbitrary. Aviation, for example, is in Chapter 7. Medicine is in both this chapter (research into Medicine as an area of academic study) and the next (Medicine in occupational settings, such as workplace communication).


Studies into specialised vocabulary in the Sciences in university contexts have found lower levels of coverage of high frequency words than in secondary school texts (see Coxhead et al., 2010). In Chapter 2, we looked briefly at Coxhead and Hirsh’s (2007) Science List for EAP. The genesis of this research was the lower coverage West’s GSL (1953) and Coxhead’s (2000) AWL over the Science sub-corpus in Coxhead’s (2000) study. This lower coverage overall suggested that Science texts contain large amounts of similar vocabulary to the other disciplines, but they also contain different vocabulary. Science has been noted in the literature for its differences in lexis. Biber’s (2006) study of lexical bundles in university textbooks found higher percentages of bundles in Natural Science than in Social Science, Business, Engineering and Humanities. Biber attributes this greater reliance on lexical bundles in Natural Science to the heavy technical content of the textbooks. Taken together, this research indicates that the vocabulary of Science is an important area to focus on, particularly when it comes to preparing pre-university Science students to deal with reading Science textbooks and other texts in university studies.

Coxhead and Hirsh (2007) started with the Science sub-corpus in Coxhead’s (2000) AWL study and decided to increase the corpus from 750,000 running words in seven subject areas. They included seven more Science subjects and doubled the size of the corpus (see Chapter 2). One purpose of the study was to find out where a general academic word list such as the AWL might stop and a Science word list might begin. This study is different from subject-specific studies such as Valipouri and Nassaji’s (2013) study of the GSL and AWL in Chemistry research articles (see the following), in that Coxhead and Hirsh (2007) were looking across a range of subjects within the Sciences, not focusing in-depth in one particular subject. The second purpose of the study was to develop a Science-specific word list. Coxhead and Hirsh’s 318 word families in the EAP Science List (2007) covered 3.79% of the Science corpus, but 3.06% of the original seven subjects of the AWL corpus (Biology, Chemistry, Computer Science, Geography, Geology, Mathematics and Physics) and 4.52% over the new seven subject areas of Agricultural Science, Ecology, Engineering and Technology, Horticultural Science, Nursing and Midwifery, Sport and Health Science and Vet and Animal Science. This finding suggests that the Science list has better return for learning for these Sciences.

The question of where the AWL might stop and a Science-specific list might start provided quite an interesting picture, based on frequency analysis. The most frequent 300 word families of the AWL (570 word families) covered 7.1% of the Science corpus whereas the Science list (318 word families) covered 3.71%. Breaking down the whole lists into their sublists and looking at the coverage figures, Coxhead and Hirsh (2007) found that Sublist 1 of the AWL, which contains the most frequent 60 word families, covered 2.87% of the Science corpus and Sublist 1 of the Science list covered 2.01%. Together, then, these 120 word families cover almost 5% of the corpus. The AWL sublists continue to have higher coverage over the Science corpus than the EAP Science sublists, as both lists drop in coverage following a basic Zipf (1935) pattern (see Chapter 3). That is, the coverage roughly halves each time. Coxhead and Quero (2015) investigated the coverage of the Science list over a corpus of ten million running words of Medical textbooks to see whether the list offered any value for medical students preparing for their studies. The coverage of the EAP Science List was 5.98% over the Medical corpus, which suggests this list contains lexical items which deserve attention in this specific area of study. The next section moves into more subject-specific research on vocabulary at university in the Sciences.