Specialised vocabulary research and the professions
Several studies into Finance have drawn on the Hong Kong Financial Services Corpus, a product of the Research Centre for Professional Communication in English, Department of English of the Hong Kong Polytechnic University. This corpus contains over seven million running words of annual reports and earnings calls. Li and Qian (2010) analysed the corpus using existing word lists, Coxhead’s AWL and West’s GSL and Nation’s BNC 1,000. Their coverage figures suggest that the annual reports and earnings calls contained a large amount of high frequency vocabulary. The AWL performed differently across the various areas of the corpus, for example, the coverage of procedures was almost 20%, as opposed to several other areas where the coverage was closer to 8%. The most frequent AWL items in the corpus were finance, invest, fund, issue, secure, period, corporate, income, option and require. Neufeld, Hancioğlu and Eldridge (2011) reanalysed the corpus used in the Li and Qian study. These researchers focused on cleaning the corpus and dealing with non-ASCII characters in the corpus. The coverage of the AWL/GSL came to just over 91% in the reanalysed corpus. As Neufeld et al. (2011) note, this reanalysis is helpful because it supports the need for accessible corpora to support the verification of the findings.
Moving on from single words to multi-word units in Finance, Cheng (2012) describes the extraction of phrases of two to five word multi-word units from the Hong Kong Financial Services Corpus using a programme called ConcGrams (Greaves, 2009). Examples of such phrases include risk management, management of risk and management of operational risk. In other words, the multi-word units do not need to be continuous. High frequency phrases occur regularly in the corpus, as would be expected. Mid-frequency ConcGrams from the corpus include outflow/resources, risk shares, and trading treasury (p. 98).
In a wide-ranging study of Financial English involving both quantitative and qualitative methodologies, Ha (2015) investigated a written and spoken corpus. The 6,753,212-word corpus contained written annual reports, scripted spoken earnings calls and spoken impromptu question and answer sessions (which follow from the scripted earning calls presentations) from four sectors: Banks, Financial Services, Insurance and Real Estate. Ha carried out keyword analyses with the annual reports and a written academic corpus of Computer Science, identifying 1,361 Finance-specific lexical items that covered up to 30% of the corpus. Out of that larger number, Ha (2015) choose 837 words to subject to a technicality analysis (from least technical to most technical on a five point scale), which involved referencing existing word lists and checking the meaning of the words in both general and financial dictionaries. Out of the 837, 802 were assigned to the least to moderately technical. Nine items were found to be in the most technical group including accretable, accretive and lien (p. 169). Ha (2015) also rated the technicality of 539 multi-word units extracted from the Financial corpus using a meaning-based categorisation. Examples of moderately technical multi-word units include real estate, fair value, capital expenditures, balance sheet and deferred tax. Very technical multi-word units include common stock, carrying value and mutual fund. This categorisation depended on the distance between the literal meanings of the constituent parts of the multi-word unit. The inclusion of written and spoken corpora in Ha’s (2015) research is important because research into spoken vocabulary in professional purposes is not as common as written vocabulary. An example of specialised single-word and multi-word units in English for Professional Purposes by Salvi (2014) gives us insight into the lexical choices of two European political leaders speaking in a time of financial crisis. Not surprisingly, Salvi (2014) finds topic-specific vocabulary such as financial, crisis, economic and euro in common between the speakers, as well as differences in that one speaker favours growth and the other favours budget. A follow-up analysis of multi-word units in the speeches shows that one speaker used more specific technical units, such as percent of GDP, the sovereign debt crisis and the financial stability board, than the other.