D12 Grammar in the construction of online discussion messages - Section D Extension

English grammar - Roger Berry 2012


D12 Grammar in the construction of online discussion messages
Section D Extension

Ann Hewings and Caroline Coffin

Ann Hewings and Caroline Coffin (2004) reprinted from Applying English Grammar, Caroline Coffin, Ann Hewings and Kieran O’Halloran (eds), London: Arnold, pp. 137-143.

In this research article, Hewings and Coffin investigate computer-mediated communication (CMC). Although it is written language, it has many of the features of speech, and their basic aim is to see to whether it is more similar to speech or writing. For this purpose they compiled a corpus of conference messages posted by students regarding an assignment, and a corpus of the essays written for that assignment. For the sake of comparison, they also used data from the conversa­tion and academic prose corpora from the Longman Grammar of Spoken and Written English (Biber et al. 1999).

These were then compared according to two measures thought to be indicative of the spoken/ written distinction: the frequency of personal pronouns, and the frequency of coordinating con­junctions. The results are shown below.

The paper has been greatly shortened; two general sections from the beginning — 8.1 (Intro­duction) and 8.2 (Background to this Study) — have been omitted, as well as further findings and the conclusion.

8.3 A Hybrid Medium

The premise on which this research project was based was that communication be­tween tutors and students using CMC would be influenced by the medium itself and that CMC as a new and evolving medium would affect the way learning took place. At its simplest, the medium of communication is characterised in terms of whether the language is spoken or written. Studies such as those by Biber and associates (1999), Carter and McCarthy (e.g. 1995), Brazil (1995) and Halliday (2002: 323-52) describe grammatical differences that result from the contexts of spoken interaction as opposed to written. Carter and McCarthy go furthest in identifying grammatical realisations associated not just with speech but with specific contexts of use, such as the language used in telling stories, recounting experiences, or casual conversation. Writing too displays grammatical differences which are context-dependent. The choices of both grammar and lexis are likely to be very different between, for instance, a letter to a friend and an essay written for assessment purposes. It is not enough then just to focus on writing or speech as the descriptive categories for ’medium’.

If we are able to identify grammatical choices that are more commonly associated with writing or speech, we should also be able to characterise the grammatical choices made in CMC. However, this is complicated by the fact that CMC can be influenced not just by the medium but by other variables as we mentioned earlier (how well the CMC contributors know each other, their communicative purpose, and the subject matter being discussed). In addition, as Baron notes in relation to e-mail, the speed

at which the technology is evolving suggests that people are having to familiarise themselves with a medium that is not yet stable:

Email is more a moving linguistic target than a stable system, thereby complicat­ing the problem of constructing a unified grammar of email. Three major sources of fluidity in email bear note: evolution of the technology, growth in usership, and partial maturation of the genre.

(Baron, 1998: 144)

Despite this lack of stability there are now some studies looking at the grammar of emails and other forms of CMC. Table 8.1 indicates a variety of features differentiated on the grounds of their typicality in speech or writing. The letter C indicates which of these features have also been found commonly in emails/CMC.

Table 8.1 Email/CMC grammar in relation to spoken and written modes

Source: Adapted from Baron (1998: 153)

The features listed in Table 8.1 were identified by Collot and Belmore (1996) and Yates (1996) (both cited in Baron, 1998) and were based on an analysis of ’one to many dialogues’, that is, where a message is sent out to a number of recipients such as is the case in computer conference discussions. We followed up this work by

examining the grammatical features of our CMC data to help us understand how people are utilising the new medium and to what extent their interaction more closely resembles speech as it would in a face-to-face tutorial or whether it is more like written academic prose.

8.4 Methodology 1: Corpus Analysis

To answer the question ’Does CMC exhibit differences in grammatical choices from conventional speech or writing?’ we made use of an electronic corpus of texts and concordancing software. We aimed to give an empirical basis to our informal impres­sions that electronic messages in the conference environment show similarities with speech even though they are in fact written text.

To make comparisons between speech, writing, and CMC we collected both CMC messages and conventional written essays from students and made use of a published source of conversational data described below. The messages sent to the conferences and written as essays formed a ’corpus’, a collection of texts, that could be examined using the computer software WordSmitH Tools (Scott, 1996). Two sub-corpora were prepared - the conference messages (142,078 words) and the essays (110,112 words). Texts in both sub-corpora were prepared for analysis by first converting them into rich text format which is readable by the software, and then by tidying up the resultant files. For example, some HTML characters had become embedded in the conference text messages and these were deleted. Where people had sent messages as attachments these were opened and also included in the sub-corpus. The corpus as a whole was left as plain text, that is, it was not grammatically tagged so only words and not parts of speech could be searched for.

In addition to this corpus we also made use of the detailed corpus analysis carried out by Biber et al. (1999) published as the Longman Grammar of Spoken and Written English (LGSWE). The LGSWE is prefaced on the understanding that gram­matical features vary in different contexts and under different circumstances. This is illustrated throughout by comparing the distribution of different grammatical features among groups of texts classified as the registers of conversation, fiction, news and academic prose. For the purposes of this study, we use the LGSWE findings for con­versation to allow comparisons across spoken, written, and CMC interactions. We also give their figures for grammatical structures in academic prose as representative of writing similar to, or more formal than, our student essay data.

The first task was to ascertain whether or not the written conference composi­tions represented the spoken-written hybrid reported in emails/CMC by Baron. This would enable us to see whether communication in the conferences showed any evidence of the dialogue and interaction that take place in face-to-face tutorials or whether they were more typical of academic writing. In general, tutorials are seen as more successful if students interact with each other as well as the tutor and reflect on their learning. Two simple measures of interactivity were used relating to pronouns and coordinators. The first was based on the findings of Collot and Belmore (1996) and Yates (1996) (summarised in Table 8.2) on personal pronoun usage. In spoken interaction there is greater use of personal pronouns such as I and we, so if the

interaction taking place has spoken-like qualities, pronoun usage is likely to be higher than in traditional written academic prose. A second comparison was based on the findings reported in the LGSWE, which identified marked differences in the distribu­tion of common coordinators (and, but and or) between conversation and academic prose. Both pronouns and coordinators are relatively easy to measure numerically using corpus analysis software and comparisons were made between the CMC and essay data collected for this research and between the findings reported in LGSWE.

8.5 Results and Discussion 1

All personal pronouns and corresponding possessive and reflexive forms (with the exception of it, its, itself) were counted in the conference messages and essays subcorpora. The raw figures were converted into frequencies per million words to make them comparable with each other and with the data provided by graphs in LGSWE. Table 8.2 shows the comparison between first and second person pronoun usage between the conferences and essays sub-corpora.

Table 8.2 Frequencies of first and second person personal pronouns

The findings support the summary of evidence given by Baron in Table 8.1 above: that first and second person pronoun use is much higher in email/conference messages than in academic prose. Despite the writing being by the same people and on the same topics, first and second person pronouns are three times more common in the conference messages, indicating that interpersonal interactivity in the conference is high.

Biber et al. (1999) in LGSWE do not break down their data on pronoun usage in conversation and academic prose to first and second person versus third person. Their statistics indicate only overall usage of personal pronouns. LGSWE data contains all personal and reflexive pronouns, including it, its and itself. To make our data more comparable it was therefore necessary to include third person pronouns. However, we stopped short of including it, its, and itself for two reasons. First, personal pronouns are used as an indicator of the more personal and involved stance of the author. Pronouns such as I or you are therefore of importance, whereas it is likely to be less significant. Second, it has functions other than just as a personal pronoun (see Chapter 6 by Hewings and Hewings, this volume). It would have been necessary to edit concordance lines manually for non-personal pronoun uses of it and then to run the frequency counts. Had the corpus been grammatically tagged, as was that used by Biber et al.(1999), this procedure could have been automated. The personal pronoun com­parison with our corpora needs therefore to be treated as a very rough guide only.

Table 8.3 Frequencies of personal pronouns

The results in Table 8.3 show the same general trends in the data as we observed in Table 8.2, but the variation between conference messages and essays is not as great as that between conversation and academic prose. This finding supports the descrip­tion of CMC messages as a spoken-written hybrid. The conference messages show features of speech in that pronoun usage indicates a personal dialogue between participants. However, as they are written forms and the interactants are not in a face-to-face context, the contrasts are not as marked as the LGSWE findings.

The research by Biber et al. also identities a difference in the use of common coordinators (and, but, and or) in different registers. And is the most frequent coordinator, linking both phrases and clauses throughout their corpus, but it is, surprisingly, less frequent in conversation than in academic prose. Example 1 from the academic prose sub-corpus used by LGSWE (Biber et al., 1999: 83) shows both uses. The first and third uses serve to link phrases (and potassium, and iron) whereas the second and fourth uses serve to link clauses (and those which are, and are transferred).

1. A distinction is needed between elements, which include nitrogen, phosphorus and potassium, which are mobile in the phloem and those which are compara­tively immobile, for example, calcium, boron and iron, and are transferred only slowly to the developing organ.

The authors of LGSWE suggest that and is less frequent in conversation than in academic prose because in conversation phrases are simpler and do not need coor­dination. In contrast, but occurs more frequently in conversation because, although it cannot be used to coordinate noun phrases, it serves to string together clauses in which ideas are contrasted without too much pre-planning. But allows the speaker to modify a statement (example 2) and others to disagree (example 3):

2. I think he will have salad but he doesn’t like tomatoes.

3. A: The golden rule is if you’re reversing you must look behind you!

B: Yeah, but she said she did.

(Biber et al., 1999: 82)

Or is the least frequent of the coordinators but is relatively more common in academic prose. This may be because academic discourse frequently considers alternatives (example 4) and explains terminology (example 5):

4. Such movements may come from local or regional deformation of the land or from a global rise or fall of sea level.

5. According to Chamberlain and Moulton, these broke into small chunks, or planetesimals, which went flying as cold bodies into orbits around the Sun.

(Biber et al., 1999: 82)

Frequencies for and, but, and or were obtained in our two sub-corpora and the results made comparable with those in LGSWE (see Table 8.4).

Table 8.4 Frequencies of simple coordinators (per million words)

Again, we found that trends are similar. And and or are used less frequently in confer­ence messages and conversation than in essays and academic prose. But is used more frequently in conference messages and conversation than in essays and academic prose. However, as would be expected, given the differences between the corpora, the proportions differ.

In interpreting these comparisons we need to be careful regarding the findings concerning and. LGSWE notes that in conversation 80 per cent of occurrences of and are to link phrases (single nouns or noun phrases) such as in ’individual personalities and learning styles’. In contrast, in academic text 35 per cent of occurrences are as clause-level connectors, ’Does it only happen in a formal framework and is it only something experts deal with’. As neither clauses nor noun phrases were tagged within our corpus, it was not possible to verify statistically whether this was the case in the data examined here. However, qualitative observation of the data suggests that and does function in this contrasting fashion in different types of conference message. This is returned to in the qualitative analysis discussed below.

Use of a corpus methodology enabled us to observe quantitative trends in the use of certain key grammatical indicators. Analysis of pronouns and coordinators confirms a difference in grammatical choices between the sub-corpora. Personal involvement as indicated by pronoun usage is more significant in conference messages. However, it is still higher in student essays than in the LGSWE academic prose sub-corpus, perhaps indicating that essays in applied linguistics have a more personal dimension than in non-disciplinary specific texts composed by professional academics.

With regard to coordination, we can say that the frequency of and, but and or in conference messages lies somewhere between its frequency in conversation and academic prose, as represented by the LGSWE corpus evidence. This suggests that conference messages are indeed a hybrid form.

Questions, suggestions and issues to consider

1. How do the findings fit in with what was said in A12 about the use of and for coordination in spoken English?

2. The authors do not discuss whether ellipsis was present in their data. How might this have affected their findings?

3. Do you think the evidence from personal pronouns and coordinating conjunc­tions is enough to prove that CMC is different from other forms of writing? What other linguistic forms could be investigated to provide further evidence?

4. Look at some emails or other forms of CMC that you receive in English. How far do they correspond to informal speech or formal writing? What features are prominent, for example, ellipsis?

5. Is there any variation between or within the different types of CMC that you write and receive? What factors are involved in differences? For example, do you write differently when addressing friends, or parents, or teachers? And are there differences in what they write to you?