Choosing Symbols - Writing Well

Mathematical Writing - Vivaldi Franco 2014

Choosing Symbols
Writing Well

Choosing mathematical notation is difficult. Mathematicians are notoriously reluctant to accept standardisation of notation, to a degree unknown in other disciplines. Indeed, the ability to adjust quickly to new notation is regarded as one of the skills of the trade. The reality is somewhat different: absorbing new notation requires effort, and most people would gladly avoid it. So if we intend to communicate mathematics without confusing or alienating our audience, the notation must be simple, logical, and consistent.

Two golden rules should inform the use of symbols:

·  DO NOT INTRODUCE UNNECESSARY SYMBOLS.

·  DEFINE EVERY SYMBOL BEFORE USING IT.

Once defined, a symbol should be used consistently. Never use the same symbol for two different things or two symbols for the same thing, even in instances appearing far apart in the document. Don’t write ’$$A_j$$, for $$1\leqslant j<n$$’ in one place and ’$$A_k$$, for $$1\leqslant k<n$$’ in another, unless there is a good reason to do so. Such small inconsistencies produce some ’notational pollution’. As the pollution piles up, reading becomes tiresome.

In this section we offer guidelines on how to choose symbols. These are not laws, and may be adapted to one’s taste or rejected. What’s important is to develop awareness of notation, and to make conscious decisions about it.

SETS. Represent sets by capital letters, Roman or Greek, such as

$$ S\qquad {\fancyscript{A}}\qquad \varOmega . $$

The large variety of fonts available in modern typesetting systems increases our choice. When dealing with generic sets, then $$A,B,C$$ or $$X,Y,Z$$ are good symbols. For specific sets, choose a symbol that will remind the reader of the nature of the set. So, for an alphabet $$\{a,b,c,\ldots \}$$, the symbols $$A,{\fancyscript{A}}$$ are obvious choices; likewise, $$F$$ is appropriate for a set of functions, etc.

Lower-case symbols like $$x,y$$ represent the elements of a set. So $$x\in A$$ is a good notation, $$X\in A$$ is bad, and $$X\in a$$ is very bad. If more than one set is involved, consider using matching symbols. Thus

$$ a\in A\qquad b\in B\qquad c\in C $$

is more coherent than

$$ x\in A\qquad y\in B\qquad z\in C. $$

Some thought may be required for sets of sets. Consider the expressions

$$ f(A\cap B) \qquad \quad f(x\cap y). $$

The left expression will be interpreted as the image of the intersection of two sets under a function. In this case $$A\cap B$$ represents a subset of the domain of $$f$$. However, suppose that the domain of $$f$$ is a set of sets (e.g., a power set, see Sect. 2.3). Then the argument of $$f$$ is a set, and this expression becomes dangerously ambiguous. The notation in the right expression removes this ambiguity. The combination of standard symbols $$x,y$$ for variables and a set operator makes it clear that the argument $$x\cap y$$ of the function is an element, rather than a subset, of the domain.

INTEGERS. When choosing a symbol for an integer, begin from the middle region of the Roman alphabet

$$\begin{aligned} i,j,k,l,m,n \end{aligned}$$

(6.1)

particularly if an integer is used as subscript or superscript.2 However, use $$p$$ for a prime number, and $$q$$ if there is a prime different from $$p$$. The list of adjacent letters (6.1) cannot be extended; the preceding symbols, $$f,g,h$$, are typical function names (see below), while the symbol that follows, $$o$$, is rarely used, not only for its resemblance to 0 (zero), but also because it has an established meaning in asymptotic analysis (it appears in expressions of the form $$o(\log x)$$). A capital letter in the list (6.1) may be used to represent a large integer, or combined with lower-case letters to denote an integer range: $$n=1,\ldots ,N$$. This combination of symbols is much used in connection with sums and products (Sect. 3.2).

RATIONALS. For rational numbers, use lower-case Roman letters in the ranges $$a$$$$e$$, or $$p$$$$z$$. The notation

$$ r={m\over n} $$

is good, because ’$$r$$’ reminds us of ’rational’, while numerator and denominator conform to the convention for integers. If there is more than one rational, use adjacent symbols, $$s,t$$ in this case.

REAL NUMBERS. For real numbers, use the same part of the Roman alphabet as for rationals, or the Greek alphabet:

$$ \alpha ,\, \beta ,\,\gamma ,\ldots $$

If there are both rational and real numbers and if the distinction between them is important, then use Roman for the rationals and Greek for the reals.

Some Greek symbols have preferential meaning: small quantities are usually represented by $$\varepsilon ,\delta $$, while for angles one uses $$\phi ,\varphi ,\theta $$. Famous real constants have dedicated symbols:

(6.2)

These constants are typeset with the ’upright’ typeface, to highlight their distinguished status. Upright fonts are known as roman (even for Greek letters), as opposed to italic, which are slanted fonts: $$\pi ,e,\gamma $$.

COMPLEX NUMBERS. Complex numbers tend to occupy the end of the Roman alphabet, and your first choice should be $$z$$ or $$w$$. On the complex plane, we write $$z=x+Iy$$, where $$x$$ and $$y$$ are the real and imaginary parts of $$z$$, and $$I$$ is the imaginary unit, again typeset in roman font. (However, number theorists use $$\sqrt{-1}$$, not $$I$$.) The standard notation for a complex number in polar coordinates is $$z=\rho \mathrm{{e}}^{I\theta }$$, which also combines roman and italic fonts.

UNKNOWNS. The quintessential symbol for an equation’s unknown is $$x$$, invariably followed by $$y$$ and $$z$$ if there are more unknowns. For a large number of unknowns it is necessary to use sequence notation $$x_1,\ldots ,x_n$$. This notation is also appropriate for the indeterminates of a polynomial and for the arguments of a function.

COMPOSITE OBJECTS. Objects which have constituent elements (groups, graphs, matrices) are best represented with capital letters, Roman or Greek. So use $$G$$ or $$\varGamma $$ for a group or a graph, and $$M$$ for a matrix. If you have two groups, use adjacent symbols, like $$G$$ and $$H$$. As with sets, for these objects’ components consider using matching symbols, e.g., $$g\in G$$. A notable exception are graphs, where $$v$$ and $$e$$ are invariably used for vertices and edges, respectively.

FUNCTIONS. The default choice for a function’s name is, of course, $$f$$, and if there is more than one function, use the adjacent symbols $$g,h$$. Lower-case symbols work well with any number of variables: $$f(x),f(x,y,z),f(x_1,\ldots ,x_n)$$. But if the co-domain of a function is a cartesian product, then the function is a vector, and capital letters are preferable. So a real function of two variables may be specified as

$$ F{:}\,\mathbb {R}^2\rightarrow \mathbb {R}^2 \qquad (x,y)\mapsto (f_1(x,y),f_2(x,y)). $$

Greek symbols, either capital or lower-case, are also commonly used for functions’ names. The contrast between Roman and Greek symbols may be exploited to separate out the symbols’ roles, as in $$\mu (x)$$ or $$f(\lambda )$$.

Some famous functions are named after, and represented by, a symbol (often a Greek one), thereby creating a strong bond between object and notation. The best known are Euler’s gamma function $${\varvec{\Gamma }}$$

$$ {\Gamma }(s)=\int \limits _0^\infty x^{s-1}\mathrm{{e}}^{-x}{\mathrm {d}}x $$

(the extension of the factorial function to complex arguments), and Riemann’s zeta-function 3 $$\zeta $$

$$\begin{aligned} \zeta (s)=\sum _{n=1}^\infty \,\frac{1}{n^s}. \end{aligned}$$

(6.3)

There is a peculiar notation for this function: its complex argument $$s$$ is commonly written as $$s=\sigma + i \tau $$, with $$\sigma $$ and $$\tau $$ real numbers. Other functions with dedicated notation are Euler’s $$\varphi $$-function (Eq. 3.8), Dedekind’s $$\eta $$-function, Kroneker’s $$\delta $$-function, Weierstrass’ $${\fancyscript{P}}$$-function, Lambert’s $${\fancyscript{W}}$$-function, etc.

SEQUENCES AND VECTORS. Sequences pose specific notational problems due to the presence of indices. Consider the various possibilities listed in (3.2): which one should we choose? Be guided by the principle of economy: a symbol should be introduced only if it’s strictly necessary. So the notation $$(a_k)$$ is quite adequate for a generic sequence, or if the specific properties of the sequence (the initial value of the index, its finiteness) are not relevant. When more information is needed, the notation $$(a_k)_{k\geqslant 1}$$ is more economical than $$(a_k)_{k=1}^\infty $$, but the latter may be a better choice if it is to be contrasted with $$(a_k)_{k=1}^n$$. In turn, the latter is not as friendly as $$(a_1,\ldots ,a_n)$$, although it is more concise.

If a sequence is referred to often, even the stripped down notation $$(a_k)$$ could become heavy, and it may be advisable to allocate a symbol for the sequence:

$$ a=(a_1,a_2,\ldots ) \qquad \quad \mathbf{v}=(v_1,\ldots ,v_n). $$

As usual, we have employed matching symbols, using, respectively, a minimalist lower-case Roman character and a lower-case boldface character which is common for vectors. When using ellipses, two or three terms of the sequence usually suffice, but there are circumstances where more terms or a different arrangement of terms is needed.

For example, in the expression

$$ (1+x,1+x^2,\ldots ,1+x^{2^k},\ldots ) $$

the insertion of the general term removes any ambiguity, while the ellipsis on the right suggests that the sequence is infinite—cf. (3.1). The notation

$$ (a_1,\ldots ,a_{k-1},a_{k+1},\ldots ,a_n) $$

denotes a sub-sequence of a finite sequence obtained by deleting the $$k$$-th term, for an unspecified value of $$k\not =1,n$$.

Things get complicated with sequences of sequences. This situation is not at all unusual; for instance, we may have a sequence of vectors whose components must be referred to explicitly. We write

$$ V=(V_1,V_2,\ldots ) \qquad \text{ or }\qquad \mathbf{v}=(\mathbf{v}_1,\mathbf{v}_2,\ldots ). $$

Let $$V_k$$ (or $$\mathbf{v}_k$$) be the general term of our sequence. How are we to represent its components? As usual, we choose the matching symbol $$v$$, with a subscript indicating the component. However, the integer $$k$$ has to appear somewhere, and its range must be specified. It is advisable to keep $$k$$ out of the way as much as possible:

$$ V_k=(v_1^{(k)},\ldots ,v_n^{(k)})\qquad k=1,2,\ldots . $$

In this expression we have used ellipses to specify the ranges of the indices; we could have used inequalities as well:

$$ V_k=(v_j^{(k)})\qquad 1\leqslant j\leqslant n,\quad k\geqslant 1. $$

The parentheses are obviously needed for the superscript, for otherwise $$v_i^k$$ would be interpreted as $$v_i$$ raised to the $$k$$-th power. However, it may just happen that we need to raise the vector components to some power. Clearly we can’t use adjacent superscripts $$v_3^{(2)\,4}$$, so parentheses are needed, but the straightforward notation $$(v_3^{(2)})^4$$ is awkward. For a more elegant solution, we represent $$k$$ as an additional subscript, adopting, in effect, matrix notation:

$$ V_k=(v_{1,k},\ldots ,v_{n,k})\qquad k\geqslant 1. $$

As a side note, one should keep in mind that with vectors the multiplication symbols ’$$\cdot $$’ and ’$$\times $$’ are reserved for the scalar and vector products, respectively—cf. (2.10). Hence for scalar multiplication we must use juxtaposition:

$$ a(bV\cdot cW)\qquad \quad x\,\mathbf{v}\times y\,\mathbf{u}. $$

DERIVED SYMBOLS. Closely related objects require closely related notation. Proximity in the alphabet, e.g., $$x,y,z$$ may be used for this purpose. For a stronger bond, the meaning of a symbol may be modified using subscripts, superscripts and other decorations:

$$ A^*\qquad \overline{\eta }\qquad n^+\qquad \underline{h}\qquad \tilde{e}\qquad \varOmega _-\qquad Z_r. $$

The derived symbol $$\mathbb {N}_0=\mathbb {N}\cup \{0\}$$ is often found in the literature. Many symbols derived from $$\mathbb {R}$$ are in use:

These sentences illustrate the use of derived symbols:

Let $$f{:} X\rightarrow X$$ be a function, and let $$x^*=f(x^*)$$ be a fixed point of $$f$$.

We consider the endpoints $$x_-$$ and $$x_+$$ of an interval containing $$x$$.

Let $$f$$ be a real function, and let

$$\begin{aligned} f^+:\mathbb {R}\rightarrow \mathbb {R}\qquad x\mapsto {\left\{ \begin{array}{ll}f(x)&{} \text {if}\, f(x)\geqslant 0 \\ 0 &{} \text {if}\, f(x)< 0.\\ \end{array}\right. } \end{aligned}$$

It must be noted that there is no general agreement on the meaning of decorations. Thus for a set the over-bar denotes the so-called closure—adjoining to a set all its boundary points, see Sect. 5.4.1(the transformation from $$\mathbb {R}$$ to $$\overline{\mathbb {R}}$$ is essentially a closure operation). However, for complex numbers the over-bar denotes complex conjugation. If $$f$$ is a function, then $$f\,{}'$$ is the derivative of $$f$$, but for sets a prime indicates taking the complement.

EXAMPLE. We illustrate notational problems raised by the coexistence of variables and parameters. For a fixed value of $$z$$, the bivariate polynomial

$$ f(x,z)=-z^2+xz+1 $$

becomes a polynomial in $$x$$, and we wish to adopt a notation that reflects the different roles played by the symbols $$x$$ and $$z$$. We replace $$z$$ with $$a$$, to keep it far apart from $$x$$ in the alphabet, and we rewrite the expression above as

$$\begin{aligned} f_a(x)=ax+1-a^2\qquad a\in \mathbb {R}. \end{aligned}$$

(6.4)

We now have a one-parameter family of linear polynomials in $$x$$. For fixed $$a$$, the equation $$y=f_a(x)$$ is the cartesian equation of a line, so we also have a one-parameter family of lines in the plane. Plotting some of these lines reveals a hidden structure: they form the envelope of a parabola (Fig. 6.1). Likewise, if we fix $$x=a$$ we obtain a one-parameter family of quadratic polynomials $$g_a(z)=-z^2+az+1$$.

Fig. 6.1

The one-parameter family of lines $$y=f_a(x)$$, where $$f_a$$ is given by (6.4). These lines are tangent to the parabola $$y=x^2/4+1$$