You are on page 1of 261

Multirnodal Discourse Analysis

Systemic-Functional Perspectives

Open Linguistics Series


Series Editor
Robin Fawcett, Cardiff University
The series is 'open' in two related ways. First, it is not confined to works associated with
any one school of linguistics. For almost two decades the series has played a significant
role in establishing and maintaining the present climate of 'openness' in linguistics, and
we intend to maintain this tradition. However, we particularly welcome works which
explore the nature and use of language through modelling its potential for use in social
contexts, or through a cognitive model of language - or indeed a combination of the two.
The series is also 'open' in the sense that it welcomes works that open out 'core'
linguistics in various ways: to give a central place to the description of natural texts and the
use of corpora; to encompass discourse 'above the sentence'; to relate language to other
semiotic systems; to apply linguistics in fields such as education, language pathology and
law; and to explore the areas that lie between linguistics and its neighbouring disciplines
such as semiotics, psychology, sociology, philosophy, and cultural and literary studies.
Continuum also publishes a series that offers a forum for primarily functional
descriptions of languages or parts of languages Functional Descriptions of Language.
Relations between linguistics and computing are covered in the Communication in Artificial
Intelligence series, two series, Advances in Applied Linguistics and Communication in Public Life,
publish books in applied linguistics and the series Modern Pragmatics in Theory and Practice
publishes both social and cognitive perspectives on the making of meaning in language
use. We also publish a range of introductory textbooks on topics in linguistics, semiotics
and deaf studies.
Recent titles in this series
Classroom Discourse Analysis: A Functional Perspective, Frances Christie
Construing Experience through Meaning: A Language-based Approach to Cognition,
M. A. K. Halliday and Christian M. I. M. Matthiessen
Culturally Speaking: Managing Rapport through Talk across Cultures, Helen Spencer-Oatey (ed.)
Educating Eve: The 'Language Instinct' Debate, Geoffrey Sampson
Empirical Linguistics, Geoffrey Sampson
Genre and Institutions: Social Processes in the Workplace and School, Frances Christie and
J. R. Martin (eds)
The Intonation Systems of English, Paul Tench
Language Policy in Britain and France: The Processes of Policy, Dennis Ager
Language Relations across Bering Strait: Reappraising the Archaeological and Linguistic Evidence,
Michael Fortescue
Learning through Language in Early Childhood, Clare Painter
Pedagogy and the Shaping of Consciousness: Linguistic and Social Processes, Frances Christie (ed.)
Register Analysis: Theory and Practice, Mohsen Ghadessy (ed.)
Relations and Functions within and around Language, Peter H. Fries, Michael Cummings,
David Lockwood and William Spruiell (eds)
Researching Language in Schools and Communities: Functional Linguistic Perspectives,
Len Unsworth (ed.)
Summary Justice: Judges Address Juries, Paul Robertshaw
Syntactic Analysis and Description: A Constructional Approach, David G. Lockwood
Thematic Developments in English Texts, Mohsen Ghadessy (ed.)
Ways of Saying: Ways of Meaning. Selected Papers of Ruqaiya Hasan. Carmen Cloran, David
Butt and Geoffrey Williams (eds)
Words, Meaning and Vocabulary: An Introduction to Modern English Lexicology, Howard Jackson
and Etienne Z Amvela
Working with Discourse: Meaning beyond the Clause, J. R. Martin and David Rose

Multimodal Discourse Analysis


Systemic-Functional Perspectives

Edited by Kay L. O'Halloran

continuum
LONDON

NEW

YORK

Continuum
The Tower Building
11 York Road
London SE1 7NX

15 East 26th Street


New York
NY 10010

Kay L. O'Halloran 2004


All rights reserved. No part of this publication may be reproduced or transmitted
in any form or by any means, electronic or mechanical, including photocopying,
recording, or any information storage or retrieval system, without prior permission
in writing from the publishers.
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
ISBN: 0-8264-7256-7
Library of Congress Cataloguing-in-Publication Data
A catalogue record for this book is available from the Library of Congress.
Typeset by RefineCatch Limited, Bungay, Suffolk
Printed and bound in Great Britain by MPG Books Ltd, Bodmin, Cornwall

Contents

Introduction

Kay L. O'Hallomn
Part I
Three-dimensional material objects in space
1

Opera Ludentes: the Sydney Opera House at work and play


Michael O'Toole

2 Making history in From Colony to Nation: a multimodal analysis


of a museum exhibition in Singapore

11

28

Alfred Pang Kah Meng

3 A semiotic study of Singapore's Orchard Road and Marriott


Hotel

55

Safeyaton Alias

Part II
Electronic media and film
4 Phase and transition, type and instance: patterns in media texts
as seen through a multimodal concordancer

83

Anthony P. Baldry

5 Visual semiosis in film

109

Kay L. O'Halloran

6 Multisemiotic mediation in hypertext

131

Arthur Kok Kum Chiew


Part III
Print media

The construal of Ideational meaning in print advertisements


Cheong Tin Yuen

163

vi

CONTENTS

8 Multimodality in a biology textbook

196

Libo Guo

9 Developing an integrative multi-semiotic model

220

Victor Lim Fei

Index

247

This book is dedicated to my mother, Janet O'Halloran

This page intentionally left blank

Introduction
Kay L. O'Halloran

Multi-modal Discourse Analysis is a collection of research papers in the field of


multimodality. These papers are concerned with developing the theory and
practice of the analysis of discourse and sites which make use of multiple
semiotic resources; for example, language, visual images, space and architecture. New social semiotic frameworks are presented for the analysis of a
range of discourse genres in print media, dynamic and static electronic
media and three-dimensional objects in space. The theoretical approach
informing these research efforts is Michael Halliday's (1994) systemicfunctional theory of language which is extended to other semiotic resources.
These frameworks, many of which are inspired by Michael O'Toole's (1994)
approach in The Language of Displayed Art, are also used to investigate meaning arising from the integrated use of semiotic resources.
The research presented here represents the early stages in a shift of focus
in linguistic enquiry where language use is no longer theorized as an isolated
phenomenon (see, for example, Baldry, 2000; Kress, 2003; Kress and van
Leeuwen, 1996, 2001; ledema, 2003; Ventola et al., forthcoming). The
analysis and interpretation of language use is contextualized in conjunction
with other semiotic resources which are simultaneously used for the construction of meaning. For example, in addition to linguistic choices and their
typographical instantiation on the printed page,1 multimodal analysis takes
into account the functions and meaning of the visual images, together with
the meaning arising from the integrated use of the two semiotic resources.
To date, the majority of research endeavours in linguistics have tended to
concentrate solely on language while ignoring, or at least downplaying, the
contributions of other meaning-making resources. This has resulted in
rather an impoverished view of functions and meaning of discourse.
Language studies are thus undergoing a major shift to account fully for
meaning-making practices as evidenced by recent research in multimodality
(for example, Baldry, 2000; Callaghan and McDonald, 2002; ledema, 2001;
Jewitt, 2002; Martin, forthcoming; Kress, 2000, 2003; Kress et al., 2001:
Kress and van Leeuwen, 1996, 2001; Lemke, 1998, 2002, 2003; O'Halloran,
1999a, 2000, 2003a, 2003b; Royce, 2002; Thibault, 2000; Unsworth, 2001;
Ventola et al., forthcoming; Zammit and Callow, 1998).
Multimodal Discourse Analysis contains an invited paper by Michael

INTRODUCTION

O'Toole, a founding scholar in the extension of systemic-functional theory


to semiotic resources other than language. The collection also features an
invited contribution from Anthony Baldry, a forerunner in the use of information technology for the development of multimodal theory and practice.
The remaining seven research papers have been completed by Kay
O'Halloran and her postgraduate students in the Semiotics Research Group
(SRG) in the Department of English Language and Literature at the
National University of Singapore. The SRG has been actively involved in
research in systemic-functional approaches to multimodality over the
period 1999-2003.
The papers are organized into sections according to the medium of the
discourse: Part I which is concerned with three-dimensional material objects
in space, Part II which deals with electronic media and film and Part III
which contains investigations into print media. The theoretical advances
presented in this volume are illustrated through the analysis of a range of
multimodal discourses and sites, some of which are Singaporean. These
contributions represent a critical yet sensitive interpretation of everyday
discourses in Singapore. Thus, like all discourse, they are grounded in local
knowledge, but due to the universality of the semiotic model being used,
they are applicable to similar texts in any culture. A brief synopsis of each
paper in this collection is given below.
In Michael O'Toole's opening paper in Part I, 'Opera Ludentes: the
Sydney Opera House at work and play', a systemic-functional analysis of
architecture (O'Toole, 1990, 1994) is used to consider in turn the Experiential, Interpersonal and Textual functions ofJ0rn Utzon's (1957-73) Sydney
Opera House and its parts, both internally and in relation to its physical and
social context. In this paper, the usual definition of 'functionalism' in architecture is significantly extended. Like language, the building embodies an
Experiential function: its practical purposes, the 'lexical content' of its components (theatre, stage, seats, lights, and so forth) and the relations of who
does what to whom, and when and where. It also embodies a 'stance' vis-avis the viewer and user (its facade, height, transparency, resemblance to
other buildings or objects) which also reflects the power relations between
groups of users. That is, it embodies an Interpersonal function like language. The Sydney Opera House also embodies a Textual function: its parts
connect with each other and combine to make a coherent 'text', and it
relates meaningfully to its surrounding context of streets, quays, harbour,
nearby buildings and cityscape, and by 'meaningful' here we include deliberate dramatic contrast as well as harmonious blending in. In the analysis,
certain features are discovered to be multifunctional, marking 'hot spots' of
meaning in the total building complex. In terms of all three functions, the
Opera House emerges as a playful building: Opera Ludentes. Utzon's building started its life as a focus of architectural and political controversy and
most discourses about the building are still preoccupied with the politics of
its conception, competition, controversies and completion by different architects. A semiotic rereading of the building can relate its structure and design

INTRODUCTION

to the 'social semiotic' of both Sydney in the 1960s and to the international
community of its users today.
The museum is located as the next site for semiotic study in Alfred Pang's
'Making history in From Colony to Nation: a multimodal analysis of a museum
exhibition in Singapore'. Pang discusses how systemic-functional theory is
productive in fashioning an interpretative framework that facilitates a multimodal analysis of a museum exhibition. The usefulness of this framework
is exemplified in the critical analyses of particular displays in From Colony to
Nation, an exhibition at the Singapore History Museum (SHM) that displays
Singapore's political constitutional history. From this analysis, Pang explains
how the museum as a discursive site powerfully constitutes and maintains
particular social structures through the primary composite medium of an
exhibition. Of interest is the relationship between the museum, nation and
history and how the multimodal representation of history in From Colony to
Nation ideologically positions the visitor to a particular style of imagining a
'nation' (Anderson, 1991).
Safeyaton Alias investigates the semiotic makeup of the city in 'A semiotic
study of Singapore's Orchard Road and Marriott Hotel'. Like a written text,
the city stores information and 'presents particular transformations and
embeddings of a culture's knowledge of itself and of the world' (Preziosi,
1984: 50-51). In this paper, a rank-scale framework for the functions and
systems in the three-dimensional multi-semiotic city is proposed. The focus in
this paper, however, is the analysis of the built forms of Orchard Road and
the Marriott Hotel. Safeyaton discusses how these built forms transmit messages which are articulated through choices in a range of metafunctionally
based systems. This paper discusses the intertextuality and the discourses that
construct Singapore as a city that survives on consumerism and capitalism.
In Part II on electronic media and film, Anthony Baldry's opening paper,
'Phase and transition, type and instance: patterns in media texts as seen
through a multimodal concordancer', explores the use of computer technology for capturing 'the slippery eel-like' (to quote Baldry) dynamics of
semiosis. Baldry demonstrates that the online multimodal concordancer, the
Multimodal Corpus Authoring (MCA) system, provides new possibilities for
the analysis and comparison of film and videotexts. This type of concordancing transcends in vitro approaches by preserving the dynamic text, insofar
as this is ever possible, in its original form. The relational properties of the
multimodal concordancer also allow a researcher to embark on a quest for
patterns and types. Taking the crucial semiotic units of phase and transition
as its starting point, Baldry shows that, when examining the semiotic and
structural units that make up a video, a multimodal concordancer far outstrips multimodal transcription in the quest for typical patterns.
Kay O'Halloran further explores the use of computer technology for
the semiotic analysis of dynamic images in 'Visual semiosis in film'. A systemic-functional model which incorporates the visual imagery and the
soundtrack for the analysis of film is introduced. Inspired by O'Toole's
(1999) representation of systemic choices in paintings in the interactive

INTRODUCTION

CD-ROM Engaging with Art., O'Halloran uses video-editing software Adobe


Premiere 6.0 to discuss the analysis of the temporal unfolding of semiotic
choices in the visual images for two short extracts from Roman Polanski's
(1974) film Chinatown. While film narrative involves staged and directed
behaviour to achieve particular effects, the analysis of film is at least a first
step to understanding semiosis in everyday life. The analysis demonstrates
the difficulty of capturing and interpreting the complexity of dynamic
semiotic activity.
Attention turns to hypertext in Arthur Kok's 'Multisemiotic mediation in
hypertext'. In this paper, Kok explores how hypertext (re)presents reality and
engages the user, and how instantiations of different semiotic resources are
arranged and co-deployed for this purpose. This paper formulates a working
definition and a theoretical model of hypertext which contains different
orders of abstraction. As with many papers in this collection, the semiotic
analysis is employed through extending previously developed systemicfunctional frameworks (Halliday, 1994; Kress and van Leeuwen, 1996;
O'Toole, 1994). Via an examination of the semiotic choices made in
Singapore's Ministry of Education (MOE) homepage, this analysis seeks to
understand how the objectives of an institution become translated, transmitted and received through the hypertext medium. In the process, an
account of the highly elusive process of intersemiosis, the interaction of
meanings across different semiotic instantiations, is given.
In Part III on print media, in the first paper, 'The construal of ideational
meaning in print advertisements', Cheong Yin Yuen proposes a generic
structure potential for print advertisements which incorporates visual and
verbal components. Cheong also investigates lexicogrammatical strategies
for the expansion of ideational meaning which occur through the interaction of the linguistic text and visual images. Through the analysis of five
advertisements, Cheong develops a new vocabulary to discuss the strategies
which account for semantic expansions of ideational meaning in these texts;
namely, the Bi-directional Investment of Meaning, Contextual Propensity,
Interpretative Space, Semantic Effervescence and Visual Metaphor.
Moving to the field of education, Guo Libo investigates the multi-semiotic
nature of introductory biology textbooks in 'Multimodality in a biology
textbook'. These books invariably contain words and visual images: for
example, diagrams, photographs, and mathematical and statistical graphs.
Drawing upon the work of sociological studies of biology texts and following
O'Toole (1994), Lemke (1998) and O'Halloran (1999b), this paper proposes
social semiotic frameworks for the analysis of schematic drawings and mathematical or statistical graphs in biology. The frameworks are used to analyse
how the various semiotic resources interact with each other to make meaning
in selected pages from the biology textbook Essential Cell Biology (Alberts et al.,
1998). The article concludes by reiterating Johns's (1998: 194) claim that in
teaching English for Academic Purposes to science and engineering students, due attention must be given to the visual as well as the linguistic
meaning in what is termed Visual/Textual interactivity' (ibid.: 186).

INTRODUCTION

Lastly, in order to further theorize the meaning made in texts containing


language and visual images, Victor Lim proposes a meta-model in 'Developing an integrative multi-semiotic model'. This model allows for an integrative approach to the interpretation of texts where the simultaneous
co-deployment of choices from various systems contextualize each other at
each instance of the meaning-making process. It takes into account the
independent meanings made by each semiotic resource and, further to this,
theorizes a space of interaction and integration where inter-semiotic processes for the expansion of meaning (for example, 'homospatiality' and
'semiotic metaphor') take place. The model also accounts for systems of
Typography and Graphics that operate on the Expression plane. Building on
the pioneering work done in this field (for example, Baldry, 2000; Baldry and
Thibault, forthcoming; Lemke, 1998; O'Halloran, 1999a; Thibault, 2000),
as with each paper in this collection, the model is conceived in the tradition
of the systemic-functional theory.
Michael Halliday has always been ready to extend and enrich his linguistic theory when particular types of text demanded it. The contributors
to this volume may be seen to be attempting to extend productively these
categories for multimodal analysis.

Note
Regrettably it has not been possible to reproduce coloured plates in this
publication. However, as will become evident in what follows, the contributors in this volume recognize that colour is a significant resource for meaning (see also Kress and van Leeuwen, 2002). While the papers have been
somewhat comprised by the black and white reproductions, every possible
effort has been made to ensure that the analysis refers to the original colour
of the texts.

References
Alberts, B., Bray, D., Johnson, A., Lewis, J., Raff, M., Roberts, K. and Walter,
P. (1998) Essential Cell Biology: An Introduction to the Molecular Biology of the Cell.
New York: Garland.
Anderson, B. (1991) Imagined Communities: Reflections on the Origin and Spread of Nationalism (revised edn). London: Verso.
Baldry, A. P. (ed.) (2000) Multimodality and Multimediality in the Distance Learning Age.
Gampobasso, Italy: Palladino Editore.
Baldry, A. P. and Thibault, P. (forthcoming) Multimodal Transcription and Text.
London: Equinox.
Gallaghan, J. and McDonald, E. (2002) Expression, content and meaning in language and music: an integrated semiotic analysis. In P. McKevitt, S. O'Nuallain
and C. Mulvihill (eds), Language, Vision and Music. Selected papers from the 8th International Workshop on the Cognitive Science of Natural Language Processing, Galway, Ireland,
1999. Advances in Consciousness Research, Volume 35. Amsterdam: Benjamins,
205-220.

INTRODUCTION

Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:


Edward Arnold.
ledema, R. (2001) Analysing film and television: a social semiotic account of hospital: an unhealthy business. In T. van. Leeuwen and C. Jewitt (eds), Handbook of
Visual Analysis. London: Sage, 183204.
ledema, R. (2003) Multimodality, resemioticization: extending the analysis of discourse as a multi-semiotic practice. Visual Communication 2(1): 2957.
Jewitt, C. (2002) The move from page to screen: the multimodal reshaping of school
English. Visual Communication 1(2): 171195.
Johns, A. (1998) The visual and the verbal: a case study in macroeconomics. English
for Specific Purposes 17(2): 183-197.
Kress, G. (2000) Multimodality. In B. Cope and M. Kalantzis (eds), Multiliteracies:
Literacy Learning and the Design of Social Futures. London: Routledge, 182202.
Kress, G. (2003) Literacy in the New Media Age. London: Routledge.
Kress, G, Jewitt, G., Ogborn, J. and Tsatsarelis, C. (2001) Multimodal Teaching and
Learning: The Rhetorics of the Science Classroom. London: Continuum.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (2001) Multimodal Discourse: The Modes and Media of
Contemporary Communication. London: Arnold.
Kress, G. and van Leeuwen, T. (2002) Colour as a semiotic mode: notes for a
grammar of colour. Visual Communication 1(3): 343-368.
Lemke, J. L. (1998) Multiplying meaning: visual and verbal semiotics in scientific
text. InJ. R. Martin and R. Veel (eds), Reading Science: Critical and Functional Perspectives on Discourses of Science. London: Routledge, 87113.
Lemke, J. L. (2002) Travels in hypermodality. Visual Communication 1(3): 299325.
Lemke, J. L. (2003) Mathematics in the middle: measure, picture, gesture, sign and
word. In M. Anderson, A. Saenz-Ludlow, S. Zellweger and V Cifarelli (eds),
Educational Perspectives on Mathematics as Semiosis: From Thinking to Interpreting to Knowing. Ottawa: Legas Publishing, 215-234.
Martin, J. R. (forthcoming) Sense and sensibility: texturing evaluation. InJ. Foley
(ed.), Mew Perspectives on Education and Discourse. London: Continuum.
O'Halloran, K. L. (1999a) Interdependence, interaction and metaphor in multisemiotic texts. Social Semiotics 9(3): 317354.
O'Halloran, K. L. (1999b) Towards a systemic-functional analysis of multi-semiotic
mathematics texts. Semiotica (124-1/2): 1-29.
O'Halloran, K. L. (2000) Classroom discourse in mathematics: a multi-semiotic
analysis. Linguistics and Education 10(3): 359388.
O'Halloran, K. L. (2003a) Educational implications of mathematics as a multisemiotic discourse. In M. Anderson, A. Saenz-Ludlow, S. Zellweger, and V V
Cifarelli (eds), Educational Perspectives on Mathematics as Semiosis: From Thinking to
Interpreting to Knowing. Ottawa: Legas Publishing, 185-214
O'Halloran, K. L. (2003b) Intersemiosis in mathematics and science: grammatical
metaphor and semiotic metaphor. In A.-M. Simon-Vandenbergen, M. Taverniers, and L. Ravelli (eds), Grammatical Metaphor: Views from Systemic Functional Linguistics. Amsterdam: John Benjamins, 337365.
O'Toole, M. (1990) A systemic-functional semiotics of art. Semiotica (823/4):
185-209.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
O'Toole, M. (1999) Engaging with Art [CD-ROM]. Perth: Murdoch University.

INTRODUCTION

Preziosi, D. (1984) Relations between environmental and linguistic structure. In


R. P. Fawcett, M. A. K. Halliday, S. M. Lamb and A. Makkai (eds), The Semiotics of
Culture and Language Volume 2. Language and Other Semiotic Systems of Culture. Dover,
NH: Frances Pinter, 47-67.
Royce, T. (2002) Multimodality in the TESOL classroom: exploring visualverbal
synergy. TESOL Quarterly 36(2): 191-205.
Thibault, P. J. (2000) The multimodal transcription of a television advertisement:
theory and practice. In A. P. Baldry (ed.), Multimodality and Multimediality in the
Distance Learning Age. Gampobasso, Italy: Palladino Editore, 311385.
Unsworth, L. (2001) Teaching Multiliteracies across the Curriculum: Changing Contexts of
Text and Image in Classroom Practice. Buckingham, UK: Open University Press.
Ventola, E., Charles, C. and Kaltenbacher, M. (eds) (forthcoming) Perspectives on
Multimodality. Amsterdam: John Benjamins.
Zammit, K. and Callow, J. (1998) Ideology and technology: visual and textual
analysis of two popular CD-ROM programs. Linguistics and Education 10(1):
89-105.

Acknowledgements
The research presented here is only made possible through the foundational
work of Michael Halliday and Michael O'Toole. I am also indebted to Jay
Lemke for originally pointing me in this direction many years ago, and for
his continued support since that time. I also thank Joe Foley, Eija Ventola,
Frances Christie and Anthony Baldry for their friendship, advice and active
support over the years.
My special thanks also to Michael O'Toole for his invaluable reading of
the first draft of the manuscript. His comments, corrections and suggestions
have contributed to the final form of this volume, although of course any
errors of interpretation are mine. I am also most grateful to Guo Libo for his
careful proof-reading and corrections to the manuscript.
My sincere thanks to my talented group of postgraduate research students for their enthusiasm, dedication and commitment to push the boundaries of multimodal analysis. This volume would not be possible without
their contributions. And special thanks to my past and present colleagues in
the Department of English Language and Literature at the National University of Singapore (NUS), especially Linda Thompson, Chris Stroud, Ed
McDonald and Desmond Allison for their continued friendship and
support.
I would also like to thank Anne Pakir and the Faculty Research Committee (FRC) in the Faculty of Arts and Social Sciences at NUS for providing
the research grant (R-103-000-014-107/112) in 2000 to establish the Laboratory for Research in Semiotics (LRS) in the Department of English Language
and Literature. The research grant has directly supported the research
presented in this publication.

This page intentionally left blank

Parti
Three-dimensional material objects in space

This page intentionally left blank

Opera Ludentes: the Sydney Opera House at work


and play

Michael O'Toole
Murdoch University, Western Australia
Here the trick was to get people up. When you go up the steps you see no
buildings. You see the sky and you get separated from being between houses. I
like procession very much: sky foyer windows sea. It takes you to another
world. That's what you want for an audience: to separate themselves from their
daily life.
(J0rnUtzon, 1998)1

Clearly, for the architect of Sydney Opera House (Plate 1.1) 'Interpersonal'
meanings are very important: the building's height and orientation to its
visitors; the play of vistas as one approaches the entrance; the stress on
architecture as theatre; constructing an audience; a working building at play.
In a systemic-functional semiotic model of architecture2 (O'Toole, 1994;
Table 1.1) these kinds of meaning are analogous to the Interpersonal
semantic functions in language: Mood constructing the roles to be played in
a verbal interaction; Modality constructing a hinge between the real and the
hypothetical; Attitudinal Modifiers and Intensifiers expressing the speaker's
position and influencing the response of the hearer.
If you look out here [at Utzon's home in Helebek, Denmark], you see a field
with flowers and a small bush and small trees and big trees. They all consist of
small elements. And if you take them up and put them on the table it's a
number of elements. Together they make this. In architecture you have a floor,
your walls, you have windows, doors, and you have a lot of materials. And you
select them. You must have in mind that they make a whole or an expression of
some kind.
(J0rnUtzon, 1998)3

Here Utzon's focus is on 'Textual' meanings: the way distinct architectural


components are combined to make a coherent whole, that is to say, an
important dimension of the meaning ('an expression of some kind') is in the
composition.4
As in language, the Collocational potential of architectural elements - their
Conjunction in rooms and floors and buildings, their Reference to each other
and to their environment - is what makes them into coherent and usable
'texts'.

Table 1.1 Functions and systems in architecture (reproduced from O'Toole 1994: 86)
Units/
Experiential
Functions

Interpersonal

Texture

Building

Size
Orientation to neighbours
Verticality
Orientation to road
Chthonicity Orientation to entrant
Fagade
Intertextuality
Gladding
reference
Colour
mimicry
Modernity
contrast
Exoticism

Relation to city
Relation to road
Relation to adjacent buildings
Proportions
Rhythms: contrasting shapes,
angles
Textures: rough/smooth
Roof/ wall relation
Reflectivity
Opacity

Height
Sites of power
Spaciousness Separation of groups
Accessibility
Openness of vista
View
Hard/ soft texture
Colour

Relation to other floors


Relation to outer world
Relation to connectors; stairs/lift
escalator (external cohesion)
Relation of landing/corridor/
foyer/room (internal cohesion)
Degree of partition
Permanence of partition

Practical function: Public/Private;


Industrial/Commercial/Agricultural/
Governmental/Educational/Medical/
Cultural/Religious/Residential; Domestic/
Utility
Orientation to light
Orientation to wind
Orientation to earth
Orientation to service (water/sewage/
power)

Floor

Sub-functions:

Access:
Working
Selling
Administration
Storing
Waking
Sleeping
Parking

Units/
Functions

Experiential

Room

Specific functions:
Access
Study
Entry
Toilet
Living room Laundry
Family room Gamesroom
Kitchen
Retreat
Bathroom
Bedroom

Element

Foyer
Restaurant
Kitchen
Bar

Bedroom
Ensuite
Servery

Light: window, lamp, curtains, blinds


Air: window, fan, conditioner
Heating: central, fire, stove "dining
Sound: carpet, rugs,
coffee
partitions acoustic,
occasional
treatment
desk
function
Seating <{
table ' computer
drawing
I comfort

Interpersonal

Texture

Comfort
Lighting
Modernity Sound
Opulence Welcome
Style: rustic, pioneer, colonial, suburban
'Dallas', working class, tenement,
slum
Foregrounding of function

Scale
Lighting
Sound
Relation to outside
Relation to other rooms
Connectors: doors/windows/
hatches/intercom
Focus (e.g. hearth, dais, altar, desk)

Relevance
Functionality: convention/surprise
Texture: rough/smooth
Newness
Decorativeness
'Stance'
Stylistic coherence
Projection (e.g. TV)

Texture
Positioning: to light/heat/other
elements
Finish

Plate 1.1 The Sydney Opera House as procession

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

15

It's a curious fact that in all the drama of constructing the building, not much
detailed thought had gone into its specific uses. The competition entrants had
been asked to provide large and small halls, the larger to accommodate orchestral
concerts and opera as its chief forms of entertainment. At this point, seven years
after construction began, the Australian Broadcasting Commission decided that a
multi-purpose venue wouldn't be good enough as the permanent home of the
Sydney Symphony Orchestra.
It was a difficult situation. To argue stubbornly in favour of the original multipurpose concept for the major auditorium would mean accepting compromises
on both sides in terms of stage requirements and acoustics: orchestral music
versus opera. There were also practical considerations involved, such as a reduction in the seating capacity for concerts, and the logistics of sharing the hall.
Should it be devoted to performances of opera and ballet alone?

(Sykes, 1993: 45)


A great deal of the political controversy surrounding the design and construction of the Opera House focused on the 'Experiential' use-functions of
the building and the competing claims of its corporate users. The brief for
any commissioned architect or entrant to an architectural competition
necessarily starts from the uses proposed for the building.
Like a clause in language, a building incorporates Types of Process and
their Participants; its specific functions are Modified in terms of material,
size, colour and texture; and its component elements are organized taxonomically like lexical items in the vocabulary of our language.
We clearly need to take account of the Experiential function of architecture. Otherwise, our roof will leak, our rooms will be full of draughts, our
cupboards and desk will face the wall, and we will find ourselves cooking or
worshipping or taking baths in the bedroom. But the obsession with Tunctionalism' in architecture by both its modernist proponents and its Postmodernist critics has taken it for granted either that the Experiential function
is the only function and that the design and evaluation of a building stands
or falls by this criterion alone, or that the form of the building primarily
expresses its practical use, which confuses functions, or modes of meaning,
which should be kept distinct. A systemic-functional approach corrects such
blinkered approaches by proposing that there are three functions creating
meaning in all buildings: an Experiential, an Interpersonal and a Textual
function, and that these are all equally valid and equally necessary for a
building to be meaningful and socially usable.
J0rn Utzon was probably naive in the early phase of designing and constructing the Opera House in that his revolutionary designs foregrounded
the public image (Interpersonal) and sculptural coherence (Textual function)
of the building, leaving many features of its use (Experiential) insufficiently
resolved. Given the political partisanship, the conflicting client requirements
and the media hype surrounding his design from the outset, this bias is
understandable, but it meant that his successors had to focus in the first
instance on the Experiential function:

16

MULTIMODAL DISCOURSE ANALYSIS


When Utzon resigned in 1966, the construction of the roof and its tile cladding
was well under way. But plans for Stage III were scarcely defined, and they
involved the elements which would turn the building from a magnificent sculpture to a working centre for the performing arts: the walls that would enclose the
roof area, the performing venues within it, the stage equipment and the furnishing of foyer, backstage and administrative areas throughout.
The newly appointed triumvirate of architects (Peter Hall, Lionel Todd and
David Littlemore) declared their intention to complete the building as closely as
possible to Utzon's intentions. But in the drawings that Utzon left behind, there
were no precise dimensions worked out for what would be more than a thousand
rooms within the structure. [. . .]
The key to finalising the internal designs was to establish what their users
wanted. Incredibly, in the Alice in Wonderland development of the construction,
there had been no formal compilation of user organisations' expectations
in terms of performance characteristics and capacities, dressing room and
rehearsal area backup, box office, administration, air conditioning and catering
requirements.
(Sykes, 1993: 61-62)

As our chart of functions and systems in architecture (Table 1.1) shows, a


large number of Experiential functions are involved in a building complex
like the Sydney Opera House. Practical orientations to light (the sun, reflections off the harbour), to wind (prevailing winds, strength of the highest
possible wind gusts), to the earth (the building up of Bennelong Point to
form the massive podium, its projection out into the harbour), and the
provision of services such as water, sewage, power, scenery and food delivery, car-parking, waste disposal, etc. had already been accounted for either
by Utzon and his team of architects or by the consultant engineers, Ove
Arup and Partners. But each functioning part requires separate specifications: the concert hall with its open plan and relatively fixed fittings as
opposed to the opera theatre with its proscenium arch and constantly changing scenery, its stage tower, backstage, stage and auditorium; the drama
theatre (originally designed as a smaller experimental theatre) as opposed to
the playhouse (originally designed as a 'music room' for solo recitals and
chamber music) or the Broadwalk Studio (originally conceived purely as a
recording hall); the Bennelong Restaurant, serving high-quality international cuisine for leisurely eating under its own miniature shell roof, as
opposed to the more informal forecourt restaurant, the Cafe Mozart, the
performers' cafeteria, or the ad hoc catering arrangements in the foyers.
As I discovered in analyzing a church and even a suburban display home
in The Language of Displayed Art (O'Toole, 1994), the rank of Floor on the
chart may not always be valid as such. And yet even in the complex structures of the Opera House, particular spaces below the rank of Building but
above the rank of Room, as likely to be separated horizontally as vertically,
still need to be accounted for experientially. To the list of sub-functions listed
at floor rank on the chart we could add Rehearsing, Recording, Cooking,

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

17

Eating, Scenery Construction, Maintenance, Security, and so on. And even


the multifunctional (in the old sense) outdoor areas of the forecourt, broadwalk, arcade and steps involve specific but varying activities (ProcessParticipant relations).
The point is that a systemic-functional semiotics takes a rank-scale as one
of its starting-points, differentiating the options available at lower ranks in
relation to those available at higher ranks. In the Experiential function this is
partly a matter of common sense (i.e. the shape of a drinks bar or a box
office at Room rank requires different decisions from either the types of
Element (desk, chairs, equipment) with which they will be furnished or the
shape, illumination, ventilation and accessibility or enclosability of
the larger foyer (Floor) spaces in which they are to be found). It is also a
matter of different design specialists, with whole firms and even industries
being responsible for particular Experiential sub-functions (cooking, drinksserving, ticketing, public relations, etc.).
The heuristic value of the rank-scale becomes more obvious when we
relate these Experiential distinctions to Interpersonal and Textual distinctions at the same ranks. To illustrate how architectural meanings are made
through all three functions I want to start with one of the smallest, most
numerous and most visible elements of the whole structure, the 1,056,000
roof tiles.
Experientially, the roof covering had to be weatherproof to all climatic
conditions and had to be self-cleaning, but curved roofs can be sprayed or
sheeted in copper or bronze. As the architect, Harry Seidler relates:
I asked him in his office, 'Why do you want to cover a building like that with tiles?
A curved surface, it could be sprayed.' And he looked surprised and said, 'But
tiles are the best.' And he'd looked all over the world at them, and he'd seen them
in the Middle East and elsewhere, mosques covered in gleaming tiles. And he'd
been to Japan and China, and he was very concerned with the quality that made
them up: what material they used, where they got the clay from and what mixes
they used in the clay, till it eventually satisfied him that it gave a slightly rough
surface. And this was the natural colour, the white, and over that surface was a
very clear glaze, a very shiny glaze.5

The material quality and the rough surface, the texture of the built surface
are primarily Interpersonal considerations. Like the shine and the gleam
they are part of the impact the Opera House shells have on spectators. And
the intertextual references to mosques and Oriental architecture, visual similarities which may jog our cultural memory, are Interpersonal issues. The
impact on the spectator is crucial to Utzon. For him his Opera House is
almost more than a sculpture; it has a human personality:
It tells a story, it's not a calm building, it's awake all the time. You cannot make a
sculpture better than something that's white or off-white. If you look at bronze
sculptures in nature, they're difficult to read. If you had put a copper roof on this
house, you wouldn't have benefitted from the light. You would have seen a green

18

MULTIMODAL DISCOURSE ANALYSIS


marvellous colour. So this was my first and only idea for the roof. And Saarinen
said to me, 'Keep it white. Sydney harbour is dark.' And at that time the buildings
were dark. So it's the right answer.6

The older Finnish architect is as alert as his Danish colleague to the effect on
the spectator of the chiaroscuro of a white building against a dark ground
and the quality of light in a city on the water like Sydney, Helsinki or
Copenhagen.
Of course, the Interpersonal function at the rank of Element is not confined to the roof tiles. The concrete ribs of the shells have a primary
Experiential function of binding and supporting the roof, but as soon as one
steps inside, one becomes aware of the contrast between the raw, matt and
unpatterned grey concrete of the ribs and the warm brown satin grain of
internal balustrades and doors. In terms of its textures, the building (apart
from the tiled surfaces) seems to start as rough, raw, grey and abrasive in its
outer layers and become progressively more smooth, polished, colourful and
comforting as we move to the core of the personal artistic experience in our
seat in any of the auditorial it speaks to us Interpersonally through its shine,
colours, textures, the very warmth or coldness of the materials used. This
play of material qualities has even more impact on the spectator at those
points outside the building where the shells meet the metal struts and sheets
of glass of the windows in an exciting geometry of tiles, raw concrete, metal
and glass (Plate 1.2). As we shall see, this involves an important interplay of
the Interpersonal and Textual functions.
Interpersonal relevance is obviously a key criterion inside a theatrical
building. Audience seats and lighting and sound booths face performers'
spaces; conductor's rostra face orchestras; prompt boxes face actors; bartenders face customers across bars, counters and tables (as Ervin Goflftnan
showed in the 1950s7 - and Fawlty Towers hyperbolized in the 1960s - restaurants and hotels are highly dramatistic spaces). The public relations
mechanisms of display boards, information desks, ticket offices, media
interview spaces and Opera House guide routes all have their structure as
mini-theatres. And where 'projection' in the home may be confined to one
or two TV sets, in theatres it covers the gamut of possibilities from staging,
rostra, lighting, sound projection, security video and telephones (fixed and
mobile) and even the projection of performances to overflow audiences on
closed-circuit television. In all these aspects of a theatre or concert hall you
might say that the Interpersonal is Experiential - but we will argue that there
is still real heuristic value in keeping them separate.
Less obviously 'theatrical' choices at Room rank are involved in the Interpersonal systems of Comfort, Modernity, Opulence and Style. Patrons of
concerts and operas are enveloped in a cocoon of almost perfect acoustics
and seated on luxuriously upholstered seats (Plate 1.3). These seats in
moulded birch ply and contrasting scarlet upholstery carry a message of
Scandinavian 'functionalism' of the 1960s and 1970s: like so much of the
architecture here, they put their working functions on display. The steel

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

Plate 1.2

19

Texture and geometry

cable tensioning of the concrete columns, the moulded curves of internal


beams and columns, the glass curtain walls, even the acoustic baffles and
plexiglass 'doughnuts' which hang in a ring from the roof of the Concert
Hall 'show the works' - though less stridently than Richard Rogers's Pompidou Centre in Paris and its imitators. The stress here comes from a humanist
'craft' tradition of high-quality but 'natural' materials (wool, varnished ply,
grained parquet, shuttered concrete) with a modest unassertive finish.
The Interpersonal meaning of many types of building is carried by the
placing and styling of 'sites of power', that is, a building expresses the
political relations between its various users. A building primarily dedicated
to classical musical performance incorporates the power of the conductor's
rostrum over the orchestra and the power of both over the audience. Hidden control booths and 'Private' administration rooms mystify this power
further. The stage and orchestra in the Opera House and other theatrical
spaces carry the same power relations.
We have pinpointed many of the systems realizing the Interpersonal function at the ranks of Floor, Room and Element, but with the Sydney Opera
House this function begins and ends at the rank of the whole building
complex. Our very opening quotation of Utzon's own words shows the
architect's concern with imposing Size and Verticality and Orientation to the

Plate 1.3 The Concert Hall

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

21

entrant (systems in the top central box of the Chart): the viewer is induced to
look up, beyond the steps, beyond the shells to the sky and to imagine
themselves into another world of the imagination, even before the official
performance starts. Chthonicity is a particularly interesting system in this
case, because the Opera House deliberately plays with conflicting options:
on the one hand, there seem to be no solid walls embedded in the base. The
shells rear up skywards (anti-chthonically, away from the earth), to such a
degree that their corners hardly seem to touch their footings, seeming to
balance on pinpoints. The smooth spherical curves induce a touch of vertigo and it is no wonder that so many of the photographs of the Opera
House, whether by official agencies or casual tourists, accentuate the
upward thrust of the shells. On the other hand, the podium is highly
chthonic: it has turned Bennelong Point into a rock-like headland and, as we
know, incorporates many of the key functions of the working building. The
light, dynamic, mobile and poetic structures above are embedded in the
solid and prosaic podium.
A building's orientation to its neighbours and the road by which it is
approached are important aspects of its Interpersonal function. Utzon and
Saarinen were keen for the white curves of the sails to stand out against the
predominantly dark water of the harbour and the high-rise buildings of
Sydney's rigidly rectangular central business district at that time. (Since
1973 more of the neighbouring buildings have been constructed in lighter
concrete, marble or glass - perhaps in deference to Utzon's building as well
as in harmony with changing architectural fashions.) The multiple curves,
however, offer visual echoes of Sydney Harbour Bridge (Plate 1.4), Circular
Quay and the bays and headlands of the harbour. Of course, good architectural as well as human relations can be spoiled when bad neighbours
move in. The Opera House's visual relationship with Circular Quay has
been obstructed and, more importantly, the easy natural pedestrian route
from the ferry terminals to the entrance steps has been interrupted by the
rectangular complex of shops and apartments erected in 1997-8,
unpopularly known as 'the East Circular Quay toaster'.
The final heading in the Interpersonal box at the rank of Building on the
chart is 'Intertextuality'. This was a term coined by Mikhail Bakhtin, the
Russian literary theorist and philosopher, to account for the deliberate references, allusions or echoes that a writer makes to other widely known texts.
As with language texts, this would seem to carry primarily an Interpersonal
function in architecture: the writer/architect is saying to the viewer 'Nudgenudge . . . look at my clever reference here to Stonehenge, or Palladian
villas, or St Peter's in Rome, or the Pompidou Centre in Paris . . . It is up to
you to enrich the meaning further here by your knowledge of that building,
its uses, its tradition, its local cultural significance, etc'. And to some extent
we as viewers interpret the allusion according to our range of references and
our cultural preoccupations at the time. Virtually everyone seeing the Opera
House sees the visual metaphor of sails; many see sharks' jaws or clam
shells; Barry Humphries saw a drowning nun. Utzon claims that the curves

Plate 1.4 Visual echoes

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

23

of the sails were inspired by the segments of an orange; the relation between
the outer shell and the inner roof of the auditoria - by the snug fit between
shell and kernel in a walnut; the structural relations between the construction units and the whole building variously by the leaves on a palm tree or
by Meccano toy construction sets. But in terms of other built texts, we have
Utzon's word for it that he had in mind a relationship between water and
built forms at Kronborg Castle, Helsingor; the soaring vaults of Gothic
cathedrals; and the shining segmentation of tiles on a mosque.
The tiles bring us at last to the Textual function (which does not have to be
the last function examined: the three functions are all equally meaningful
and may be considered in any order). At the lowest rank of Element the
finish of the tiles and the chevron patterning create the surface texture of
the Opera House shells. This is texture as such - Textual meaning - as
opposed to their practical (Experiential) function of keeping out the rain and
their decorative or dramatic (Interpersonal) functions.
At the rank of Room, each auditorium, or foyer, or office, or restaurant
has its own scale and proportions, it is lit or in shadow, and has its own
acoustic properties in contrast to other spaces around it. Its relation to
outside carries Textual meaning, so that our response to the isolated and
insulated worlds of the concert hall, opera theatre, drama theatre or cinema
is quite different from how we feel in the foyers, where our gaze is deliberately projected out to the harbour and city views - where we are no longer
fully enclosed in the built text. At this rank we experience a Textual focus as
well as the power relation (Interpersonal) between the rostrum and the
orchestra and the audience. This is facilitated by aisles and stairways within
the auditoria, and all such 'connectors' as corridors, stairs, lifts, escalators,
hatches and interconnecting windows throughout the building are primarily
Textual in function: they work like the cohesive devices of conjunction in
language.
Like cohesive devices in language, these connectors work across several
ranks, since they also work to relate floors and the various auditoria and
other internal spaces to each other. Doors and windows, of course, relate the
internal spaces to exterior parts of the built text: walkways, entrance steps
and terraces, and thence to the Broadwalk and approach road.
The most striking Textual systems of the Opera House at the rank of
Building are listed in the top right-hand box of the Chart. We will consider
them from the bottom up - as if we were moving from near the building to
vantage points further away. Opacity/Reflectivity/Transparency is a system
of options that tends to have meaning when we are near a building. The
shells of the Opera House are opaque, but, being shiny and white or offwhite, reflect the light, whereas the podium is opaque and comparatively
matt, giving a denser, less light-responsive texture. The windows, of demitopaz coloured laminated glass, are highly transparent for the viewer from
inside and for those outside when the interior is lit - after dark, when most
of the building's theatrical functions are at play. Unlike most glass facing
water and sky, they do not reflect much of their environment, except from

24

MULTIMODAL DISCOURSE ANALYSIS

an aerial vantage-point; the three distinct surface planes of the northern


foyer windows draw our gaze in rather than reflecting us and the world we
stand in.
These windows also make a major contribution to other kinds of Textual
meaning such as the Roof-Wall relation, Rhythms and Proportions. As Jill
Sykes explains, after Utzon's resignation it took his replacement architects
nearly four years:
to solve the design problems, at first working through trial proposals and then
tackling tricky situations as they arose under construction [. . .]
Linking the curves of the sails to the rectangular lines of the podium required a
concept that combined the aesthetic with the pragmatic. Without a mathematical
relationship between the shape of the shell and that of the podium to use as a
starting point for a geometrical solution to devising the structure of the two
largest glass walls overlooking the Harbour, a new design element had to be
introduced. The result was a combination of three surface planes: vertical at the
top, coming down to a half-circle leaning outwards from the vertical, then pulling
back in a cone shape [Plate 1.5].
This verandah-style approach provided the practical advantage of extending the
area within the building well beyond the feet of the shells, as well as offering nonreflective views over the Harbour through the inward-slanting glass that ended at
floor level.
(Sykes, 1993: 62-63)

Sykes is here describing the resolution of Experiential ('pragmatic') and


Interpersonal ('aesthetic') problems through the Textual functions ('geometry') of shell-glass wall-podium relations and the contrasting shapes, angles
and proportions created by the windows. Even she has difficulty in articulating the sheer visual excitement of this brilliant and unique interplay of the
parallel lines of the vertical mullions, gradually diverging in the other two
planes, with the stepped window spacers and the curve of the intersections
of the planes and the curve of the front of the canopy creating an intricate
harmony with the curve of the shells above: only a musical metaphor can do
justice to the Textual meaning of these mathematical relationships.
We have discussed the Opera House's relation to adjacent buildings and
to the road already in terms of their Interpersonal tensions and mimicry, but
a full account must also recognize the Textual relations created by their
shared geometry (partly discernible in Plate 1.5). The strong vertical fluting
of some of the tower blocks, the proportions of the relations between glass
curtain and solid plane walls and the curves of some towers or roof features
all give the Opera House a distinctive role in the urban texture of Sydney.
Its relation to the city as a whole is highly dynamic. Because of its prominent and open, uncluttered site, it is visible from many vantage points, both
near and distant, low and high. From the foot of the podium steps or a
passing ferry it rears up colossally, as Utzon intended, but from the ferry

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

Plate 1.5

Intersecting geometries

25

26

MULTIMODAL DISCOURSE ANALYSIS

terminals or the far side of Circular Quay it has diminished to an imposing


sculpture against the skyline, or another sailboat cutting through the waves,
while from the Harbour Bridge or the roads descending to the Harbour
through North Sydney it has become a tantalizing jewel on a dark
velvet cloth.
We have allowed ourselves a lyrical passage here out of deference to the
many encomiums that have been written and spoken and filmed for this
'Eighth Wonder of the Modern World'. So much has been written, indeed,
that we must ask whether anything remains to be said about the Sydney
Opera House. Can a systemic-functional analysis (with or without the lyricism) add anything to the mass of books, magazine, journal and newspaper
articles, architectural, historical or political speeches, interviews or discussions devoted to it?
Apart from attempting to extend the limits of systemic-functional semiotic theory by applying it to a complex three-dimensional work of art (which
is one of the aims of this book), I believe that a functional approach allows
one to see certain features in a new light. In the first place, it counters the
simplistic tendency to interpret 'functionalism' as concerned only or primarily with the utilitarian, Experiential, functions of a building. While recognizing that the practical functions may have a priority in all kinds of text about
buildings, from architects' briefs to security manuals or tourist brochures, it
asserts - and tries to prove - that the Interpersonal and Textual functions are
just as important in the elucidation of what a building 'means', whether to
the individual viewer, the citizens of Sydney, contemporary society or posterity. It does this not by generalizing, but in detail, teasing out the systems
of choice which are available to the architects, engineers and builders at
different ranks of unit - Building Complex - Building - Floor - Room Element in each of the three functions. This then enables us to pinpoint
those features of the building where the meaning is 'hottest', where specific
functional meanings overlap, interplay or conflict to produce more complex,
sometimes contradictory interpretations. The process may well generate
new insights we can share with others in an agreed common language.
The chart of systems and functions becomes a kind of hypertext - a nonsequential tool for exploring the hypertext of the building itself: the user can
start with any system in any box of the chart, analyse that part of the
building and interpret it in terms either of higher or lower ranks in the same
function or in terms of related systems in other functions. Like any good
map, it will still help us know where we are - theoretically as well as practically - at any stage of our exploration. Similarly, as I have tried to show, we
may stand on the Opera House podium looking at the tiles on the shells
rearing above and around us. An appreciation of their colour and shine may
lead us to the imposing grandeur of each shell or the whole building complex against the harbour and sky (higher ranks in the Interpersonal function), or the geometrical textures of the tiles and chevrons may draw our
attention to the complex interplay of materials and geometry in the windows which I discussed under the Textual function.

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

27

A further theoretically principled step may then be possible if the functional meanings in the 'text' of the building itself are projected onto the
manifestations of the 'social semiotic', that aggregation of opinions,
assumptions and prejudices about what should be built and how it should
look that prevails in a given culture (which might be the political right or the
artistic avant-garde of Sydney in the 1960s, or world architectural opinion
in the 1990s, or mass tourist culture in 2000, Sydney's Olympic Year).
On a more prosaic and technical level, the specification of distinct ranks
of unit in the systemic-functional model allows one to discriminate the
kinds of choices the architect has made and the kinds of construal we
ourselves make as viewers, visitors and users of the building. And the specification of the systems which make up the 'grammar of architecture' helps us
to understand the nature of the choices the architect has made in relation to
the practical, aesthetic, social, political and financial constraints which are
laid on him and his justification in calling it quits when those constraints
become unmanageable.
Notes
1 J0rn Utzon in an interview for the film The Edge of the Possible: J0rn Utzpn and the
Sydney Opera House, director: Daniel Dellora, ABC Television, 20.10.98.
2 Michael O'Toole, The Language of Displayed Art (1994), Chap. 3 'A Semiotics of
Architecture', pp. 85-144.
3 J0rn Utzon in an interview for the film The Edge of the Possible.
4 My versions of Halliday's model for the systemic-functional analysis of painting
and sculpture (O'Toole, 1994) use the term 'Compositional function' for this kind
of meaning in those arts which are primarily for display. In the case of architecture, which, like language, is of practical use as well as display, it seems
appropriate to retain Halliday's notion of the 'Textual function'.
5 Harry Seidler in an interview for the film The Edge of the Possible.
6 J0rn Utzon in an interview for the film The Edge of the Possible.
1 Ervin Goffman, The Presentation of Self in Everyday Life (1965).

References
Dellora, D. (1998) The Edge of the Possible: J0rn Utzon and the Sydney Opera House. ABC
Television, 20.10.98.
GofTman, E. (1965) The Presentation of Self in Everyday Life. London: Penguin Books.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
Sykes, J. (1993) Sydney Opera House from the Outside In. Sydney: Playbill Proprietary
Ltd/Sydney Opera House Trust.

Making history in From Colony to Nation: a multimodal


analysis of a museum exhibition in Singapore

Alfred Pang Kah Meng


National University of Singapore

Introduction
This paper explores how systemic-functional (SF) theory may be extended
to a social semiotic analysis of the museum exhibition as a multimodal site.
The museum exhibition is obviously multimodal in that different semiotic
resources, such as photographs, three-dimensional physical objects, space
and language, are co-deployed in complex ways to construct meaning. I
sketch here a preliminary SF framework for the multimodal analysis of a
museum exhibition and exemplify its usefulness in articulating the critical
construction of historical meaning by particular displays in From Colony to
Nation, an exhibition at the Singapore History Museum (SHM) that represents the national history of Singapore. By critical, I mean understanding
how the communicative complexity of the exhibition connects with the
discursive institution of the museum as 'a dynamic power-play of competing knowledges, intentions and interests' (Macdonald, 1998: 3). In particular, I reflect on how the making of Singapore's national history in From
Colony to Nation serves to (re)produce particular dominant imaginings of
Singapore as a 'nation'. The general point here is that making history is
never value-free; it is, rather, imbued with power-knowledge relations1
invested in the site of historical production.

From systemic-functional linguistics (SFL) to


systemic-functional semiotics
The project to extend SF theory into the analysis of multimodal terrains
such as the museum exhibition entails, in the first place, an understanding
of the theory. SF theory, as Halliday (1970, 1973, 1978, 1994) originally
formulates it, has principally centred on language as the object of analysis.
Hence, the emergence of SFL as a method of linguistic analysis informed by
the theoretical conception of language as a social semiotic; that is, language
as meaning potential that evolves with the functions it has to serve in social
living (HaUiday, 1973; HaUiday and Martin, 1993). As HaUiday (1973: 34)
asserts, 'Language is as it is because of what it has to do'. From the

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

29

standpoint of SFL, then, language constitutes the social practice of


meaning-making.
Recently, there has been much interest among some practitioners of SF
theory in the analysis of specific non-linguistic semiotic modes of meaning
(e.g. visual images in Kress and van Leeuwen, 1996; displayed art
in O'Toole, 1994; movement in Martinec, 1998), as well as in the coarticulation of meaning between them and language (e.g. Lemke, 1998;
O'Halloran, 1999; RaveUi, 2000; Thibault, 2000). Such an interest makes
explicit the fact that conceiving language as meaning potential necessarily
entails a broadening of perspective that recognizes its co-deployment (and
hence co-evolution) with other non-linguistic semiotic resources in meaningmaking. As Thibault (1997: 342, emphasis original) argues, '[t]he linguistic
semiotic is strongly coupled with the various other semiotic modalities in social
semiosis'. It follows, then, that language is as it is not only because of what it
has to do, but also what it does with and to other semiotic resources.
Implicit in the choice of SF theory to facilitate an understanding of what
multimodal texts mean is the assumption that the theory has reached a point
of development where the descriptive tools elaborated for analyzing language can be useful in articulating the dynamic processes of meaningmaking within and across various semiotic resources (Baldry, 2000). How
valid is this assumption? That is, what are the spaces within SF theory that
render viable (or not) its extrapolation from linguistic to general semiotic
theory able to cope with the analysis of multimodal texts? Unfortunately,
there is no space here to explore in-depth these questions.2 For the purpose
of this paper, however, it suffices to recognize that the viability of extrapolating SF theory into the field of multimodal analysis may be claimed on the
grounds that the principles that underpin its description of language are
conceptualized at a level of abstraction relevant to social meaning-making
in general (Kress et al., 1997) These principles are:
1. The generality of Halliday's three metafunctions of language (Ideational,
Interpersonal and Textual) as abstract semiotic functions (see Kress and
van Leeuwen, 1996; Lemke, 1998; O'Toole, 1994).
2. The exotropic lens of SF theory, which conceives of the non-accidental
relation between language and social context, potentially affords the
foundation for modelling contextual semiotics. The crucial implication
here is that 'there are no contextless signs' (Harris, 2000: 81). That is, the
language system which powers various instances of text comes into meaningful existence only in their situation within social context. More than
just the socio-cultural environment, the exotropic lens of SF theory, in the
light of multimodality, entails a refining focus on co-contextualizing relations between language and other semiotic modalities.
Notwithstanding the two principles above, it is crucial to recognize what
Lemke (1998: 110) has termed as the principle of incommensurability between sign
systems. That is, every semiotic system embodies its own unique complexity

30

MULTIMODAL DISCOURSE ANALYSIS

and the co-articulation between two or more semiotic systems in multimodal


texts is multiplicative of the relative specificity of each semiotic (Lemke, 1998).
Hence, the descriptive tools elaborated for language in SFL cannot be directly
imported and applied to the analysis of other non-linguistic modalities. It may
be necessary to formulate SF descriptions of specific non-linguistic semiotic
resources (e.g. O'Toole, 1994). However, this defines only partially the problematic of multimodality. The development of such specific descriptive tools
should hopefully culminate in some means of (un)packing the processes of
intersemiosis, which Ravelli (2000: 508) defines as 'a co-ordination of semiosis
across different sign systems'. In sum, we need to cultivate an integrational
semiolog/1 to better understand how multimodal texts work.
Towards the analysis of From Colony to Nation
In this section, I sketch a preliminary SF framework for the interpretative
analysis of the museum exhibition as a multimodal text. It is important to
recognize that the conceptualization of any framework to understand any
social phenomenon is inherently a reductive abstraction from the dynamic
worlds that we inhabit. As such, it is not my intention here to insist on a strict
conformal fit between the proposed framework and the myriad exhibition
styles that one encounters in social living. Rather, I aim to explore those
dimensions that can be useful in articulating and negotiating one's (dis)agreement with others about how an exhibition means. I also develop the
framework as far as it allows me to adequately unpack the ideological nature
of particular displays in the exhibition, From Colony to Nation.
A semiological approach towards museum communication is not new.
Delibasic (cited in Maroevic, 1997: 29), for example, has conceived of the
museum as 'filled with signs or systems of signs, which are at the disposal of
those who know how to interpret them'. It is important to recognize,
though, that museum communication is more than the exhibition. As
Hooper-Greenhill (1999) observes, catalogues, books and souvenirs in
museum shops, for example, also form a strategy through which museums
communicate with the public. Nonetheless, the exhibition warrants primary
attention in museum communication as it is still 'a typical museum medium
for expressing the museum message' (Maroevic 1997: 30).
Broadly speaking, at least two perspectives may be discerned from the
development of various semiological approaches undertaken in museum
studies. The first tends to centre narrowly on the collection of material
objects as the means par excellence of communication in a museum (e.g.
Pearce, 1991, 1994). Noteworthy in such analyses is the conclusion that the
artefactual significance of objects lies in the socio-cultural relations of their
production, circulation and use. However, it is crucial to recognize that the
values of artefactual objects are as much mediated by the institutional
environment of their display in a museum. This leads us to the second
perspective, which emphasizes the (re)appropriation and (re)interpretation
of artefactual objects in relation to the composite design of an exhibition as a

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

31

whole (e.g. Hooper-Greenhill, 1999; Kavanagh, 2000; Vergo, 1989). As


Smith (1989: 19) puts it:
artifacts do not exist in a space of their own, transmitting meaning to the spectator, but on the contrary, are susceptible to a multiform construction of meaning
which is dependent on the design, the context of other objects, the visual and
historical representation, the whole environment.
Such a perspective may be increasingly relevant now, given the prevailing
trend to democratize museums through the creation of audience-oriented
exhibitions, where 'a shift in focus from individual objects to a "whole gallery experience"' (Martin, 1997: 36) is encouraged. Herein lies the pressing
motivation to conceive of the exhibition as a multimodal social semiotic,
where objects are rarely left to 'speak for themselves' (Vergo, 1989: 49), but
mean in collaboration with other semiotic modalities such as space, visual
images and language.
Multimodality in an exhibition implies the multi-tiered complexity of
museum messages. While this has been generally acknowledged in various
studies on the museum exhibition (e.g. Belcher, 1991; Hall, 1987; HooperGreenhill, 1999), what remains insufficiently elucidated is the 'what' of
these tiers that underlie the exhibitionary construction of meaning. In this
regard, Halliday's (1994) three metafunctions for language - Ideational,
Interpersonal and Textual - provide a useful dimension to organize this
multi-tiered meaning potential of exhibitions '[as] pieces of functional
design with the purpose of doing a specific task' (Belcher, 1991: 41). Indeed,
this tripartite organization of meaning seems latent in Bennett's (1995: 67,
emphasis mine) conception of the exhibitionary complex., which is an 'ability to
organize and coordinate an order of things and to produce a. place for the people in
relation to that order1. The museum exhibition performs an Ideational function
in representing a cultural practice that construes social 'realities'. It realizes
an Interpersonal function by powerfully addressing and shaping the interests of visitors in particular ways. The Textual function orders the interconnected flow of both ideational and Interpersonal meanings to compose an
exhibition as a coherent and cohesive whole.
The metafunctional organization of the meaning potential of a museum
exhibition has been broadly conceived in Ravelli (1997, 2000). According to
her, the exhibition is a site for intersemiosis, which is 'a co-ordination of
semiosis across different sign systems' (Ravelli, 2000: 508). Rather than the
specific analyses of individual semiotic codes per se, Ravelli (2000)
emphasizes the productivity of a macro-level analysis in unpacking the
interaction between them in an integrated way. The framework formulated
here aims to abstract such macro features of meaning that emerge from the
dynamic interplay of various semiotic modalities deployed in an exhibition.
However, it does not (and perhaps should not) preclude the relevance of
micro-level analyses of individual semiotic systems whenever possible.
To recall an earlier discussion, the nature and extent of their interaction

32

MULTIMODAL DISCOURSE ANALYSIS

depends on the relative specificities of each semiotic resource co-deployed.


As such, the interpretative framework I suggest is open to apply eclectically
particular SF descriptions conceptualized for specific semiotic codes
(for example, language in Halliday, 1994; visual images in Kress and van
Leeuwen, 1996; displayed art - including sculptural and architectural texts in O'Toole, 1994). The point of integrating such descriptions is directed to
discern that level of deep detailed analyses required for each sign system so
as to explicate its interconnection with other semiotic modalities. It is thus
my view that analytical approaches to unpacking multimodal texts in general need to maintain a balance between micro- and macro-level perspectives of the range of semiotic resources coordinated. In practice, of course,
this balance is also subject to the purposes of the research analyst.
Apart from the metafunctions, the logic of a rank-scale in SFL also
provides another dimension to conceptualize the multi-tiered complexity in
an exhibition. In the case of a museum exhibition, it is possible, by analogy,
to postulate a rank scale based on a hierarchical layering of spatial constituents: Museum, Gallery, Area and Surface/Item. These rank units, which I
term as Sites, are conceived as different environments wherein an exhibition
can be viewed. Each environment presents a set of dimensions that orients
the analytical 'eye' to interpret the (multi)semiotic space of an exhibition
from a particular angle.
Thus, as conceived in Table 2.1, the three metafunctions and the order of
sites may serve as two axes of a matrix of systemic components that characterize the meaning potential of a museum exhibition. There is, however, no
space here to explain in detail each systemic component in the matrix. It is
hoped that the analysis in the section which follows will sufficiently illuminate some of the components in the matrix. At this juncture, it is worth
stressing that the various components in the proposed functional semiotic
model are, in reality, more fluid than their discrete placing in the matrix
suggests; that is, 'certain features [can] either operate in more than one
function or have consequences for other features from other systems, functions or ranks of unit' (O'Toole, 1999: 6).
I also explore how the co-patterning of these options from various semiotic modalities may be organized by the co-evaluation of some phenomena
along some foregrounded parameter. In this respect, I consider the possibility of extending Appraisal Theory (Martin, 2000a) into the domain of multimodal discourse analysis. The point here is that evaluation can serve as an
integrative principle organizing intersemiosis. The basis for this may be
located deeply in the question whether one can mean anything outside
evaluation. That is, when are humans not evaluating if the view is that the
'worlds' which selves inhabit are always created in dynamic relation with., for
and to others? According to Hernadi (1995: 116):
emotive awareness initiates the dialectical process through which the self and its
world 'make' each other so that the former may begin to 'mean' and 'do' both
cognize and act upon the latter.

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

33

He further suggests that 'if our evolution has enabled us to evaluate in


greater depth, our evaluations enable us to evolve at a far more unsettling
speed than members of other species' (Hernadi, 1995: 135). That the act
of evaluating is behaviour potential (Ravelli, 2000) that possibly co-extends
with making-meaning through some material-semiotic technology in the
exosomatic evolution of the human species lends natural credence to
evaluation as an integrative principle of intersemiosis. To put it another
way, it is difficult, if not impossible, to mean something without also evaluating it. The co-evolution of language with other semiotic modalities is
probably marked with co-evaluation between them, which in turn intertwines with the larger co-evolution of nature and cultures in the ecology
of being human. From this perspective, evaluation is multi-layered
and takes place at many levels in making-meaning. These various levels
could well afford different scales by which multiple semiotic modes are
combined.
It is critical to recognize that access to and selection of possible configurations of these components towards evaluation in an exhibition are as much
regulated by the communities of values and beliefs invested in the ideological
space of the museum (Hodge and D'Souza, 1999; Karp, 1992). As HooperGreenhill (1992: 214) cautions:
The total experience (in living history or interactive exhibits), the total immersion
(in gallery workshops and events), can have the function, in the apparently democratized environment of the museum marketplace, of soothing, of silencing, of
quieting questions, of closing minds.

In other words, the current popular paradigm that pushes for the democratization of the museum does not equal the dissolution of power. Instead, it
indexes the powerful capacity of the museum in strategically negotiating its
institutional authority to position the subjectivities of its audience in particular ways. This ideological motivation of meanings construed and construable in an exhibition is taken seriously in an SF framework that emphasizes
a dialectical relationship between social context and semiotic system(s).
Touring From Colony to Nation 'Communist United Front'
From Colony to Nation is a permanent exhibition at the Singapore History
Museum (SHM) and displays the national political history of Singapore.
This exhibition, which opened on 19 July 1997, was motivated by the formulation of National Education (NE) by the ruling People's Action Party (PAP).
The idea of NE was initiated by Prime Minister Goh Chok Tong at the
Teachers' Day Rally on 8 September 1996 in response to a survey, which
found students ignorant of Singapore's past, particularly '[t]he country's
struggle against communism, and how it went about getting self-rule and
independence' (The Straits Times., 16 September 1996).4 According to Goh
(cited in Wee, Business Times, 31 May 1998):

Table 2.1

Systemic functional framework for a museum exhibition


Interpersonal

Site/
Ideational
Function
Museum Museum type

Gallery

Disciplinary Field

Narrative Design Interplay of Genres


Interplay of Areas

Target Audience
Architectural Appeal

Ideal Visitor
Circulation Path
Setting (Mood)

Area

Sub-narrative Theme
Interplay of Surfaces (i.e. displays on
walls, floors and ceilings)

Circulation Path
Setting

Textual
Public/Private
External environment Relation to city/Relation to
Internal environment adjacent buildings
Relation to Practical
Facilities
Traffic Flow/Flow
Rate
Lighting, Colour,
Size, Volume, Kinds
of Object Props

Traffic Flow/
Flow Rate
Lighting, Colour,
Size, Volume, Kinds
of Object Props

Internal Cohesion
Sequence of Areas

Focal points

Rhythm

Lighting, Colour, Scale


of Exhibits, Display
Density, Degree of
Partition
Information Composition
External Cohesion (e.g. relative
prominence in museum, relation to
connectors corridors, stairways)
Rhythm Sequence of Surfaces
Relative Prominence of Area

Site/
Ideational
Function

Interpersonal

Surface/ Topics (Sub-topics)


Item
Relationship Map
Intra-relationship of elements in an item
Inter-relationship of elements across items

Interactivity

Gaze and other sensory


modes of attention

Display Style

Classification
Arrangement

Interpretive Path
Directional Path
Focus (CVI)

Interplay of modal and


compositional elements
(e.g. Colour, Light,
Shape, Size, Lines)

Visual Salience

Balance:
Flank/ Spiral
Alignment

Image-Word-Object: ExtraVocalization
Semiotic
Metaphor
Obj ectification
Metonymy
Visual semiotic O'Toole (1994)/Kress and
van Leeuwen (1996)
Linguistic semiotic: Halliday (1994)

Perspective
Viewing height

Textual

Information Composition
Relative Prominence of Surface/
Item

36

MULTIMODAL DISCOURSE ANALYSIS


National Education . . . is an exercise to develop instincts that become part of the
psyche of every child. It must engender a shared sense of nationhood, an understanding of how our past is relevant to our present and future. It must appeal to
both heart and mind.

Deputy Prime Minister Lee Hsien Loong reiterated this position at the
formal launch of NE on 19 May 1997, saying that it is 'a concerted effort to
imbue the right values and instincts in the psyche of our young' through
teaching 'the Singapore Story - how Singapore succeeded against all odds
to become a nation'. Thus, From Colony to Nation, which is also referred to as
'The story of Singapore' in the exhibition guide (see Plate 2.1),5 has a strong
pedagogic purpose that is tightly circumscribed by the ideals of NE, namely
to underscore the constraints and vulnerabilities of Singapore. I discuss now
how the intent of NE motivates a selective remembering of Singapore's
recent political past, with particular focus on an Area - the 'Communist
United Front' that displays the Communist movement in Singapore after
the Japanese Occupation.
It is worthwhile first to contextualize this Area concerning the Communist
movement in terms of the Narrative Design at the rank of Gallery. Typically
referred to as the 'storyline' among exhibition makers, the Narrative Design
is abstracted as that overall thematic content of an exhibition that binds the
particular selection and arrangement of multiple semiotic systems. As Vergo
(1989: 46) puts it:
in the case of most exhibitions at least, objects are brought together not simply for
the sake of their physical manifestation or juxtaposition, but because they are
part of a story one is trying to tell . . . Through being incorporated into an
exhibition, they [objects] become not merely works of art or tokens of a certain
culture or society, but elements of a narrative, forming part of a thread of
discourse which is itself one element in a more complex web of meanings.

The Narrative Design is, then, an 'interpretative strategy' (Dean, 1994: 103),
within which the subject matter of an exhibition is formulated at several
levels of complexity. An aspect of this complexity lies in the Interplay of
Genres, which is worked through the social experience of a museum visit.
An instance of this would be the experience of picking up and glancing
through an exhibition/gallery guide before viewing the actual threedimensional display. In From Colony to Nation, where no main introductory
panel is installed, the exhibition guide plays a marked role in providing
visitors with an overview of the content of the display. More significantly,
the exhibition guide, in orientating the visitor to '[t]ake a walk through
history and understand why Singapore must prize her independence above
all else', inflects the historical recount displayed as an exemplum. An
exemplum, according to Martin (2000b: 8), 'relate [s] a sequence of events in
order to make a moral point'. The moral point here is the obligation for
Singaporeans to value positively and not take for granted the country's
independence.

Plate 2.1

Exhibition guide to From Colony to Nation (layout plan)

38

MULTIMODAL DISCOURSE ANALYSIS

This interplay between the guide and three-dimensional display may be


conceived as a generic chain (Fairclough, 2001)6 across media, which coevaluates Singapore as a vulnerable body politic. The vulnerability of
Singapore, therefore, forms an even more abstract theme that organizes the
interconnectivity between various semiotic resources in the exhibition. This
state of vulnerability is perceived within the Narrative Design through the
erection of points that risk the status quo established by the PAP government. Communism is one such risky point.
Now, I move into the Area 'Communist United Front' (see Plate 2.2) and
discuss how the co-deployment of various semiotic resources (primarily
written language, visual images and space) serves mainly to discredit the
Malayan Communist Party (MCP).
From the outset, the undesirability of the Communists is already indicated
by the thematic classification of this Area under 'Colony in Chaos' (see Table
2.2, p. 40). In the exhibition on the left wall, this classification is indexed by
the use of a red board on which the linguistic text panel is mounted.
Linguistic text panel

Written language is used in the main text panel and in museum labels.
Table 2.3 (see pp. 42-43) contains a linguistic analysis of the main text panel
in terms of its schematic organization and the sub-system of Attitude in
Appraisal Theory (Martin 2000a).
Attitudinal evaluations of the MCP and pro-Communists are mostly
negative Judgements on propriety. For example, Material Processes like
'infiltrating' (clause 7), 'exploit/ed' (clauses 8 and 11) and 'incited' (clause
21) dramatically construct a negative Judgement of (pro)-Communists as
reactionary, unlawful, manipulative and perhaps even irrational. Noteworthy too is the accumulation of negative Judgement from clauses 310,
which function to elaborate the Thesis. It is interesting to observe how the
series of non-finite in clauses 610 appears to 'quicken' this accumulation
by allowing a jam-pack of New information, which refers back to 'It' (clause
5) as thematized Actor. This 'It', in turn, anaphorically refers to the MCP.
A cluster of attitudinal evaluations is thus rhetorically woven to intensify the
negative evaluative force on Communism.
Noteworthy in the analysis presented above is also the embedding of two
historical recounts - the May 13th Incident in 1954 and the Hock Lee Bus
Riots in 1955 as examples of Communist-instigated violence. This
embedding has the effect of re-interpreting the historical recounts to the point
of the Thesis (clause 2), which generalizes via an intensive identifying
relational process the use of violence as the primary strategy by which the
MCP aimed to achieve power. Indeed, the negative propriety of the Communists is predicated on this use of violence. The point of this linguistic text
is not to recover the specifics of the actual perpetrators and victims in these
acts of violence. Rather, within the genre of an exemplum, the social process here is to moralize violence as socially undesirable in order to discredit

Plate 2.2 Display area of 'Communist United Front'

40

MULTIMODAL DISCOURSE ANALYSIS

Table 2.2 Classificatory scheme of From Colony to Nation


1945-50s

1960s

1965

present

Colony in Chaos

Tides of
Transition

Nation-Building

World War II &


Southeast Asia
Divided Population
The Maria Hertogh
Riots
A Time of Hardship
A Political Goal: Union
with Malaya
Communist United
Front
1955 General Elections
Self-Government

Mighty Malaysia
Proposal
Historic PAP split
Battle for Merger
Referendum
Confrontation
Political Rivalry
Economic Problems
Racial Tension
Racial Riots
Singapore is Out!

On Our Own
We Had to Accept Reality
Political Unrest
Who will Protect Us?
The Struggle to Live
Foreign Relations
Defending Ourselves
Economic Growth
Caring for our People (up to
1970s)
Passing the Baton (1984/1990)
Our Presidents (1965-present)

Communism and Communalism

Division

Elaboration of 'national'
interests in terms of what is
needed for Singapore to survive
(Economic Pragmatism)
Communitarian values
> Unity in Diversity

the Communists. Any act of violence which might have been committed by
the police then is from the start tolerated and legitimized as control.
Moving into space
The spatialization of information is a central feature in the threedimensional text of an exhibition. As Bennett (1995: 6) remarks, 'an exhibitionary space . . . is a place for "organized walking" in which an intended
message is communicated in the form of a (more or less) directed itinerary'.
The framework here conceives this 'organized' walking as the system of
Circulation Path under the Interpersonal function. There are two aspects to
Circulation Path: Traffic Flow which concerns the routing through a series
of spaces within an exhibition, and Flow Rate which relates to how a visitor
is paced along the circulation route throughout a gallery and within an area
of an exhibition. The system of Circulation Path is visually represented in
Figure 2.1.
Apart from the application of Circulation Path, I also examine in
this section the operation of semiotic metaphor in the spatial rerepresentation of the meanings constructed in the linguistic text panel.

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

41

Figure 2.1 System of Circulation Path (adapted from Royal Ontario Museum
1999)

FoUowing O'Halloran (1996, 1999, 2003a, 2003b), semiotic metaphor


relates to the semantic shift that takes place inter-semiotically, during
which the function of an element may be receded and new functional
elements may be introduced in the movement from one semiotic resource
to another.
In her investigation on secondary school history, Coffin (1997: 202)
notes the linguistic construal of external and internal time in organizing
the past. The linguistic text panel sets up a chronological template in
which external time unfolds categorically through marked Circumstances
(in bold):
(03) In 1948, it failed in an armed uprising during the emergency
(19) On 13 May 1954, students and police clashed
(20) In May 1955, the pro-communists incited students to join the Hock Lee
Bus workers in a strike.
Internal time is deployed to build up an explanation about the past and this
is linguistically construed in the text panel via logical links of Cause. Now, the
spatial semiotic also affords the capacity to realize external and internal
time, but perhaps in ways less differentiated than language.
The three-dimensional spatialization of external time can be seen to
involve parallel semiotic metaphor. The events dynamically recounted along
a chronological timeline of marked Circumstances in the linguistic text are
physically bounded in a more or less rectangular enclosure with exhibits
displayed along the two longer walls (see Plate 2.2). The left wall consists of

MULTIMODAL DISCOURSE ANALYSIS

42

Table 2.3

Main text panel: schematic organization and attitude

Thesis

Elaboration

Summary

Exemplify I
Recount of May
13th Incident

Communist United Front


The Malayan Communist Party's (MCP) fundamental
[Appreciation: -valuation] aim was the establishment of a
communist state in Malaya (including Singapore) by revolutionary
violence [token, Affect: insecurity: disquiet > token,
Judgement on MCP: -propriety]
(03)
In 1948, it failed Judgement on MCP: -capacity] in an
armed uprising during the Emergency [token, Judgement on
MCP: -propriety]
(04) and went underground [token, Judgement on MCP:
-propriety].
(05)
It then changed its tactics
(06)
to form a communist-controlled united front [token, Judgement
on MCP: -veracity / propriety]
(07)
by infiltrating Judgement on MCP: -propriety] into legal
organizations such as trade unions, students' unions,
farmer's associations, Women's Federation, cultural groups
and political parties
(08) to exploit Judgement on MCP: -propriety] grievances
[Affect: unhappiness: antipathy],
(09) expand their influence [token, Judgement on MCP:
-propriety]
(10) and eventually gain control of these organizations, [token,
Judgement on MCP: -propriety]
(11) The MCP through the communist united front exploited
Judgement on MCP: -propriety] anti-colonial feelings
[Affect: unhappiness: antipathy], concern about Chinese
education [Affect: insecurity: disquiet], feelings of social
frustration and economic injustice [Affect: dissatisfaction:
displeasure].
(12) When the British announced
(13) that 2,500 youths would be drafted under the National
Service Ordinance,
(14) the pro-communists fanned discontent [token, Judgement:
-propriety]
(15) by claiming Judgement: -veracity]
(16) that locals were used
(17) to further colonial rule [token, Judgement on the British:
-propriety].
(18) Mass student protest demonstrations were staged [token,
Judgement on student demonstrators: -propriety
> token, Judgement on pro-Communists (agent
ellipsed): -propriety]
(19) On 13 May 1954, students and police clashed
(20) and 48 students were arrested [token, Judgement on
students: -propriety > token, Judgement on proCommunists: -capacity; token, Judgement on
police: +capacity].

(01)
(02)

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE


Exemplify II
Recount of
Hock Bus Riots

(21)

(22)
(23)

43

In May 1955, the pro-communists incited [Judgement:


veracity] students to join the Hock Lee Bus workers in a
strike [token, Judgement on pro-Communists:
-propriety].
Violence broke out [token, Affect: insecurity: disquiet
token, Judgement on pro-Communists: -propriety],
resulting in 4 people dead and 31 injured [token, Affect:
insecurity: disquiet token, Judgement on procommunists: -capacity].

items that relate to the May 13th Incident in 1954, while the right wall
exhibits items associated with the Hock Lee Bus Riots in 1955. There
appears to be a shift from the linguistic construal of time as Circumstance to
its spatial experience in the exhibition as a physical material Thing. It is this
semantic shift that enables the further compression of these events into a
period, negatively appraised in its sub-thematic classification as 'Colony in
Chaos'.
The semantic shift is parallel in the sense that no new functional entities
are introduced in this reconstrual although there is an overlay of meaning
enabled by the system of Circulation Path in the spatial semiotic. The sight
of space simultaneously invites its traversal. The continuous material process of'organized walking' (Bennett, 1995: 6) now topologically enacts the
dynamic unfolding of time (external and internal) in space. The system thus
activated is that of the Circulation Path. From the perspective of Traffic Flow,
this display on the Communists is situated relatively early in an Arterial
pattern (see Plate 2.1) from left to right. This left-right directional flow is
explicitly insisted upon by the instruction on the Exit Door: 'Please enter
exhibition via door on the left'. Interpersonally, the Arterial pattern promotes a didactic stance in that the visitor is given little choice in choosing
her/his pathway through an exhibition. This textures the importance of this
display since a visitor is made to walk through it anyhow.
Now, I focus on the Flow Rate, which is affected by the arrangement of
walls. In this Area, the two longer walls run parallel to each other and are
conjoined by a straight path through. Movement through this pathway
enacts a conjunctive relation in the Interplay of Walls. This conjunction is
not merely an additive of two external timeframes (referenced as 1954 and
1955 from the linguistic text panel), but also expresses their internal relation
as examples of Communist-instigated violence. More significantiy, this spatial design, by its relatively low Degree of Partition, affects a Flow Rate that
tends not to be crowd stopping.
Furthermore, following Arnheim (1982: 61), the two longer walls of the
rectangular enclosure tend to emphasize an axial symmetry, which propels
the Ideal Visitor to move forward and ahead of the Area, towards the portrait
painting of Lee Kuan Yew being sworn in as Prime Minister in 1959. This
coloured oil painting, enshrined in a gold frame, stands out in contrast to the
black walls and the black-and-white photographs used. According to Bal
(1999: 176), the portrait is

44

MULTIMODAL DISCOURSE ANALYSIS


[a] genre that bestows authority upon its subject. Its history is bound up with that
of capitalism, individualism, bourgeois culture . . . portraits are made to honor
power.

Thus, apart from visual contrast in the display design, the intertextual allusion to such generic conventions about the portrait marks the painting as a
focal point, which indexes the starting-point within the Narrative Design of
how the elected PAP Government (represented metonymically and authoritatively by the figure of Lee) would overcome all odds to build Singapore
into what it is today. From the perspective of Flow Rate, then, the relative
prominence of the Topic 'Communist United Front' is downplayed. It is not
that the Topic has become less important or significant. Rather, what seems
to be enacted by the continuous Flow Rate is perhaps a channelling of that
significance to an appropriateness of distancing oneself from Communism
towards the promise of social prosperity that the PAP Government has
come to stand for. This gesture of distancing is furthermore directed to
reinforce the negative desirability of Communist activism in general.
Photographic images
I examine the collection of thirteen photographs placed immediately after
the text panel along the left Wall (see Plate 2.3). What probably arrests a
visitor's attention to this collection of photographs is the wired fence. The
significance of this wired fence, other than its role as a focal point that draws
a visitor's Gaze to the photographs, is discussed later in this section. For now,
I concentrate my analysis on some of the photographic images displayed.
For the specific analysis of the meanings constructed in each photograph, I
apply eclectically the SF interpretative frameworks formulated in O'Toole
(1994) and Kress and van Leeuwen (1996). The analyst's situation is, however, further complicated in the medium of a museum exhibition, where
how any single photographic image can mean is as much mediated by its
dissemination alongside other photographs through display practices, two
of which are discussed here: museum labelling and setting.
In relation to the exposition set out by the linguistic text panel, these
photographs serve as artefactual evidence that testify to the 'truth' of the May
13th Incident recounted in clauses 12-20. Following O'Toole (1994), the
Representational content expressed (at the rank of Work) in the thirteen
photographs consists of Scenes of police control and arrest, crowd dispersion
and injury, all of which illustrate the non-productive consequences of the
May 13th Incident. In addition, photographs in black-and-white and particularly sepia not only evoke a sense of the past, but also hark back to the
traditional genre of documentary. As Price (2000: 75) writes of documentary
photography, one implicit claim that underlies its historical development is
that 'it offers us a disinterested and true picture of the world'. It is precisely
this naturalistic coding orientation (Kress and van Leeuwen, 1996; Thibault,
2000) that underpins the evidential value of each photographic image.

image a

Plate 2.3

Display of photographs on the May 13th Incident (left Wall)

46

MULTIMODAL DISCOURSE ANALYSIS

There is, in other words, the social assumption that photography realistically captures 'an immediate and transparent identity between image and
referent' (Phillips, 1998: 155). However, as Ryan (1993, cited in Price, 2000:
69) argues:
Despite claims for its accuracy and trustworthiness, however, photography did
not so much record the real as signify and construct it.
Tagg (1988: 187, emphasis mine) similarly reminds us that
[t]he photographer turns his or her camera on a world of objects already constructed as a world of uses, values and meanings, though in the perceptual process these may not appear as such but only as qualities discerned in a 'natural'
recognition of'what is there'.
Thus, rather than imputing an ontological status to realism, what is underscored so far is its discursive constitution that invests the photographic image
with an authority to authenticate. The photographs exhibited on this wall
are themselves social semiotic constructions whose perceived naturalistic
coding orientation is worked through the genre of documentary to reify the
facticity of the linguistic recount of the May 13th Incident.
The photographs displayed are reproductions rather than 'originals'. It
follows from this reproducibility that photographic images are 'transmutable
objects . . . involved in endless, complex acts of circulation and exchange'
(Price, 2000: 111). That is to say, '[t]he photograph is not a magical "emanation" but a material product of a material apparatus set to work in specific
contexts, by specific forces, for more or less defined purposes' (Tagg, 1988:
3). The key principle here is the recontextualization (Thibault, 2000: 364365)
of photography in relation to other social practices configured within particular institutional spaces. In this display, the practice of museum labelling
recontextualizes photographs as archival knowledge. The photographs are
minimally labelled as '1954 Student Riots; National Archives Singapore'.
Such a form of labelling generalizes two reference points: first, it reductively
identifies what the photographs are about under a general classification
'1954 Student Riots'; second, it specifies the source ('National Archives
Singapore') from which these images are retrieved and reproduced.
Now, Smith (1989: 12) has argued that museum labelling 'conceals a
complex history' of artefacts on display. In this instance, however, labels do
not simply hide but recreate the historical significance of the photographs as
an archive. As Sekula observes of photographic archives, 'they heap
together images of very different kinds and impose upon them a homogeneity that is a product of their very existence within an archive' (Price and
Wells, 2000: 59). This thrust towards homogeneity is also directed strategically to achieve particular social purposes. That is, archival knowledge is
never created for its own sake but for its appropriation to serve some (dominant) social impulse to recollect and review past times. The larger point

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

47

imprinted is that the significance of photographs as material artefacts is as


much shaped within institutionalized relations of their select use.
I move on to probe deeper into the inter-semiotic mechanisms between
language and visual imagery in this display practice of labelling. It is interesting to observe how labelling photographs in terms of their source may be
analogous to the rhetorical technique of Attribution, where the content
'cited' is now construed by the visual semiotic. Attribution here is also significant in enacting the 'network of intertextual connections' (Lemke, 1995:
11) between SHM and the National Archives of Singapore (NAS). Further
complicating the discourse on heritage and history, then, is the collaboration
of social practices between different memory institutions. In this instance,
the institutional authority of NAS as a 'resource centre for the research and
dissemination of information on the history of Singapore' (NHB 2000: 4) is
evoked to authenticate the documentary value of the photographs exhibited. The institutional status of these photographs as official evidence in turn
determines the credibility of the propositions in the text panel. It is in this
respect that the heteroglossic space of the discourse on Communism tends
towards constriction.
The Arrangement of photographs on this wall further positions the visitor
to sympathize with the police as riot victim. Of the thirteen photographs,
the only visual representation of injury is that of a policeman with a bandaged head (see Image A). Image A, in its portrait formatting, seems visually
salient as a pivotal centre balancing the flanked Arrangement of photographs. Interpersonally, Image A is also prominent as the only photograph
that directly addresses the visitor through the direct outward Gaze of the
wounded policeman. The frontal angle of the shot, coupled with the near
central positioning of the injured policeman with his head slightly tilted,
encourages viewer involvement and amplifies sympathy.
Noteworthy in Image A is the observation that the injured policeman is
non-Chinese (most probably Malay). In fact, most of the policemen captured in the photographs are non-Chinese. The student demonstrators are,
on the other hand, predominantly Chinese. To recall, historical research
(for example, Lee, 1996; Wee, 1999) has reported how the Communist
movement in Singapore during the 1950s and 1960s developed in relation
to its capacity to garner and mobilize support from the world of Chinesespeaking Chinese. As sociologist PuruShotam (1998: 55) also notes, 'The
equation according to which language equals culture equals race mirrored
the perceptions of students, supporters and sympathizers of the cause of
Chinese education'. In this light, the Communists' alignment with the
Chinese-educated may be seen as provoking Chinese Communalism against
colonialism. However, aside from the brief mention in the text panel of the
MCP exploiting 'concern about Chinese education' (clause 11), the racial
dimension of the Communist conflict in Singapore remains relatively
unelaborated in the exhibition. Racialization is also only covertly implied
through skin colour in the photographs. In sum, within the institutional
context of the museum, the photographs acquire a documentary value that

48

MULTIMODAL DISCOURSE ANALYSIS

not only objectifies the negative appraisal of the Communists and their
activities, but also layers it with the delicate complexity of race.
What probably arrests a visitor's attention to this collection of photographs is the hapticity7 of the wired fence. The wired fence is used here as
an object prop to 'fabricate' the Setting of a prison. It is within this Setting of
imprisonment that the photographs come to be interpreted as ideational
tokens of negative Judgement on riotous behaviour. The visitor walking
through the floor of this area is simultaneously locked in and out from the
Scenes captured in the photographs. This physical barrier serves as a 'safety
net' that 'protects' the visitor from acts of violence. In preventing the visitor
from having any direct tactile contact with the photographs, the wired fence
enacts a form of metaphorical distancing from a riotous past. Even one's
visual interactivity with these images is 'intervened' by the criss-cross of
wire, as if dictating that these riots in the past should not be allowed to
repeat themselves in present time. What may be implied in this construction
is the importance in preserving the 'safety net' that the PAP Government
has thus far spun for the peaceful progress of Singapore as a nation-state.
The wired fence thus amplifies the scale of the undesirability of the
Communist movement. In addition, the perceived risk of physical pain
evoked by the barbed wiring at the top disciplines the visitor into accepting
police control as a necessary and legitimate deterrent against Communism
lest Singapore becomes a totalitarian state. For some, there may seem to be a
dash of irony here since police surveillance is as instrumental in enforcing a
sense of totalitarianism. Yet, any force wielded by police power remains
hidden and naturalized behind a legalistic frame of social order presently
articulated to criminalize the Communist movement during the 1950s.
Ideological motivation
The exhibition, which displays a dominant 'progressivist national narrative'
that stages 'a transition from a colonial society to a modern capitalist one'
(Wee, 1999: 169, 172, emphasis original), suppresses any formative role the
Communists played in the 'nation-ising' of Singapore. The collective multimodal definition of the Communists as a dangerous riotous Other is filtered
through the dominant lens of communitarian ideology (Chua, 1995) presently held by the PAP Government. Communitarian ideology is most
recently articulated and instituted in the Government's 1991 White Paper
on Shared Values.8
Two of these Shared Values are transmitted through this display. First, the
non-legitimate place of revolutionary violence emphasizes PAP's order of
politics, which is one founded on constitutional consensus rather than conflict; this echoes the Shared Value Consensus instead of contention. Second, downplaying the racial script in this display also aligns the exhibition with the
Shared Value of Racial harmony. As Wee (1999: 170) writes of the delicate
racial communal tension that underlay the mobilization of Communism in
Singapore's 'stage of nationalist polities':

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

49

The problem was that if the Chinese-speaking of various linguistic stripes


(Mandarin, Hokkien etc.) were the politicized masses who could be mobilized,
they also represented a problem for the imagining of a multiethnic or multicultural 'national' community. In Singapore, a clear-cut national entity could not
be created, as there was not one single 'nation' on which to build the nation-state
a common enough problem for a former colonial state.

The representation of the 'Communist United Front', in precluding


the historical view that the PAP initially rode on the force of Chinese
communalism as a source of anti-colonial resistance in nationalist politics,
preserves the party's dominant ideal of multiracialism. This emphasis on
multiracialism is in line with the museum's mission of 'preserving and interpreting the nation's history and material culture in the context of its multicultural origins' (NHB 2000: 5, emphasis mine).
'Closing' the tour
In this paper I have tried to exemplify how the SF theory may be extended
to articulate systematically the complex dimensions of meanings construed
in From Colony to Nation. A major emphasis is the collaboration between these
dimensions in co-evaluating the spectacle of history in particular ways.
Through the analysis I hope to have extended evaluation (or Appraisal theory in SF context) as a discursive end realized by the interaction of various
semiotic systems. This extension of Appraisal theory into the multimodal
terrain of the museum exhibition has also led us to appreciate evaluative
dynamics as essentially multi-levelled. Cortazzi and Jin (2000: 119) have
actually conceived of this multi-level complexity from three perspectives:
evaluation in, through and of narrative. These three perspectives may be
extended to social semiotic practices in general. At the close of this paper,
it might be worthwhile to tease out more clearly for the reader the operation
of these three evaluative levels, which have remained implicit in my preceding analysis of the exhibition.
On the first level, there is evaluation in the exhibition's Narrative Design
where the co-evaluative relations between multiple semiotic resources assess
the historical representations of Communist and communal unrest in specific ways. It is worth emphasizing that, even at this level, evaluation is
shown to be simultaneously implicated and complicated in the Interplay of
Genres configured within particular institutional formations.
As reflected in the analysis, the period of Communist insurgency ('Communist United Front') is evaluated within the Narrative Design as traumatic.
Perhaps, as Antze and Lambek (1996: xii) have observed, 'memory worth
talking about worth remembering is memory of trauma'. More importantly, foregrounding the traumatic nature of any incident here is also pointing to its control. As Neal (1998: 5) conceives:
A national trauma involves sufficient damage to the social system that discourse
throughout the nation is directed toward the repair work that needs to be done.

50

MULTIMODAL DISCOURSE ANALYSIS

This 'repair work' to recover from the trauma of Communist and communal violence allows the reinstatement of the communitarian ideology
espoused by the ruling PAP Government. Herein lies the second evaluative
level, where through the Narrative Design, the national 'self of Singapore is
positioned as vulnerable; this vulnerability includes especially the delicate
problem of difference posed by race. In this light, communitarianism is
posed as a form of social discipline cultivated to prevent a relapse into a
traumatic past. This social discipline is hardly resisted primarily because of
its pragmatic effectiveness in sustaining Singapore's material progress. The
body politic of Singapore thus risks trauma if there should be a lapse from
this progress. Underscored in all these is also the discursive positioning of
the museum (SHM) as a State apparatus that plays a political role in reproducing PAP's ideals of a Singapore citizenry. Such politicization resides
precisely in their capacity to structure knowledge.
Finally, on the third level, evaluation of the Narrative Design engages the
researcher's subjectivity in her/his analysis. That is, the interpretative analysis I present in this paper is as evaluative, positioning you to view the
exhibition in a particular light. The interpretative stance I adopt towards the
analysis undertaken here aims to trace how From Colony to Nation naturalizes
dominant conceptions of social 'reality'. It is necessary, though, to add the
qualification that the point here is not to denounce the credibility of the past
represented in the exhibition. Indeed, the emphasis on history as an ideological (re)construction throughout this paper does not mean that those
past events recounted did not happen. Nor should it be easily conflated with
a claim of historical falsity. In fact, if one takes the social constructivist view
of history seriously, notions of 'truth' and 'falsity' appear to be in flux since
the crux of the matter now is how any single interpretation of the past
becomes (de)legitimized, by whom and for what purposes. Further, it is the act
of evaluating that is directive of one's sensibilities to the past. Herein lies the
disciplining act of history, whose representation in the museum is a form of
directed remembering. The flipside of this selective remembering is, of
course, a disciplined forgetting motivated by the ideologies of the dominant
in society.
Museums are then strategically placed in history making. The SF
framework formulated here endeavours to be useful as some form of
'meta-language' that enables visitors to 'talk' systematically about how the
exhibition as a primary composite medium construes ideology. Yet, not just
'talking' about, but also potentially 'talking' back to particular unequal representations displayed in exhibitions. In the final analysis, the museum represents a heterogeneous zone that differentially engages multiple social players
in negotiating (or mutually disciplining) the discursive forces of social
change. It is perhaps for this reason that the museum continues to stand as a
site worth (re)visiting.

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

51

Notes
1

2
3

5
6
7

Foucault (1977, 1980) conceptualizes the mutual constitution of power and


knowledge in social practices. As Foucault (1977: 194) has argued in Discipline and
Punish: 'In fact, power . . . produces domains of objects and rituals of truth'.
For a more detailed consideration of the theoretical basis for extending SF
theory into the domain of multimodality, see Pang (2001: 38-54).
Harris (2000) first conceived the term integrational semiology to understand the
multimodal character of writing. Under integrational semiology, Harris (2000:
69, emphasis original) explains that 'signs . . . are not invariants: their semiological
value depends on the circumstances and activities in which, in any particular
instance, they fulfil an integrational function'. Though insightful, Harris remains
vague on the what and how of this integrational function. This paper suggests
that: (1) the metafunctional hypothesis and (2) the realizational dialectic between text and
social context in SF theory help elucidate more concretely the shape of this integrational semiology.
Results of the survey are also reported in The Straits Times, 16 September 1996.
For a sample of some of the questions asked in this survey, see The Straits Times,
15 September 1996.
I refer to the guide here, not for an exhaustive multimodal analysis of it, but to
distil the exhibition's classificatory scheme (see Table 2.2).
According to Fairclough (2001): 'An (interaction may involve a "chain" of
different, interconnected texts which manifest a chain of different genres'.
Following O'Toole (1994: 35), hapticity refers to that three-dimensional quality
in sculpture which 'engages our whole body in an identification with [its] mass
and rhythms'.
For a detailed discussion on the promulgation of Shared Values as a National
Ideology, see Hill and Lian (1995: 210219). There are principally five components in this National Ideology: (1) nation before community and society above
self; (2) family as the basic unit of society; (3) regard and community support for
the individual; (4) consensus instead of contention; and (5) racial and religious
harmony.

Acknowledgements
Plates 2.1, 2.2 and 2.3 are reproduced by courtesy of the Singapore History
Museum, National Heritage Board, Singapore.
References
Antze, P. and Lambek, M. (eds) (1996) Tense Past: Cultural Essays in Trauma and
Memory. London: Routledge.
Arnheim, R. (1982) The Power of the Center: A Study of Composition in the Visual Arts.
Berkeley: University of California Press.
Bal, M. (1999) Memories in the museum: preposterous histories for today. In M. Bal,
J. Crewe and L. Spitzer (eds), Acts of Memory: Cultural Recall in the Present. London:
University Press of New England, 171-190.
Baldry, A. P. (ed.) (2000) Multimodality and Multimediality in the Distance Learning Age.
Campobasso, Italy: Palladino Editore.

52

MULTIMODAL DISCOURSE ANALYSIS

Belcher, M. (1991) Exhibitions in Museums. London: Leicester University Press.


Bennett, T. (1995) The Birth qftheMuseum History, Theory, Politics, London: Routledge.
Chua, B. H. (1995) Communitarian Ideology and Democracy in Singapore. London:
Routledge.
Coffin, C. (1997) Constructing and giving value to the past: an investigation into
secondary school history. In E Christie andj. R. Martin (eds), Genre and Institutions:
Social Processes in the Workplace and School. London: Cassell, 196-230.
Cortazzi, M. and Jin, L. (2000) Evaluating evaluation in narrative. In S. Hunston
and G. Thompson (eds), Evaluation in Text: Authorial Stance and the Construction of
Discourse. Oxford: Oxford University Press, 102120.
Dean, D. (1994) Museum Exhibition - Theory and Practice. London: Routledge.
Fairclough, N. (2001) Genre in Critical Discourse Analysis: Researching Language in New
Capitalism. Handout to keynote address at conference: Genres and Discourses in
Education, Work and Cultural Life: Encounters of Academic Disciplines on
Theories and Practices, 13-16 May 2001, Oslo, Norway,
Foucault, M. (1977) Discipline and Punish. London: Penguin.
Foucault, M. (1980) Power/Knowledge: Selected Interviews and other Writings 19721977.
London: Harvester Press.
Hall, M. (1987) On Display: A Design Grammar for Museum Exhibitions. London: Lund
Humphries.
Halliday M. A. K. (1970) Language structure and language function. In J. Lyons
(ed.), New Horizons in Linguistics. Harmondsworth: Penguin Books, 140165.
Halliday, M. A. K. (1973) Explorations in the Functions of Language. London: Edward
Arnold.
Halliday, M. A. K. (1978) Language as a Social Semiotic: The Social Interpretation of
Language and Meaning. London: Edward Arnold.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edition). London:
Edward Arnold.
Halliday, M. A. K. and Martin, J. R. (eds) (1993) Writing Science: Literacy and Discursive
Power. London: The Falmer Press.
Harris, R. (2000) Rethinking Writing. London: Athlone Press.
Hernadi, P. (1995) Cultural Transactions: Nature, Self, Society. Ithaca: Cornell University
Press.
Hill, M. and Lian, K. F. (1995) The Politics of Nation Building and Citizenship in Singapore.
London: Routledge.
Hodge, R. and D'Souza, W. (1999) The museum as a communicator: a semiotic
analysis of the Western Australian Museum Aboriginal Gallery, Perth. In
E. Hooper-Greenhill (ed.), The Educational Role of the Museum (2nd edition). London and New York: Routledge, 53-63. (First appeared in Museum 31(4) (1979).)
Hooper-Greenhill, E. (1992) Museums and the Shaping of Knowledge. London: Routledge.
Hooper-Greenhill, E. (1999) Education, communication and interpretation:
towards a critical pedagogy in museums. In E. Hooper-Greenhill (ed.), The Educational Role of the Museum (2nd edition). London: Routledge, 3-27.
Karp, I. (1992) Introduction: museums and communities: the politics of public
culture. In I. Karp, C. M. Kreamer and S. D. Lavine (eds), Museums and Communities. Washington: Smithsonian Institution Press, 117.
Kavanagh, G. (2000) Dream Spaces: Memory and the Museum. London: Leicester
University Press.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

53

Kress, G., Leite-Garcia, R. and van Leeuwen, T. (1997) Discourse semiotics. In T. A.


van Dijk (ed.), Discourse as Structure and Process. London: Sage Publications, 257-287.
Lee, T. H. (1996) The Open United Front: The Communist Struggle in Singapore 19541966.
Singapore: South Seas Society.
Lemke, J. L. (1995) Textual Politics: Discourse and Social Dynamics. London: Taylor &
Francis.
Lemke, J. L. (1998) Multiplying meaning: visual and verbal semiotics in scientific
text. In J. R. Martin and R. Veel (eds), Reading Science: Critical and Functional Perspectives on Discourses of Science. London: Routiedge, 87113.
Macdonald, S. (1998) Exhibitions of power and powers of exhibition: an introduction to the politics of display. In S. Macdonald (ed.), The Politics of Display.
London: Routiedge, 1-24.
Maroevic, I. (1997) The museum message: between the document and information.
In E. Hooper-Greenhill (ed.), Museum, Media, Message. London: Routiedge, 2436.
Martin, D. (ed.) (1997) Museum Practice. Issue 5(2/2): 36-38.
Martin, J. R. (2000a) Beyond exchange: Appraisal systems in English. In S. Hunston
and G. Thompson (eds), Evaluation in Text: Authorial Stance and the Construction of
Discourse. Oxford: Oxford University Press, 142-175.
Martin, J. R. (2000b) Grammar meets Genre: Reflections on the 'Sydney School'.
Inaugural Lecture at Sydney University Arts Association.
Martinec, R. (1998) Cohesion in action. Semiotica 120(1/2): 168-180.
National Heritage Board (2000) National Heritage Board (NHB) 1998/1999 Annual
Report.
Neal, A. G. (1998) National Trauma and Collective Memory: Major Events in the American
Century. London: M.E. Sharpe.
O'Halloran, K. L. (1996) The discourses of secondary school mathematics,
unpublished Ph.D. thesis. Murdoch University, Western Australia.
O'Halloran, K. L. (1999) Interdependence, interaction and metaphor in multisemiotic texts. Social Semiotics 9(3): 317-354.
O'Halloran, K. L. (2003a). Educational implications of mathematics as a
multi-semiotic discourse. In M. Anderson, A. Saenz-Ludlow, S. Zellweger, and
V V Cifarelli (eds), Educational Perspectives on Mathematics as Semiosis: From Thinking to
Interpreting to Knowing. Ottawa: Legas Publishing, 185214.
O'Halloran, K. L. (2003b). Intersemiosis in mathematics and science: grammatical
metaphor and semiotic metaphor. In A.-M. Simon-Vandenbergen, M. Taverniers, and L. Ravelli (eds), Grammatical Metaphor: Views from Systemic Linguistics.
Amsterdam: John Benjamins, 337365.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
O'Toole, M. (1999) From systems to hypertext: navigating semiotic space in the
visual arts. Plenary paper presented at the 6th Congress of the International
Association for Semiotics Studies, Guadalajara, Mexico on Semiotics Bridging Nature
and Culture.
Pang, K. M. A. (2001) Disciplining history a multimodal analysis of the museum
exhibition 'From Colony to Nation1., unpublished MA thesis. National University of
Singapore.
Pearce, S. M. (1991) Objects in structures. In S. M. Pearce (ed.), Museum Studies in
Material Culture. Washington, DC: Smithsonian Institution Press, 4759.
Pearce, S. M. (ed.) 1994. Interpreting Objects and Collections. London: Routiedge.
Phillips, D. (1998) Photo-Logos: photography and deconstruction. In M. A.
Cheetham, M. A. Holly and K. Moxey (eds), The Subject of Art History: Historical

54

MULTIMODAL DISCOURSE ANALYSIS

Objects in Contemporary Perspectives. Cambridge, UK: Cambridge University Press,


155-179.
Price, D. (2000) Surveyors and surveyed: photography out and about. In L. Wells
(ed.), Photography: A Critical Introduction. London: Routledge, 65-115.
Price, D. and Wells, L. (2000) Thinking about photography: debates, historically and
now. In L. Wells (ed.), Photography: A Critical Introduction. London: Routledge, 963.
PuruShotam, N. S. (1998) Negotiating Language, Constructing Race: Disciplining Difference
in Singapore. Berlin: Mouton de Gruyter.
Ravelli, L. J. (1997) Making meaning: how, what and why? Paper presented at the
conference Museum Making Meanings Communication by Design?, Australian
National Maritime Museum.
Ravelli, L. J. (2000) Beyond shopping: constructing the Sydney Olympics in threedimensional text. Text 20(4): 489-515.
Royal Ontario Museum (1999) Spatial considerations. In E. Hooper-Greenhill (ed.),
The Educational Role of the Museum (2nd edition). London: Routledge, 178-190.
(First appeared in Communicating with the Museum Visitor (1976).)
Smith, C. S. (1989) Museums, artefacts and meanings. In P. Vergo (ed.), The New
Museology. London: Routledge, 6-21.
Tagg, J. (1988) The Burden of Representation: Essays on Photographies and Histories.
London: Macmillan.
The Straits Times (15 September 1996) Many students ignorant of Singapore history.
The Straits Times (16 September 1996) Students know little of Singapore history:
survey.
Thibault, P. J. (1997) Re-reading Saussure: The Dynamics of Signs in Social Life. London:
Routledge.
Thibault, P. J. (2000) The multimodal transcription of a television advertisement:
theory and practice. In A. P. Baldry (ed.), Multimodality and Multimediality in the
Distance Learning Age. Campobasso, Italy: Palladino Editore, 311385.
Vergo, P. (1989) The reticent object. In P. Vergo (ed.), The New Museology. London:
Reaktion Books, 41-59.
Wee, C. J. W-L. (1998) The need for National Education in Singapore. In Business
Times, 30-31 May 1998.
Wee, C. J. W-L. (1999) The vanquished: Lim Chin Siong and a progressivist
national narrative. In Lam Peng Er and Kelvin YL Tan (eds), Lee's Lieutenants:
Singapore's Old Guard. St Leonards, NSW: Allen & Unwin, 169-244.

3 A semiotic study of Singapore's Orchard Road and


Marriott Hotel
Safeyaton Alias
National University of Singapore

Introduction
Cities are more than a place to live, to work or to play in. As people observe
the city while they move through it (Lynch, 1996), the city serves as a
political and social statement, and in some cases, symbolizes and
encompasses the achievement and political prowess of the country's ruling
elite. This is especially true in the case of Singapore where the city becomes
a showcase of what has been politically and economically achieved by the
People's Action Party (PAP) over the years since independence in 1965.
Within a span of thirty-five years, for instance, the country has achieved one
of the highest living standards in Asia, which has led some economists to
proclaim it a modern miracle. Lacking in natural resources and having to
rely on its human resources, it was suggested that for Singapore 'the capitalist road was [perhaps] the only one open' (Chua, 1995: 59). The number of
buildings and shops in Orchard Road stands as testimony to the realization
of Raffles's vision of a 'bustling emporium' (Jayapal, 1992: 67). A city is
therefore 'man's single most impressive and visible achievement' (Pike,
1996: 243) while remaining nonetheless a 'social institution' (Mumford,
1996: 184).
A city or a 'built world, like a written text, stores information' and 'presents particular transformations and embeddings of a culture's knowledge of
itself and of the world' (Preziosi, 1984: 50-51). The built world is an exhibit
of the culture of a given society, which in some ways reflects the ideologies
that operate within that society. Buildings, for example, 'are not just functional machines; they have signs of their practical functions written all over
them: they signify their function as use' (O'Toole, 1994: 85); that is, 'buildings
are designed to mean something' (Stern, 1994: 47). Architecture is part of a
society's culture which affirms and re-establishes its values and ideals; it is
the representation of power (Betsky, 1994; Stern, 1994) and, whether positive or negative, the city or the built world is the image of the community
(Pike, 1996).
This paper therefore sets out to investigate the nature and manifestation
of the prevailing ideologies within the society of Singapore. To achieve this
purpose Singapore is treated as a text and indeed, it is a discourse worth

56

MULTIMODAL DISCOURSE ANALYSIS

investigating, analyzing and interpreting. Like a text, Singapore has been


structurally organized but with a difference: the country is threedimensional and multimodal. Like a text, it too leaves itself open to interpretation but how it is interpreted depends on one's theoretical perspective.
The interpretation of a text requires the application of theory and, in the
case of a city, involves the interpretation of the integration of the various
meaning-making resources. Although part of a larger research project
(Safeyaton, 2001), due to space constraints the focus of the analysis and
discussion in this paper is restricted to Orchard Road, that significant part
of Singapore popularly known as the 'town'. It is here, and more specifically
the Marriott Hotel, where it is commonly believed that East meets West.
The theoretical approach underlying these analyses is Michael Halliday's
(1994) social semiotic theory of language, which has been extended to visual
images and architecture (O'Toole, 1994). From this perspective, language,
visual images and architecture are viewed as social semiotic resources which
are metafunctional; that is, they simultaneously realize Textual, Interpersonal
and Experiential meaning. The systemic-functional frameworks used in the
analyses are discussed in more detail below.
The semiotics of Singapore
As a city, Singapore is not static; it grows and reinvents itself according to
the changing needs and demands of its society (Tan, 1999). The physical
features of a city such as Singapore, where there is constant development
and redevelopment, are therefore not permanent. Changes to Singapore are
deemed necessary as it continues to aspire to be a 'model city', that is, a city
that is livable, attractive, business-friendly and accessible (URA Annual Report
1998/1999). While national objectives must be met, the planning of a city
needs also to consider the needs of the people who must be assured that
housing is 'affordable and comfortable', that there are 'enough public spaces
to provide [them with] urban relief, that there are the 'necessary telecommunications and fiscal infrastructure' and that there is an efficient and
affordable public transport (ibid.: 31). People should be able to move easily
from one designated area to another, for purposes of work or recreation.
This freedom to move about permits the city dwellers to be in touch with the
environment. As a result, a person develops a relationship with his or her
surroundings and that relationship is physical, emotional, mental, cultural
and even religious.
The making and planning of modern-day Singapore, however, has been
an intensive and prolonged enterprise. Its urbanization planning began with
the formulation of the Island Concept Plan, also known as the Singapore Concept
Plan and later as the Master Plan, on 1 January 1952. The Master Plan was
aimed at regulating the development of land through plot zoning and plot/
density controls. Since its implementation, and as required by legislation,
the Master Plan has undergone several reviews involving various additions
and alterations. After Singapore's separation from Malaysia in 1965, the

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

57

plan was reviewed and renamed the Concept Plan in 1971 by an appointed
local authority. The Central Area Plans came to fruition between 19741989
only to be renamed the Revised Concept Plan in 1991. In 1998 the latest
revision of the Concept Plan was presented (Dale, 1993; Fong, 1973; Master
Plan Written Statement, 1993; Tan, J. H., 1972; Tan, S., 1999; URA Annual
Report, 1997/98).
Part of the Concept Plan's objectives is to meet what the authorities perceive as 'the new wants and needs of [the] people' (Dale, 1993: 42). This is
accomplished by improving the living environment and by offering, or
rather prescribing, a better quality of life. This includes a policy of
decentralizing commercial activities to avoid overcrowding in any one area,
specifically the Central Area of which Orchard Road is a part (Dale, 1993).
But the ideas and the benefits outlined in the Concept Plan can only be
successfully implemented with a healthy economic growth. This becomes a
platform through which the authorities can justify their actions and
decisions both politically and in terms of the development practices. On the
business front, for example, one aim of the Concept Plan is to provide 12,000
hectares of land for industrial needs (Keung, 1991). In addition, 'judicious'
investments in the leisure industry will be welcomed because such investments mean 'good business' and will 'enhance [the Singaporeans'] quality
of life' (Liu, 1991: 4). In an effort to add 'life and character' to the streets as
well as making them 'more exciting and lively' (URA Annual Report, 1997/98:
28), the regulations for the setting up of outdoor refreshment areas and
outdoor kiosks along the pedestrian malls in the city were relaxed in July
1996. Previously, these outlets could occupy only 10 per cent of the total
building length but this is now 25 per cent, resulting in more outlets being
set up along pedestrian malls, especially along Orchard Road. These outlets
bring in additional income for the authority in the form of 'payment of
development charges or different premiums' (ibid.}. Hence, every metre of
unoccupied space in, around, below and above Singapore has potential for
extra revenue. This provides a boost to the economy with the Singaporeans
themselves helping to sustain that economy; the system and the people
depend on each other.
A visitor to Singapore, however, is likely to have little knowledge of how
the country has been transformed historically although he or she may have
seen where the locals live, how they travel, where they eat, work, play, shop
or seek medical attention. The visitor sees how the country 'operates' but
he or she may not be able to explain how this is possible because, more
likely than not, the visitor is not equipped with the knowledge or the tools
to explain what he or she sees or feels. For the uninitiated, Singapore
'explains' its operations very well because every part of the country, be it a
designated area, its roads, the open spaces or the buildings, transmits
explicit messages. Each of these 'speaks' to or 'addresses' the visitor directly. While part of the Singapore city has its specific functions or purposes, linguistically there is also a physical and Textual representation
which transmits messages.

Table 3.1

Functions and systems in Singapore

Units/
Functions

Area (Rank 1)

Roads/MRT
(Rank 2.1)

Experiential

Zone: north, north-east,


central, west, south
District: CBD, non-CBD
Location: mainland, offshore
islands
Theme: business, cultural,
educational, entertainment,
medical, recreational,
religious, residential
Portrayals: cultural, social,
religious
Interplay between each theme
Focal: business, cultural, social,
religious
Self-containment
Specific functions:

Expressways (ERP/non-ERP)
Roads
Flyovers
Tunnels
MRT tracks (aboveground/
underground)

Interpersonal

Textual

Size
Orientation to general
amenities
Orientation to members of the
public: accessibility,
affordability
Characterization: Oriental,
occidental
Sites of power
Message

Relation to outer areas


Relation to MRT stations, bus
terminals
Rhythms: contrasting themes,
building shapes
Relation to prestige
Coherence and cohesion:
repetition of themes, new and
old buildings (preserved and
conserved)

Travelling hours: peak/nonpeak period


Size and spaciousness: main
road, minor road, slip road,
one-way/two-way traffic,
two/three lanes
Orientation to entrant: user
friendliness, accessibility to
public transport (bus lanes),
general public, fire-fighters,
paramedics

Relation to other roads, MRT


stations
Relation to other areas
Relation to buildings
Relation to safety
External cohesion: relation to
connectors, escalators

Units/
Functions

Experiential

Interpersonal

Textual

Orientation to buildings
Characterization: MRT
stations, bus-stops, street lights,
road names, road signs,
signboards
Lighting
Openness
Soft/hard texture: concrete,
asphalt, dirt track
Open Space
(Rank 2.2)

Specific functions:

Road dividers, islands


Road shoulders
Pavements/Footpaths
Parking space
Grass verge/Green belt
Open field: recreational,
business
Burial grounds
Public space
Private space

Spaciousness
Openness
Orientation to entrant:
accessibility
View
Relevance
Comfort: sheltered/
unsheltered walk-ways,
shades, benches
Lighting: natural, artificial
Hard/soft textures: concrete,
asphalt, grass
Colour

Relation to bus-stops, taxi


stands, roads, MRT stations
Relation to area/theme
Relation to buildings
Relation to safety
Relation to power and prestige
Degree of visibility
Degree of partition
External cohesion: relation to
connectors, stairs, overhead
bridges, pedestrian crossings,
underground passage
Permanence of open space
Permanence of partition

60

MULTIMODAL DISCOURSE ANALYSIS

To help 'explain' Singapore, that is, to analyse and interpret the city,
which is three-dimensional and multi-semiotic, a framework featuring a
rank-scale for the functions and systems is proposed (Table 3.1). The multiplicity of the framework means that the city can be read 'backwards or
forwards, upwards or downwards, and inside to outside' (Preziosi, 1984: 55).
The framework may be used to analyse from the whole to the smallest unit
in the city. This means that the semiotic analysis of the city of Singapore
begins with the unit Area at Rank 1, followed by the units Roads/MRT at
Rank 2.1 and the unit Open Space at Rank 2.2 (Table 3.1). The analysis of
the smallest unit in a city, that is, Elements contained in a room or on a floor
in a Building at Rank 2.3 in Table 3.2, completes the semiotic analysis.
Alternatively, because the city is three-dimensional and multimodal, it is
possible to perform the analysis from the lowest rank to the highest, that is,
from the unit Element in a Building (Rank 2.3) upwards to the unit Area
(Rank 1). Although beyond the scope of this paper, Singapore could be
conceived as the total sum of these Areas.
As buildings constitute an essential part of a city, O'Toole's (1994: 86)
chart for architecture has been incorporated into Table 3.2. Although the
chart has been amended to suit the Singapore context because 'the existence
of built form is not universal in all cultures' (Preziosi, 1984: 52), the change
is minimal. Elements such as the characterization of a building, that is,
whether it is occidental or Oriental, for example, or how it is oriented
towards the MRT station, have been incorporated into the framework. As
most buildings in Singapore are designed to be either self-contained (for
example, a hotel) or interdependent (for example, a market), they are treated
as individual episodes that help to contribute to or to complement the design
of the whole area. In other words, there is interaction or 'interplay' between
these Episodes.
Functions and systems in Singapore
The built world has functions that are wittingly or unwittingly designated or
prescribed. In land-scarce Singapore, the 'spatial products' or 'the built
forms' are likely to be designed to be multi-functional, that is, the practical
functions of a product very often overlap (Preziosi, 1984). A 'rank' or a 'unit'
links each of these built forms to the other. Major roads link one area or a
'unit' to another (see Table 3.1). Within an area, there are roads and open
spaces, which will eventually lead to buildings where there may be different
levels or storeys, with different rooms for different functions. Depending on
its function, each room may have a different layout or decor. While the
practical functions of an area or a building are considered to realize
Experiential meaning, the relationship between these practical functions and
its design or planning are Textual. The consistency and the repetition of a
specific theme in a particular area in Singapore means that textually it has
been designed to 'blend' and to 'fit' and construct the culture of the people.
Each unit operates or functions in relation to another, usually a neighbour-

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

61

ing one, and to its general surroundings or environment. At the same time in
the built world, the 'built forms' (Preziosi, 1984) will direcdy or indirecdy
command the people's involvement and interaction; our senses respond in
specific ways to our natural environment (Kress, 2000). The framework in
Table 3.1 lists the systems which function Interpersonally to engage us with
our environment. In what follows, I analyse and interpret the built forms of
the Orchard Road and the Marriott Hotel. These analyses reveal how
specific ideologies are manifested in the city of Singapore.
A semiotic analysis of Orchard Road

Commercial developments in Orchard Road began in the early 1900s when


stores were established to provide residents with fresh produce and food
supplies. In the 1950s, when the late C. K. Tang opened a department store,
it marked the beginning of rapid commercial development within the area.
Entertainment centres and hotels were soon built to cater to the demands of
the locals as well as the buoyant tourist industry. The construction of
Orchard and Somerset MRT stations and their locations within the Central
Business District reaffirmed Orchard Road's importance and its status as a
'dynamic activity corridor' (Orchard Planning Area, 1994: 9).
Orchard Road, a seven-lane and one-way-traffic road, stretches from
Delfi Orchard to Plaza Singapura and is an area where open parking spaces
for cars are limited. Most of these parking spaces are found within the
confines of hotels or shopping malls where parking fees are high. Textually,
this demonstrates the area's 'relation to prestige' and Orchard Road is
indeed a prestigious area where public Housing Development Board (HDB)
flats are not found. Building heights reach their maximum in the vicinity of
Orchard MRT station where buildings reach thirty storeys high. The height
tapers to ten storeys towards the Singaporean Presidential Palace or Istana
and twenty storeys towards the Tanglin zone (Orchard Planning Area, 1994)
(Plate 3.1). We may note that the Size and Verticality of a building is an
indication of its importance and status (O'Toole, 1994). Metaphorically, the
value of commercial or cultural activities reaches their peak at the junction
of Scotts Road and Orchard Road where the Marriott Hotel is located.
Suffice to say the Electronic Road Pricing (ERP) begins at this junction.
Indeed, the lack of parking spaces means people are encouraged to use 'the
efficient and affordable' public transport (URA Annual Report, 1998/1999:
31) to reduce traffic congestion and to solve parking problems.
On both sides of the road, Open Spaces such as the wide pedestrian walkways facilitate smooth pedestrian flow but people are not induced to visit or
to walk along these Open Spaces if there is no place to sit (Whyte, 1996).
Therefore practical Interpersonal elements such as benches, which are made
of durable and maintenance-free concrete or granite or wrought iron, are
selectively provided. People are socially engineered 'to get into new habits'
(Whyte, 1996: 111) such as walking rather than driving because of the lack
and excess of particular types of Open Spaces along Orchard Road. The

Table 3.2
Units/
Functions
Building
(Rank 2.3)

Floor

Functions and systems in a building (adapted from O'Toole 1994: 86)


Experiential

Practical Function: Business,


Cultural, Educational,
Entertainment, Governmental,
Medical, Private/Public,
Recreational, Religious,
Residential
Orientation to light
Orientation to wind
Orientation to earth
Orientation to service (water,
power)
Episode: self-contained,
interdependent
Interplay of episodes

Sub-functions:

Access
Working
Selling
Administration
Storing
Waking
Sleeping
Parking

Interpersonal

Textual

Size (relation to area and


setting)
Orientation to neighbours,
adjacent buildings
Orientation to road, MRT
tracks and stations
Orientation to entrant
Facade
Modernity
Colour
Cladding
Characterization
Colour
Intertextuality: reference,
mimicry, colour
Exoticism

Proportion (height/breadth/
length)
Relation to external area
Relation to road/MRT station
Relation to adjacent buildings
Relation to permanence: old,
new, preservation,
conservation
Rhythms: contrasting shapes,
angles, colours
Textures: rough/smooth
Roof/wall relation
Opacity
Reflectivity
Cohesion: interplay of episodes

Height
Spaciousness
Accessibility
Openness
View
Hard/soft texture
Colour
Sites of power
Separation of groups

Relation to other floors


Relation to outer world
Relation to connectors, stairs,
lifts, escalators (external
cohesion)
Relation of landing/corridor/
room/foyer/room (internal
cohesion)
Degree of partition
Permanance of partition

Units/
Functions

Experiential

Room

Specific functions:
Access
Entry
Lobby
Dining
Bedroom suites
Bathroom/toilet
Fitness centre /gamesroom
Restaurants, bar, lounge
Kitchen
Ensuite
Servery
Foyer
Laundry
Retreat

Element

Lighting: windows, lamps,


curtains, blinds
Air: window, fan, conditioner
Sound: carpet, rugs, partitions,
acoustic, treatment
Seating: function, comfort
Table: buffet, dining, coffee,
computer
Counter: cash, reception, bar

Interpersonal

Textual

Comfort
Lighting
Modernity
Sound
Opulence
Welcome
Style: rustic, pioneer, colonial,
suburban, 'Dallas', working
class, tenement, slum
Foregrounding of functions

Scale
Lighting
Sound
Relation to outside world
Relation to other rooms
Connectors: doors, windows,
hatches, intercom
Focus (e.g. hearth, dais, altar,
desk)

Relevance
Functionality: convention/
surprise
Texture: rough/smooth
Newness
Decorativeness
Stance
Stylistic coherence
Projection

Texture
Positioning: to light, other
elements
Finish

Plate 3.1

The shopping map from This Week Singapore

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

65

area appears to rely on the concept of'supply creates demand' (ibid.); that is,
the restricted nature of parking spaces and the open spaces create the
demand for public transport.
The built forms in Orchard Road place a great emphasis on Interpersonal
metafunction. Commercial developments along the pedestrian routes are
encouraged to 'have activity-generating uses on the [ground floor]' (Orchard
Planning Area, 1994: 20) and as a result, shops and restaurants open directly
to the mall, beckoning pedestrians as well as supporting street activities. The
open spaces are planned so that people instinctively walk into the airconditioned interiors of the shopping malls to escape the humidity of the
outdoors. The 'progression from street to interior is critical' (Whyte, 1996:
117) and Orchard Road has been planned such that it is hard to tell when
one transition ends and when the other begins. Pedestrians also have visual
access to the products on sale at the ground floor shops, which are encased
behind glass panels. Window displays are usually used to attract the attention
of the female pedestrians and cater first to what are perceived as the primary
needs of women: cosmetics, fine jewellery, clothing as well as their coordinated accessories while condoms at the Lucky Plaza shops are arranged to
resemble a bouquet of flowers. A major part of the business strategy is to
capture the female eye first. Seen textually, sex implicitly becomes the selling
point in Orchard Road in what largely remains a patriarchal society.
The presence of overseas investors in Orchard Road is ubiquitous and
thus there is a reinforcement of the culture of consumerism. For example, at
the time of writing, twenty-five outdoor refreshment outlets (OROs) are
located along Orchard Road. Located on both sides of the road, these
outlets serve coffee and tea and food such as burgers and fries; that is,
foreign imports from the West. It is common to see several oudets promoting
the same items but under different trade names. Patronizing these oudets
has become a way of life. These OROs have built a 'new constituency'
(Whyte, 1996: 111) where people are subconsciously trained to adopt new
habits such as having alfresco lunches. These outlets also act as an avenue
for the people to see and be seen and this has given rise to a new street
culture that is readily embraced. As competition among the various investors intensifies, 'campaigns' are launched to remind consumers, particularly
the young, of the products' existence, which are readily accessible and available to them. Hence, these OROs are located a few hundred metres away
from one another. While these outlets operate textually because they contribute to the thematic 'consistency' of Orchard Road, they have what
O'Toole describes as 'powerful [and serious] Interpersonal implications'
(O'Toole, 1994: 103). The 'repetition of themes' ensures that people would
not miss or forget these products. To invest in the young and impressionable
is therefore to invest in the future of Singapore. Such investments guarantee
the survival of these products and the continuous Western presence. Equally
important, these OROs continue to draw revenues for the authorities.
Ironically though, while these OROs are located at strategic and prime
locations, that is, they are in the Open Space and visible from the road, outlets

66

MULTIMODAL DISCOURSE ANALYSIS

serving local Singaporean fare are usually confined within a building, often
at the basement or the back of the building or at a side road and away from
the main road. Although the nature of Asian cooking is highly suitable for
the outdoors, it does not or rather is not allowed to fit into the context
of Orchard Road. Textually, a conscious effort has been made to ensure
Orchard Road projects and reinforces the sophisticatedly developed clean
and green image that has become synonymous with the image of Singapore.
The fact that ideas for these outlets were imported from overseas (URA
Annual Report, 1998/99) and are expected to 'make our streets more exciting
and lively' while 'adding life and character to our streetscape' (URA Annual
Report., 1997/98: 28) suggests the relative value of Asian culture. The hotels
may not necessarily cater solely to European tourists, but nevertheless, a
foreign culture is foregrounded while its Asian counterpart is backgrounded.
The message is clear: anything foreign, imported and specifically Western
excites and sells readily.
Unlike Geylang or Serangoon Road in Singapore, there is a conspicuous
absence of religious symbols along Orchard Road even though a prayer hall
for the Muslims is located off the main street in Bideford Road. The vicinity
is thus constructed to be secular, but not necessarily apolitical. California
Fitness Centres and Planet Hollywood have made their presence felt in
Orchard Road along with Singtel and the Safra Town Club. Unlike these
vibrant institutions whose open concept invites pedestrians to browse, the
Thai Embassy appears inaccessible behind its iron gates and thick foliage. As
the lowest building there, the embassy does not fit into the concept of
Orchard Road because it does not generate sales or draw in the crowds. It
mars the overall outiook and thematic concept of Orchard Road and we
interpret it as 'failing' textually. In contrast, the Singaporean Presidential
Palace or the Istana, situated at the end of Orchard Road and not visible
from the main road, is designed to attract attention. The changing-of-theguards ceremony has found favour with both the locals and the visitors. This
appeal can be translated as a desirable Interpersonal relation. Officially
closed to the public throughout the year, however, the grounds are opened
on designated public holidays.
Streetscapes such as road signs, street lights and bus shelters appear to be
neutral, but closer inspection reveals a different scenario. While the street
signs are in English, the ethnic group whose presence is strongly represented
is Chinese. The architecture of the Marriott Hotel is an example of how
that presence is reinforced and preserved. Such buildings serve to remind
Singaporeans of their cultural heritage. The one reminder of a multi-racial
society is the mural wall located next to the entrance to Orchard MRT
station where foreigners, especially the Filipinos, congregate on Sundays.
This mural wall depicts the cultural activities and the various landmarks
associated with the four main ethnic groups in Singapore. Discotheques and
pubs are discreetly placed in various corners of buildings and roads, away
from the public eye during the day. However, at night, these entertainment
centres spring to life, while out in the street the action continues. Orchard

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

67

Road has fulfilled the expectations and has realized the vision of the authorities to create 'a modern and vibrant commercial corridor alive with day and
night activities' (Orchard Planning Area, 1994: 14).
Hence, it is immaterial whether the people are indoors or outdoors. In
Orchard Road, people are constantly on the move and wherever they may
be, there are ATM machines for them to withdraw their money and at the
same time, an outlet for them to spend it. Every visitor to Orchard Road is a
potential customer. Regardless of the time of day, one can be assured that
there are cash transactions in Orchard Road. Textually then, the retail
industry has been successfully turned into one of the cultures of Orchard
Road. Except for Ngee Ann City, no other shopping mall has been prominently featured in postcards, the one form of communication that 'personally' connects Singapore and its visitors to the other parts of the world.
Examination of the postcards available in the local shops reveals that a
postcard of Orchard Road often includes Tang Plaza and the Singapore
Marriott Hotel, usually photographed from various angles and at different
times of day. This inevitably enhances the hotel's status but most importantly, it transforms the hotel into the landmark of Orchard Road.
A semiotic analysis of the Tang Plaza
A landmark is observed externally and serves as a point of reference or a
clue of identity. The choice of a landmark, according to Lynch (1996: 102),
is 'more easily identifiable' if it is 'significant', has 'a clear form', 'contrast[s]
with the background' and 'if there is some prominence of spatial location'.
Hence, I have chosen to investigate the Tang Plaza/Marriott Hotel (henceforth known as 'the complex') as a landmark.
Built in 1982, the 33-storey Singapore Marriott Hotel was formerly
known as the Dynasty Hotel. Acquired by the Marriott Group in 1995, it
underwent extensive interior redecoration and renovation and has since
operated under its present name (The Straits Times, 4 September 2000: 42).
Its strategic location ensures that every vehicle or commuter travelling
down from Tanglin, Scotts and Paterson Roads passes by it. It is situated at
a location where the ERP begins and where vehicles stop at the traffic
lights, the first of the four traffic junctions along Orchard Road. An underground passageway links the Plaza to the Orchard MRT station and the
other buildings across from the hotel, namely Shaw House and Wheelock
Place. Hence, no matter what mode of transportation one uses, the complex is highly accessible and visible to the public. Its prominent location,
that is, its Orientation to the people and its Relation to the road and
MRT station, which are Interpersonal and Textual functions respectively
(Table 3.2 above), have been translated into a form of visual and massive
advertisement that gives the complex an exposure not accorded to any
other shopping centre or hotel along the road. Additionally, its Size and
Verticality acts as a 'clear indication of [its] status' in the vicinity (O'Toole,
1994: 102).

68

MULTIMODAL DISCOURSE ANALYSIS

What make the complex more significant are its colours and its pagodalike architecture. Using the framework for architecture in Table 3.2, choices
from systems for Interpersonal meanings feature strongly in the design of
the hotel. For example, as an illustration of the functions of the units Modernity and Colour, its former owners had deliberately chosen its present
design to reflect their racial and cultural heritage and, although only eighteen years old, its design is representative of the days of ancient China. As far
as Colour is concerned, the dominant colours in the vicinity of Orchard
Road are blue and brown, as in the Forum the Shopping Mall, Wisma Atria
and Ngee Ann City, but, at the complex, the traditional Chinese colours,
green and red, dominate both the roof tiles and columns of the building.
Unlike the other hotels, which were designed to resemble vertical rectangular blocks that occupy extra lateral space, the Marriott Hotel is
octagonal in shape and is a tall and lean building with a distinctive Faade,
which is a conical top and upturned roof-ends that point towards the sky (see
Figure 3.2). The contrast in building shape and colour is grouped under the
category of the unit Rhythms that operates textually In addition, the
appearance of the complex has been likened to 'a decent Oriental gentleman' and conferred as 'a trustworthy place' (Gwee, 1991: 6263). We note
that the number 'eight' and the colour 'red' are considered lucky and symbolize prosperity within the Chinese community. Such beliefs or practices
are related to a community's social semiotic, which operates Interpersonally.
However, because of its pagoda-like structure and octagonal shape, the
design of Tang Plaza and Marriott Hotel is not consistent with the overall
environmental and architectural structure of Orchard Road. In other
words, the complex does not 'exhibit some kind of "fit" with their neighbours and neighbourhood' (O'Toole, 1994: 87). Although this is apparently
deliberate, textually the inconsistency could be said to 'fail' or be 'undesirable'. This Textual failing means, of course, that Interpersonally the building
attracts attention. The shape of the Marriott Hotel is only prominent from
an aerial view (see Figure 3.2). At eye-level, due to its orientation, distance to
and accessibility from the main road, the Tang Plaza is more distinct (see
Figure 3.1). This disparity may be partly due to proportionality in Size, a
system that operates interpersonally. The hotel seems to be sitting on a base
that is too broad for it (see Figure 3.2) and unlike the Tang Plaza, the
Marriott Hotel is backgrounded. The hotel proper is built in the centre of
the Plaza, which means that it is actually distanced from the main road.
From the environmental point of view, and both interpersonally and textually, this location acts as a buffer to the noise generated by the traffic.
Nevertheless, the hotel draws attention to itself due to its unique roof
design. One needs to raise one's head to view the hotel, and what is first seen
at ground level is the red and green roof (see Figure 3.1). In sum, the
Intertextuality or the difference in overall design, mismatch in size and the
colour scheme gives the building its Oriental character, one that provides
that significant 'contrast with the background' (Lynch, 1996:102) which is
Orchard Road. These differences have naturally proven to be advantageous

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

Figure 3.1 Front view of the Marriott Hotel

69

70

Figure 3.2

MULTIMODAL DISCOURSE ANALYSIS

Three-dimensional view of Tang Plaza

because these are the features that are constandy highlighted in various
postcards and travelling brochures.
The complex is a building with a hotel and the four-storey Tangs Superstore built as a whole unit. Experientially, there are two Episodes operating
simultaneously at the Tang Plaza. One Episode is that of a hotel and the
other, a shopping centre. Each is a different entity but one which has been
integrated and superimposed over the other. Each Episode serves its own
function: the hotel provides lodging, food and entertainment, while
the shopping centre is part of an industry that is responsible for shaping
Singapore into the commonly perceived shoppers' paradise. Both cater to
the needs of the foreigners as well as the locals and fit into the concept of
'under one roof; that is, shopping, dining, entertainment and lodging within
the same building. This provides the Textual Cohesion in the Episodes. This

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

71

Cohesion is also responsible for the great interplay and interaction between
the two Episodes because, seen experientially, for the uninitiated at least, it is
hard to predict where the shopping centre ends and where the hotel begins.
Textually, the foregrounding and the prominence given to Tangs Superstore
ensure that the complex fits into the overall thematic concept of Orchard
Road. Unlike the thematic Malay Village in Singapore, which was designed
to promote the Malay culture as a form of tourist attraction, the Tang
complex has proven to be a successful social, cultural and economic venture.
Even though the complex is located at the junction of Scotts and Orchard
Roads, vehicle access or the Interpersonal salient Orientation to entrant to
the complex is only from Scotts Road. A slip road branching out from Scotts
Road leads to the hotel main entrance and subsequently to the main
entrance of Tangs Superstore. For those using public transport, a bus stop
and an underpass to Orchard MRT station are conveniently located opposite the entrance of Tangs Superstore giving commuters, who are also prospective customers, direct access to the shopping centre. The whole complex
is slightly elevated from the main road, which metaphorically puts it in a
position of power or superiority. The protruding roof of the Tang complex
provides a much-needed shelter from both sun and rain while its red columns act as advertisement boards. The width of the pedestrian walkway
skirting the complex indicates that a heavy human traffic flow is anticipated.
Therefore, the open spaces around the complex are put to efficient use.
Benches are provided while OROs, such as Mrs Fields' and Juice & Java,
provide quick snacks and drinks. Textually, unlike in most parts of Singapore,
there is a sloping ramp that caters to the needs of the physically handicapped or those who are wheelchair-bound. And in case pedestrians forget
that the hotel is an octagonal-shaped building, this has been permanently
imprinted on the non-slip tiles of the walkway skirting the complex, while an
octagon circumscribes each column of the complex on the roof. Like the
built form of Orchard Road, there is an overwhelming emphasis on the
Interpersonal function at the complex.
Textually, in keeping with the green image of the area, low-lying shrubs
and palm trees signifying 'a tropical island' line the perimeter of the complex. The hotel entrance, however, has the thickest shrubs. Interpersonally,
other than complementing the colours of the hotel and enhancing its landscape, these plants shield the hotel guests from the main road, providing a
little privacy. The names of the complex's main tenants, that is, 'Marriott'
flanked by 'Tang' on either side, are mounted on the wall facing Scotts Road,
giving the impression that each is vying for the attention of the onlookers. If
one were to miss the hotel's name, the situation has been rectified through a
concrete signboard. This signboard 'announces' its presence in the vicinity
as it is erected directly opposite the hotel entrance and thus faces towards
the junction of Scotts/Orchard/Paterson Roads. Such a signboard, one
that is not part of a hotel proper and located in an open space, is the only
one found in the area. Others, if available, are usually located within the
hotel's premises.

72

MULTIMODAL DISCOURSE ANALYSIS

A semiotic analysis of the Singapore Marriott Hotel


Upon arrival at the steps of the hotel entrance, what first attracts the attention of a guest are the open-air sidewalk cafes on the left and on the right
and the side entrance to Tangs Superstore (see Figure 3.3). In other words,
the hotel's outdoors scenery and the activities generated from and around it
function to distract the guest from his or her intended destination. This can
be attributed to two factors. First, at the unit of Floor - colour for Interpersonal function, yellowish marble tiles are used for the floor at the hotel
entrance and black and grey tiles for the walls. While the tiles may add a
touch of class or sophistication and facilitate maintenance, the colours pale
against the eye-catching colours of the sidewalk cafes and the OROs where
red dominates. Second, and more importantly, there is a considerable distance between the complex's driveway and the hotel main entrance.
Although it is directly facing the driveway, the entrance or the Access to the
hotel appears hidden from the view of the guest.
Flanking the passageway leading to the hotel's main entrance are smallscale water fountains, which extend into the hotel. Fengshui, the art of
comprehending how the natural energy of life affects us in our daily life
(Gwee, 1991; Noble, 1994), seems to have played a role in the layout of the
hotel's ground floor. However, depending on one's social and cultural background, it could also be argued that these fountains are for aesthetic reasons.
Though O'Toole (1994: 90) has expressed uncertainty of its actual place
within the framework of his chart for architecture, fengshui, I feel, should be
categorized as Interpersonal. As O'Toole has clearly stated, however, it
depends greatly on one's social semiotic. Noting this, water, being one of the
five elements of nature, the others being earth, wood, fire and metal, symbolizes wealth in the Chinese culture and its employment is intended to
promote fortune (Gwee, 1991; Noble, 1994). Contextually, however, the
fountains could mean different things to different people. For children, they
are a source of entertainment because they may find pleasure in dipping
their hands into the water. To the hotel, the stream of water brings the hope
that success and prosperity continue to flow towards it. This is reinforced by
the motifs on its floor tiles. Here, the two unbroken concentric circles, which
could signify smoothness in perhaps business dealings and continuous prosperity, are divided into four segments, presumably representing the four
corners of the world where the Marriott Group operates. The circles, however, could also be a representation of the Luo Pan Compass that is used in
fengshui to determine the siting and building dynamics (Gwee, 1991; Noble,
1994). This circular pattern is also repeated on the false ceiling. Seen
experientially and at the rank of Element, this false ceiling and the fountain
conceal the light bulbs at the entrance where the lighting remains soft and
warm. Interpersonally, this provides Warmth and Comfort to the guests.
Finally, orchids are used to brighten the passageway and to offset the
dullness of the walls and floor. On the whole therefore, the entrance of
the hotel, which acts as the introduction to the hotel, employs various

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

Figure 3.3

Floor plan of ground floor of Marriott Hotel

73

74

MULTIMODAL DISCOURSE ANALYSIS

Interpersonal features or strategies to provide the guests with what are perceived as the necessary comforts. Guests are expected to respond visually,
auditorily, mentally as well as emotionally to their immediate surroundings
(Kress, 2000), and, in this case, to the soft lighting, to the orchids and to the
soothing sounds of the flowing water from the fountains.
The layout of the Foyer of the Marriott Hotel is unique because it does
not conform to the standards adopted by the various hotels in the vicinity. At
the rank of Element and for the Experiential function, for instance, the Foyer
opens to the sky and, as described in its promotional brochure, is 'illuminated by a three-storey skylight' thereby reducing the reliance on artificial
lightings while at the same time giving the air-conditioned lobby an airy
atmosphere and good ventilation. No chandeliers are needed, just wallmounted lamps and table lamps placed at strategic locations. The warm and
soft lightings are easy on the eyes. The walls and floor are bare as carpets
and ornaments or decorations such as paintings are kept to a minimum.
Instead both the floor and the walls are fully tiled and of similar shades.
Though this means easy maintenance, as the cleaning and mopping process
is easier, the Foyer exudes coldness and appears businesslike. Interpersonal
functions or considerations such as Warmth and Comfort appear to have
been backgrounded.
At the Foyer, guests are not greeted by the traditional Sites of power,
that function interpersonally at the rank of Floor. This is usually the reception counter where the initial scrutiny of a guest takes place. Instead guests
are 'greeted' by an escalator or a 'connector' leading to the second floor of
the hotel where the banquet rooms and restaurants are located. Signboards displaying the names of the banquet rooms and restaurants are
placed at the foot of the escalator. Thus, guests need not seek directions,
thereby alleviating labour costs. Inevitably this reduces the human interaction between guests and hotel staff. For an establishment that deals with
the service industry, Interpersonally, this is interpreted as another setback.
The distance between guests and hotel is further widened by the location
of the reception counter, which is located at the far end of the lobby and
sandwiched between its side entrance and its emergency exit. Guests either
approach the counter by walking across the Foyer (Path 'A' in Figure 3.3)
or by passing the jewellery, pastry and cigar shops along the passageway
on the right (Path 'B' in Figure 3.3). Initially the location of the counter,
which is part of the hotel's welcoming team and the human face of the
establishment, appears inconvenient to the guests but this apparent
inconvenience is negated because of the close proximity of the lifts that
would eventually lead the guests to their rooms. What can be deduced
here is that at the Foyer, the foregrounding of Textual functions such as the
Relation of the lifts to the reception counter and the Relation of escalator to
signboards, far outweighs the Interpersonal functions such as Comfort, Welcome, and human contact with hotel staff. This also functions to make
surveillance and the official scrutiny of the guests implicit rather than
explicit.

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

75

The vertical as well as the horizontal space or the Spaciousness of the


Foyer, which is a system for Interpersonal meaning, is striking and the
absence of physical or permanent Partition in this area suggests an openconcept approach to business. This openness allows guests to move and
interact freely around the Foyer. Guests arriving on the first day, for
example, are free to view the food and beverage menus at the cafe's reservation counter. Simultaneously though, these guests become easy targets of
scrutiny by the hotel's security personnel or even by other guests. If there
were partitions on the ground floor of the hotel, this would not so readily
occur. Except for the Marriott Cafe, which is slightly elevated, the Separation of groups is achieved by utilizing the square or circular columns
around the perimeter of the Foyer. These columns help to distinguish
between one group of specific activity from another, such as the bar counter
from the Lobby Lounge and the Lobby Lounge from the reception lobby.
Seen textually, the only Permanent Partition at the Foyer is the hotel lifts,
which are hidden behind four square columns. These columns and lifts act
as markers to indicate the end of general public activities, such as drinking
and dining. They shield one's view from the most unpleasant sights or spaces
on the ground floor but the ones that cater to a guest's more personal needs,
that is, the passage to the parking lots and the restrooms. Understandably,
there is a need for the hotel to put forward its best 'face' to the public and
this has been done overtly. There is a distinct separation between public and
private needs or spaces with the former usually foregrounded. Judging from
the location of the Business Centre, a guest's professional needs seem to fall
within his or her private domain; it is located next to the emergency exit and
adjacent to the lifts. On the priority scale, the size of the Centre suggests
that 'business' or 'work' should constitute a major part of a guest's private
life but at the Marriott Hotel, it retreats to the background.
The main physical attraction or distraction at the Foyer (depending on
how one chooses to see it) is the nine preserved palm trees placed almost in
the middle of the Foyer. As a Textual element, these trees serve as the 'focus'
in the area known as the Lobby Lounge. The theme of a 'tropical island' is
thus brought into the interior of the hotel. Hence there is continuity or a
constant repetition of themes in and around the hotel. The preserved palm
trees in the Lobby Lounge are encircled by four semi-circular flower troughs
where, once again, orchids are the choice plants. An artificial garden city or
island is thus created within the hotel's premises. An aerial view of the Lobby
Lounge suggests that it is the physical representation or the built form of the
motifs found on the floor tiles at the hotel entrance. The open spaces under
the palm trees and around the flower troughs are also fully utilized. These
are used as a dining area where diners are served drinks from the bar
counter, food from the Marriott Cafe and cakes and pastries from the Pastry
Shop. In fact, there is an overwhelming emphasis on the food and beverage
business at the hotel. The consumption of alcohol also seems to be widely
promoted and encouraged. There are bar counters located at the Foyer,
at the sidewalk cafe and at the underground pub, Bar None. Guests are

76

MULTIMODAL DISCOURSE ANALYSIS

continuously confronted and surrounded by food and drinks and if that is


not enough, the open-concept kitchen at the Marriott Cafe gives them a view
of how food is prepared for consumption. At the same time, the glass panels
at the Cafe allow patrons to view what the other diners at the Crossroad Cafe
are having and vice versa.
The change in hotel management in 1995 did not affect the Marriott
Hotel physically or structurally, because it remains culturally Chinese. The
former Dynasty Hotel, as its name suggested, had created an image of the
existence of Chinese imperialism and a dynasty of Chinese culture and
tradition within the Orchard Road vicinity. This is the image that is captured
in postcards and promotional brochures to represent, ironically, a multiracial and a multicultural society, a fact that has often been stressed by the
government. The interior of the hotel, however, does not reflect its cultural
heritage and dominance, as it is more occidental than Oriental. The layout
of its ground floor unwittingly reveals the hotel's business philosophy and
demonstrates how it chooses to construct itself in the context of Singapore.
Initially, there is seemingly a lack of customer service in the hotel because,
like the reception counter, the concierge and tour desks are pushed towards
the recesses in the walls of the Foyer. This reduces obstruction and creates a
clear passageway but at the same time it pushes customer service to the
background. But 'reducing obstruction' or eliminating 'tripping hazards'
such as carpets is a 'service' in itself and this aspect of service belongs to the
all-important department in every industry, that is, safety. Unfavourable
Interpersonal functions such as the Orientation to entrant, which is the distance between the hotel entrance and the reception counter, are often
negated by other Experiential and Textual functions such as the Relation to
connectors, that is, the close proximity between the reception counter and
the lifts.
From a popular point of view, a big part of the success of the hotel
appears to be its understanding and its knowledge of what the public wants
and by giving and capitalizing on those wants. There is an impression that
there is something for everyone. There is day and night entertainment,
jewellery and chocolates for the women, cigars for the men and food for
everyone. The Lobby Lounge in particular has captured the constructed
spirit and image of Singapore and what is being projected is an image of an
ideal tropical island where palm trees flourish under the warm sun. Singapore is constructed as a city where food is in abundance and seen as a
passion, and where eating is perceived to be a favourite pastime. But because
of its open concept and the lack of permanent partition, another side of the
hotel becomes invisible to the public eye. Women especially are vulnerable
to the male gaze. While soliciting is an offence that is punishable by law and
is an activity usually and wrongfully associated with women, the locations of
the hotel side entrance, its emergency exit and the corridor that leads to the
lifts provide the discreet routes (Path 'B' in Figure 3.3) for the guests who
wish to bring in additional company. Similarly, the escalator to the hotel's
second floor does not only connect guests to the restaurants and banquet

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

77

rooms but also to the lifts on that level. In the same manner too, the cigar
shop and the underground pub are sites for discreet soliciting. These are,
however, located away from public viewing. What is further implied is that
seeking pleasures and entertainment is the prerogative of the male. The
open concept in the hotel, which symbolizes one's public image, reflects a
closure to reality or to one's private life; it demands discretion because there
still remains the Asian obsession with the subject efface'.
Conclusion
While the analysis of Orchard Road entails the construction of a framework
that features a rank-scale with the functions and systems through which
Singapore is constructed, the analysis of the Marriott Hotel requires the
application of O'Toole's (1994: 86) framework for architecture. Through
the integration of both frameworks, the analyses of both Orchard Road and
Marriott Hotel reveal how spaces in and around Singapore are carefully
organized to meet the sociopolitical and socio-economic demands of the
authorities. Every available space is found to be potentially economically
viable. In general, the analyses reveal how Singapore is constructed as a
shopper's paradise, a tropical island and a food haven. What is presented,
however, is a constructed image of a country and a hotel that both the
authorities and the management want the public and the world to see and to
believe. How this is done requires, to a certain degree, the use of women as
commodities. The general perception in Orchard Road and the Marriott
Hotel is that sex sells, thus reflecting the values of a patriarchal society.
Orchard Road demonstrates how foreign cultures, specifically those from
the West, are foregrounded and how the cultures of Singapore's multi-racial
societies are backgrounded. This is perhaps part of a strategy to cater to the
influx of 'foreign talents' and tourists to the country. Business concepts such
as the outdoor refreshment areas, for example, are imported from overseas
in an effort to make the streets 'more exciting and lively' (URA Annual Report,
1997/98: 28). The concepts of excitement and liveliness are therefore
denned by the authorities and Singaporeans are socially engineered to subscribe to these prescribed concepts. These oudets as well as the abundance
of shopping centres in the vicinity are in reality revenue-generating
machines. Profit-making is the key word; the culture of consumerism dominates the area and capitalism is seen as the answer for a land reliant on
human resources. The presence of the Marriott Hotel, however, serves to
remind the people, whether locals or foreigners, of Singapore's cultural
heritage. The building is a potent social and cultural symbol and a reminder
of the prominence of the Chinese community in the country. Amidst the
chaotic cultural scene in Orchard Road, Singaporeans must be reminded of
their cultural heritage and to meet those expectations, a system is implemented and a lifestyle prescribed. The system and the people depend on
one another.

78

MULTIMODAL DISCOURSE ANALYSIS

Acknowledgements

The map (Plate 3.1) is provided courtesy of This Week Singapore.


References
Betsky, A. (1994) James Gamble Rogers and the pragmatics of architectural representation. In W. J. Lillyman, M. E Moriarty and D. J. Neuman (eds), Critical
Architecture and Contemporary Culture. New York: Oxford University Press, 6484.
Chua, B. H. (1995) Communitarian Ideology and Democracy in Singapore. London:
Routledge.
Dale, O. J. (1993) The Singapore Concept Plan: historical context/current assessment. PLANEWS. Journal of the Singapore Institute of Planners 14(1): 41-46.
Singapore: Straits Printers Pte. Ltd.
Fong, T W. (1973) Industrial complexes and the garden city can they co-exist? In
Chua Peng Chye (ed.), Planning In Singapore - Selected Aspects and Issues. Singapore:
Chopmen Enterprises, 16-21.
Gwee, P. K. W. (1991) Fengshui: The Geomancy and Economy of Singapore. Singapore:
Shing Lee Publishers Pte Ltd.
Halliday M. A. K. (1994) An Introduction to Functional Grammar (2nd edition). London:
Arnold.
Jayapal, M. (1992) Old Singapore. New York: Oxford University Press.
Keung, J. (1991) Overview on the Concept Plan. Living the Next Lap Blueprintsfor Business.
Singapore: Urban Redevelopment Authority.
Kress, G. (2000) Multimodality. In B. Cope and M. Kalantzis (eds), Multiliteracies:
Literacy Learning and the Design of Social Futures. South Yarra: Macmillan Publishers
Australia Pty Ltd, 182-202.
Liu, T. K. (1991) Press Release on Living the Next Lap Blueprints for Business. Singapore:
Urban Redevelopment Authority, 15.
Lynch, K. (1996) The city image and its elements (first published 1960). In R. T.
LeGates and E Stout (eds), The City Reader. London: Routledge, 98-102.
Master Plan. Report of Survey Volume 1. (1955) Singapore: E S. Horslin, Government
Printer.
Master Plan Written Statement 1993. The Planning Act (Cap 232, revised edn 1990).
Republic of Singapore. Singapore: Ministry of National Development.
Mumford, L. (1996) What is a city? (first published 1937). In R. T. LeGates and
E Stout (eds), The City Reader. London: Routiedge, 183-188.
Noble, S. (1994) Feng Shui in Singapore. Singapore: Graham Brash (Pte) Ltd.
Orchard Planning Area: Planning Report 1994. Singapore: Urban Redevelopment
Authority.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
Pike, B. (1996) The city as image (first published 1981). In R. T. LeGates and
E Stout (eds), The City Reader. London: Routledge, 242-249.
Preziosi, D. (1984) Relations between environmental and linguistic structure. In
R. P. Fawcett, M. A. K. Halliday, S. M. Lamb and A. Makkai (eds), The Semiotics of
Culture and Language Volume 2. Language and Other Semiotic Systems of Culture. Dover,
New Hampshire: Frances Pinter, 47-67.
Safeyaton, A. (2001) The Lion City as a text - a semiotic study of Singapore's
Orchard Road and Marriott Hotel. Unpublished MA dissertation. The National
University of Singapore.

THREE-DIMENSIONAL MATERIAL OBJECTS IN SPACE

79

Stern, R. A. M. (1994) The postmodern continuum. In W. J. Lillyman, M. E


Moriarty and D. J. Neuman (eds), Critical Architecture and Contemporary Culture. New
York: Oxford University Press, 4663.
Straits Times, The, 4 September 2000, 42.
Tan, J. H. (1972) Urbanization Planning and National Development Planning in Singapore.
SEADAG Papers On Problems of Development in Southeast Asia. New York: the Asia
Society-SEADAG.
Tan, S. (1999) Home. Work. Play. Singapore: Urban Redevelopment Authority.
URA Annual Report 1997/98. Singapore: Urban Redevelopment Authority of
Singapore.
URA Annual Report 1998/99. Singapore: Urban Redevelopment Authority of
Singapore.
Week Singapore, This, 18-24 December 1999. Singapore: Miller Freeman Pte. Ltd.
Whyte, W. (1996) The design of spaces (first published 1988). In R. T LeGates and
E Stout (eds), The City Reader. London: Routledge, 109-117.

This page intentionally left blank

Part II
Electronic media and film

This page intentionally left blank

Phase and transition, type and instance: patterns


in media texts as seen through a multimodal
concordancer

Anthony P. Baldry
University of Pavia

Introduction
How can we go about analyzing a TV advertisement? Despite the long
tradition of analysis of printed advertisements, the prevailing view, until
quite recently, has been that it is impossible, for technical reasons, to analyse
TV adverts in such a way that the interplay of visual and verbal resources
can be reconstructed. Cook (1992: 37-38, see also 2001: 42-44), for
example, states that:
Any analysis of the language of adverts immediately encounters the paradox that
it both must and cannot take the musical and pictorial modes into account as well
[. . .] This problem is more serious with tv than with printed ads, for on paper
pictures stand still (and can even be reproduced), and there is no sound [. . .] In
considering tv ads, where pictures move, music plays, and language comes in
changing combinations of speech, song and writing, reproduction is virtually
impossible, and a video, to be watched while reading, would transform a written
analysis even more than companion illustrations. Many analyses of advertising
solve this problem by ignoring it.

Cook's statement is, in fact, a testimony to the revolution that has taken
place in a decade vis-a-vis film texts and their analysis, often providing
solutions to the concerns he raises, particularly those relating to reproduction: the videocassette can be easily digitalized using an appropriate PC card
and the resulting digital film can be manipulated in many ways, including,
for example, the addition of explanatory captions; the Web, unknown ten
years ago, has spawned new forms of advertising which increasingly include
streaming video capturable through special software programs such as
Camtasia; postproduction software such as Adobe Premiere has made it
possible to convert a film into a sequence of stills and hence into a printable
format.
These technological innovations have given rise to new descriptive practices including: (a) the multimodal transcription (Baldry, 2000b: 81-85;
Thibault, 2000: 374-385) and (b) the construction of PC-based multimodal

84

MULTIMODAL DISCOURSE ANALYSIS

corpora accessible by structured queries (Baldry, 2000c: 31). The former


allows a TV advert to be reconstructed in terms of a Table containing
a chronological sequence of frames, a technique that goes a long way
to resolving the difficulties of taking linguistic, musical and pictorial
modes into account. Figure 4.1 shows how, in a TV car advert, the intersection between the Columns and Rows in a Table characterizes the
interplay between resources, not just those mentioned by Cook speech, song and writing - but also others such as ambient sounds, gaze
and gesture.
In keeping with the systemic-functional tradition of multimodality
(Baldry, 2000a; Kress and van Leeuwen, 1996; O'Toole, 1994), a multimodal transcription will also need to show how meaning is built up as a
series of functional units - typically, subphases, phases, but also potentially
macrophases, minigenres and genres. An early example of this work was
presented by the author in the 25th International Systemic-Functional
Congress in 1998 (Taylor and Baldry, 200la, 200Ib) which analysed the
phasal (Gregory, 1995, 2002; Gregory and Malcolm, 1981) and metafunctional (Halliday, 1994) organization of a car advert in such a way as to
show the advert's interleaving of American and British cultural values.
Subsequently, Thibault (2000: 311-385) devised an annotational system,
partly reproduced in Figure 4.1, which provides a systematic description
of the interplay between resources in an Australian bank advert and which
illustrates how a typically Australian identity comes to be created.
Thibault's model, which allows the phasal and metafunctional organization of a text to be described in great detail, has become a reference point
when extending the approach, for example, to subtitling for languagelearning purposes (Baldry and Taylor, in press) and to the description of
genres relating to the political arena (see the news report and interview
described in Lombardo, 2001, and party political broadcasts described in
Vasta, 2001: 99-128). At the very least, this work has succeeded in establishing how national identities and values are constandy expressed and
manipulated in many forms of advertising. In the process of this applicative work, the multimodal transcription has begun to change its function,
increasingly being identified with the typical interplays that occur in many
texts; for example, Baldry and Thibault have devised a multimodal transcription which, by incorporating a multimodal tagging system, promotes
an understanding of the notion of type that lies behind a specific instance
(Baldry and Thibault, 2001: 94-98).
But is the multimodal transcription the answer to researchers' needs?
This paper suggests that, at the very least, it needs to be backed up by
other tools that help us understand the workings of multimodal genres.
Indeed this paper describes the initial stages of research into multimodal
concordancing and development of appropriate software that will allow
the relationship between phase and transition to be viewed in terms of type
rather than instance. In so doing it questions some aspects of the traditional
view of the relationship between phase and transition. But the main purpose

ELECTRONIC MEDIA AND FILM

85

of this paper is, however, to introduce the new field of multimodal concordancing as a means of examining text and text types in relation to their
context of situation and context of culture (Halliday, 1978; Halliday and
Hasan, 1985). Multimodal concordancing thus builds on the foundations
laid by the multimodal transcription and on systemic-functional
approaches to language-only concordancing such as the Systemics Coder
developed by O'Donnell (2002). In so doing it raises questions about how
the study of multimodal discourse might be undertaken in the languagelearning classroom (Baldry, 1999, in press; Pavesi and Baldry, 2000) and
more generally how multimodal concordancing might develop in the
future.
The multimodal transcription as an expression of instance or of
type in phasal organization?
What role do phases and transitions play in TV adverts? And crucially how
far are multimodal transcriptions limited to giving details of specific
instances of phases and how far can they, instead, give information about
types of phases? Figure 4.1 is a small and highly abridged sample of a

Figure 4.1

A classic multimodal transcription (Phase 1 of The Fan)

86

MULTIMODAL DISCOURSE ANALYSIS

'classic' multimodal transcription based on the Table and (with a few


additions and modifications and many abridgements) on, in particular,
Thibault's annotational scheme (see Thibault, 2000: 374385 for a complete example of a multimodal transcription). It relates to an advert entitled
The Fan in which a male driver, whose car has become overheated, hitches a
lift from a lady driver in an Audi automatic.
The transcription in Figure 4.1 is concerned with instance rather than with
type and may be read from left to right in three major blocks that constitute a
progression from a description of Textual data to one relating to the specific
text's organization into semiotic units: (a) the first two columns relate to the
way in which frames are selected with a periodic regularity, in this case one
frame per second (see Table 4.1 for an explanation of the abbreviations

Table 4.1

List of abbreviations

Time:

TS = time in seconds;

Phases:

Ph Phase or 11; SP= Subphase or |; CD = Car Drive; CS= Car


Stationary;

Metqfunctions:

EXP= Experiential; NT= Interpersonal; TEX= Textual;

Visual image:

CP= Camera Position; VS = Visual salience; SH= Shot; CLS =


Close Shot; D = Distance; MCS= Medium Close Shot; SV=
Side view; FV= Front View P; /C= Inside Car; OC= Outside Car;
ICLO Inside Car Looking Out; OCLI= Outside Car Looking
In; WS = written slogan;
P= Participant, D = Male Driver; F= Female driver; M=
Mascot;
ST= Soundtrack; AS = Ambient sounds; MIS Music and
singing or JJ + %* with words in italics representing words
spoken or sung;
T= Transition; \ <> \ = transition lasting a subphase; > || a
transition crossing a phasal boundary;

Participants:
Soundtrack:

Transitions:
Resource

J high integration of resources (in particular between body


movements and music);
Combinations (RC): L low integration of resources (in particular between body
movements and music);
K medium integration of resources (in particular between body
movements and music);
Other symbols:
Movt. = Movement; ! or > = the same semiotic selections hold
true as compared with the previous frame on the left (i.e. same
phase or subphase but some changes have occurred); a double
arrow as in > P: >+ F (finger) means that the configuration is
as before but there is now a new Participant added to those
previously present: the Female driver's finger.

ELECTRONIC MEDIA AND FILM

87

used); (b) the subsequent columns provide descriptions of the individual


semiotic resources, including a description of the soundtrack in terms of
both music and song and ambient sounds; (c) the final columns relate to
semiotically motivated interpretations of how resources combine to form
meaning-making units.
As Thibault (2000: 321) points out, multimodal text analysis does not
accept the notion that the meaning of the text can be divided into a number
of separate semiotic 'channels' or 'codes': the meaning of a multimodal text
is instead the composite product/process of the ways in which different
resources are co-deployed and in which the phase is taken as an enactment
of'locally foregrounded selections of options'.
Figure 4.2 is a very different kind of multimodal transcription, organized
not so much as a finite Table with its page-based verticality, but more like a
musical score. It unfolds from left to right in a manner that potentially
extends well beyond the confines of the page. Such a transcription, remains
however, in keeping with the overall goals of multimodal text analysis
which is to specify both the selections made from the various semiotic
modalities and the combinations used to produce a given (phase-specific)
meaning (Thibault, 2000: 321). Vis-a-vis Figure 4.1, it uses a few more
abbreviations.
It should be noted that the car, although not included in the list of
abbreviations, is still to be considered a Participant (in the technical sense
of a Participant in a ParticipantAProcess relationship, see Halliday, 1994:
107-109), its parts (gear, window, etc.) being spelled out in full: thus a transcription of the type P: D: Arm; Car: Gear means that the major Participants
in the construction of meaning are the driver's arm and the car gearstick.
Unlike the 'classic' multimodal transcription described in Figure 4.1, a transcription of the type presented in Figure 4.2 is concerned as much with type
as with instance. It dispenses, for example, with a precise reference to the
text's unfolding in time: in fact the total duration of this text is 35 seconds
and the interval between the individual frames is, as in Figure 4.1, still one
second.
The transcription in Figure 4.2 is in some respects slighdy less detailed
than the instantial conception of the multimodal transcription exemplified in
Figure 4.1 since one of its functions is to summarize the major characteristics of the entire text in a concise way, thus demonstrating its greater potential for compression in description. This kind of multimodal transcription
records information about type in the top section, i.e. the Row above the
individual frames, and information about specific instance in the bottom part.
In particular, the Top Row suggests the change that has taken place vis-a-vis
the previous subphase: the text in question relates to the male driver jiving
to the sounds of an Elvis Presley-type song as he drives along, while the
details of what actually happens to the driver and the way this relates to the
song are given in the Bottom Row. The Top Row thus suggests the semiotic
development of the text in terms of its phasal structure and the main
changes in its deployment of resources - the Top Row is thus oriented

Figure 4.2 A multimodal transcription incorporating structure/type of phase/subphase


(Top Row) and co-deployments and resource selections (Bottom Row)

ELECTRONIC MEDIA AND FILM

89

towards the constant shifts in the selection of options in keeping with Gregory's
principle that phase and transition can 'be used to capture the dynamic
instantiation of micro-registerial choices in a particular discourse' (Gregory,
2002: 323); the Bottom Row (with its focus on the content of each shot)
describes, on the other hand, the film's unfolding in time, and, though not
excluding the principle of selection from options, is thus oriented more
towards sequential development and specific realizations. Though not
shown here, a multimodal transcription of this type also allows Textual
elements from various texts to be aligned in such a way as to compare their
phasal organization (see Baldry, 2000b: 68-69 for the development of the
comparative multimodal transcription).
Though unusually involving two drivers and two car-drive phases, in
many other ways the text in question illustrates many typical features of
car adverts, in particular the expression of the very strong relationship
between the driver's and the car's identity. As Figure 4.2 indicates,
although other criteria might have been invoked, a good starting point
when defining the division into phases in this text (and we may add the
60 adverts in the current car advert corpus) relates not to the human
participants but instead to the type of representation of the car: in this
case (and in many other cases) whether the car is present and, if so,
whether it is moving or stationary. At the start of the first phase of this
text, there is a typical car-drive phase [+CD], in the second, an essentially car-stationary phase [+CS] (though the second subphase contains
the idea of a car stopping and starting - hence the [+CS, +CD] tag); the
third phase is again a car-drive phase [+CD], while the fourth phase, the
end phase, typically relates to the car abstractly in terms of its make and
manufacturer, and presents all the typical ingredients of one type of end
phase where the car itself is (physically) excluded [CD, CS] and where
instead the focus is on oral and written slogans and the manufacturer's
logo.
The correlation between driver and car is, of course, a major goal of
the car advert genre, reflected in the genre's phasal organization, which
characterizes the way the car advert unfolds in time. The car is very
much a Participant by definition, at least an equal partner in the
human/non-human participant relationship (and more often than not a
superior). This emerges quite clearly in the type-oriented multimodal transcription of Figure 4.2, which explicitly defines the constant shifts in local
foregrounding in the Top Row, e.g. whether it is the car, the driver, the
mascot or the countryside that is the salient Participant in a particular
phase or subphase.
In fact, the most salient phases in this text are the first and third, with
the first phase being conjunctive-disjunctive in nature and the third, conversely, of the disjunctive-conjunctive type. This reflects the text's foregrounding of potentially conflictual Interpersonal relationships between
the two drivers: the first, an outlandish jive-as-you-drive dude, the second,
a suave, sophisticated female. 'Conjunctive' and 'disjunctive' are here

90

MULTIMODAL DISCOURSE ANALYSIS

respectively used to describe specifically the synchronization and nonsynchronization of movements among Participants (see Thibault, 2000:
342), which, of course, include the cars and the mascot as well as the
drivers. It is frequently the case in film texts (for example, documentaries)
that the visual and the verbal are out of step, with the visual anticipating
the verbal (see Baldry, 2000b: 74), but what is striking in this text is whether
or not the movements of the participants are synchronized in relation to each
other and to the music. An attempt has been made to track the types of
shift that take place in this respect using symbols that relate to Resource
Combinations (RC).
Resources can thus be deployed, as in this text, in such a way that their
initial synchronization is lost, thereby creating two sets of meanings which
are potentially in conflict. In the first phase, all the resources - body
movement, music and song and the sequence of visual frames - are in
unison but, by the end of the first phase, visual image, kinetic action and
soundtrack are out of step: neither driver nor mascot are swinging in time
with the music and song (which continues); instead with the sudden braking of the car, they remain quite rigid and motionless, an indication that a
second and rather ironical series of rhythms is at work, which, together
with the smoke coming from the gearbox, signals the fact that the dude's
car is on the point of breaking down. Conversely, in the third phase,
desynchronized resources become synchronized: the man and woman
seem, initially, to be in conflict with each other, with gaze significantiy
contributing to this meaning - the woman glares reproachfully at the man
who dares to stick his mascot on her impeccably clean windscreen while
the poor man 'defends' himself by looking blankly straight ahead, out of
the windscreen, his facial expression and body position, having become
utterly rigid, in complete contrast to what happens in the first three subphases of the first phase. Resources such as gaze, spatial disposition, body
movement and facial expression are deployed in this text in such a way as
to be deliberately out of step with cultural expectations: two people sitting
next to each other in the confined space of a private car and who have
never met before and who are the car's only occupants will normally look
at and talk to each other, which is precisely what does not happen. Gradually, harmony between the two sets in, as the interplay of semiotic
resources underpinning the first phase is restored: music and song are the
first to be reintroduced when the man hands over his cassette, followed by
the restored rhythms of the mascot, which when prodded by the nowsmiling lady, once again swings in keeping with the music; finally, the man's
good humour also returns: he lifts his head up, smiles and starts to chew
gum again, all signs that his unrestrained jiving-'n'-driving is in the process
of being rehabilitated. Important in this process is the focus on the mascot,
which is seen being prodded by the lady who, though not present in the
man's car, seems intuitively to understand that this gesture will cause the
mascot to wriggle and writhe to the music thereby restoring her passenger's
good humour.

ELECTRONIC MEDIA AND FILM

91

These two phases are separated by a very brief second phase in which
both cars are essentially motionless and where, vis-a-vis the metafunctions,
rather than Interpersonal elements, Textual and Experiential elements are
prominent: in this short phase, the New is introduced in the form of a new
car, a new driver and an Internet address (note that, in fact, the Internet
address has been 'carried across' from the previous phase, illustrating the
significance of extended transitions and overlaps in phasal organization, a
matter discussed in detail below). The main meaning created is that the
male driver successfully hitches a lift, so that it is important to glimpse one
of the cars stopping - hence the [+CS +CD] tag for this phase. The absence
of salient Interpersonal elements in this phase is striking: the song, for
example, ceases, in contrast to the previous and subsequent phases,
indirectly underscoring the fact that, in many car ads, song is a crucial source
of meaning, often acting as the functional equivalent of a narrator, linking
the viewer to the events at hand and, in part, defining the viewer's expected
response to actions and events.
This advert is no exception in this respect: the final refrain 'you own my
heart', cements the identity between the viewer, the car drivers and the car.
It also coincides with the written slogan - multitronic: II cambio automatico a variazione continua da Audi [i.e. Audi's gearbox with continuously
variable automatic transmission] further building on the text's basic thematics, namely that the discordant contrast between the smooth, sophisticated lady and the dude's jive-as-you-drive lifestyle will be resolved in a
harmonious fashion by a relaxing ride in the right car, namely an Audi
automatic.
Significantly, many contemporary car adverts present the car as a
space where social conflicts, potential or real, may be resolved, whether
within the family or, for example, between loving couples. The car can thus
be a sexual space as in the Citroen car advert in the author's corpus
where, in the hyperreal coding orientation (for coding orientation see
Bernstein, 1971; for a multimodal perspective of coding orientation
see Kress and van Leeuwen, 1996: 168171), the car rolls over and
over as the couple make love; alternatively, it may be a place of protection
where a kissing couple in a lonely lane can successfully fend off an
attack from Zombies. More mundanely, it can also be a space where
children and animals can be safely transported and sometimes even a
space in which members of a sports team can start throwing a
ball around, de facto transcending the narrow confines of the car's
interior.
In this advert, song contributes significantly to this particular meaning of
conflict resolution, built up gradually and multimodally, throughout the
advert with its highly sensual suggestion that the relationship between the
man and the woman will outlive the lift and that it will be the woman
who will provide the initiative in this respect: a prod, as it were, is as
good as a wink. Notice, indeed, how, rather than stopping at the end of
the third phase, the song extends beyond into the final phase, with the

92

MULTIMODAL DISCOURSE ANALYSIS

result that the latter, as well as giving the usual information about the
particular model and the manufacturer, also underscores the entire text's
meaning, by suggesting that the conflict between the man and the woman
can be, and indeed, has been resolved by virtue of the car's design
characteristics.
This meaning-making is the result of the artful juxtaposition and overlapping of different types of phases that carry out different functions. The
very notion of phase presupposes that there is some transition between one
phase and another and, to a lesser extent, between the various subphases
that constitute a phase. Moreover, following on from what has been stated
above, as well as phase types, we can also expect various types of transition to be
present in film texts. For example, Thibault (2000: 320-321) suggests that
the points of transition between phases have their own special features that
play an important role in the ways in which observers or viewers recognize
the shift from one phase to the next and that, generally speaking, transition
points are perceptually more salient in relation to the phases themselves.
Thus viewers of texts have no difficulty in perceiving particular Textual
phases thanks to their ability to recognize the transition points or the
boundaries between phases.
However, the notion of transition should not necessarily be associated
with the idea that there is a precise boundary or point at which a transition
occurs. In many cases, this vision of boundaries in the organization of
phases and transitions will work very successfully. But this is not always the
case. As Thibault (2000: 326-327) points out:
Perceptually speaking, transitions between phases are not always clear-cut [. . .]
Thus, the transition point may be characterized by a gradual merging of features
from the two phases in question as one phase decays or fades out and the other
conies into being [. . .] The transitions between subphases are not always so
straightforward. At times, there is an almost imperceptible overlap between subphases.
In this text, for example, each of the two main phases (the first and the third)
contains a series of pivotal transitional points between the various subphases that mark the step-like progression from conjunctive to disjunctive (as
defined above) and vice versa: these are movements relating to the gearbox,
the cassette, the mascot, the drivers and the cars (e.g. braking). In the first
phase, the malfunctioning of the gearbox (we hear an ominous crunching
noise) and sudden braking and cessation of the mascot's movements signal
that disruption is to follow. The driver's jiving also comes to a stop. In the
third phase, the reverse is true: for the second time the camera carefully
focuses on the gearbox, which, in keeping with the demands of the targeted
audience, and quite unlike the first gearbox, is an automatic gearbox for
drivers who like a smooth ride. Significantly, the transition points in this, and
many other adverts, are linked in a chain to form a crescendo which

ELECTRONIC MEDIA AND FILM

93

contributes to the overall coherence of the text. One way in which this
salience is achieved is by changing the camera focus: thus the out-of-focus
mascot suddenly comes into focus. Another is the type of shot used: two
major subphasal transition points in the first phase and a third in the third
phase coincide with the only three shots in which we view the mascot by
looking out of the car through the windscreen: in each case this selection of the
mascot contributes to the underlying conjunctive/disjunctive 'stop-start'
flow of the text: the mascot is shown, in an alternating way, as either static,
carrying with it a negative connotation (a 'stop') that things are wrong, or,
when it sways in all directions, with a positive connotation (a 'continuation'
or a 'restart' after a 'stop'). A similar chain is described in Thibault (2000:
328-329), in terms of:
covariate semantic ties in the visual thematics [. . .] that are progressively denned
in the unfolding text as cohesive chains extending over the entire text. For
example, the foregrounded co-patternings of items deriving from the interacting
cohesive chains of 'smiling', 'rolling the sleeves', and 'moving forward' function
to create global coherence in the text.

In Thibault's example the meaning implied relates to the characterization


of different activities as being fundamentally analogous (the participants, each
in a different context, roll up their sleeves, smile and get on with their
different jobs). But the transition chains in this text carry out a very different
function: they realize step-like crescendos relating to the creation of discord
and the subsequent return to harmony.
That the notion of transition does not necessarily entail the notion of a
single point or a single boundary in any particular phase also emerges in other
ways. Mergings and overlaps between phases are also typical in many film
texts, adverts included. In this particular text, transitions are prominent in
both the second and the fourth phases of this text, where the transition
from Phase 1 to Phase 2 is prolonged over a few seconds in such a way as to
construct the meaning that the male driver is in the process of changing
cars. Thus, in the second phase, phase and transition are partly co-terminous
insofar as an entire subphase (SP2) is taken up with an (albeit rare) split
shot, in which the two different cars are simultaneously foregrounded and
backgrounded, the result of postproduction techniques (but also clever
camera work), whose purpose is to effect the transition from the Given (the
first car) to the Mew (the second car) in a salient and lingering way, thereby
underscoring the fact that a major change in the events described in the text
is taking place. Moreover, as we have already seen, the end phase, which in
car adverts are typically associated with slogans for the particular car model
and car manufacturer, is merged with the previous phase, thanks to the
precise synchronization between the oral slogan (the final part of the song),
and the written slogan.
Thus, transitions are not necessarily equated with the cutting from one
shot to another, nor indeed with what is happening in the visual. While

94

MULTIMODAL DISCOURSE ANALYSIS

transitions will often be related to what is happening in the visual, this


will not always be the case; while phases (see Gregory, 1995, 2002;
Gregory and Malcolm, 1981) relate to 'stretches of text in which there is a
significant measure of consistency and congruity' (Gregory, 2002: 322)
transitions, as Thibault (2000: 320) and Gregory (2002: 323) have pointed
out, essentially relate to changes in the metafunctional organization of the
text and as such may very well be related to changes in the soundtrack
and not just to what happens in the visual. One of the clues to the fact
that, in this text, the second phase really is a separate phase is the fact
that after the noise and commotion of the first phase, this phase uses
'quiet' sounds: no song, no music - just the sound of car tyres and a
barely audible wind. Indeed the Top Row of Figure 4.2 attempts to
record the constantly changing interplay between the types of resource in
the soundtrack: ambient sounds, music and music and song (but never
complete silence).
Transitions, as well as being structural in nature, are thus inherently
and predominantly semiotic, contributing to the entire text's meaning
through their typical organization into chains. They are thus not just part
of the local foregrounding of semiotic selections. Indeed, precisely because
they are salient, transitions are frequently linked to the advert's ultimate
message. Transitions are ultimately bound up with the expectations that
the viewer has about the text and often guide the viewer vis-a-vis these
expectations to the right conclusion. Transitions thus have to do with the
constant interplay between the expected and unexpected in film texts. In
this text, we expect song and music to be restored, which is precisely what
happens.
These expectations are inherently multimodal, the result of the interplay
between many resources. Zago (2002: 6270) reports an interesting case in
which a drinks advert uses an animated cartoon to represent the transitions
from one experience to another in a sequence of hallucinations each represented as a warping of the face of the protagonist, an exhausted cyclist, and
in the buildings he cycles past. He finally reaches a place where he can drink
a cool pint of Guinness, and thus bring a halt to the spiral of fever-like
experiences that include blue penguins and deformed rubber-like walls.
Here, too, it is the chain of transitions that is important, the text's meaning
being built around a spiralling escalation, interrupted only by the act of
'murdering' a cool pint. I have also reported a similar chain of transitions at
work in Benigni's 1997 film La vita e bella (Life is beautiful} (Baldry, 2002)
suggesting that the viewer's expectations are that the chain of transitions
from one phase to another will be linked to the final climax in the film.
Transition chains make their meaning by being typically multimodal. In
many film genres, they will be visual and musical as well as linguistic, the
case, time and again, in the world of advertising. In La vita e bella all three
elements intertwine, with music playing a very significant role: the catchy
music starts off as background music but gradually as the film proceeds
becomes foregrounded and thematized as the only means of communication

ELECTRONIC MEDIA AND FILM

95

in a concentration camp. In saying this, we are suggesting that it is the


transition., rather than the phase, that is the most significant element in the
phasal organization of a text and that a focus on type of transition in multimodal analysis will help clarify that what is salient is ultimately what is most
meaningful.
So far we have posited phase types applicable to this advert as being
describable as: [+GD], [-CD], [+CS] or [-CS] or a combination thereof
(but see also the discussion below for their extension to many other car
adverts) and, from a slightly different perspective, we have also posited the
existence of phases that can be characterized in terms of the conjunctive
and/or disjunctive deployment of resources. But what transition types are
there? In this paper, given that multimodal concordancing is taking its very
first steps, we can do little more than posit their existence.
Indeed, precisely because of its static nature, the multimodal transcription, which seems so far to have been the major research tool used in the
multimodal analysis of film text, is inappropriate when identifying and
describing what is quintessentially dynamic in nature: namely the transition.
All this is reflected in a second type of multimodal transcription presented in
Figure 4.2, concerned as much with multimodal type as with multimodal
instance. Still under development (e.g. as a method of reporting the findings
of multimodal concordancing), this type of transcription tries to highlight
types of cut, types of shot, types of phases and types of transition. To give just
one example, the symbol > has been used to suggest a transition overlap, that
is, points at which there is no clean break between one phase and another
but where instead one or more resources get carried across what otherwise
appears to be a phasal boundary. Thus, the symbol >ll J3 + %* means a type
of transition in which music and song are carried across from one phase to
another. Conversely, the symbol | <> | means a type of transition that lasts
for the entire length of a subphase.
Looking at types of phase and transition through a multimodal
concordancer
In describing The Fan advert, we are beginning to move away from the
multimodal transcription as an expression of instance towards the multimodal transcription as an expression of type, which inevitably raises a whole
series of questions. Are certain types of transitions likely to be found more
frequently in specific genres? Is the absence of speech, or indeed total
silence, one of the typical markers of a transition from one phase to another
in afeature film, but which, because of the need for maximum compression of
meaning in a very short space of time, is unlikely to be present in such
genres as the car advert? Or, on the other hand, do transitions function in
such a way as to introduce a new item of information that builds, sometimes
in a repetitive chain, onto what has previously been constructed in the text?
As viewers we are capable of recognizing phases and transitions; as transcribers, we can reconstruct where they occur. But this does not amount to

96

MULTIMODAL DISCOURSE ANALYSIS

the same thing as characterizing the typical ways in which transitions come to
be the salient element in phasal organization.
A multimodal transcription is limited in the amount of information it can
give about types of semiotic units that are found in film texts and cannot
provide anything like the information we need in order to provide motivated
answers to these questions. If we are to pursue our understanding of the codeployment of semiotic resources more thoroughly we need to understand
how a large number of dynamic texts typically unfold in time.
And in order to be able to identify characteristic patterns, the research
process requires us to build corpora that can be analysed in terms of various
Textual phenomena, including, in particular, a study of the typical phasal
organization of a specific genre which ensures that a film's unfolding in
time, in which the transition, as we have seen, is so significant, can be
captured by in vivo multimodal analysis. Such a requirement dictates the
need to build software programs that are capable of analyzing corpora and
not just individual texts.
What then are the characteristics of an online XML-based multimodal
concordancer such as the Multimodal Corpus Authoring (MCA) system,
which has been designed by the author specifically to identify recurrent
patterns in films?
First, as an authoring tool, it enables researchers, however imperfectly, to
view short pieces of film and simultaneously to write multimodal descriptions of them in terms of various parameters, for example, those relating to
a text's metafunctional and phasal organization. Using MCA's editing tool,
researchers can segment a particular film into functional units and, while
viewing these units, type out detailed annotations relating both to the semiotic resources they deploy and the functions they perform within that film.
Indeed, MCA approximates to the researcher's dream of simultaneously
viewing and writing a description of a film in real time (see Baldry and
Taylor, in press).
Second, like a linguistic concordancer, a multimodal concordancer can
also establish patterns that relate to a series of texts, rather than to specific
instances, to a much greater degree than is possible with a multimodal
transcription, even where the latter is oriented towards type rather than
instance. For example, it is possible, using MCA, to determine the ratio of
female to male drivers, or to identify those texts relating to cars that are not
being driven, and hence have no drivers, and those relating to cars which are
instead being driven but where the driver is 'implied' and not actually seen.
It is also possible to identify special cases that involve two drivers, typically
one male and one female, or non-human drivers, typically robots. As with
any corpus approach using information technology, this information can be
obtained within a few seconds. However, unlike many lemma-based
approaches, the researcher must first carry out the work of description-cumtranscription of the texts in the corpus. Not surprisingly, the software design
is such to incorporate an analytical framework that simplifies this task as
much as possible.

ELECTRONIC MEDIA AND FILM

97

MCA's incorporated relational database allows researchers to search the


corpora created and identify patterns in them, all of which leads to a further
round of hypothesis formulation, segmentation, description and comparison of results. Table 4.2 gives the results of multimodal concordancing
in relation to 60 car adverts and shows that there are, in fact, many cases
where there is either no driver, because the car is stationary (16/60), or
where an unseen driver is driving the car (19/60). There are in fact a total of
24 male drivers (though in 3 cases we assume that the driver is a male from
what goes before and after). There are only six woman drivers and two of
these appear, as in the case of The Fan, in adverts where a man also drives.
Importantly, half the adverts are careful not to show the driver's identity.
Moreover, the relationship between men and women takes on a different
perspective when we look at different participants in the structure of an
advert. When we examine, for example, the ratio between male and female
voiceovers (whose function is usually to act as 'narrators' or 'storytellers'), we
notice that the imbalance begins to redress itself for there are various cases
in the corpus where, vis-a-vis a male or an unidentified driver, a female

Table 4.2 Driver types in 60 TV car adverts

98

MULTIMODAL DISCOURSE ANALYSIS

voiceover predominates. As Table 4.3 shows, the search query in this case is
no longer formed by a single parameter (driver) but is a relational search that
links two disparate parameters: driver and storyteller.
Thus, unlike many lemma-based linguistic concordancers such as OCP
or WordSmith, but in keeping with the approach adopted by O'Halloran
and Judd (2002), a multimodal concordancer needs to be built around the
notion of the relationship between resources, events and participants. In this
respect, any form of transcription is a hard task, often undertaken by a
researcher without knowing whether the effort will be worth the candle. In
theory, the results described in Table 4.2 could be acquired by watching a
videocassette and marking down the various features using pen and paper.
Though in principle feasible, it would be a time-consuming process. Even
using MCA, which greatly reduces the time taken to provide a description, it
is still a time-consuming process. A much harder task, however, is to relate the
parameter DRIVER with other parameters such as STORYTELLER and ORAL
SLOGAN. This is virtually impossible to achieve using traditional pen-andpaper and cassette methods. A multimodal concordancer, such as MCA,
which is based on these relational principles, can easily identify such patterns through relational searches as Table 4.3 indicates.
Third, a multimodal concordancer, even more than a linguistic concordancer, needs to be built around functional parameters such as those we have
mentioned above, namely Halliday's notion of metafunctions (Halliday,
1994) and Gregory's notion of phase and transition (Gregory, 1995, 2002).
In this respect, one significant step in the development of a corpus relates to
the work of tagging. In their paper on the development of a tagging system,
Baldry and Thibault (2001: 94-98) proposed the use of an annotational
system that defined gesture and language in terms of Halliday's notion of

Table 4.3 A relational search in MCA

ELECTRONIC MEDIA AND FILM

99

Experiential metafunction: thus a tag of the L-MENT:PROJ and G-MENT:PROJ


type means that the text being described contains an instantiation in which
language and gesture are being used together to express mental projection.
MCA will support this type of tagging without any difficulty. However, given
that, as mentioned above, annotational systems in multimodal concordancers are still in their infancy, the system adopted so far has been oriented to a
binary presence/absence distinction of the various descriptive parameters,
which, as described elsewhere (Baldry and Taylor, in press), may be defined
at will by the corpus author.
But how does all this contribute, for example, to our understanding of the
phasal organization of texts? Though ultimately more sophisticated mappings of the relationship between phases and metafunctions should be
possible, in the current stage of development, this relationship has been
characterized only in terms of a very preliminary step, namely the analysis
of the major Experiential 'category' in the car-drive phase(s) of 60 car
adverts: the activity of driving and what precedes and follows it.
As Table 4.4 illustrates, this activity has been characterized in terms of
the sub-components associated with the material process of driving, where
SP stands (as indicated in Table 4.1) for a subphase.
Using MCA, this information can be retrieved from the corpus with a
query of the form: SP2: contains YES or SP2: contains NO or even SP2:
contains YES and NO in cases where the matter is not quite so clear (for
Table 4.4

Division of the activity of driving into subphases

SP1: INDICATES INTENTION: e.g. picks up keys (partly a mental


process, partly a material process);
SP2: APPROACHES: The driver a) approaches the car and b) unlocks
the driver's door;
SP3: GETS IN: The driver a) opens the door, b) gets in and c) closes
door;
SP4: STARTS UP/DEPARTS: The driver a) puts the key in the ignition,
b) starts the engine, c) indicates intention to move off and d) pulls
away;
SP5: CAR-DRIVE: The driver drives along the road (in town/country,
by day/by night, in summer/winter, on/off road) towards his/
her destination;
SPG: STOPS/SLOWS: The driver (a) stops and (b) slows down at
INTERMEDIATE POINTS (e.g. traffic lights, junction, negotiates
bend, has accident, calls in a shop, changes cars);
SP7: ARRIVES/PARKS: The driver (a) slows down, (b) stops on
reaching destination, (c) parks and (d) switches off engine;
SP8: EXITS: The driver (a) gets out of the car and (b) closes door;
SP9: WALKS OFF: separates himself/herself physically from the car.

100

MULTIMODAL DISCOURSE ANALYSIS

example, when a driver opens the door and puts an object or person in the
car rather than himself/herself). Equally, it is possible, with a single search, to
identify all the cases where we see the car being driven and the driver getting
into and out of the car, in this case a query of the type: SP3: contains YES +
SP5: contains YES + SP8 contains YES. Thus 60 adverts were 'tagged' in
terms of the subphases of the car-drive phase (the first subphase has been
excluded on the grounds that it is only partly a material process), in such a
way that the corpus could be searched for the absence or presence of a
particular subphase.
As Table 4.5 shows, there is in fact only one advert (n. 21) which comes
anywhere close to instantiating all the possible subphases and even in this
case one subphase is missing and another is doubtful - hence the YES/NO tag
represented as a bracketed tick: this is a case where the driver is seen getting
into the car but only to put his young son in the back seat (see Figure 4.3
below). In all these adverts, visual/verbal ellipsis is constantly at work vis-avis the instantiation of the driving experience: there is normally no need to
see all the phases at work, since our own experience of driving allows us to
'fill in the gaps'. With the exception of advert n. 21, in 60 adverts we never
see the driver getting into and out of a car.
Table 4.5 suggests that car adverts do, in fact, fall into three types, which
may be tabulated as follows:
1 Car-drive adverts: The car is seen moving in a glorified way that attempts to
go beyond the daily grind of the ordinary world. The car is in an ideal
world. More often than not the number of participants is limited to one or two people
and in many cases no human participant is foregrounded; the participants never talk
about the car and never talk to each other and only exceptionally to the audience. In
these adverts only subphase 5 is apparent (17 cases);
2 Car-stationary adverts: The car is motionless, a statue to be 'worshipped'
and is typically related to some inconsistency or oddity in the behaviour
of the people surrounding the car who typically talk about the car. In these
adverts, none of the subphases listed in Table 4.4 is present (11 cases
represented in Table 4.5 as grey-shaded columns) or alternatively subphases in which the car is seen moving are absent (a further 6 cases);
3 Hybrid storytelling adverts: where both car-drive and car-stationary elements
are present and where either other genres are exploited to meet the
advert's own ends (e.g. spoofs on cinema and TV genres) or some attempt
is made to define the car in relation to daily activities and (usually) its
enhancement of these. These types include talk but never in the car-drive phase or
subphase. A good example of this is where the car-drive element is not
shown - hence the bracketed tick notation - but is instead realized,
through talk, as a mental and oral fantasy (projection) about the car's
drive potential by the car driver while the car is actually stopped (say at
the traffic lights). This is by far the largest category (26 cases), although it
should be noted that the majority (15) instantiate CD subphases before CS
subphases (The Fan being a rather special case).

ELECTRONIC MEDIA AND FILM

101

Table 4.5 Distribution of subphases (material process) in the car-drive phase

A fourth important characteristic of a multimodal concordancer is that it


comes close to functioning as a 'Mark II' multimodal transcription represented in Figure 4.2 incorporating the notion of type in that it can 'print
out' all the characteristics of a specific car advert in terms of a set of YES/
NO presence of descriptive parameters. Thus Table 4.6 gives the 'printout' (actually a screen illustration) for The Fan advert we have analysed
above.

102

MULTIMODAL DISCOURSE ANALYSIS '

Table 4.6 Screen illustration of a multimodal transcription generated by MCA

Table 4.7 A multimodal transcription generated by MCA using relational


parameters

Finally, an important function of the multimodal concordancer, closely


linked to its capacity to relate the characteristics of a specific car advert to
general trends, lies in its ability to pick the 'odd man' out. Thus, for example,
getting into a car is a comparatively rare event found in only 5 out of 60
adverts. Though by no means the rarest of subphases, its relative absence is
surprising. Moreover, there are only two cases (10, 21) where SP2+SP3+SP4
all occur together. As Table 4.7 shows, they are both marked cases where, as is
very frequently the case in car adverts, the abnormality and unpredictability
of humans (in this case, as Figure 4.3 shows, the stereotypical forgetfulness
of a male driver) is compared to the scientific reliability of cars.
Notice the player symbols on the left-hand side of Tables 4.3, 4.6 and 4.7.
Once we have identified a particularly striking result, we can mouse-click
these symbols and gain immediate access to the advert in question, all of
which allows us to view the precise context and to 'explain' the exception to
the predicted pattern in the manner indicated in Figure 4.3. A multimodal
concordancer is, after all, concerned with giving the researcher immediate
access to phases in film that require careful scrutiny.
Analyses of the results of queries such as Table 4.4, together with exceptions such as those suggested in Table 4.7 and Figure 4.3, all confirm
Thibault's (2000: 343) hypothesis that:
In movement, simultaneity and spatiality rather than linear succession in time
and particulateness (constituency) are important in the realization of Experiential
event and action configurations.

ELECTRONIC MEDIA AND FILM

Figure 4.3

103

Unusual events dictate the need for an extended pre-drive subphase

This might at first seem surprising: undoubtedly, the posited sequence of


8 subphases in the material process of driving might at first be seen as
implying a linear succession in time. However, as we have seen, most of the
subphases are implied rather than actually seen: only 5 out of 60 adverts
explicitly represent more than 4 subphases. Most are more like The Fan,
concerned with the car as a social space rather than as a moving object.
The camera focuses on the spatial location of the car (on a country road)
and on a body or body part (where body = e.g. driver, mascot or car) which
performs a movement as an instigator or a reactor. Nevertheless, car adverts
may, in general, be divided into three main blocks that can be tabulated
as follows:
an initial block consisting of a single phase focusing on a single, individual
entity: a specific car, a person or a place in time;
a main block consisting of one or more phases or subphases contextualizing the initial focus through the specification of the relations of the
selected entity with the 'missing' parameters;
an end block consisting of a single phase: featuring the car logo, name,
manufacturer and, in many cases, some kind of EVALUATING synthesis that
may be used to project beyond the small world of individual entities
shown in the advert to a larger, more complex world (and which, of
course, functions to persuade you, the viewer, by overcoming your resistance to the product).
This phasal organization seems to fit The Fan and many other adverts in the
corpus very well. However, more work using MCA is required to establish
the validity of this suggested typical phasal organization and the division
of advert types into three types. The [+/-CD] and [+/GS] tagging
system will not, of course, always be distributed as in the current case as:
+CD(P1)A+CSA+CS/+CD(P2)A+GD(P3)A-GS/-GD(P4). There are cases,
for example, in which the distribution is essentially the reverse, with the car's
physical presence being confined exclusively to the end phase. But this does
not affect the hypothesis that three basic subtypes exist.

104

MULTIMODAL DISCOURSE ANALYSIS

If they do exist, then it may well be that the predominating human figure
in the car advert will turn out to be generically correlated with one of the
specific subtypes mentioned above: the DRIVER (the car-drive only advert),
the INSPECTOR (the car-stationary advert) and the RACONTEUR/STORYTELLER
(the hybrid type alternating car-drive and car-stationary phases and including the subtype which includes an off-screen narrator). A further prediction
is that other roles will be involved definable, however, in relation to the car (as
opposed to other participants, whether family, colleagues or strangers). That
is, it may prove to be the case that (despite many overlaps between the
categories) the car may be defined in terms of first, second and third person
relationships. The general distribution might well be: (a) car-drive adverts:
driver with his/her car [first person: mine: car and me, driver are the same thing];
(b) car-stationary adverts: inspector with somebody else's car, not mine [third person: otherness: not mine/notyours\; (c) storytelling adverts: raconteur and his/her
dream car for you [second person: yours, likely to include some kind of appeal
of the type: You should be driving it. . .].
Table 4.5 reconstructs the Experiential metafunction of 60 car adverts
analytically and systematically as subphases in the material process of driving,
thereby suggesting the validity of multimodal concordancing as an analytical and teaching approach. But, however systematic this may be, this is
only a provisional finding for if we are to honour the definition of phases in
terms of Gregory's already mentioned concept of consistency and congruity
echoed in Thibault's definition of phases as 'co-patterned semiotic selections that are co-deployed in a consistent way over a given stretch of text'
(Thibault, 2000: 325-326) and if we are to characterize their consequent
close identification with specific metafunctional configurations, we need,
at the very least, to complete the picture by describing patterns that
emerge vis-a-vis the Interpersonal metafunction (many of which are likely
to be stereotypical) and even more crucially the types of configurations
that emerge in relation to Interpersonal meanings when they are mapped
onto the Experiential structure we have sketched out. This is a complex
descriptive operation. Thus, although the previous paragraph gives broad
suggestions as to how this mapping might take place in car adverts, a complete picture of the organization of car adverts into typical patterns of
phases and transitions still needs to be worked out. Such a picture needs to
be ascertained with more robust corpus description than the one currently
available. But the important point to note is that both the type of corpus
description and the corpus querying that this operation requires seem to be
quite in keeping with MCA's capabilities, given that its core feature is
its capacity to relate a wide array of disparate features over a wide range
of texts. But even if the phasal patterns sketched out above prove to be
valid over a still larger corpus, they will not be a point of arrival. Rather they
will still be a point of departure into a more precise understanding of
transitions and transition types, whose careful description, as this paper has
attempted to suggest, is crucial to the success of the multimodal analysis
of film texts.

ELECTRONIC MEDIA AND FILM

105

Conclusion
What is a multimodal transcription and what is a multimodal concordancer?
What is the relation between them and how can they promote English
studies, both from the standpoint of the researcher carrying out detailed
comparisons of texts and, more generally, from the standpoint of teachers
and students of English? Why should we be looking at type as opposed to
instance? Most answers to these questions will, hopefully, have been provided
in what has been stated above. A characterization of phase and transition
types would seem to lead to a better understanding of the features of
dynamic genres of which TV ads are just one exponent, one that at the very
least provides a guiding framework for students taking their first steps in the
analysis of dynamic texts.
A few concluding notes are, however, in order. While the multimodal transcription can be a useful starting point for an understanding of the ways in
which resources such as gaze, gesture and language combine in typical phasal
patterns, it has its limitations, some of which have been noted above. In the
early stages of this work, Baldry and Thibault developed a dynamic version of
the static multimodal transcription, a forerunner of MCA, which allowed the
user to generate the individual rows of a transcription through a query mechanism, and which facilitated understanding of how visual objects and their
movements could be analysed in terms of Halliday's metafunctions.
Unlike a lemma-based linguistic concordancer such as OCP or
Wordsmith, MCA does not search throug Textual data directly in the
search for patterns but does so indirectly: it searches the corpus for patterns in
descriptions which have been previously created by the researcher using
MCA's annotational tool. The annotational patterns so far used in the construction of a corpus of car adverts relate mainly to the metafunctional and
phasal organization of the texts. As we have seen, in the analysis of The Fan car
advert, driving a car is notjust a question of driving: rather a car advert can be
defined in terms of the relationship between the car driver and the car itself,
with car-drive (CD) phases intertwining with car-stationary (CS) phases.
Above all, though, MCA is the result of efforts to create transcription and
annotational tools that meet functional criteria in a way that was not
achieved by the first generations of lemma-based concordances. In this
respect, it has to be stressed that the needs of the research community have
changed in recent years in such a way as to privilege specialized corpora,
including the analysis, whether comparative or otherwise, of specific texts,
all of which are clearly reflected in the design characteristics of MCA. MCA
has been specifically designed as an online tool so that the research and
teaching community can easily access it. In this respect, work is currently in
progress to establish what integrations can be achieved with other systems,
for example, with HyperContext Web which uses techniques born in artificial intelligence that keep track of the user's progress and which are fundamental in teaching applications of corpora (see Pavesi and Baldry, 2000;
Piastra and Lombardi, 2000).

106

MULTIMODAL DISCOURSE ANALYSIS

Multimodal concordancing is in its infancy. MCA may have been on-line


for more than a year now with a constantly growing user base. But it is still a
prototype that requires inputs and co-developments by various research
teams, including the efforts of specialists in computer-based multimodal
annotational systems. One area, for example, in which MCA and instruments like MCA may be expected to develop further, in particular if they
are to be used as a teaching tool, is in terms of their incorporation of
predefined sets of parameters so as to reflect different linguistic and multimodal theories and traditions. Here MCA will depend heavily on the
experience gained by other research teams, in particular the work carried
out at the National University of Singapore (for example, O'Halloran and
Judd, 2002). Another development will be in relation to subtitling (Baldry,
2002; Baldry and Taylor, in press) where a project is underway to associate
language-learning subtitles with the films in MCA's database. Rather than
as faced overlays incorporated in the film itself, the subtitles, rather as happens with DVD, will be generated independently of the film text, in the case
of MCA, through specific queries using the relational querying mechanism.

Acknowledgements
This paper is part of research within the Linguatel Project, an Italian interUniversity project, co-financed by MURST/MIUR and co-ordinated by
Carol Taylor Torsello, University of Padua and its successor the Didactas
Project, co-ordinated by Chris Taylor, University of Trieste, which is
similarly financed. Michele Beltrami has developed MCA to the author's
design requirements as part of this project. Now in its second release,
MCA is viewable through the Pavia pages of the Linguatel Website:
claweb.cla.unipd.it/Linguatel/Pavia/MCA.htm or directly at: mca.unipv.it
[default User name: guest and default login: iamguest; see also New Registration] using Microsoft Explorer.
I thank Vauxhall Motors for the inclusion of five frames from their
advertisement, and I also wish to thank Antonio Cerlenizza and Oliver
Bartholomay, respectively Direttore Audi Italia and Responsabile MKTAudi of Autogerma, Divisione Audi S.p.A, Verona and Roberta Mottino of
Verba s.r.l. Milan for their kind permission to reproduce parts of The Fan
advert for the Audi A4 model. However appreciative and supportive of the
advert's organization and goals, the interpretation given above remains, of
course, entirely mine.

References
Baldry, A. P. (1999) Multimodality and multimediality. In M. Karagevrekis (ed.),
Compelling Learning Techniques in ESP/EAP, Proceedings of the 3rd ESP Conference., 25th
September 1998. Thessaloniki: Zefyros, 5-32.
Baldry, A. P. (ed.) (2000a) Multimodality and Multimediality in the Distance Learning Age.
Campobasso: Palladino Editore.

ELECTRONIC MEDIA AND FILM

107

Baldry, A. P. (2000b) ESP in a visual society: historical dimensions in multimodality


and multimediality. In A. P. Baldry (ed.), Multimodality and Multimediality in the
Distance Learning Age. Campobasso: Palladino Editore, 4189.
Baldry, A. P. (2000c) Introduction. In A. P. Baldry (ed.), Multimodality and Multimediality in the Distance Learning Age. Campobasso: Palladino Editore, 1139.
Baldry, A. P. (2002) Computerized subtitling: a multimodal approach to the learning
of minority languages. In G. Talbot and P. Williams (eds), Essays in Language
Translation and Digital Learning Technologies in Honour of Doug Thompson. London:
Matador-Troubador Books, 69-84.
Baldry, A. P. (in press) Promoting comparative multimodal concordancing: its role in
language education, teacher training, subtitling and minority language learning.
In N. Vasta (ed.), Atti del Convegno Forms of Promotion, Bologna: Patron.
Baldry, A. P. and Taylor, C. (in press) Multimodal corpus authoring system: multimodal corpora, subtitling and phasal analysis. In Proceedings of the LREC Congress,
Las Palmas, June 2002.
Baldry, A. P. and Thibault, P. J. (2001) Towards multimodal corpora. In G. Aston
and L. Burnard (eds), Corpora in the Description and Teaching of English. Bologna:
CLUEB, 87-102.
Bernstein, B. (1971) Class, Codes, and Control, Vol. I: Theoretical Studies Towards a
Sociology of Language. London: Routledge and Kegan Paul.
Cook, G. (1992) The Discourse of Advertising. London: Routledge. (2nd edn 2001)
Gregory, M. (1995) Generic expectancies and discoursal surprises. John
Donne's The Good Morrow. In P. Fries and M. Gregory (eds), Discourse in Society:
SystemicFunctional Perspectives. Meaning and Choice in Language: Studies for Michael Halliday. Norwood, NJ: Ablex, 67-84.
Gregory, M. (2002) Phasal analysis within communication linguistics: two contrastive discourses. In P. Fries, M. Cummings, D. Lockwood and W. Sprueill
(eds), Relations and Functions within and around Language. London: Continuum,
316-345.
Gregory, M. and Malcolm, K. (1981) Generic Situation and Discourse Phase: An Approach
to the Analysis of Children's Talk. Mimeo, Applied Linguistics Research Working
Group. Glendon College, York University, Toronto.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Halliday, M. A. K. (1978) Language as Social Semiotic: The Social Interpretation of Language
and Meaning. London: Edward Arnold.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:
Edward Arnold.
Halliday, M. A. K. and Hasan, R. (1985) Language, Context and Text: Aspects of Language
in a Social-Semiotic Perspective. Geelong, Victoria: Deakin University Press.
(Republished by Oxford University Press, 1989.)
Lombardo, L. (2001) Selling it and Telling it. A Functional Approach to the Discourse of Print
Ads and TV News. Roma: Istituto Linguistica Moderna, Luiss, Guido Carli.
O'Donnell, M. (2002) Systemics Coder. http://www.wagsoft.com/Coder/
index.html
O'Halloran, K. L. andjudd, K. (2002) Systemics 1.0. [CD-ROM]. Singapore: Singapore University Press.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
Pavesi, M. and Baldry, A. P. (2000) Learning to read scientific texts: integrated selfaccess courseware and corpora for university science students. In A. P. Baldry

108

MULTIMODAL DISCOURSE ANALYSIS

(ed.), Multimodality and Multimediality in the Distance Learning Age. Gampobasso:


Palladino Editore, 227-245.
Piastra, M. and Lombard!, L. (2000) The HyperContext Web Project: dynamic
authoring for distance learning. In A. P. Baldry (ed.), Multimodality and Multimediality in the Distance Learning Age. Gampobasso: Palladino Editore, 247-262.
Taylor, G. and Baldry, A. P. (200la) Computer assisted text analysis and translation:
a functional approach in the analysis and translation of advertising texts. In
E. Steiner and C. Yallop (eds), Exploring Translation and Multilingual Text Production:
Beyond Content. Berlin: Mouton de Gruyter, 277-305.
Taylor, G. and Baldry, A. P. (200Ib) Computer-assisted text analysis and translation
(characteristics of interactive self-access computer modules incorporating a functional approach in the analysis and translation of advertising texts). In
G. Torsello, G. Brunetti, andN. Penello (eds), Corpora Testualiper Ricerca, Traduzione
e Apprendimento Linguistico. Studi Linguistici Applicati. Padova: Unipress, 273-292.
Thibault, P. J. (2000) The multimodal transcription of a television advertisement:
theory and practice. In A. P. Baldry (ed.), Multimodality and Multimediality in the
Distance Learning Age. Campobasso: Palladino Editore, 311385.
Vasta, N. (2001) Rallying Voters: New Labour's VerbalVisual Strategies. Padova: Gedam.
Zago, S. (2002) A multimodal analysis of six television adverts. Unpublished thesis.
Dipartimento di Lingue e Letterature Anglo-Germaniche e Slave, University of
Padua.

Visual semiosis in film

Kay L. O'Hallomn
National University of Singapore

Introduction
The aim of this paper is to investigate a method for capturing and interpreting the spatial and temporal dynamics of visual semiosis. This is achieved
through the description of an analysis of a short segment from the dynamic
medium of film. The analysis is based on a systemic-functional framework
for film, and the use of software which allows the editing of digital video
images in order to display visually the nature of different semiotic choices
across a range of systems. From this point, the problematic nature of such
an enterprise becomes apparent and possible directions for future research
are suggested.
The film medium parallels a significant dimension of our experience of
the world: it involves sequences of change and repetition in the visual and
auditory realm. Film, however, involves playing with time sequences in a
two-dimensional frame to represent our three-dimensional lived-in material
experience of the world where the faculties of hearing, sight, smell, taste
and touch are sources for sensory, and therefore semiotic, input. Thus while
limited in the sense that the discussion presented here only incorporates the
visual aspect of semiotic exchange, this paper is nonetheless a further tentative step towards incorporating the meaning of the dynamic in systemicfunctional theory. For it is not only the culmination of choices made across
semiotic resources in their interaction with other resources that makes
meaning, but also the temporal and spatial unfolding of those choices.
Although images of instances frozen in time may become lodged within our
consciousness, generally we do not make meaning from a series of snapshot
images of the world, but rather our daily experience of the world is based on
patterns of change; that is, meanings derived from systems in flux. Our
perceptual apparatus is oriented towards detecting and assimilating change
and contrast, rather than relying on the stability and continuity which, in the
normal course of events, we learn to layer on top of that experience. An
adequate model which accounts for our social construction of the world,
therefore, necessarily needs to account for changing states which have traditionally been the concern of other domains, which include film theory,
mathematics, physics and studies of perception in cognitive science.

110

MULTIMODAL DISCOURSE ANALYSIS

Although not reproduced here (Paramount refused copyright permission),1 two short scenes from the film Chinatown, directed by Roman Polanski
(1974), were analysed for this paper. While film is evidentiy staged and directed behaviour with sequences which have been edited to achieve particular
effects, the analysis of this medium is at least a step in understanding semiosis
in everyday life. That is, despite the scripted and edited nature of film performance, this environment provides us with some means to start investigating everyday discourses-in-flux. Using a systemic-functional framework for
film, this paper is a preliminary attempt at a method for capturing and
analysing the dynamics of visual semiosis in a digitalized video format.
The social semiotic framework presented in this paper is based on
Michael Halliday's (1994) systemic-functional grammar of the English
language. Halliday's theorization of language as a social semiotic with
systems for Interpersonal, Experiential, Logical and Textual meaning has
been extended by O'Toole (1994, 1995, 1999) to the realm of displayed
art; for example, paintings, architecture and sculpture. While O'Toole's
systems for paintings are included in the proposed framework for film, the
former are concerned with analysing the single semiotic of the static visual
image. In film, however, there are multiple semiotic resources being used
spatially and temporally. Thus the multiple resources which result in
change, similarity and contrast are included in the systemic model for film
presented here. In addition, O'Toole (1999) represents his theory in an
interactive CD-ROM format. This method of visual representation in the
electronic environment provides the basis for the investigations undertaken
in this paper.
The focus of early studies in multimodality has primarily been directed
towards the analysis of static texts; notably Lemke's (1998b, 2003) early
pioneering work in scientific discourse and mathematics, Kress and van
Leeuwen's (1996) Reading Images: The Grammar of Visual Design, and other
more recent studies2 (for example, Baldry, 2000; Kress and van Leeuwen,
2001; O'HaUoran, 2003a, 2003b; Ventola et al, forthcoming). However,
current research is increasingly turning towards the analysis of the dynamic
text (for example, Baldry, this volume; Callaghan and McDonald, 2002;
ledema, 2001; Lemke, 1998a, 2000; Mclnnes, 1998; Martinec, 2000;
Thibault, 2000; van Leeuwen 1999).
With the exception of Baldry's Multimodal Corpus Authoring (MCA)
system (see this volume), however, few (if any) attempts have been made to
analyse dynamic semiosis in digitalized format using computer-based technology. Baldry's MCA is a Web-based instrument which is designed for
analysing dynamic multimodal texts, that is, film and video texts which
display different and constantly varying configurations of sound, image,
gesture, text and language as the text unfolds in time. Baldry harnesses the
potential of computer technology to develop the MCA system with the aim
of developing a metafunctionally based transcription method which can
highlight the types of shots, cuts, phases and transitions. The analyst can
record choices in a relational database format so that comparisons can be

ELECTRONIC MEDIA AND FILM

111

made across a corpus of texts. This concordance instrument thus analyses


the dynamics of semiosis through methods which involve recording annotated entries. As Baldry (this volume: 105) explains:
MCA does not search through Textual data directly in the search for patterns but
does so indirectly: it searches the corpus for patterns in descriptions which have
been previously created by the researcher using MCA's annotational tool. The
annotational patterns so far used in the construction of a corpus of car adverts
relate mainly to the metafunctional and phasal organization of the texts.

One aim of this paper is to suggest ways in which the user can directly search
for patterns in visual Textual data. In other words, I explain how commercially available software can be used in conjunction with a visual
grammar to capture changing patterns in dynamic text. This exploratory
stage is viewed as a first step towards a new methodology afforded by the
electronic medium which could eventually be included in a system such as
Baldry's MCA. In addition, there is the potential to incorporate software
such as Systemics 1.0 (O'Halloran and Judd, 2002) in such applications
in order to analyse the linguistic choices as they unfold in time. The challenge remains for us to capture and analyse choices across all semiotic
resources in such a way that the dynamics of meaning-making can truly be
investigated.
A visual grammar for visual images
The inspiration for the approach adopted in this study stems from O'Toole's
(1994: 24, 1999) framework for the analysis of paintings where a constituent
structure approach with ranks PICTURE, EPISODE, FIGURE and MEMBER is
adopted. O'Toole's chart documents the systems of meaning for the
Experiential, Interpersonal and Textual metafunctions which are respectively
labelled representational, modal and compositional. While many of these
systems can also be seen to operate within the realm of film, the different
medium of production and the fact that the text unfolds in real time mean
that there are further dimensions to the analysis. Also, given the cause-effect
relations in film narrative, the logical metafunction is also included.
In the innovative CD-ROM, Engaging with Art, O'Toole (1999) creatively
utilizes computer technology in an interactive multimedia hypertext
environment to display choices visually from his systemic-functional framework. For example, in Plate 5.1 O'Toole effectively captures choices from
the system of light which function to engage the viewer in Rembrandt's
painting The Night Watch (1642). In another instance, O'Toole (1999) demonstrates how Vertical Lines are one resource which functions compositionally
in Seurat's Sunday Afternoon on the Island of La Grande Jatte (1884-1886). He
also gives an amusing demonstration of the change in meaning which would
occur in Botticelli's Primavera (1478) with alternative choices for the direction
of Gaze for each of the figures in the painting.

112

Plate 5.1

MULTIMODAL DISCOURSE ANALYSIS

Visualizing the system of light (O'Toole, 1999)

O'Toole's (1999) Engaging with Art thus represents a major advance in


theory of semiotic analysis where choices in the visual semiotic are displayed
visually rather than being described linguistically. This method means that
patterns in visual semiosis may be marked in such a way that the viewer can
immediately grasp the significance of such choices. As I describe in this
paper, there also exists the potential for displaying visually the overlapping
dynamic choices in-flux across systems. The advantages of this approach
may be appreciated through a comparison with an alternative method
developed by Thibault (2000).
As a major step in theorizing a comprehensive semiotic analysis of a
television advertisement,3 Thibault (2000: 374385) proposes a static linguistic description in table format with dimensions 'Visual Image', 'Kinesic
Action', 'Soundtrack' and 'Metafunctional Interpretation of Phases and Subphases' which are denned as constituting 'an intermediate level of analysis
which lies between the microlevel lexicogrammatical, kinesic, and image
selections and the global structuring of the text as a whole' (Thibault, 2000:
365). Following Gregory (1995, 2002), Thibault (2000: 325-326) defines
phase as 'a set of co-patterned semiotic selections that are co-deployed in a
consistent way over a given stretch of text'. Here the change of phase is
marked by a salient metafunctional choice which marks the transition.

ELECTRONIC MEDIA AND FILM

113

Although a highly significant and useful methodology for capturing integratively multimodal social meaning-making, the linguistic description does not
capture the import of such choices and also it fails to map visually the
choices as a sequence of continuity and change.
The potential exists for the viewer to actively engage with the digitalized
film segments to illustrate the impact of different semiotic choices. This is
achievable through the use of facilities in video editing software such as
Adobe Premiere 6.0, which permits the user to segment a digitalized video
clip into sections according to frame number (for example, 1, 2, 4 and
6 frames) or time intervals (for example, 1, 2, 4 seconds). The software allows
the user to manipulate the visual footage in multiple ways; for example, the
image may be adjusted for brightness, contrast, colour (which can be
replaced and matched) and special effects such as blurring, distortion, perspective, edge definition and shadowing (to name but a few) may be applied.
The software also allows the user to create multiple transparent mattes
which act as overlays on the original film footage so that text can be inserted
and lines, vectors, figures, outlines and shadings can be drawn. In addition,
visual transitions between parts of the footage can be marked in various
ways. These facilities allow the user to mark explicitly the nature of visual
semiotic choices which have been made. Just as one enters a linguistic analysis by tagging the linguistic text in software such as Systemics 1.0, so the
analyst can enter the analysis of the visual images through direct Textual
engagement.
In the following discussion of the analysis of the visual dimensions of the
dynamics of the film footage, I do not consider the soundtrack. Therefore, in
this limited discussion it is important to keep in mind Baldry's (this volume:
94) claim that transitions in phases take many forms:
Thus, transitions are not necessarily equated with the cutting from one shot to
another, nor indeed with what is happening in the visual. While transitions will
often be related to what is happening in the visual, this will not always be the case
[. . .] transitions, as Thibault (2000: 320) and Gregory (2002: 323) have pointed
out, essentially relate to changes in the metafunctional organization of the text and
as such may very well be related to changes in the soundtrack and not just to what
happens in the visual.

Video-editing tools, therefore, allow the user to highlight the different semiotic choices visually and view the impact of such choices when they combine in the text in real time. The method which was adopted for this paper
involved the use of Adobe Premiere 6.0 to explore how salient semiotic
choices may be highlighted in a short extract from the film Chinatown. However, as previously noted, unfortunately it has not been possible to reproduce
still frames from this analysis in this publication due to Paramount Studio's
refusal to give copyright permission. Nonetheless, the results of the visual
analysis are described in some detail.

114

MULTIMODAL DISCOURSE ANALYSIS

A systemic-functional framework and Chinatown (1974)


The systemic-functional model proposed here4 has been developed in conjunction with the film theory presented in Bordwell and Thompson's (2001)
Film Art: An Introduction. Bordwell and Thompson are concerned with the
image in the visual frame and the accompanying audio soundtrack. In what
follows, I discuss the proposed systemic framework and demonstrate how
such an approach may be applied for the analysis of compositional and
Interpersonal meaning in two short scenes from Chinatown. In order to situate the analysis, I first briefly discuss this film.
Written by Robert Towne and produced by Robert Evans with director
Roman Polanski and production designer Richard Sylbert, Chinatown is a
detective film set in 1937 in Los Angeles with Jack Nicholson as Jake Gittes
(the private detective), Fay Dunaway as Evelyn Mulwray (the wife of Hollis
Mulwray, chief engineer of Water Energy and Power) and John Huston as
Noah Cross (former partner with Hollis Mulwray of a private water company for LA). The plot unfolds as Jake unearths the corruption behind
Cross's plan to build a new reservoir. This involves investigation of the
murder of Hollis Mulwray who opposes the plan, and unearthing the history of Evelyn Mulwray who was raped by her father Noah Cross at the age
of 15. Cross's partner Hollis Mulwray subsequently married Evelyn and
supported her daughter Katherine. After Jake becomes aware of the reasons
for Evelyn's actions, he organizes her escape from her father with Katherine.
However, Cross forces Jake to disclose their whereabouts with the result that
Evelyn is killed by the police. Jake once again unwittingly aids the death of
someone he is trying to protect, which is a repeated scene from the days in
which he was a police officer in Los Angeles' Chinatown.
Chinatown has been the subject of much discussion in film theory (for
example, Eaton, 1997; Heisner, 1997; Krutnik, 1991; Tuska, 1984).
Based on the history of pumping water to Los Angeles in the first quarter
of the twentieth century, Eaton (1997: 43) explains that Chinatown is 'a
complex detective thriller with dimensions which are political (about the
nature of power), sexual (about the nature of gender), metaphysical (about
the nature of the evil), psychological (about the nature of the self) and
philosophical (about the nature of knowledge)'. According to Eaton (1997)
the subtext is concerned with the theme of American greed. In addition,
Heisner (1997: 63) explains that Robert Towne has explored the 1930s
popular conception of 'the inscrutable Orient' which is 'unknowable; it is
dense and powerful and corrupt'. In the film Chinatown, this view is applied
to the entire world.
The proposed systemic-functional framework involves classifying the film
according to type, form and genre. The semiotic analysis of the film is based
on a metafunctionally organized rank constituent structure with ranks
Film Plot, Sequences, Scene, Mise-en-Scene and Frame. Though beyond the
scope of this paper, the notion of metafunctionally based phases and transitions may be incorporated within this framework. The aim of the analysis

ELECTRONIC MEDIA AND FILM

115

undertaken here, however, is to demonstrate how a visual grammar can be


implemented in the dynamic digitalized environment of film.
Film type: fiction, documentary, experimental and animated
Film form: narrative, categorical, rhetorical, abstract and association
multiple types; for example, narrative films include science
Genre:
fiction, western, musical, comedy, suspense, and action
thrillers with sub-genres horror, detective, hostage and
gangster
Film Plot
Ranks:
Sequences
Scenes
Mise-en-Scene (the shot)
Frame
Film type /form
Bordwell and Thompson (2001) categorize films as fiction, documentary,
experimental and animated based on how the film material was chosen,
arranged and the nature of the filming. They further propose that films
also have a basic film form, or a system of relationships among the parts
which may be categorized as Narrative, Categorical, Rhetorical, Abstract
and Associational. The narrative form, however, is dominant in mainstream
cinema. Bordwell and Thompson (2001: 60) define narrative as 'a chain of
events in cause-effect relationship occurring in time and space'. In a narrative film, the viewer is presented with the plot, 'the arrangement of material
in the film' from which the viewer individually creates the story 'on the basis
of cues in the plot' (ibid.: 62). Most films employ narrative where causality
and time are central.
In classic Hollywood cinema, the action usually springs from individual
characters as causal agents where the narrative usually centres on personal
psychological causes such as decisions, desires, choices and traits of character (Bordwell and Thompson, 2001). The narrative subordinates time,
motivation and other factors to the cause-effect sequence. There is usually
strong closure where the causal chain is completed with a final effect. 'We
usually learn the fate of each character, the answer to each mystery, and the
outcome of each conflict' (ibid.: 77).
In Chinatown Jake Gittes desires to know the truth surrounding Evelyn and
the murder of Hollis Mulwray. As Eaton (1997) explains, Evelyn chooses not
to speak because she knows too much about her father's corruption and
power to share Jake's faith in revelation. Jake considers her a betrayer but he
learns that in fact she is the victim. The cause-effect relations in Chinatown
are extremely complex as new revelations continually occur in the unfolding
of the plot.

116

MULTIMODAL DISCOURSE ANALYSIS

Genre
There are no rigid criteria to define the different genres of film (Bordwell and
Thompson, 2001). Some classifications are based on subject/theme (for
example, crime for gangster movies), while others are defined by emotional
effect (for example, amusement for comedy). Genre conventions are also
based on plot, thematic development, film techniques and iconography. Further to this, genres change and new hybrid types are continually emerging.
However, despite this fluidity the audience generally recognizes genre conventions. Genres are seen to be institutionalized and ritualized dramas 'which
are satisfying because they reaffirm cultural values . . . [such as] self sacrificing heroism, the desirability of romantic love' (Bordwell and Thompson,
2001: 99). Bordwell and Thompson (2001) further explain that these reaffirmations distance the viewer from real social problems and the more finite
and anxiety-ridden aspects of life such as death, disease, breakdown and
insecurity. Genres may also be seen to 'exploit ambivalent social values and
attitudes' which 'arouse emotion by touching upon deep social uncertainties
but then channel those emotions into approved attitudes' (ibid.: 99).
Chinatown is a detective story with an investigative structure (Eaton, 1997).
'As Poe so clearly put it, the detective exists "to play the Oedipus'" (ibid.:
17), the truth seeker. Chinatown is a story where 'wrongs can ultimately be
uncovered but the seeker after truth is not only completely incapable of righting them but his very search will only make matters worse' (ibid.: 21).
Chinatown is also recognized as film noir and, more specifically, reflects the
origins of the neo-noir. The subject of much study (for example, Christopher,
1997; Hirsch, 1981; Kaplan, 1998; Krutnik, 1991; Palmer, 1994; Tuska,
1984; Voytilla, 1999], film noir is a descriptive term for American crime film
from early 1940s to late 1950s where doomed men are obsessed with seductive women, as exemplified by Double Indemnity (1944) and Scarlet Street (1945).
In the 1960s and 1970s films with noir flourishes include Klute (1971), Play
Misty for Me (1971), Taxi Driver (1976) and Chinatown (1974).
Definitions of film noir vary but there seems to be general agreement that
the term designates films with a low-key visual style which contrasts to the
bright balanced studio look of the 1930s. There are noir movies of different
genres, for example, mystery, suspense thriller, psychological drama, and
gangster films (Krutnik, 1991). Critics generally agree that there is also an
obliqueness and often confused temporal narrative plot. There is usually a
general mood of dislocation and bleakness, and the noir world is deceptive
and uncertain. ' "The world is a dangerous place" is one of the axioms of
noir' (Hirsch, 1981: 13).
Chinatown, however, is filmed in the non-expressionistic 'classical' style of
Panavision and Technicolour with a straightforward narrative style. However, 'the cynicism and despair which permeates the social vision of the film
noir... is present... in the final act of this Polish exile's [Roman Polanski's]
film' (Eaton, 1997: 57-58). However, according to Eaton (1997: 58), the
depiction of Evelyn Cross Mulwray is where the noir-ish influence is most

ELECTRONIC MEDIA AND FILM

117

obvious. 'The dark lady, the spider woman, the evil seductress who tempts
man and brings about his destruction' [Place, 1998: 47] is how the "female
archetype" of film noir has been characterized and this is the image of the
female lead which is now consciously evoked [in Chinatown]' (ibid.: 58).
The figure of the woman in film noir has been the focus of feminist film
theory since Chinatown was produced. The emergent newfemmefatale in films
in the 1990s, for example, Basic Instinct (1992), is 'redefined as a sexual
performer within a visual system which owes as much to soft-core pornography as it does to mainstream Hollywood' (Stables, 1998: 172-173). The
new woman takes an active role in initiating sexual practices which are
perceived as deviant, marginal or transgressive to the dominant culture. In
the analysis below, we shall investigate the semiotic construction of Evelyn
Cross in the role of 'spider woman' which has subsequently led to such
constructions of women in contemporary cinema.
The Film Plot and Sequence
The form which gives rise to the plot is the overall interrelation among
various systems of elements and every element in this totality has one or
more functions (Bordwell and Thompson, 2001). In the model presented
here, the Film Plot is constructed from the series of Sequences where the
motivation is similarity and repetition, and difference and variation.
In Chinatown., repetitive elements and motifs are significant (Eaton, 1997;
Heisner, 1997). The Scenes take place in different locations which reinforce
the theme of drought-stricken Los Angeles. The symbolism of water continually appears in the unfolding of the plot with constant screen images
and references to water. A second motif is the lens in the form of glasses, car
mirrors and binoculars which contribute to the theme of distorted vision.
These themes of voyeurism and blindness are 'not simply about seeing, it is
about seeing wrongly' (Eaton, 1997: 29). Other motifs in Chinatown., for
example, the horse and rider, are metaphors for desire and sexuality. In the
Mise-en-Scene analysed below, we shall see these themes reappear in different forms.
Scene and Mise-en-Scene
The Mise-en-Scene is concerned with everything which is seen within the
frame as it unfolds in time together with the accompanying soundtrack. As
soon as the camera shot changes, even though still centred on the same
setting, we will be concerned with a new Mise-en-Scene. The Mise-en-Scene
complex, or the unfolding series of Mise-en-Scene, forms the Scene. The
total of Scenes forms the Sequence, which in film theory is the term for the
fragmentation of the film into segments.
The Mise-en-Scene forms the basic unit for analysis because the major
systems for each metafunction across the semiotic resources are operational
at this rank. For instance, the higher rank of Sequence does not allow

118

MULTIMODAL DISCOURSE ANALYSIS

comprehensive analysis of the choices across semiotic resources, while the


lower rank of Frame frozen in time excludes analysis of speech, music and
other sound effects.
Following Baldry (this volume) and Thibault (2000), the soundtrack can
mark a transition, and in the case of the framework presented here, the
transition may take place within one Mise-en-Scene. In effect, this would
create a 'rankshifted' Mise-en-Scene. That is, if the soundtrack changes to
indicate a transition within the single camera shot, we have a Mise-en-Scene
embedded within the ranking Mise-en-Scene of the camera shot. In a similar
manner, the soundtrack can continue across several Mises-en-Scene to form
a Mise-en-Scene complex. The Mise-en-Scene complex is therefore construed by the nature of the setting and other structural elements which
include the soundtrack.
As displayed in Table 5.1, the Mise-en-Scene is analysed according to
Visual Imagery, Speech, Music, Sound Effects and the subsequent Interweaving of the Visual Imagery and the Soundtrack. For Visual Imagery, the
ranks are Movement-Action-Event in a shot, temporal episode, temporal
figure and temporal member. In addition to making dynamic O'Toole's
systems for paintings, further systems are included for the analysis of the
temporal unfolding of the text. At the rank of Mise-en-Scene, these include
systems for: (a) Interpersonal meaning such as Patterns (Kinesic, Proxemic,
Rhythm, Gaze and Shape), Duration of the Image, Speed of Motion and
Point of View; (b) Representational meaning, for example, Movement-Action
Sequence; (c) Logical meaning, for example, Narrative Cause-Effect Relations; and (d) Compositional meaning, for example, Changes in Gestalt, OnScreen/Off-Screen Space, Camera Angle, Camera Level, Camera Distance
and Mobile Frame. The Mobile Frame allows changes in the camera position
in the Mise-en-Scene. The Mobile Frame thus interpersonally orients the
viewer towards the image and furthermore contributes to the representational meaning in the form of the Point of View constructed within the film.
The analysis described below is concerned with the visual imagery in two
Mise-en-Scene from Chinatown. As the goal of this exercise is to demonstrate
the usefulness of the Textual application of the visual grammar, the discussion is only concerned with selected choices in systems for Interpersonal and
compositional meaning. The original analysis appears in the form of a movie
where choices from the visual systems are marked on the digitalized film clip
from Chinatown as they unfold in real time. We may note that, compositionally, the Framing in Chinatown (which may be marked visually) is widescreen
with ratio 16:9. This allows the action sequences to be framed against an
expansive setting which contributes to establishing one of the key themes of
Chinatown'. Los Angeles in a drought.
The analysis of two Mises-en-Scene in Chinatown
The first Mise-en-Scene occurs at the end of the Scene where Jake and
Evelyn meet in a restaurant. Jake is largely unsuccessful in his attempts to get

ELECTRONIC MEDIA AND FILM

119

further information from Evelyn, and in the ensuing Mise-en-Scene outside


the restaurant, a somewhat angry and frustrated Jake informs Evelyn that
her husband may have been murdered. In the newly released 1999 DVD
version of Chinatown., director Roman Polanski states that this scene outside
the restaurant is one of his favourite shots. We shall soon appreciate at least
some of the reasons why Polanski thought this way about this part of the
film. The dialogue which takes place outside the restaurant is reproduced
below.
Key:

EM: Evelyn Mulwray JG: Jake Gittes

EM:
JG:

Oh no ... I have my own car. Ahh . . . the Packard.


Wait a minute sonny [to the car attendant]. I think you [Evelyn] had better
come with me
But why. There's nothing more to say. Will you get my car please [to the
attendant].
Okay go home. But in case you're interested, your husband was murdered.
Somebody's been dumping thousands of tonnes of water from the city's
reservoirs and we are supposed to be in the middle of a drought. He found
out about it and he was killed. There's a waterlogged drunk in the morgue involuntary manslaughter if anyone wants to take the trouble which they
don't. It seems like half the city is trying cover it all up which is fine by me.
But Mrs Mulwray. I goddamn near lost my nose and I like it. I like breathing
through it. And I still think that you're hiding something
Mr Gittes [as JG drives away]

EM:
JG:

EM:

The restaurant Mise-en-Scene

The viewer's perception is attuned to difference rather than prolonged stimuli, and attention is typically focused through contrasting patterns and
movement. However, in the selected Mise-en-Scene which occurs at the end
of the restaurant Scene, the camera focuses on Evelyn (pictured from the
shoulder upwards) who is silent and virtually motionless. Kinesics and
Rhythm through movement are absent. What functions to make this Miseen-Scene so compelling? Through the analysis, we see that there are many
simultaneous choices at work which focus the viewer's attention on this
portrayal of Evelyn as the 'spider woman'.
The Lighting Quality, Lighting Intensity, Lighting Direction and Lighting
Source in the restaurant scene function to make Evelyn visually salient. The
soft background Lighting may be marked visually through the use of the
special effect 'lens flare' which allows the light source to be highlighted. As
well as providing a contrast for the next Mise-en-Scene, the choice of the
warm reddish colours from the system of Colour/Cohesion has implications for more immediate Interpersonal and Experiential meanings as we
shall soon see.
At the rank of Member, the Clarity and Focus of Evelyn's beautiful, pale
and sculptured face attracts the viewer's attention. Further to this, Evelyn's

Table 5.1

Functions and systems in the Mise-en-Scene

Semiotic Resources/Rank

Modal

Representational

Logical

C ompositional

MISE-EN-SCENE
COMPLEX

Contrasts

Narrative continuity and


discontinuity

Cause-effect relations

Continuity and
discontinuity

Patterns:
Kinesic
Proxemic
Rhythm
Gaze
Shape
Colours and Contrast
Lighting Quality
Light Intensity
Lighting Direction
Lighting Source
Clarity
Focus
Film Tonality
Special Effects
Duration of Image
Speed of Motion
Point of View (Viewer)

Movement-Action-Event
Sequence
Figures/Objects
Nature of Scene
Props
Lighting Colour
Narrative as Cause
Effect
Relations
Point of View
Visual Motifs

Narrative CauseEffect Relations

Frame Dimension
Frame Shape
Changes in Gestalt:
Framing
Horizontal
Vertical
Diagonal
Colour Cohesion/
Contrast
Perspective Relations
On-Screen/Off-Screen
Space
Camera Angle
Camera Level
Camera Distance
Mobile Frame
Film Editing

(the edited scene)

MISE-EN-SCENE
The Temporal-Spatial Frame
Complex Relation: The Shot
Visual Imagery
Movement-Action-Event in a Shot

Semiotic Resources/Rank

Modal

Representational

Logical

Compositional

Temporal Episode

Relation to MovementAction-Event:
Scale
Depth
Centrality
Relative Prominence
Duration
Clarity
Focus
Light

Sequence of Sub- Actions,


Side Sequences and
Events
Interplay of Actions

Contribution to
Narrative
Cause-Effect
Relations

Relative Relation of
Action in Changing
Gestalt
Subframing
Parallelism and
Opposition
Relative On-Screen/OffScreen Space
Camera Angle
Camera Level
Camera Distance

Temporal Figure

Colour Coordination/
Contrast
Colour Intensity
Costume Style
Frontal View
Change in Size
Change in Prominence
Gaze Pattern
Focus
Depth
Light

Character of Figure
Costume
Body Behaviour/Gesture
Props

Contribution to
Cause-Effect
Relations through
Intertextual
Motif

Relative Position in
Changing Gestalt
Subframing
Parallelism and
Opposition
Relative On-Screen/OffScreen Space
Camera AngleCamera Level
Camera Distance

Temporal Member

Colour
Colour Intensity
Style of Costume Part
Makeup
Facial Expression

Body Part
Makeup
Facial Expression
Gesture
Role in action

Contribution to
Cause-Effect
Relations through
Intertextual
Motif

Relative Position in
Changing Gestalt
Subframing

Semiotic Resources/Rank

Modal

Representational

Logical

Compositional
Parallelism and
Opposition
Relative on-screen/offScreen space
Camera level and angle
Camera Distance

Gesture
Light
Change in Size
Change in Prominence
Focus
Depth
Soundtrack
Speech

Negotiation
Speech Function
Mood
Modality
Polarity
Attitude
Comment
Appraisal
Lexical 'Register'
Tone
Pitch
Volume

Ideation
Transitivity
Tense
Lexical Content
Ergativity
Verbal Motifs

Conjunction and
Continuity
Logico-Semantic
Relations

Identification
Theme
Cohesion
Information

Music

Volume
Pitch
Timbre
Rhythm
Fidelity
Beat

Genre:
Experiential Context
Intertextuality
Musical Motifs

Narrative Cause-Effect
Relations

Sound Perspective
(Diegetic, Non-Diegetic)

Semiotic Resources/Rank

Modal

Representational

Logical

C ompositional

Sound Effects

Volume
Pitch
Timbre
Rhythm
Fidelity
Beat

Experiential Content
Intertextuality
Oral Motif

Narrative Cause-Effect
Relations

Sound Perspective Diegetic and NonDiegetic

Direction of Engagement
through Foregrounded
Semiotic Choice
Change in Phase Marking

Development of the
Narrative Plot for Story
Line through Directed
Content Input

Development of
Cause-Effect
Relations

Organization of the
Unfolding of the
Narrative

Visual Imagery +
Soundtrack
Interweaving Visual
Imagery and Sound

Frame
24 Frames/Second

Viewed as Mise-en-Scene

124

MULTIMODAL DISCOURSE ANALYSIS

Gaze towards Jake (which may also be marked visually through vectors) is
oblique and so the viewer can openly scrutinize her face, Makeup and Costume throughout the extended Duration of the Image. After her husband's
funeral, Evelyn is wearing a black dress and a hat with a netted black veil
which covers the top half of her face. Her Gaze in effect is veiled. Jake
comments in the next Mise-en-Scene, 'And I still think you are hiding something'. Here the motif of distorted vision is reinforced. In this case, Jake is
not gazing through a camera or car mirror, rather he is trying to penetrate
the protective veil through which Evelyn views the world.
The use of Colour in the restaurant scene is significant for several reasons.
Digital colour matching (which can be displayed) reveals that Evelyn's red
lipstick exactly matches the colour of the couch upon which she is seated.
The motif of sexuality is represented through this use of the colour red in
Evelyn's makeup which coheres with the intimate setting. The characterization of Evelyn as the 'spider woman' is thus created; she is veiled, oblique,
sexual and potentially dangerous. This portrayal of Evelyn largely remains
in place until the final scenes in the movie.
The street Mise-en-Scene

In the next Mise-en-Scene the viewer is confronted with a bright street scene
as Evelyn and Jake walk into the open glare of sunlight outside the restaurant. Compositionally, the contrast in Colour Cohesion/Contrast may be displayed through the use of colour matching and replacement. The analyst
becomes conscious that the dominant background colour of bright yellow
has replaced the subdued colours in the restaurant. The dark quiet world of
the spider woman is contrasted to the stark brightness of the street where
sunlight shines against the buildings and normal day-to-day activity takes
place as the attendant rushes to open the car door for Jake and Evelyn.
Through the use of overlays and drawing tools to mark the perspective
and the placement of the Figures in the Mise-en-Scene, it may be appreciated
that the On-Screen Space initially occupied by the attendant works perfectly
in conjunction with the perspective provided by the buildings. Activities are
ordinary, orderly and public in this Mise-en-Scene where the sound of car
horns is heard and people walk down the street arm in arm.
Jake and Evelyn become the focus of attention as they walk out onto
the street. They continue to occupy the central On-Screen Space in the
remainder of the Mise-en-Scene, and a dynamic visual tracing of the outline
of their two Figures reveals the perfect compositional balance that is achieved
within the widescreen frame format. The Colour Contrast provided by the
bright background also functions to highlight the figures of Evelyn and Jake.
The effect of the light provided by the sun may be marked visually through
the use of'lens flare' to insert an accentuated light source. The analyst again
becomes aware that the motif of hot dry weather is invoked.
As the Mise-en-Scene unfolds, the exchange between the two central
characters becomes increasingly intense as Jake responds with frustration to

ELECTRONIC MEDIA AND FILM

125

his lack of understanding of the situation. The intense gaze between Jake
and Evelyn, which accompanies her refusal of his offer to drive her home,
may be indicated visually by vectors. The On-Screen Space dominated by
Evelyn and Jake continues to remain perfectly balanced, and the analyst can
begin to appreciate how effectively the camera work and background setting
function in this Mise-en-Scene. In addition, there is a lightly coloured bandage on Jake's nose which is marked with visual prominence despite its
cohesiveness with the background colours. This visual prominence of the
bandage is matched by the linguistic choices in the dialogue which takes
place as we shall see in a moment.
The triangle of social relationships between Jake, Evelyn and the car
attendant is construed visually as well as linguistically. The attendant is a
minor participant as indicated by his backgrounded physical position in the
Movement-Action-Event when Jake and Evelyn walk out of the restaurant.
Jake's use of the vocative 'sonny' in the command 'Wait a minute sonny'
reinforces this position. Jake's attempts at exercising power over Evelyn,
however, do not succeed.
Jake fails in his bid to drive Evelyn home, and there is a pause before he
turns to confront her. Evelyn remains detached and supposedly nonchalant
by focusing her Gaze on her gloves, which may be indicated visually by line
vectors. Evelyn's hand movements may also be highlighted visually to indicate Gesture. After a short silence, the Interpersonal relations between Jake
and Evelyn intensify. The Gaze becomes direct and focused as the Proxemics,
which may be displayed by visual vectors, decrease. The Mobile Frame has
been brought into play so that the Camera Distance is decreased. This compositional strategy further draws the viewer into the exchange between Jake
and Evelyn. The Interpersonal intensity of Jake's delivery continues as he
explains that Evelyn's husband was murdered. Evelyn's Gaze, which again
may be marked by visual vectors, shifts downwards as Jake refers to her late
husband. Jake, however, continues regardless of Evelyn's silent response.
When Jake refers to a situation where he was physically attacked and his
nose sliced by a knife [hence the bandage], 'but Mrs Mulwray I goddamn
near lost my nose', the Interpersonal intensity of the exchange increases.
The use of vectors may explicidy demonstrate how distance in the Proxemics has again decreased with a resulting increase in the intensity of gaze. In
addition, Jake's use of'goddamn near' reinforces the affect of his speech to
Evelyn, which is somewhat mocking given that he addresses her as
'Mrs Mulwray'.
The climax in this Mise-en-Scene is reached when Jake accuses Evelyn of
'hiding something'. Here the motif of the truth seeker looking through a veil
of deception is reinforced. While he is correct that Evelyn is withholding
information, it is not exactly the sort that Jake envisages. However, in the
remainder of the street scene, Roman Polanski allows the viewer to gain
some insight into Evelyn's situation.
The final frames of the Mise-en-Scene capture one of the rare moments
in Chinatown where the Point of View switches from Jake to Evelyn. The

126

MULTIMODAL DISCOURSE ANALYSIS

viewer is aware of Evelyn's latent appeal to Jake ('Mr Gittes') as he drives


away. Evelyn maintains her position within the Frame but the Mobile
Camera effectively retreats to leave Evelyn pictured completely alone in the
street scene. The appeal is reinforced through Evelyn's downcast Gaze and
Gesture of moving her hand to her throat.
At this stage, the viewer gains an understanding of Evelyn's efforts at selfcontrol. With her eyes temporarily closed, the absence of Gaze and the
continuing Gesture are made salient through the Duration of the Image and
the Framing of Evelyn within the street scene. Jake's departing car is the
only Temporal Episode in relation to Evelyn's Movement-Action-Event. A
somewhat resolute Evelyn opens her eyes with a straight gaze realized as a
horizontal vector as her car is reversed by the attendant. In the final
frames of this Mise-en-Scene, Evelyn has again opened her eyes to a world
which does not understand her position nor the reasons for her actions.

Conclusion
This necessarily incomplete description of the analysis of two Mises-enScene from Chinatown seeks to describe how a visual grammar may be
applied to the dynamic visual image. In the discourse analysis of a linguistic
text, the analyst directly engages with the linguistic choices which have been
made in order to interpret the text. In a similar manner, the description of
this analysis seeks to demonstrate the effectiveness of directly engaging visually with a Mise-en-Scene to make salient the choices which have been made.
Through such an analysis, we start to appreciate the reasons why director
Roman Polanski favoured this particular scene in Chinatown.
The bright public street setting marks a stark transition from the intimate
restaurant scene where Evelyn's sexuality is marked. The compositional
aspects of the narrow street setting are perfect; the actors are framed
through perspective, on-screen space, colour cohesion and contrast. The
yellow tones of the background setting with light and shadows provided by
the sun, the buildings and other lighting effects further enhance the visual
salience of the two actors in the setting. The camera moves in to record the
growing intensity of the exchange between Jake and Evelyn against a backdrop of day-to-day life which continues despite the drama being played out
before the viewer's eyes. Through the use of gaze, gesture and proxemics the
visual aspects of the interaction effectively construct Jake's growing frustration and anger with Evelyn in his search for truth. The camera later lingers
to capture a subtle shift in the point of view where the unenviable position
of Evelyn is signalled to the viewer. Jake's arrogance transforms her perceived strength into a web of deceit and corruption which rightfully should
be attributed to her father.
Roman Polanski ensured that the usual generic conventions were not followed in the movie Chinatown. In Robert Towne's original script, Evelyn is
saved and her father exposed. Thus the usual generic tropes such as 'love
triumphs' and 'youth defeats old age' and 'corruption resulting in a new

ELECTRONIC MEDIA AND FILM

127

healthier social order' are eliminated by Polanski. As Heisner (1997:63) states,


in Chinatown 'Evil and power have triumphed, corruption has won out'. As
Heisner further explains, the pessimism of the ending extends beyond Jake's
cynicism. In the final line in the film Jake is told ' "Forget it Jake. It's Chinatown. It's Noah Cross. It's the power structure. It's the world"' (ibid.: 64).
The proposed methodology for analyzing dynamic visual images, however, presents a range of difficulties. First, it proved near impossible to
simultaneously record dynamically the metafunctional choices across the
different semiotic systems, even in the case when each metafunction is considered separately. The reason is twofold: first, the complexity and range of
systems from which options are chosen; and second, the problem of the
temporal unfolding of those choices in real time.
In the first case, visually marking semiotic choices across a range of systems
for one metafunction proves problematic. For example, recording on-screen
space for compositional meaning precluded including choices for colour
cohesion and contrast because the resulting footage became too dense and
confused. In a similar manner, choices from Interpersonal systems such as
lighting and colour could not be combined with the analysis of gaze and
proxemics. This situation gave rise to the second problem. In attempts to
combine the metafunctionally based analysis in real time, the temporal
unfolding of the resultant footage for each metafunction was too fast for the
viewer to grasp the significance of the different aspects of the analysis. It
becomes apparent that we perceive so much visual data in a short time span
that it is impossible to mark this visually in real time. If the analysis for all
four metafunctions were recombined in the footage, the problem would be
exacerbated.
In order to overcome the difficulties described above, it is suggested that
the analysis for one system should be documented and the shifts annotated
within a system such as the MCA. After the analysis for each system has
been entered, the resulting footage could be recombined to mark salient
transition points which occur as the result of the conflation of choices across
the systems. These higher-level transition points could also be recorded in a
database format.
Despite the difficulties of using a visual grammar to interact directly with
the dynamic visual image, the usefulness of such an approach is that the
analyst becomes sensitized to meaning through choice in visual semiosis. In a
manner analogous to language, the analyst can only become attuned to
metafunctionally based choices if one has in a sense directly engaged with the
text. The advances in computer technology mean that this is becoming a very
real option for our investigation of the dynamics of semiosis in real time.
Notes
1

Despite repeated written requests to Paul Hrisko, the Manager for the Film Clip
Licensing Division for Paramount Studios, copyright permission to reproduce
still frames from the movie containing the analysis of Chinatown was not given. I

128

MULTIMODAL DISCOURSE ANALYSIS

am, however, most grateful to Roman Polanski who kindly wrote in support of
my requests for copyright permission.
2 See Visual Communication (Sage Publications), a journal devoted to the theory and
analysis of visual images and multimodal texts.
3 See also Baldry (this volume) for the analysis of car advertisements.
4 See ledema's (2001) social semiotic framework and analysis of a television
documentary. His framework consists of six levels: frame, Shot, Scene, Sequence,
Generic Stage and Work as a whole.
Acknowledgements

I would like to thank Michael O'Toole for his kind permission to


reproduce Plate 5.1 from the CD-ROM Engaging with Art (Perth: Murdoch
University, 1999) [copyright Michael O'Toole] with acknowledgement to
the Rijksmuseum of Amsterdam for the original image of Rembrandt's The
Night Watch.
References
Baldry, A. P. (this volume) Phase and transition, type and instance: patterns in media
texts as seen through a multimodal concordancer, 83108.
Baldry, A. P. (ed.) (2000) Multimodality and Multimediality in the Distance Learning Age.
Campobasso, Italy: Palladino Editore.
Bordwell, D., and Thompson, K. (2001) Film Art: An Introduction (6th edn). New York:
McGraw Hill.
Gallaghan, J. and McDonald, E. (2002). Expression, content and meaning in language and music: an integrated semiotic analysis. In P. McKevitt, S. O'Nuallain
and C. Mulvihill (eds), Language, Vision and Music. Selected papers from the 8th International Workshop on the Cognitive Science of Natural Language Processing, Galway, Ireland,
1999. Advances in Consciousness Research, Volume 35. Amsterdam: Benjamins, 205220.
Christopher, N. (1997) Somewhere in the Night: Film Noir and the'American City. New York:
The Free Press.
Eaton, M. (1997) Chinatown. London: British Film Institute.
Gregory, M. (1995) Generic expectancies and discoursal surprises: John Donnne's
The Good Morrow. In P. H. Fries and M. Gregory (eds), Discourse in Society: SystemicFunctional Perspectives. Meaning and Choice in Language: Studies for Michael Halliday.
Norwood, NJ: Ablex, 67-84.
Gregory, M. (2002) Phasal analysis within communication linguistics: two contrastive discourses. In P. Fries, M. Cummings, D. Lockwood and W. Sprueill (eds),
Relations and Functions within and around Language. London and New York: Continuum, 316-345.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:
Arnold.
Heisner, B. (1997) Production Design in the Contemporary American Film. Jefferson: Me
Farland.
Hirsch, F (1981) The Dark Side of the Screen: Film Noir. New York: Da Capo Press.
ledema, R. (2001) Analysing film and television: a social semiotic account of hospital: an unhealthy business. In T. van. Leeuwen and C. Jewitt (eds), Handbook of
Visual Analysis. London: Sage, 183204.

ELECTRONIC MEDIA AND FILM

129

Kaplan, E. A. (ed.) (1998) Woman in Film Noir (rev. edn). London: British Film
Institute.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (2001) Multimodal Discourse: The Modes and Media of
Contemporary Communication. London: Arnold.
Krutnik, E (1991) In a Lonely Street: Film Noir, Genre and Masculinity. London: Routledge.
Lemke, J. L. (1998a) Metamedia literacy: transforming meanings and media. In D.
Reinking, L. Labbo, M. McKenna and R. Kiefer (eds), Handbook of Literacy
and Technology: Transformations in a Post-Typographic World. Hillsdale, NJ: Erlbaum,
283-301.
Lemke, J. L. (1998b) Multiplying meaning: visual and verbal semiotics in scientific
text. InJ. R. Martin and R. Veel (eds), Reading Science: Critical and Functional Perspectives on Discourses of Science. London: Routledge, 87113.
Lemke, J. L. (2000) Multimedia demands of the scientific curriculum. Linguistics and
Education, 10(3): 247-271.
Lemke, J. L. (2003) Mathematics in the middle: measure, picture, gesture, sign and
word. In M. Anderson, A. Saenz-Ludlow, S. Zellweger and V Cifarelli (eds),
Educational Perspectives on Mathematics as Semiosis: From Thinking to Interpreting to Knowing. Ottawa: Legas Publishing, 215-234.
Mclnnes, D. (1998) Attending to the instance: towards a systemic-based dynamic
and responsive analysis of composite performance text. Unpublished Ph.D.
thesis. University of Sydney.
Martinec, R. (2000) Construction of identity in Michael Jackson's 'Jam'. Social
Semiotics, 10(3): 313-329.
O'Halloran, K. L. (2003a) Educational implications of mathematics as a multisemiotic discourse. In M. Anderson, A. Saenz-Ludlow, S. Zellweger, and V V
Cifarelli (eds), Educational Perspectives on Mathematics as Semiosis: From Thinking to
Interpreting to Knowing. Ottawa: Legas Publishing, 185-214
O'Halloran, K. L. (2003b) Intersemiosis in mathematics and science: grammatical
metaphor and semiotic metaphor. In A.-M. Simon-Vandenbergen, M. Taverniers, and L. Ravelli (eds), Grammatical Metaphor: Views from Systemic Functional
Linguistics. Amsterdam: John Benjamins, 337365.
O'Halloran, K. L. and Judd, K. (2002) Systemics 1.0. [CD-ROM]. Singapore:
Singapore University Press.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
O'Toole, M. (1995) A systemic-functional semiotics of art. In P. H. Fries and M.
Gregory (eds), Discourse in Society: SystemicFunctional Perspectives: Meaning and Choice
in Language: Studiesfor Michael Halliday. Norwood, NJ: Ablex, 159-179.
O'Toole, M. (1999) Engaging with Art. [CD-ROM]. Perth: Murdoch University.
Palmer, R. B. (1994) Hollywood's Dark Cinema: The American Film Noir. New York:
Twayne Publishers.
Place, J. (1998) Women in Film noir. In E. Anne Kaplan (ed.), Women in Film Noir (rev.
edn). London: British Film Institute, 47-68.
Stables, K. (1998) The postmodern always rings twice: constructing the femme
fatale in 1990s cinema. In E. A. Kaplan (ed.), Woman in Film Noir (rev. edn).
London: British Film Institute, 164-201.
Thibault, P. J. (2000) The multimodal transcription of a television advertisement:
theory and practice. In A. P. Baldry (ed.), Multimodality and Multimediality in the
Distance Learning Age. Campobasso, Italy: Palladino Editore, 311385.

130

MULTIMODAL DISCOURSE ANALYSIS

Towne, R. (1974) Chinatown (R. Polanski, Director; and R. Evans, Producer).


Hollywood GA: Paramount Studio.
Tuska, J. (1984) Dark Cinema: American Film JVoir in Cultural Perspective. Westport, GN:
Greenwood Press.
van Leeuwen, T. (1999) Speech, Music, Sound. London: Macmillan.
Ventola, E., Charles, C. and Kaltenbacher, M. (eds) (forthcoming) Perspectives on
Multimodality. Amsterdam: John Benjamins.
Voytilla, S. (1999) Myth and the Movies: Discovering the Mythic Structure of 50 Unforgettable
Films. CA: Michael Wiese Productions.

Multisemiotic mediation in hypertext

Arthur Kok Kum Chiew


National University of Singapore

Introduction
This paper is an attempt to understand how an institution and its objectives
become translated, transmitted and received through the hypertext
medium. The notion of hypertext is first clarified with the aim of abstracting methodological categories which may be used for a semiotic analysis.
Following this, systemic functional models (Halliday, 1994; Kress and van
Leeuwen, 1996; O'Toole, 1994) are employed to examine the semiotic
choices made within a selected webpage, the Singaporean Ministry of Education (MOE) site,1 in order to examine the meanings produced by these
choices and the context circumscribing this choice-making and meaning
production. The interaction of meanings across different semiotic instantiations also features in this analysis.
Genesis of hypertext
The precedence of verbal over written language in human groups is firmly
acknowledged in conventional histories of writing, with only certain cultures
developing a recording-writing system for reasons of trade, religion or politics (Kress and van Leeuwen, 1996: 18-19). In Euro-American history, the
advent of print technology made recordable texts not only vastly replicable
but also more readily available compared to the past. In this sea of data,
however, information retrieval posed a serious difficulty because texts
remained in an unchangeable linear format.
Early theorists concerned with presenting and retrieving information
envisaged a system for providing complete access to the 'endlessly expansive
world of texts' (Tuman, 1992: 55). The term 'hypertext', coined by Ted
Nelson in the 1960s, was used to refer to a form of electronic text where the
mode of publication was characterized by 'non-sequential writing'; that is,
'text that branches and allows choices to the reader' in the form of 'a series
of text chunks connected by links which offer the reader different pathways'
through an interactive screen interface (Landow, 1997: 3). In the late 1960s,
theory moved towards reality when the Advanced Research Projects Agency
(AREA) of the Department of Defence in the United States of America set

132

MULTIMODAL DISCOURSE ANALYSIS

up ARPnet, an inter-computer communication network which was designed


to be impervious to communication disruptions in the event of a nuclear
attack (Moore, 1994: 4). While it initially connected selected academic
institutions, network technology soon expanded the use of hypertext. Software applications, such as web browsers, were made available to online
computer users, and these combined with other software applications (for
example, word-processing software) so that hypertext could be edited,
updated, copied and, in a word, 'acted' on. Juxtaposed to static and linear
print technology, hypertext became dynamic, alterable and multi-sequential.
Interpretations and applications of hypertext
Espen Aarseth (1997), appreciating the interactive co-partnering between
the reader and the creator during Internet surfing, describes hypertext as
'ergodic, using a term appropriated from physics that derives from the Greek
words ergon and hodos, meaning "work" and "path"'. 'Ergodist' has been
coined to refer to the person who interacts with the hypertext in this way
(Lim, 1998: 31). It is perhaps necessary to discuss what is meant by 'ergodic'
so as to more fully investigate the notion of the 'ergodist'.
'Ergodicity' describes, first, the complexity of path predetermination and,
second, how these paths can either be followed or bypassed, thereby creating
new paths. The former implicates a 'creator' of the path, and the latter a
choice-making individual who is faced with these paths. The 'ergodist' is this
choice-making individual who may follow predetermined paths suggested
by hypertext links which connect one webpage to another, or alternatively,
may forge his or her own path. In moving through hypertext, a complex
tripartite relationship exists between the ergodist, the hypertext and the
hypertext creator. As the next section will show, ergodist acquires its definitional fullness at a particular abstraction of hypertext.
The notion of hypertext has been rethought in various fields of study
including deconstruction, structuralism, post-structuralism, reader-response
theory, narratology, critical literacy (see Landow, 1997), and multiliteracy
(Kress, 2003; Kress and van Leeuwen, 2001; Lemke, 1998; Unsworth,
2001). A reactionary view of hypertext sees it as artificial, a threat to face-toface or 'real' communication, and an usurper of older communicative technologies such as the nostalgic pen(cil) and paper manuscript.2 On the other
hand, certain grandiose pro-hypertext statements claim hypertext to be an
evolutionary superior that will replace linear writing; that better communication will result simply because multiple interpretations and voices are
linked; and that hypertext will democratize society and education, even
surmounting artificial divisions between the disciplines. These and certain
other hyperbolic construals of hypertext detract from an understanding of
the nature of this new technology and what it can and cannot do for us. I
therefore propose a definition which opens up hypertext to further (multi)semiotic investigation (see also Kress, 2003; Kress and van Leeuwen, 2001;
Lemke, 1998; Unsworth, 2001).

ELECTRONIC MEDIA AND FILM

133

Proposed working definition of hypertext


Consensus seems to place hypertext as a new technology or medium for
communication which allows new dimensions of human interaction hitherto not possible. Indeed, hypertext is a means of communication where
multisemiosis as fact impinges upon the user. From these formulations, I
postulate the following working definition:
Hypertext is a computer supported online telecommunication technology that
makes possible the assembly, retrieval, display and manipulation of texts, which
are realizations of a single semiotic resource or a combination of semiotic
resources, some of which include visual, linguistic, phonic and music.

The crucial qualification 'makes possible' arises for two reasons: first, multisemiotic texts can be assembled by technology other than the hypertext;
second, a whole host of factors can curtail what hypertext affords; for
example, 'secure' websites that can only be accessed by certain knowledgeable people (whether one possesses the password or is an expert hacker),
incompatible or missing software, lack of technical savoir-faire., and so on. On
another note, my definition excludes CD-ROM programs for standalone
computer workstations. These CD-ROMs, while possessing certain hypertext features (such as connected scrollable pages and multimedia), are not
related or potentially relatable to other webpages or software in a larger
connected network of workstations. This exclusion holds until a website is
created for supporting the said CD-ROM program in a web-browser window, in effect, making it relatable to other webpages. One is forced to admit
that technological innovation continues to problematize the notion of
hypertext.
Orders of abstraction of hypertext
With this working definition of hypertext in place, it is now possible to
extract what I perceive to be different orders of abstraction with which one
can talk about hypertext. These orders of abstraction should not, however,
be confused with ranks or levels which are posited for different semiotic
resources. Halliday (1994) proposed such constituent ranks for the linguistic
semiotic. Borrowing this notion of levels, O'Toole (1994) suggests rank
scales for visual art, sculpture and architecture. Here the notion of rank
orders and relates systems of meaning-making across the different
metafunctions in what are essentially theoretical formulations of the
'grammar' of different semiotic resources. As such, the ranks operate within
the confines of the 'text' produced. These ranks become useful when one
seeks to uncover the choices made in instantiations of each of the semiotic
resources.
The orders of abstraction posited here for hypertext are methodological
categories construed to handle this to-date slippery technology. As we shall

134

MULTIMODAL DISCOURSE ANALYSIS

soon see, these orders of abstraction are not necessarily related to each other
by constituency. Indeed, the orders of abstraction are different in nature to
the aforementioned semiotic ranks because hypertext is not a semiotic
resource, but a platform for the codeployment of different semiotic
resources. The orders of abstraction proposed for hypertext are ITEM, LEXIA,
CLUSTER and WEB. As these terms require theorization, I start with the lowest
order of abstraction and develop these concepts to the highest or most
inclusive category of hypertext.
Item
An ITEM is any instantiation from any meaning-making system that is supportable by hypertext technology, and to date, these semiotic resources
include the linguistic, visual, music and phonic. The question of what
instantiation(s) count as an ITEM is necessarily preceded with a brief discussion of ranks (in italic font below) in semiotic systems.
A linguistic instantiation such as 'I could fly' is easily identified as a
Clause. In contrast, the instantiation 'Move!' is simultaneously a Clause, a
Verbal Group and a Word. O'Toole (1994: 12) observes the same phenomenon in certain paintings where a Work may simultaneously be an Episode,
a Figure or simply a Member. Ostensibly, ranks within any one semiotic
system are not impermeable to each other. In any one semiotic, an ITEM may
therefore be a number of instantiations of different ranks of the one semiotic combining together as a discernible whole. In multisemiotic texts, an
ITEM could be an instantiation of one semiotic resource, or a combination of
instantiations of different ranks of different semiotic resources joining
together as a methodologically justifiable whole. In this light, ITEM encapsulates this permeability of the ranks within and across semiotic resources.
What are the semiotic choices that contribute to a sign or a complex of
signs being designated as an ITEM? For either linguistic or visual semiosis,
they are the choices made in the Textual or Compositional metafunction
respectively. For a combination between the two resources, factors that
separate one ITEM from another crucially rest on the choices made in the
Compositional metafunction. These Compositional choices include those
from the system of Colour Cohesion, the system of Alignment and the
system of Gestalt: Framing (see Table 6.2). This is not meant, however, to
play down the fact that choices made in the other metafunctions in both
semiotic resources also contribute to the discreteness of a sign or complex of
signs, but that the justification for ITEM rests primarily on choices made in
the Compositional metafunction with regards to the Textual organization
of the typographical/graphical instantiation of the linguistic/visual semiotic choices.
As displayed in Plate 6.1, the order of ITEM could apply to a Word, a
boxed-up Clause(s), an Element of a stylized gust of wind, an Episode of a
man swatting a fly, the Work of an evening skyline serving as a background
graphic, or even a complex of signs.

135

Plate 6.1

Examples of items

So far I have only concentrated on signs or complexes of signs that are


either linguistic or visual as these make up most of what appears on a
webpage. However, hypertext makes available instantiations from other
semiotic resources as well. Can ITEM be extended to instantiations of the
phonic semiotic resource? Perhaps non-linguistic phonic instantiations
broadcast in hypertext may be designated as an ITEM. These may include a
sound clip such as the Microsoft Startup Window chime that is emitted
when the Microsoft platform is launched on the computer. An ITEM may
also be extended to melodic broadcasts, which again overlap between the
phonic and music semiotic resources. Likewise, perhaps in certain cases of
hypertext where audio recordings of linguistic discourse are broadcast, the
entire broadcast might be grouped as an ITEM. It is apparent that further
work in this direction is needed.

Lexia
The word lexia derives from Roland Barthes (1974: 1314) and stands for
the scrollable webpage; that is, the 'text composed of blocks of texts' that an
ergodist sees on the computer screen (Landow, 1997: 34). ITEM, which
include hypertext links, become the constituents that make up a LEXIA. In
practice, LEXIAS can be 'short' or 'long' depending on how many ITEMS are
included and how they are organized. It is at this order of abstraction where
(multi)semiotic realizations are organized in some meaningful way in relation to others. 'Reality' is represented (multi)semiotically, and the ergodist
engages with, and is placed in a particular relation to, what is displayed and
the producers of that display. The relation between LEXIA and ITEM is one of
composition where a LEXIA is made up of ITEMS. Instances of LEXIAS and
ITEMS are in turn realized from choices made in the metafunctional systems
for different semiotic resources.

Cluster
CLUSTER refers to a number of connected LEXIAS due to associations created
via hypertext links. These hypertext links are classified as 'LEXIA internal' as

136

MULTIMODAL DISCOURSE ANALYSIS

they are located within the LEXIA itself and serve to 'call-up' another LEXIA
should the ergodist click on it. With hypertext links, one agency (institution,
company, collective or individual) can link its many LEXIAS in such a way as
to suggest (and so limit) the multidirectionality of traversing the LEXIAS that
make up one CLUSTER. The notion of CLUSTER thus overlaps with the notion
of a producer-created path, because it is the producers of particular LEXIAS
who place hypertext links that in turn suggest or determine a pathway or
pathways through the CLUSTER. A CLUSTER can appear discrete from others
by means such as strategic placing of'Back', 'Forward', 'Back to Homepage'
buttons or even a sidebar with hypertext links to other LEXIAS within the
CLUSTER.
A complication to this order of abstraction may be the fact that a number of LEXIAS associated by one agency via hypertext links can join with or
overlap with others as a result of hypertext links put up by the same agency
or some other. This is not only a remote possibility, but an avenue exploited
by agencies who insert a hypertext link on their own LEXIA that links to a
larger number of associated LEXIAS. Pushed to its logical extreme, this
notion breaks down what is authoritatively the CLUSTER belonging to a particular agency. For example, in December 1999, a hypertext link on the
MOE homepage linked directly to a webpage belonging to the Housing
Development Board of Singapore (HDB), which was in turn linked with a
vast series of LEXIAS that the HDB produced. One asks where the MOE
CLUSTER ends and the HDB counterpart begins? This is precisely the problem of designating CLUSTERS based on agency. The notion of CLUSTER is
thus not concerned with agency perse., but associations formed via hypertext
links. These links are finite, and a CLUSTER 'rounds off, or starts becoming a
more discrete entity from other CLUSTERS with the termination of links.
While the CLUSTER is constituted by LEXIAS based on internal hypertext
links, these are temporal and changeable, thus making the associations
between LEXIAS transient and mutable. CLUSTER is as such Virtual' and an
observable disjunction occurs between this order and those of LEXIA
and ITEM.

Web
WEB is the number of LEXIAS associable through hypertext links and other
facilities internal and external to a LEXIA. Facilities that are LEXIA internal
(but are not hypertext links) include search engines situated within a LEXIA,
while LEXIA external facilities are those provided, for example, by the webbrowser software. These appear on the web-browser window and include
the 'Forward', 'Back' and 'Home' buttons among other options. LEXIA
external facilities also include the hardware, or the cable connections
between computers. This notion of WEB thus includes LEXIAS potentially
relatable to each other by Local Area Networks (LANs), such as Ethernet,
that join sets of machines within an institution or a part of one and also
Wide Area Networks (WANs) that join multiple organizations in widely

ELECTRONIC MEDIA AND FILM

137

spread geographical locations. The contemporary terms 'Internet' and


'World Wide Web' capture what is believed to be an increased global connectivity since LANs and WANs. WEB therefore characterizes the varying
degrees of associations as well as the different means of forming associations
between LEXIAS and CLUSTERS.
The range of facilities both LEXIA internal and external make potentially
all LEXIA accessible and traversable. Perhaps here is where hyperbolic statements about hypertext's infinitude arise. In reality, however, even if all
LEXIAS that comprise a WEB were made freely accessible, they are still a
finite number, and, furthermore, restrictions limit access to particular sites.
For example, certain 'secure' websites such as private online email accounts
are accessible only to the person with the password or technical expertise to
bypass the password requirement. WEB as an order of abstraction is what
embraces all potential associations via devices that establish links originally
internal or external to LEXIAS. Actual reading practice, therefore, is an
interaction between the two notions of CLUSTER and WEB, where an ergodist's route occurs either within or without the routes made by the producer
of the websites.
Orders of hypertext for semiotic mediation and analysis
Much effort has gone into clarifying the notion of hypertext because the
objective of this paper is to give an account of how semiotic resources are
codeployed in hypertext. With the ordering of hypertext into different
abstractions, it becomes clear that it is at the categories of ITEM and LEXIA
where multisemiosis, or the realization of different semiotic resources,
occurs, and hence where multisemiosis as a fact impinges on the ergodist. It
is at these two abstractions of hypertext where multisemiotic analysis can
meaningfully operate. This is demonstrated in the following section which
focuses on the analysis of the MOE homepage.
Context for the construction of the MOE homepage
Before the semiotic analysis proper, it is essential to include a brief consideration of webpages in relation to semiotic resources and the context of
situation and culture. This relationship is represented diagrammatically in
Figure 6.1.
Both processes of realization and instantiation imply a dialectic activation
to the right and below, and construal to the left and above. For example,
culture activates the use of semiotic resources while choices from the systems
of different semiotic resources construe culture. Likewise, culture activates
situation while situation construes culture. A certain complexity enters into
this relationship, however, when one appreciates that culture is not monolithic, that situations deriving from culture are not uniform and consequently LEXIAS are not entirely identical. Context, constituted by culture
and situation, thus needs to be appreciated as multidimensional.

138

MULTIMODAL DISCOURSE ANALYSIS

Figure 6.1 Relation of culture, situation, semiotic resources and lexia (adapted
from HaUiday, 1991)

Nonetheless, a particularization of the aspects of a context is useful for


uncovering the circumstances under which a webpage is produced. For
exemplification, I examine the MOE homepage 'frozen' at 7 January 2000,
scaled down and reproduced in Plate 6.2. Note the analysis refers to the
actual size of what is seen onscreen.
With the MOE homepage in view, one aspect of context is the sociopolitical climate constructed by the current ruling party in Singapore, the
People's Action Party (PAP). Through the years, the PAP has selectively
identified and communicated to the local population concerns over
Singapore's lack of natural resources, relative geographical smallness, heterogeneous population and proximity to nations predominantly MalayIslamic. With this 'crisis narrative', as some have called it, the government
offers economic survival among others as a solution-goal under which to
unify and direct Singaporeans (Heng and Devan, 1995).
Unsurprisingly, the use of hypertext has been discursively predicated on
this larger concern of economic survival in the 'Information Age'. For
example, in the Singaporean newspaper The Straits Times (9 February, 2000)
an editorial entided 'Internet a driving force' claims that:
In Singapore, where the need to be a communications hub is, if anything, more
acute than it is in places less dependent on global economy, connectivity is not a
slogan. It is a simple pointed imperative. Companies and employees must take

Plate 6.2 Lexia of the MOE homepage

140

MULTIMODAL DISCOURSE ANALYSIS

seriously the Government's call for workers to upgrade their skills to find a place
in the new knowledge economy.

As the educational arm of the PAP, the MOE works with such an end in
mind. In a public release, in the section entitled 'Cornerstone of education
policy', the MOE reveals that one of its chief foci is 'the development of
human resources to meet Singapore's need for an educated and skilled
workforce' (Ministry of Education, Singapore 2000). Out of this context
construed by the PAP and, more specifically, the MOE, the homepage
under consideration is erected.
Another configuration of context, comprising the production norms for
webpages, forms a necessary second step to contextualize the MOE
homepage. LEXIAS can be constructed for a range of purposes. One such
purpose is the display of information. Webpages that only serve this purpose
emerge as 'content heavy'. Other webpages are used for administrative
purposes such as gathering feedback and so possess features whereby the
ergodist can 'enter' whatever he or she wishes. A particular type of webpage
serves the function of welcoming and introducing the ergodist to a series of
linked webpages. Such a webpage is commonly referred to as the
'homepage', since it is held to be the locus point to all the other linked
webpages. Apart from welcoming and introducing the ergodist, homepages
may also serve as an index of varying degrees by having visible hypertext
links to the linked webpages.
The norms associated with a homepage provide an insight into one
aspect of the context that produces it. Most homepages have the generic
layout of masthead in the topmost position with various texts and hypertext links beneath. This layout is generally adopted by commercial and
institutional organizations perhaps because apart from welcoming and
introducing, it foregrounds the corporate identity behind the website. With
the identity of the 'seller' disclosed, the ergodist as consumer may in 'good
faith' accept the material goods, services or information proffered by the
website. Nonetheless, some websites do play with the rigid style of presentation or depart from it altogether to increase its engagement with the
ergodist. This is done either by experimenting with the different semiotic
resources in the hypertext environment or communicating in novel
ways through uniquely hypertext facilities to create a greater sense of
dynamism and unpredictability. For example, homepages may flout convention by duplicating and relocating the masthead vertically at the sides of the
webpage, and such columns of words may flash alternative colours
sequentially.
Whatever the case may be, the purposes served by a homepage are circumscribed by situational and cultural demands of context. Context thus
stands as a necessary preface to any semiotic analysis. With this in mind, one
may enter into an exploration of the semiotic choices and hypertext facilities
employed by the MOE homepage.

ELECTRONIC MEDIA AND FILM

141

Semiotic analysis of the MOE homepage


The semiotic account of the MOE homepage tackles many intersecting
questions: Why are certain semiotic choices made? How do these semiotic
choices work together to give meaning? What meanings are conveyed and
for what purpose? In effect, the following analysis works towards the central
concern of this paper: to explicate the complex question of why the webpage comes to be written in the way it is. Such an analysis in turn necessitates an account of the interaction of meanings between instantiations
of different semiotic resources, and this is explored in the next section of this
paper.
The semiotic examination of the meanings put forth by the MOE
homepage is systematized first at the order of LEXIA followed by the order of
ITEM. Such an analysis relies on the ranked functional systems for linguistic
and visual semiotic resources posited respectively by Halliday (1994) and
O'Toole (1994). Tables 6.1 and 6.2 provide a sketch of these ranked functional systems for both the linguistic and the visual semiotic.
Because Tables 6.1 and 6.2 are essentially 'unfinished' maps, the systems
are to a certain degree open-ended, implying that a greater level of analytical delicacy is always possible. Out of these posited systems, choices are
simultaneously made to produce particular instantiations. Additionally, systems across ranks may also work together for any one instantiation. To
capture this complexity, semiotic choices discussed in this analysis are presented in terms of'selection expressions'. These expressions use the systems
available in Tables 6.1 and 6.2 as 'entry points', and these are worked to
whatever level of delicacy is needed (see Hasan, 1996 for a detailed presentation of selection expression and entry points). All references to these entry
points in Tables 6.1 and 6.2 are henceforth in plain text with the initial letter
capitalized (for example Focus: Perspective), while those in italics represent
my more delicate contribution (for example, Gestalt: Framing: Bordering).
Pertaining to these selection expressions, there are several things to note: the
left-most element is the entry point for the discussion; colons precede a more
delicate choice in relation to the preceding element; and semi-colons
distinguish elements of the same level of delicacy.
Analysis at order of lexia
Modal and compositional choices
Because Modal choices are very intimately related to Compositional choices,
a discussion of the former cannot avoid invoking the latter. A quick survey
of the MOE homepage gives an impression of five sections represented in
Plate 6.3. These divisions are strongly suggested by the Modal choices from
the system of Scale to Whole, the system of Contrast and Conflict: Colour;
Scale; Light; Line and the System of Relative Prominence. These choices
may be more usefully explicated by complementary Compositional choices.

142

MULTIMODAL DISCOURSE ANALYSIS

Table 6.1 Halliday's functional systems for language (adapted from O'Toole,
1999)
Function/
Rank

Ideational
Experiential Logical

Interpersonal Textual

Clause

Condition
Transitivity
Types Of
Addition
Report
Process,
Participants
Polarity
and
Circumstances
(Identity Glauses)
(Things, Facts
and Reports)

Mood
Types Of
Speech
Function
Modality
(The WhFunction)

Theme
Types Of
Message
(Identity As
Text Relation)
(Identification
Predication
Reference
Substitution)

Verbal
Group

Tense
(Verb Classes)

Person
('Marked'
Options)

Voice
('Contrastive'
Options)

Nominal
Group

Modification Classification
Epithet
SubFunction
Modification
Enumeration
(Noun Classes)
(Adjective Classes)

Attitude
Attitudinal
Modifiers
Intensifiers

Deixis
Determiners
'Phoric'
Elements
(Qualifiers)
(Definite
Articles)

Adverbial
(incl.
Prepositional)
Group

Narrowing
'Minor
Processes'
SubPrepositional Modification
Relations
(Classes Of
Circumstantial
Adjunct)

Comment
(Classes Of
Comments
Adjunct)

Conjunction
(Classes Of
Discourse
Adjunct)

Word (incl.
Lexical item)

Compounding Lexical
Lexical
'Register'
Derivation
'Content'
(Expressive
(Taxonomic
Words)
Organization
(Stylistic
Of Vocabulary)
Organization
Of Vocabulary)

Catenation
Secondary
Tense

Collocation
(Collocational
Organization
Of Vocabulary)

These include Relative Position In Gestalt, In Episode And To Each Other:


Proximate, Gestalt: Framing: Bordering (for example, those borders under the
masthead and below 'Corporate Information') and groupings of recognizably similar instantiations under headings in Stylization: Font: Font Style:

143

ELECTRONIC MEDIA AND FILM

Table 6.2 Ranked functional systems for the visual semiotic (adapted from
O'Toole, 1999)
Function
Unit

Representational

Modal

Compositional

Work

Actions, Events
Agents, Patients,
Goals
Scenes, Settings,
Features
Portrayals, Sitters
Narrative Themes
Interplay Of
Episodes

Focus: Perspective
Clarity
Light
Colour
Scale
Volume
Gaze: 'Eyework'
'Paths'
'Rhythms'
Intermediaries
Frame
'Weight'
Modality: Fantasy
Irony
Authenticity
Symbolism
Omission
Intertextuality

Gestalt: Framing
Horizontals
Verticals
Diagonals
Proportion
Line
Rhythm
Geometric Forms
Colour Cohesion
'Theme'

Episode

Groups And
Sub-Actions,
Scenes, Portrayals
Side Sequence)
Interplay Of
Actions

Scale To Whole
Centrality To Whole
Relative Prominence
Interplay Of
Modalities

Relative Position In
Gestalt And To
Each Other
Alignment
Coherence
Interplay Of Forms

Figure

Character
Act
Stance
Gesture

Member

Basic Physical
Forms:
Parts Of Body
Object
Natural Forms
Components

Object
Characterization
Position Relation To Viewer
Gaze
Gesture
Contrast and Conflict:
Colour
Scale
Light
Line
Stylization
Attenuation
Chiaroscuro
Synecdoche
Irony

Relative Position In
Gestalt, In
Episode And To
Each Other
Parallelism and
Opposition
Subframing

Cohesion: Reference
Parallel
Contrast
Rhythm

Plate 6.3

Sections of the MOE homepage

ELECTRONIC MEDIA AND FILM

145

Bold (such as the ITEMS under 'Web Sites of Interest' and 'Corporate Information'). Because these sections are rectilinear and stacked vertically, the
Gestalt is one that positively suggests stability or negatively an absence of
dynamism (O'Toole, 1994).
The organization of linguistic and visual instantiation of this webpage
reflects a certain trend. If one were to consider the linguistic texts on the
webpage, the selection is Relative Position In Gestalt: Formatting: Left Justified.,
meaning strings of words are aligned from the same vertical point of
departure starting from the left. This left justification relates to the reading
practice associated with English texts which is left to right to the row below.
Additionally, each of the hypertext links under 'Highlights' and 'Corporate
Information' has a graphic bullet that indicates the start of a 'new point' as
well as a distinct hypertext link. These bullets therefore function to draw the
eye to the right and to signal the intended discreteness of linguistic instantiations. In much the same way, the MOE Shield at the top left corner of the
webpage calls attention to itself while bulleting the 'main point' of the
homepage: the Ministry of Education, Singapore.
More so than in other multisemiotic texts, the 'putting together' or construction of a hypertext involves a heightened awareness of bringing separate elements together in spatial relation to each other. This construction is
fundamentally achieved through Hypertext Mark-Up Language (HTML)
that is used to 'write' computer commands which execute the webpage as
seen on-screen. A source code thus details a particular webpage's HTML
consisting of commands enclosed in pointed brackets such as '<P align=centre>' to more complex ones such as <TABLE border=0 cellPadding= 5
cellSpacing=5 width='101 per cent'>'.
In addition, sequentiality in the source code usually translates to the
actual webpage displayed, as evinced by a simple comparison between the
given source code and the MOE homepage. The HTML of the source code
thus implicates a deliberate writer who is conscious of the spatial ordering of
texts as they appear on a webpage.
Representational choices
The above choices not only underscore the MOE as most salient (and this is
matter of course since it is the MOE homepage) but they also work in tandem
with Representational choices to construe the MOE's institutional 'face'.
Contextualized with other homepages, the MOE homepage does 'reassure'
with its 'generic' layout of masthead at the top with various texts and hypertext links beneath it. As mentioned in the discussion on context, this layout is
adhered to through a choice in Portrayal to foreground the corporate agency,
and this functions to increase credibility to the end of encouraging the
ergodist to 'buy' what is offered on-screen. In the case of the MOE, it is
information on local education-related issues that is being 'sold'.
Nonetheless, there are websites that play with the rigid style of presentation, or depart from it altogether, to create a greater sense of dynamism and

146

MULTIMODAL DISCOURSE ANALYSIS

unpredictability and so increase its engagement with the ergodist. Of


course, a website may go the extreme and end up deterring the mystified
ergodist. Regardless, the MOE homepage evidently has not experimented
with the semiotic resources nor with hypertext facilities. Consider how the
rigid left to right framework set against the stark-white background makes
the masthead appear as a letterhead. A sense of a printed document, that
'we have it in black and white', thus emerges. Perhaps because the
homepage is contextualized as a government website in an area of such
national preoccupation, in its adherence to the 'standard' layout of
homepages the MOE site has chosen to foreground credibility and background 'playfulness'. The MOE thus foregoes the creativity that different
semiotic resources and hypertext facilities afford, making the website
relatively 'conservative' compared with other webpages. How does such a
webpage act on the reader and what assumptions are embedded in this
representation of the MOE? Such questions are explored following the
more detailed analysis at the order of ITEM.
Scrollability
Before finishing the analysis at the order of LEXIA, one particular hypertext
feature gives cause for further thought. Due to several factors, such as a nonmaximized web-browser window or a small monitor display, a LEXIA may
only be presented in part. One facility hypertext opens up is what I call
'scrollability' which determines how the semiotic choices ultimately contact
the ergodist. A deliberately lengthy or wide webpage exploits scrollability
while simultaneously marking it as a feature for the ergodist.
The feature of scrollability has two types: vertical and lateral. As the
default display of webpages is always the topmost and leftmost portion first,
this means that for small displays, the option to scroll laterally arises, in
which case one must always start from the left. The more common case is
the vertical scrolling option, starting always from the top. Noting this default
top-left display, it is not surprising that webpage designers usually situate
what they deem as more important in these 'guaranteed viewing areas'. In
the light of scrollability, the preceding discussion needs re-examination
because even with the largest monitor display presently available and maximization of the web-browser window, the MOE homepage is only fully
'read' by scrolling downwards. The downward scrolling process is reproduced in Plate 6.4.
The initial window rules out all those ITEMS under the heading 'web sites
of interest' and below, ensuring that the already prominent masthead is even
more salient. Ostensibly, the convention of locating the most important
information (in this case the MOE masthead) at the top is a recognition of
the default top-left display.
What is deemed most significant is situated at the said guaranteed viewing
areas with the rest arranged in a descending sequence according to import.
This overall arrangement has a significant contribution to how the ITEMS

Plate 6.4

Scrolling sequence of the MOE homepage

148

MULTIMODAL DISCOURSE ANALYSIS

are read. The questions of how the MOE homepage acts on the reader and
what assumptions are embedded in this representation of the MOE may
now be more fully explored in an analysis at the order of ITEM.
Analysis at order of item

Working through a reading path


The following brief looks at the webpage's ITEMS works through a 'reading
path', a notion which relies on the assumption that 'all forms of semiosis are
read syntagmatically' against the patterned whole of the text (O'Halloran,
1999: 322-324). Whenever a new LEXIA is displayed on-screen, therefore,
some ITEM/S will arrest or compete for the attention of the ergodist. This
starting point through which the ergodist 'enters' the LEXIA is what Mario
Garcia terms the focal point or the Central Visual impact (CVI) (in Bohle,
1990: 36, cited by Wee, 1999: 21). The CVI is compatible with the notion of
an ITEM because, as we shall see, both can be accounted for by salient
Compositional (and in some cases Modal) choices. From the CVI, the ergodist engages sequentially with other ITEMS of the LEXIA, in effect working
through an idealized 'reading path'. From the above discussion on the feature of scrollability and the semiotic choices instantiated at the order of
LEXIA, I suggest a reading path labelled alphabetically in Plate 6.5.
Being 'bulleted' by the MOE Crest to the left and punctuated to the right
by a Y2K symbol, the masthead captures the initial attention of the ergodist
at Step A for the reasons discussed above. Ostensibly, Portrayals of authority
and preparedness (as embodied by the crest and the 'Year 2000 compliant'
symbol respectively) function to bolster the credibility of the website. The
reading practice of left to right to next line down brings one to the mission
statement 'Moulding the Future of Our Nation' immediately beneath the
masthead. The border below the mission statement 'closes off Step A,
which constitutes the CVI due to its superordinate position through Compositional choices. The complex interaction between the masthead and the
mission statement and what this interaction means are detailed later.
Next, the ergodist enters Step B via 'Highlights' in Stylization: font: font
style: bold, which labels particular entities as noteworthy or of news value.
The eye is then quickly drawn by the diagonal arc cutting through the logo
of the 2nd AEMM Education Ministerial Meeting. This option in Gestalt:
Diagonals has a certain dynamism when set against the darker coloured
circle. Presumably, since the '2nd AEMM' hypertext link is topmost and
alongside an image, the linked site is deemed (at least by the MOE) to be of
utmost interest if not importance. Perhaps the 2nd AEMM is ranked higher
because of its international scope, and this reflects a bid by the MOE, and
by extension the PAP, to accredit itself with global relevance.
In Steps C and D, hypertext links realized as linguistic instantiations are
read in the manner of left to right to the row below, due to reading conventions which are reinforced by the bullets. These hypertext links are arranged

Plate 6.5

Suggested reading path for the MOE homepage

150

MULTIMODAL DISCOURSE ANALYSIS

in a two-column top to bottom order. With the notable exceptions of the


hypertext links for 'HDB 40th Anniversary Web Site' and 'Teacher: Create
a sense of wonder. Offer new perspectives', all the other hypertext links
contain nominal groups. The lengthy nominalizations, such as 'New University Admission System from 2003', recall headlines which are a notable
feature for newspaper articles. Perhaps hypertext links are in general constructed to serve as headlines, promoting or giving the gist of their respectively linked pages.
In Steps C and D, the 'CL "B" Syllabus and Bonus Points Scheme', 'New
University Admission System from 2003' and other issues that relate in
some way to the centrality of ensuring that one obtains a 'good' education
are deemed newsworthy. As in this case, the packing together of hypertext
links becomes an index for the ergodist to obtain a limited overview of
associated webpages while, on the other hand, allowing the webpage
designer to limit and foreground what are considered important extensions
of the webpage. An index comprised of hypertext links emerges as one way
through which the webpage designers construct and underscore pathway
choices for the ergodist to choose from.
Reading conventions bring the ergodist to Step E through another header
'Web Sites of Interest'. Like the header before, and the final header 'Corporate Information' below, all ITEMS serving as headers in this webpage are
a result of Stylization: Font: Font Style: Bold., deriving a visual distinctiveness
against other linguistic ITEMS. Because of the option in Gestalt: Diagonals,
the tilted magnifying glass draws the eye into the hypertext link of 'Teacher:
Create a sense of wonder. Offer new perspectives'. The eye then moves
rightwards through to Step I, in this case not only because of reading
conventions. Here the subtle reduction in height of the hypertext links, the
increasing colour brightness to the right, together with the diagonals in the
telescope and the magnifying glass of hypertext links 'Educational Television' and 'NE.WS' respectively draw the reader across the page to Step I. A
closer look at the hypertext links in Steps F, H and I shows they relate
directly to what is spelled out by the MOE as the 'Cornerstone of Education
Policy':
Information technology will be used widely as teaching and learning resources to
develop skills in communication and independent learning. National Education is
also taught to foster strong bonds among students and develop in them a sense of
responsibility and commitment to family, community and country . . . capable of
contributing towards Singapore's continued growth and prosperity.
(The Ministry of Education, Singapore, 2000, emphasis mine)

The hypertext links under 'Web Sites of Interest' can thus be seen as primarily expansions of what are institutional-governmental goals rather than
what may be of some interest to the ergodist.
While one may examine any one of these ITEMS for any length of time,
the leftmost ITEM draws back the eye through the hypertext facility of

ELECTRONIC MEDIA AND FILM

151

animation. For the hypertext link 'Teacher: Create a sense of wonder. Offer
new perspectives' the image of the magnifying glass over a flower morphs
into a girl in mid-jump and then back again in perpetual recursion. As a
complex extension of the visual semiotic, animation necessitates further
research which, however, is beyond the scope of this enterprise. Nonetheless,
this conscious use of animation implies that the MOE made a decision to
foreground this particular ITEM. This link's power of attraction is also
enhanced by the possession of two of the only three Mood: Imperative
clauses which, in effect, level a 'direct' address at the ergodist. Both animation and the rhetorical stance carried by this ITEM function to attract and
situate the ergodist as someone who can 'Create a sense of wonder' and
'Offer new perspectives'. In the recent context of a nationwide campaign to
enlarge the teaching workforce, the relative magnetism of this ITEM
becomes meaningful when one recognizes the fact that it serves as a link to
another webpage that encourages individuals to join the teaching profession. Regardless, the general paucity of direct address may be due to an
aspiration towards a formal, objective register which interacts with the
'headline' convention of hypertext links as discussed above.
Steps J and K comprise an ordered bi-column arrangement of linguistic
hypertext links as in Steps C and D. Notably, under 'Corporate Information', a choice from Gestalt: Framing tabulates the hypertext links. Representationally the rectilinear framing is a choice which projects stability and
immutability which is meant to accord with the corporate, definitive nature
of the information. In this light, perhaps among other reasons, the linguistic
hypertext links in Steps C and D are not framed because they are by nature
time sensitive. For example, in January 2000 the hypertext link for the '2nd
AEMM Education Ministerial Meeting' appeared while simultaneously the
hypertext link to 'ThinkQuest-Singapore' was dropped.
In Step L, a border marks off the final portion of the webpage which
contains the MOE's contact information and the 'Last Updated' date in
small fonts. This contact information is obligatory insofar as authenticating
the website and providing an avenue for dialoguing with the MOE. However, this information may perhaps be obscured because it is deemed relatively less newsworthy to the purposes of the website which acts as a media
arm for the MOE. As evidenced by choices in Relative Prominence, contacting the MOE through any of the channels laid out in the contact information is downplayed as an option for the reader. What is instead deliberately
highlighted are the definitive statements found in the LEXIAS that the MOE
has already scripted for the MOE CLUSTER.
Backtracking the steps of the reading path, one finds an increasing significance associated with the ITEMS, with Step A housing the 'main subject'
from which the rest of the webpage is understood: the MOE. As the tour at
the orders of LEXIA and ITEM shows, different texts on a webpage stand out
differently due to various Modal, Compositional and Representational
choices, pulling in the ergodist's gaze at every step of the reading path. In
turn, these choices as a whole reflect an image of the MOE as construed by

MULTIMODAL DISCOURSE ANALYSIS

152

the MOE homepage for the ergodist: as the authoritative voice on local
education in service of the governmental goal of economic viability through
an educated workforce in the global marketplace.
So far, the semiotic choices explored at the orders of LEXIA and ITEM
have been rather brief owing to the limitations of space. Nonetheless, this
sketch sets the backdrop against which a more delicate account of semiotic
activity may be further explored and detailed. This activity is exemplified
with a focused examination of Step A in the MOE homepage.
An account of intersemiosis
Visual semiosis

The ITEM that stands out as the CVI in the MOE homepage is that complex of signs constituting the masthead, reproduced in Plate 6.6.

Plate 6.6

The MOE masthead

The Relative Prominence of the masthead at the top of the page is in


sharp relief to the other ITEMS of the webpage, and this arrests the attention of the ergodist and serves as the focal point through which the webpage
is entered. Additionally, options in Scale To Whole realize the masthead as
larger than any other ITEM, thus heightening its prominence. Furthermore,
the Contrast and Conflict: Colour of the rectangle (which is brown) acts to
distinguish the masthead from the white background of the webpage, simultaneously adding salience to the white words it encloses. In a binary fashion,
the white of the masthead words is now in contrast to the rest of the
predominantly dark-coloured linguistic text, even as the latter derive their
clarity from the white background of the webpage. This interlocking contrast is precisely what throws the linguistic instantiations in sharp relief to
one another.
The masthead reveals a more delicate option in Contrast and Conflict:
Scale in terms of Font Size. The words 'Ministry of Education, Singapore'
are Font Size:24, which is noticeably larger than the rest of the linguistic
instances which are Font Size: 12 or less. Within Stylization, Font serves as a
further specification. While 'Ministry of Education, Singapore' approximate Font: Times New Roman, a large number of the other linguistic instances
are Font: Arial or some derivation through joint options with Font: Font Style:
Bold or Font: Font Style: Italics. The uniformity displayed in the majority of
linguistic instances may be related to the default setting of Font: Arial in
HTML. Any other font in the actual webpage display must be deliberately

ELECTRONIC MEDIA AND FILM

153

chosen at the programming stage from a larger range of font styles. Any
other font style apart from 'Arial' thus implies a certain degree of deliberateness. The masthead with its non-conventional font is thus a deliberate
choice to make it stand out from the rest.
The 'effect' of Modal choices is thus intimately tied to how the texts are
arranged in meaningful relation to each other, that is, the compositional
choices made. Gestalt: Framing is selected for the masthead via a border with
equidistant light and dark intensities of colour, suggesting both variation
and regularity. The strong rectangular frame at once mirrors the rectilinear
frame of the web-browser window and is echoed by the grid-like pattern
within itself. Although the criss-crossing lines segment and may thus fracture
the surface of the masthead, the continuity of the words 'Ministry of Education, Singapore' over the surface evokes at the very least a closely pieced
together surface without chinks. What remains is a Parallelism connecting
these geometric Forms which relate 'to the horizontal axis and the vertical
axis [. . .] [and] contribute to stability and harmony' (O'Toole, 1994: 23).
What is crucially conveyed by the Modal and Compositional choices are
the discreteness, centrality and stability of the masthead. Representationally,
the masthead with its tiled texture and patterned border suggests among
many things some flat human-worked surface. Two important observations
can be made here: first, the range of visual meanings are suggested by the
actual choices instantiated; and second, while the meanings are, according
to Barthes (1977: 38-39), 'polysemous', they are nonetheless finite (Kress
and van Leeuwen, 1996: 16).
Visual-linguistic intersemiosis
Barthes's (1977) attempt to 'fix' visual meanings has been criticized because
it makes visual meanings dependent on linguistic choices, a phenomenon he
called 'anchorage'. Nonetheless, perhaps Barthes observes part of a more
complex process. Analyzing the masthead once again, the uncertainty of the
visual meanings is clarified somewhat as it interacts with the linguistic meanings it frames. The meanings of the Nominal Group may be uncovered by
examining the choices realized in its structure as we may see in Table 6.3.
At the rank of Word, the Lexical 'Content' of the noun head 'Ministry'
allows for these taxonomic meanings:
1 a a government department headed by a minister, b the building which it
occupies. 2 a (prec. by the) the vocation or profession of a religious minister, b the
office of a religious minister, priest etc. c the period or tenure of this. 3 (prec. by
the) [a] the body of ministers of a government or [b] of a religion. 4 a period of
government under one Prime Minister. 5 ministering, ministration.

(Reader's Digest Oxford Complete Wordfinder 1994: 969)


At the rank of Nominal Group, the options 2 a, b, c, 3[b] and 4 are excluded
by the following choices in Modification: the premodifying definite article

154
Table 6.3

MULTIMODAL DISCOURSE ANALYSIS


Nominal Group structure in the masthead

The

Ministry

of Education,

Singapore

Premodification
Determiner
Article
Definite

Head

Postmodification

Postmodification

Postposed
Prepositional
Phrase

Postposed Noun
Phrase
Head

'The' establishes 'Ministry of Education, Singapore' as unique, monolithic


and authoritative just as the postmodifers 'of Education, Singapore' imbue
the function and sphere of influence of the ministry. The Nominal Group is
thus specified to mean:
(a) From la: The Ministry of Education headed by the Minister of Education.
(b) From Ib: The building of this government department.
(c) From 3 [a]: The body of ministers of this government department.
(d) From 5: The administration of this government department.

The polysemy of meanings proposed by the instantiated linguistic choices is


crucially finite. What is represented visually as discussed above now comes
into relation with this range of linguistic meanings, and seems to further
contract the range of linguistic meanings to allow for only (b), that is,
the physical building. I call this 'Specification 1'. However, 'Ministry of
Education, Singapore' also comes into relation with yet another linguistic
instantiation.
Moulding the future of our nation
As mentioned in the above discussion at the order of LEXIA, this relationship between these two discrete linguistic ITEMS is encouraged by the Compositional choices Relative Position To Each Other: Proximate (instantiated as
their top-bottom proximity) and Gestalt: Framing: Bordering (instantiated as
the dotted line 'sectioning out' these two linguistic ITEMS from others). In
addition, because the mission statement is non-finite, it can be thought of as
a dependent Clause, which calls into question what it is dependent on.
Causal relationships may also be implicit with dependent Clauses, and in this
case, one may ask who or what agent is 'moulding the future of our nation'.
As a result of these questions and the Compositional choices mentioned
above, the two linguistic ITEMS come into the following possible relation in
Table 6.4.
The mission statement may thus be perceived as entering into Experiential relations with the masthead. The material process 'Moulding' stipulates

ELECTRONIC MEDIA AND FILM


Table 6.4
statement

155

Experiential relations between the masthead and the mission

Ministry of Education,
Singapore

Moulding

the Future of Our Nation

Participant
Actor

Process
Material

Participant
Goal

a sentient Participant Actor. This is complemented with the option Font: Font
Style: Italics which also imbues the mission statement with a sense of dynamism, implicating an animate Actor. These options act to specify the
linguistic meaning of the masthead as (a) and (c) (see above). I call this
'Specification 2'. As can be observed, a disjunction arises between Specifications 1 and 2. The MOE as imaged by Specification 1 is solid, concrete,
immovable and non-living. In contradistinction, Specification 2 suggests the
MOE as the animate agent shaping Singapore's future. In this sustained
ambiguity, the MOE is (re)presented as a human agency who is at the same
time 'faceless', impenetrable and incontestable. This depiction of the MOE
derives perhaps from the premises of its uncontestable authority with
respect to educational matters and its existence as an arm of the PAP.
Abstraction ofintersemiosis

This discussion has been concerned with the way meanings across instantiations of various semiotic resources interact with one another to give a new
meaning or set of meanings. This complex interaction and production of
meanings between instantiations of different semiotic resources is called
'intersemiosis'. Though the prior analysis of Step A is sequenced as Specification 1 followed by 2 in keeping with the suggested reading path,
intersemiosis does not in fact depend on any one sequence, but upon the
meanings first conveyed by each instantiation. In other words, for multisemiotic texts, there is no binding unidirectionality or sequentiality for
meaning interaction. Rather, one instantiation comes into relation with
another, and each simultaneously specifies the other.
Intersemiosis as discussed so far has been circumscribed by Compositional
choices such as Gestalt: Framing and Relative Position: Proximate that relate
instantiations that are spatially 'grouped'. A more complete notion of
intersemiosis recognizes that choices from the Modal and Representational
systems can also bring instantiations that are spatially distant or ungrouped
into significant relations for the interaction and production of meanings.
However, these non-Compositional factors for intersemiosis can only be
pursued outside the confines of this paper.
An abstraction of the stages of visual-linguistic intersemiosis may be
offered at this point as Relation, Intersection and Manifestation (collectively
RIM):

156

MULTIMODAL DISCOURSE ANALYSIS

(1) Relation: Compositional, Modal and Representational meaning-making


choices delimit what instantiations are 'connected'. It is in the context of
this connectedness between instantiations of different semiotic resources
where intersemiosis occurs.
(2) Intersection: the range of meanings suggested by an instantiation of a
particular semiotic comes into some relation with another instantiation
of the same or different semiotic resource. Between instantiations, meanings across instantiations that are similar underscore each other to produce one focused meaning, or a specification. Conversely, meanings that
are not similar may be either backgrounded or foregrounded.
(3) Manifestation: specifications across different semiotic instantiations may
either materialize in a single highly determined, focused meaning or a
number of focused meanings. In the latter case, a sustained polysemy
results. Ambiguity results when the polysemous meanings are divergent.
Therefore, contra Barthes (1977), it is not only linguistic meaning that
anchors visual meaning, but the reverse as well. Questions of how similar or
divergent meanings may be determined aside, the above approach uncovers
to some degree the complexity of intersemiosis. However, it becomes clear
that further work in this area is needed. Nonetheless, the abstract stages of
Relation, Intersection and Manifestation (RIM) may provide a way to
describe the process of encircling the pool of meanings occurring in
multisemiosis.
Conclusion
This undertaking has been an exercise in increasing specificity. That is,
against an expansive range of discourse on hypertext, four abstract orders
of hypertext are posited, out of which the two lower orders of LEXIA and
ITEM are identified as sites for semiosis. At these lower orders of abstraction,
a multisemiotic analysis was applied to the MOE homepage to uncover the
meaning-making choices which construe the MOE. A further particularization occurs when intersemiosis is demonstrated at the level of delicacy of
two ITEMS. Finally, this exploratory attempt culminated in an abstraction
of the process of intersemiosis, Relation, Intersection, Manifestation
(RIM) which approaches the problem of how to illuminate this complex
phenomenon.
The issue of whether non-linguistic semiotic resources are systemic raises
the question of the validity of extending the notion of the systemic
metafunctions beyond language. The contention that there may not exist a
stratum of 'grammar' for a non-linguistic semiotic resource and that, even if
there is, this stratum is of a comparable nature to that of language becomes
an issue. These theoretical questions remain still very much questions in
themselves and there is no reason to date to reject the notion that nonlinguistic semiotic resources are systemic and tri-metafunctional. This is
not to say that the metafunctional systems between semiotic resources are

ELECTRONIC MEDIA AND FILM

157

identical. That is patently untrue for the simple reason that different semiotic resources have different ways of meaning, and so have in themselves
different meaning-making systems. The systems proposed for non-linguistic
semiotic resources are markedly different from the linguistic. One crucial
question may be whether non-linguistic semiotic resources serve non-social
functions. The notion that semiosis is necessarily social seems to secure the
notion of the three metafunctions (see Kok, 2001).
While exploring the systemic choices in the MOE homepage, my analysis
has worked with a suggested reading path. This does not, however, rule out
the fact that an ergodist can focus initially on an ITEM other than the GVI,
or in a similar fashion, can work through a different sequence of engaging
with the ITEMS on a LEXIA depending upon the immediate contextual
factors such as the number of times the website has been viewed. Furthermore, what is immediately demanding of attention for one particular culture may not be so for another, although acculturation across cultures is
becoming more frequent with the spread of mass media, of which hypertext is a part. Further to this, it appears that various meaning-making
choices and facilities in hypertext, as demonstrated, function to secure certain sites of immediate visual engagement so that a GVI becomes visually
prominent.
This enterprise has been unwilling to divorce hypertext from contextual
use because as a means of communication, hypertext only acquires its
richness and definition from its use in the social realm. The functions of
hypertext are not wholly determined either by technology or society, but
by technology used in society. As future innovations in communicative
technology surface, new ways of meaning-making will be introduced.
What has been suggested in the course of this undertaking are some of
the new systems of meaning-making enabled by hypertext. However, further work is needed to account for the many other systems opened up in
this new platform. Nonetheless, the value of this work lies in its potential
to explicate the process through which semiotic choices are made, how
they are made, for what purposes and to what effect. It is hoped that this
has provided some answers to enquiries concerning the shifting ways of
communication and works towards a fuller disclosure of multisemiotic
activity.
Notes
1 Due to publishing constraints, the MOE homepage could not be reproduced in
colour. As colour is an important resource for meaning, these constraints somewhat compromise the reader's interpretation of the webpage and the analysis
presented here. However, every effort has been made to overcome this deficiency.
2 Although it seems counterintuitive to say that means of writing prior to the
printing press or the typewriter are technologies, 'the papyrus roll and the vellum manuscript also exemplify technologies of writing . . . [as] . . . both required
devices: the reed pen and papyrus in ancient Egypt, and the quill and parchment
in the Middle Ages' (Snyder, 1997: 1).

158

MULTIMODAL DISCOURSE ANALYSIS

Acknowledgements

Plates 6.2, 6.3, 6.4, 6.5 and 6.6 are reproduced by courtesy of the Ministry
of Education (MOE), Singapore. The screenshots of the MOE homepage
were captured on 7 January 2000.
References
Aarseth, E. J. (1997) Cybertext: Perspectives on Ergodic Literature. Baltimore: John
Hopkins University Press. Available from: http://www.hf.uib.no/cybertext/
Ergodic.html.
Barthes, R. (1974) S/(R. Miller, trans.) New York: Hill and Wang.
Barthes, R. (1977) Rhetoric of the image. In R. Barthes (S. Heath, ed. and trans.),
ImageMusicText. London: Fontana, 3251.
Bohle, R. (1990) Publication Design for Editors. New Jersey: Prentice-Hall.
Halliday, M. A. K. (1991) The notion of context in language education. In
T. Le and M. McCausland (eds), Language Education: Interaction and Development:
Proceedings of the International Conference, Ho Chin Min City, Vietnam 30 March-1 April
1991.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:
Edward Arnold.
Hasan, R. (1996) Ways of Saying: Ways of Meaning Selected Papers of Ruqaiya Hasan.
London: Gassell.
Heng, G. and Devan, J. (1995) State Fatherhood: the politics of nationalism, sexuality and race in Singapore. In A. Ong and M. G. Peletz (eds), Bewitching Women,
Pious Men: Gender and Body Politics in Southeast Asia. Berkeley: University of California
Press, 195-215.
Kok, K. G. A. (2001) What is material about hypertext? Unpublished masters thesis.
National University of Singapore.
Kress, G. (2003) Literacy in the New Media Age. London: Routledge.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (2001) Multimodal Discourse: The Modes and Media of
Contemporary Communication Discourse. London: Arnold.
Landow, G. P. (1997) Hypertext 2.0: The Convergence of Contemporary Critical Theory and
Technology. Baltimore: John Hopkins University Press.
Lemke, J. L. (1998) Metamedia literacy: transforming meanings and media. In
D. Reinking, L. Labbo, M. McKenna and R. Kiefer (eds), Handbook of Literacy and
Technology: Transformations in a Post-Typographic World. Hillsdale, NJ: Erlbaum,
283-301.
Lim, B. L. L. (1998) Hypertext fiction: a narrative analysis. Unpublished honours
thesis. National University of Singapore.
Ministry of Education, Singapore. (2000) Education in Singapore; available from
http://wwwl.moe.edu.sg/educatio.htm.
Moore, M. (1994) Introducing the internet. In Wired Magazine: The Internet Unleashed.
Indianapolis: Sams Publishing, 419.
O'Halloran, K. L. (1999) Interdependence, interaction and metaphor in multisemiotic texts. Social Semiotics 9(3): 317354.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University
Press.

ELECTRONIC MEDIA AND FILM

159

O'Toole, M. (1999) Functions and Systems in Verbal and Visual Texts. Paper presented at
the 26th International SystemicFunctional Congress. Regional Language
Centre (RELC), Singapore, 26-30 July 1999.
Reader's Digest Oxford Complete Wordfinder, The (1994) London: The Reader's Digest
Association Limited.
Snyder, I. (1997) Hypertext: The Electronic Labyrinth. New York: New York University
Press.
Straits Times, The (9 February 2000) Internet a driving force.
Tuman, M. C. (1992) WordPerfect: Literacy in the Computer Age. London: Falrner Press.
Unsworth, L. (2001) Teaching Multiliteracies across the Curriculum: Changing Contexts of
Text and Image in Classroom Practice. Buckingham, UK: Open University Press.
Wee, C. K. A. (1999) Multi-semiotic analysis of advertisements. Unpublished honours thesis. National University of Singapore.

This page intentionally left blank

Part III
Print media

This page intentionally left blank

The construal of Ideational meaning in print


advertisements

Cheong Tin Yuen


National University of Singapore

Introduction
The investigation of the intricacies, complexities and nuances of multisemiotic texts has been the Focus of recent research. This arises from the
observation that 'language, and typological modes of semiosis generally,
have evolved to work in partnership with other, often more topologically
grounded, semiotic systems' (Lemke, 1998: 111). O'Toole (1994), Kress and
van Leeuwen (1996, 2001), Lemke (1998), Wee (1999), O'Halloran (1999)
and Baldry (2000) have made significant strides within this area of multisemiotic text analyses from a systemic-functional perspective.
This paper aims to contribute to the development of a theoretical framework and vocabulary for the articulation of meaning in multi-semiotic texts
as research in this realm has not been as extensive as the examination of
purely linguistic texts. To limit text analyses to only the linguistic aspect and
disregard the non-linguistic features such as graphs and diagrams is tantamount to annihilating the efflorescence of meaning that can emerge from a
multi-semiotic analysis. As aptly stated by Wee (1999: vi):
Compared to text with a single semiotic code, the meaning potential of multisemiotic texts is greatly expanded. Hence, meaning creation becomes an interactive, dynamic and symbiotic process.

Research into multi-semiotic texts is indeed underrepresented, which is ironical as 'computer technologies make multimedia genres more convenient
and accessible for all purposes, [thus] it will become increasingly important
to understand how the resources of different semiotic systems have been and
can be combined' (Lemke, 1998: 111). In this information age, it is indeed a
rarity for texts not to be illustrated, and this further signals the need to
invigorate and fortify research in this area.
Gheong (1999) proposes a working systemic-functional model for meaning-making in print advertisements through proposing lexicogrammatical
strategies for Ideational, Textual and Interpersonal meaning. Constraints of
space here allow for only a discussion of the construal of Ideational meaning
in multi-semiotic texts. In this paper the generic structure potential of

164

MULTIMODAL DISCOURSE ANALYSIS

advertisements is proposed and illustrated through the examination of


five advertisements. In the following sections I discuss five strategies for
construing Ideational meaning: the Bidirectional Investment of meaning,
Contextual Propensity, Interpretative Space, Semantic Effervescence and
Visual Metaphor. Hasan's (1996) Generic Structure Potential for advertisements and Kress and van Leeuwen's (1996) concept of Given and New
are critiqued, together with a re-examination of Barthes's (1977) notions of
'anchorage' and readerly and writerly texts.
Generic structure potential of a print advertisement
Hasan (1996: 41-42) proposes 'CaptureAFocusAJustification' as the generic
structure for an advertisement. Hasan (1996: 41) aims to encapsulate the
multi-semiotic nature of advertisements, with the Capture functioning:
to attract attention . . . [and] realized in the written mode through the management of the visual layout, the typeface patterns and/or the presence of pictures.
According to Hasan (1996: 41), the Focus csingle[s] out that which is being
advertised'. However, while stating that the Focus can be visually realized,
Hasan (1996) does not clarify whether the Focus has a linguistic realization as
well. Hasan (1996) also establishes the presence of a visual aspect to the
Justification, but in a similar manner does not include the component to give
a 'detailed account of other elements of structure for an advertisement'
(Hasan, 1996: 42). Suffice to say that Hasan's (1996) generic structure Captures to some extent the multi-semiotic nature of advertisements.
Following Hasan's proposal, there is a need to provide a more detailed
account of generic structure for advertisements. Hasan's (1996) model does
not make explicit the complexities involved in the interaction between visual
images and linguistic text in advertisements. It is the aim of this paper to
provide a model that best Captures the multi-semiotic interaction between
visual images and linguistic text in print advertisements.
Based on this limited study of print advertisements, the Generic Structure
Potential or GSP which ' [expresses] the total range of optional and obligatory elements' (Halliday and Hasan, 1985: 64) for advertisements may be
Captured as:
LeadA (Display)AEmblemA (Announcement)A (Enhancer)A (Tag)A
(Call-and-Visit Information)
Table 7.1 details the generic structure of a print advertisement. In this
framework the various visual and linguistic components in an advertisement
are made explicit, together with the interaction between these semiotic
resources which creates differing levels of Ideational, Interpersonal and
Compositional/Textual meaning.

PRINT MEDIA
Table 7.1

165

Proposed generic structure of print advertisements

Visual components

Linguistic components

Lead: Locus of Attention (LoA), Complement to


the Locus of Attention (Gomp.LoA)
Display. Explicit, Implicit, Congruent,
Incongruent (metaphorical)
Emblem
Announcement Primary, Secondary
Enhancer
Emblem
Tag

Call-and-visit information
Interaction to create Interpersonal, Ideational and Compositional/Textual meanings

Five advertisements are analysed in this paper: the Golf, the Epson, the
Ml, the Beetle and the Guess? advertisements, which are displayed in
Plates 7.1-7.5. I discuss in the next section why the Lead and the Emblem
are designated obligatory elements while the others are optional.

The Lead
The discussion that follows details the characteristics and function/s of the
various components that constitute the Generic Structure Potential of a
print advertisement. I will begin with the Lead.
The Lead is thus termed as it is Interpersonally most Salient (Kress and
van Leeuwen, 1996) through choices in size, position and/or colour. The
Lead is illustrated in Plate 7.1. On its own, the Lead has a wide spectrum in
terms of meaning potential, that is, many possible meanings emanate from
the Lead. Interpreted independently of the Announcement, Enhancer, Display and Emblem, the Lead is figuratively an efflorescence of meaning. For
example, the sensual looking female who is the Lead in the Golf advertisement (Plate 7.1) could be calling to attention the new millennium look or she
could be an ambassador for women's rights. Therefore, on its own, the Lead
has a bounty or a kaleidoscope of possible meanings.
As I explain below, the Lead consists of the Locus of Attention (LoA) and
Complements to the Locus of Attention (Comp.LoA). There is an element
in the Lead that by its very Salience, be it an unusual quality that challenges
reality or outstanding size, colour and so forth, arrests the attention of the
viewers. In Plate 7.2 depicting the Epson advertisement, it is the splash of
the water outside the boundaries of the photograph. This attentionarresting element is termed the 'Locus of Attention' (LoA). The LoA
embeds the central idea of the advertisement, that Epson produces lifelike
quality prints. The three-fold functions of the LoA include Interpersonally
attracting attention, and Ideationally construing reality in a way intended by
the advertisers, where the viewer's perception of reality is manipulated.
Textually, it is a springboard for further development of the central idea, for

166

Plate 7.1

MULTIMODAL DISCOURSE ANALYSIS

Generic structure of the Golf advertisement

example, that Epson produces lifelike quality prints in the linguistic text that
follows. 'The text serves to elaborate' the visuals (Kress and van Leeuwen,
1996: 194). But by what specific strategies/systems, we are left uninformed.
The following discussion serves to explain this.

PRINT MEDIA

Plate 7.2

167

Generic structure of the Epson advertisement

Visually, the LoA encapsulates the central idea that Epson produces lifelike prints. This central idea is reiterated in the linguistic text. That is, there
is a linguistic equivalence (be it in the form of sentences or particular lexis)
that coheres ideationally with this central idea conveyed in the LoA. Ideationally, the following linguistic items, including clauses and nominal groups,
encapsulate tightly and parallel the idea embedded in the LoA, that is,
Epson produces lifelike prints:

168

MULTIMODAL DISCOURSE ANALYSIS

(a)

EPSON STYLUS PHOTO EX - crystal-clear, photographic quality


printing
(b) //Six specially formulated colour inks deliver richer, more lifelike images//
(c) //while EPSON PhotoEnhance provides realistic colour balance every
time//
(d) //The EPSON Stylus Photo EX can transform your photography//
(e) EPSON Stylus. The most advanced inkjets.

If the above linguistic items had occurred in isolation without the


accompaniment of the LoA, they would be mere statements weakened of
their persuasive force to manipulate perception in a way intended by the
advertiser, thus diminishing the influence over viewers to purchase the
product. However, with the LoA conveying visually the idea that Epson
produces lifelike prints, the meaning potential in linguistic items (a)-(e) is
significandy enhanced. Interpersonally, the LoA provides the context in
which linguistic items (a)(e) are endowed with greater persuasive force to
influence viewers to purchase the product. Meaning-making of (a)-(e)
from the Ideational perspective is enhanced, as the LoA serves textually
as a reference point for readers to make sense of what exactly is meant
by 'crystal-dear photographic printing', 'lifelike images', 'realistic colour balance',
'Epson . . . can transform your photography' and 'EPSON STYLUS. The most
advanced inkjets'.
When the rankshifted clause 'to make a bigger splash with your images' in the
Enhancer (see Plate 7.2) is read within the context of the LoA, it is identified
as a pun. There is an interplay between linguistic item and the visual image
that enhances the meaning potential of the rankshifted clause as well as the
LoA. Without the LoA, there would be no such interplay of meaning, thus
the rankshifted clause 'to make a bigger splash with your images' would not
be interpreted as a pun, reducing the overall affective appeal of the advertisement. Extending Wee's (1999) concept of symbiosis, the LoA and the
linguistic text act on each other, mutually reinforcing and enhancing the
meaning potential of the Lead.
Particular facets of the meaning potential of the Ideational meaning of
the LoA can be articulated linguistically, that is, the Ideational meaning of
the visual code can be translated into the linguistic code. Items (a)(e) above
are an articulation and representation, in linguistic form, of the meanings
embedded in the LoA. Conversely, it can be stated that the Ideational meanings in Items (a)-(e) are loaded into the LoA. The LoA is a visual compression of the linguistic meaning in (a)-(e). The LoA can thus be interpreted as
a Visual Metaphor as explained below.
Bohle (1990: 36) mentions Garcia's centre of visual impact (CVI), 'where
the reader enters the page . . . without the CVI, a page is a mass confusion
of elements competing for attention'. Wee (1999) further states that the CVI
'becomes the entry point for the reading path of the multi-semiotic text. It is
the Theme of the entire text' (Wee, 1999: 21). Though paraUeling the CVFs
function in engaging the viewer Modally, the proposed LoA functions

PRINT MEDIA

169

beyond a mere Interpersonal engaging of viewers' attention. It is not limited


to just being the theme, 'the point of departure of the message' (Halliday,
1994: 37). Functioning as a Visual Metaphor, the LoA ideationally elucidates
and enhances the Ideational meaning potential of the linguistic text in the
advertisement.
The Complements to the LoA (Comp.LoA) refer to components in the
Lead which are comparatively less Salient than the LoA. They functionally
enhance the Interpersonal and Ideational Salience of the LoA. In other
words, the Comp.LoA plays a subordinate role, to channel and Focus
viewers' attention on particular aspects of the LoA. In the following discussion of the Ml advertisement, accompanied by Plate 7.3 depicting the
Generic Structure of the Ml advertisement, I illustrate how the interaction
between the Comp.LoA and the LoA brings out the Ideational and Interpersonal Salience of the LoA.
In Plate 7.3, which advertises the perks of the Ml telecommunications
service, the woman is Salient (Kress and van Leeuwen, 1996), while in
O'Toole's (1994) terms, she has Prominence. She is the most illuminated by
the Emblem, that is, the logo of the 'Sun' which represents Ml. Her megawatt smile lends her affective appeal. The female model is thus the LoA.
Plate 7.3 illustrates two Complements to the LoA in the Ml advertisement:
(1) Comp.LoA 1: the two boys beside the LoA, who are reduced in size and
are lacking in affective appeal, and therefore visually less inviting than
the smiling LoA. Their Stylization (O'Toole, 1994) differs from the
model's. They do not hold the Ml placard, and are not smiling, which
implies also that they would never be able to say 'Everything they offer is
brighter, nicer and more fun'. The Comp.LoAl subordinates itself to bring
the LoA and the Emblem into Focus. The LoA and the Emblem become
the confluence of all attention.
(2) Comp.LoA2: the background, which remains in dark hues and fails to
be illuminated, despite the spotlights. The backgrounded stalls and the
goods the stalls are selling are generally obliterated and unobservable.
This may be contrasted with the LoA who is illuminated by the Emblem
of the 'sun' (that is, Ml) she is holding, while the spotlighted background, ironically, fails to brighten up. Thus the Comp.LoA2 underscores the prominence of the LoA and by extension, the prominence of
the product (that is, Ml). The Comp.LoA2 thrusts the LoA and the
product (that is, Ml) into viewers' attention.
Juxtaposing the LoA with the Comp.LoAl, we see an interplay of meaning
between the visual images, that is, the LoA represents those who have and
enjoy the Ml benefits and therefore are happy, while the Comp.LoAl represents those excluded from such benefits. The ideology of exclusivity becomes
apparent. Without an Ml subscription, life will not be 'brighter, nicer and
morefun f . Thus, be bright and make the wise choice of subscribing to Ml.

170

Plate 7.3

MULTIMODAL DISCOURSE ANALYSIS

Generic structure of the Ml advertisement

PRINT MEDIA

171

The Display

Explicit: pictures of a tangible product


Implicit: an intangible product or service given tangible
form through another medium
Display
Congruent: product not realized through symbolism
Incongruent: product realized through symbolism
Figure 7.1

The Display in a print advertisement

The LoA can also function as an Implicit Display in certain advertisements


where the Display refers to the photographic Display of the product or service in the advertisement. If the product advertised is in a tangible form, for
example the Golf, it is termed Explicit Display. In comparison to the Beetle
advertisement (Plate 7.4) which employs symbolism as an advertising strategy, the Golf in the Golf advertisement can also be construed as a Congruent realization of the product, as no symbolism is involved. Therefore, the
Golf is construed as ExplicitCongruent Display.
However, some products or services are intangible, or difficult to Capture
in tangible form. For instance, the '118 off-peak hours every week' service
provided by Ml is not a tangible product with a physical form that can be
captured in print. Thus, the advertisers find a way of portraying such a
service through the smiling model who has obviously been the beneficiary
of such a service. The model, which has been previously established as the
LoA, is then also the Implicit Display of the product/service. She personifies
the '118 off-peak hours every week' service. As can be seen, a conflation of
functions is possible in an advertisement, that is, the LoA conflates with the
Implicit Display, as illustrated in Plate 7.3.
Arising from creative advertising strategies in the Beetle advertisement
depicted in Plate 7.4, the insect beetle is creatively used as a substitute for
the car, the New Beetle. This substitution strategy makes the beetle an
Implicit Display of the product which is a car. The insect beetle symbolizing
the car operates as an Incongruent realization of the product. Thus it can be
construed as Implicit:Incongruent Display.
The Emblem, the Announcement and the Enhancer

The Emblem may be realized visually as the logo of the product/service


advertised and its linguistic realization is in the form of the brandname of
the product/service. Ideationally and ideologically, it is the stamp of authority bespeaking and validating the authenticity of the product advertised.
The Emblem functions to bestow an identity, as well as to confer status to a

172

Plate 7.4

MULTIMODAL DISCOURSE ANALYSIS

The Beetle advertisement

PRINT MEDIA

173

product. The Emblem may be positioned anywhere in the advertisement.


However, it is interpersonally Salient to Capture attention. The Emblem in
the Ml advertisement is the logo of the 'Sun' and the brandname 'Ml', as
depicted in Plate 7.3.
In a print advertisement, the most Salient linguistic item/s are termed the
Announcement. The Announcement has Relative Prominence in Scale and
Colour, Font and Size (O'Toole, 1994). Ideationally, the Announcement captures and conveys the essence of an intended message the advertisers wish to
foreground to the consumers. Figure 7.2 Displays the functional realizations
of an Announcement. The examples are taken from the Golf and Ml
advertisements (Plates 7.1 and 7.3 respectively).
The Enhancer comprises linguistic items only, usually in paragraph form,
as exemplified by the labelled advertisements above. The Enhancer builds on
or modifies the meaning emanating from the interaction between the Lead
and the Announcement. Interpersonally, its function is to persuade and
influence viewers to purchase the product, thus the Enhancer contains
Interpersonal lexis (in bold print below), which carry an attitudinal and/or
affective thrust. Through Interpersonal lexis, 'texts/speakers attach an intersubjective value or assessment to participants and processes by reference to
emotional responses or to systems of culturally-determined value systems'
(White, 1999). Ideationally, it details the advertisers' reasoning/argument as
to why the product is worth the customers' attention and money, and so
Logical relations and rankshifted clauses are evident. The Golf advertisement

Primary

Defined as the only announcement in the


advertisement
(E.g. 'It doesn't make a statement. It's for people who already
have one' (Golf))
Defined
as the most interpersonally salient
announcement among other announcements in the
same advertisement
(E.g. '118 Off-peak hours every week' (Ml))

Announcement
The catch-phrase of an advertisement
(E.g. 'Everywhere under the sun' (Ml))

Secondary

The less interpersonally salient announcement/s among


other announcements in the same advertisement
(E.g. 'I get thefeeling that Ml wants me to enjoy value - and enjoy
life. Everything they offer is brighter, nicer and moreJim!' (Ml))
(E.g. 'Bigger value. Better service. Brighter smiles. Nobody covers
it all as nicely as Ml' (Ml))

Figure 7.2 The Announcement in print advertisements

174

MULTIMODAL DISCOURSE ANALYSIS

is one illustration of the use of Interpersonal lexis and interdependency


between clauses.
//In its own confident and quiet style [[that have won endless admirations
the world over]], the New Golf has come of age with a sophistication
beyond comparison//
a
//[[Setting itself apart and in a blistering pace]], is a new and awesome
1.8 litre turbo engine//
X (3 //to take its performance to a higher level//
//Not only that, the beauty and luxury of the New Golf is also graced with
equally exciting refinement both inside and outside//
1
//Truly the New Golf hasn't changed in spirit and valour//
+ 2 a //but has gotten better//
X (3 //to assert itself as the ultimate hatchback//
//No wonder it has been hailed as '. . . a triumph of execution' by UK's
Car's Magazine (January '99)//
//And termed by others as the 'Rolls-Royce of hatchbacks'//
The abundance of Interpersonal lexis in the Enhancer suggests room for
the application of Appraisal Theory which 'is concerned more particularly
with the language of evaluation, attitude and emotion . . .' (White,
1999). However, space does not permit the investigation of Appraisal
Theory in this context; no doubt future research in this area would be
productive.
The Tag and Call-and- Visit Information
Certain elements of information about a product/service that are not
included in the Enhancer are captured in the Tag. The Tag is usually in the
form of one-liners in small print and is typically non-Salient as illustrated
in preceding labelled advertisements. Grammatically, Tags are usually
realized as non-finite, for example, 'Based on Super Off-Peak rates of 5c per
rnirH in the Ml advertisement, and as ellipted Subject and finite element,
exemplified by 'Available in 1.8 Turbo and 1.6 Automatic1 in the Golf
advertisement. Grammatically, there could be exceptions to the above but it
is not within the scope of this paper to explore the lexicogrammatical
realizations of Tags. As can be seen in the preceding labelled advertisements, the Call-and-Visit Information is usually in small print and nonsalient, comprising contact information as to where, when, how the
product/service is available to the consumer. For example, from the Golf
advertisement in Plate 7.1, 'Cars and Cars Pte Ltd. 10 Leng Kee Road,
Tel:474-llir.

Revisiting the Generic Structure Potential (GSP) for print


advertisements
The GSP for advertisements in this paper is stated as:

175

PRINT MEDIA

LeadA (Display) AEmblemA (Announcement)A (Enhancer)A (Tag)A


(Call-and-Visit Information)
Table 7.2 is a brief survey of the advertisements analyzed in this paper and
reveals which elements are optional, and which obligatory.
Evident in Table 7.2 is the diversity of choice as to which elements
are included or excluded from the advertisements. This study indicates
that only the Lead and the Emblem occur in all the advertisements
which have been analyzed. Thus the Lead and the Emblem appear to be
obligatory elements, while the rest are optional in the GSP of print
advertisements.
Cook (1992: 216), quoting Barthes, states that advertisements represent a
'resdess' discourse type. He explains (ibid. 217):
The conventions of ads change fast, driven by an internal dynamic, by changes in
society, and by changes in the discourse types on which they are parasitic or in
which they are embedded . . . they are . . . constantly transmuting and
re-combining, so that at present any lasting characterization is impossible. Synchronically, there are too many exceptions. Diachronically, the rules are in a flux.
Deriving a GSP for advertisements is thus made difficult due to this 'restlessness' of advertisements. The GSP for advertisements which I have
derived is at best tentative, insofar as advertisements metamorphose along
with 'changes in society . . . constantly transmuting and re-combining'
Table 7.2

Tabulation of Elements in the five advertisements

Advertisement Element/ s present in


advertisement

Element/s absent in
advertisement

Golf

Lead, Emblem, Display,


Announcement, Enhancer, Tag,
Gall-and-Visit Information

Epson

Lead, Emblem, Display,


Announcement, Enhancer, Galland-Visit Information

Ml

Lead, Emblem, Display,


Announcement, Enhancer, Tag,
Call-and-Visit Information

Beetle

Lead, Emblem, Display,


Announcement, Enhancer, Calland-Visit Information

Tag

Guess?

Lead, Emblem

Display, Announcement,
Enhancer, Tag, Gall-andVisit Information

Tag

176

MULTIMODAL DISCOURSE ANALYSIS

(ibid.}. I venture further to say that the GSP for advertisements is chameleonlike, slippery to define and ever-evolving. Further research into the GSP of
advertisements, which needs to be conducted in greater breadth and depth,
may produce a different GSP. For example, Hasan (1996) establishes
'CaptureAFocusAJustification' as the generic structure for advertisements but
my research has produced a GSP which differs in terms of the degree of
detail and the ability to Capture the complexity of intersemiosis in
advertisements.
Strategies for Ideational meaning-making in multi-semiotic
texts
The section above introduces the Visual Metaphor which ideationally elucidates and enhances the Ideational meaning potential of the linguistic text in
the advertisement. I introduce another four strategies for Ideational meaning in Table 7.3.
Table 7.3

The construal of Ideational meaning in print advertisements


Ideational meaning

Strategies for meaning-making


in a multi-semiotic text

1. Bidirectional investment of meaning


2. Contextualization Propensity
3. Interpretative Space
4. Semantic effervescence

Generic Structure Potential of a


print advertisement

LeadA(Display)AEmblemA(Announcement)A(Enhancer)A(Tag)A(Call-and-Visit
Information)

The Bidirectional Investment of meaning refers to the cross-investment of


lexicogrammatical meaning in the linguistic text in the Announcement to
the visual image in the Lead and vice-versa. The Contextualization Potential
(CP) refers to the degree to which linguistic items in a print advertisement
contextualize the meaning of the visual images. In a print advertisement,
viewers have an Interpretative Space (IS) within which to create meaning
and the wider the IS, the greater the Semantic Effervescence (SE) of the
advertisement. The sections below further elaborate.
Lead, Announcement and Enhancer: a triumvirate approach to
meaning-making
The Ideational metafunction is concerned with 'understanding] the
environment' (Halliday, 1994: xiii), '[enabling] humans to ... make sense of
what goes on around them and inside them' (Halliday, 1994: 106). Figure
7.3 outlines the four stages of triumvirate interaction between the Lead,

PRINT MEDIA

177

Effervescence of
meaning; a
kaleidoscope of
meaning;
Low GP, wide IS,
high SE
Contextualization
of meaning;
options of meaning
unintended by
advertisers closed
off

Meaning in
advertisement
funnelled towards
a preferred
direction
intended by advertisers

Stability in
meaning; X
number of meanings
intended by
advertisers
communicated to
and received
by viewer.
High CP, narrow IS,
lowSE

Figure 7.3

Triumvirate Interaction of Lead, Announcement and Enhancer

Announcement and Enhancer in construing Ideational meaning in a


print advertisement. Stages 1-4 detail how new dimensions of meaning may
be accessed and made manifest through the interaction of the Lead,
Announcement and Enhancer.
Stage 1
The Lead in the Golf advertisement is the most interpersonally Salient, as
seen in Plate 7.1, and thus this element is first approached by viewers. The

178

MULTIMODAL DISCOURSE ANALYSIS

gaze of the LoA locks with the viewer, and the latter is led into the advertisement. If interpreted independently of the Announcement, Enhancer,
Display and Emblem, the Lead is figuratively an effervescence of meaning.
As mentioned above, the Lead could represent an ambassador for women's
rights, or a call to attention to the new millennium look. On its own, a
kaleidoscope of possible meanings characterizes the Lead. There is a wide
scope in terms of meaning potential in the Lead. At this stage, there is low
CP, wide IS and high SE.
Stage 2

In a print advertisement, the next most Salient item is the Announcement,


thus the Primary Announcement is second in the reading path. There is
Bidirectional Investment of meaning between the Announcement (the
linguistic code) and the Lead (the visual code) as illustrated in Figure 7.3.
The term Investment refers to the Bidirectional Investment of meaning
from the lexicogrammatical choices in the Announcement to the visual
in the Lead and vice-versa. For example, should the Announcement in the
Golf advertisement 'It does not make a statement. It's for people who
already have one' occur elsewhere, for instance on the back of a T-shirt,
it would connote different meanings. Similarly if the Lead of the Golf
advertisement appears in a different context, for instance in a Playboy
magazine, it would have different connotations from what it has here. So
how are the viewers constructed by the advertisers to read the meaning in
the Golf advertisement that the LoA represents someone with a statementmaking personality? After all, the Announcement is a linguistic code while
the Lead is a visual one. How does the juxtaposition of two different codes
result in meaning that can be unambiguously conveyed by the advertisers
and unambiguously decoded by the viewers? I propose that the juxtaposition of the linguistic texts and visuals sets up Transitivity processes that
invest meaning from the linguistic code to the visual code and vice versa.
The discussion that follows unravels and explicates the mechanics of this
Bidirectional Investment of meaning from the Announcement to the Lead
and vice-versa.
Stage 2a

In the Golf advertisement (Plate 7.1), there is a Relational:Attributive:


Intensive process between the Primary Announcement and the Lead. The
Attribute 'statement-making personality' is invested from the Primary
Announcement into the Carrier (that is, the LoA in the Lead) by virtue of
their proximity, thus causing viewers to see the LoA as a person with a
statement-making personality (Figure 7.4):

PRINT MEDIA

179

Primary Announcement
'It does not make a statement. It's for people who already have one'
Relational: Attributive: Intensive process occurs between Primary
Announcement and Lead.
The Attribute 'statement-making personality' is invested into the LoA in
the Lead. The LoA is construed as Carrier with such an Attribute.

Investment

Lead
'Visual of LoA'

Figure 7.4

Investment of Meaning from Primary Announcement to Lead

Due to the Relational process between Primary Announcement and Lead


which invests meaning from the former to the latter, viewers read the
Experiential meaning in the Golf advertisement.
The LoA

has

a 'statement-making personality'.

The LoA

is

statement-making

Statement-making
individuals

are

beautiful, sensuous, stylish

Carrier

Attributive:
Intensive

Attribute

The Primary Announcement thus acts as a stabilizer for an otherwise


semantically efflorescent Lead. The Primary Announcement provides a context for viewers to adopt/pursue the preferred thread of meaning intended
by the advertisers. Whatever 'it' refers to, 'it' is only for beautiful, sensuous
and stylish statement-making individuals. Even at this early stage, a flux
of ideologies emerges, that of elitism and exclusivity, where only statementmaking individuals deserve the Golf. With elitism and exclusivity arise
'social power' (Goldman, 1992: 115) and an endowment of status, also
gender and beauty stereotyping, as the LoA in the Lead defines and epitomizes allure, beauty, charm and desirability.
Stage 2b
Figure 7.5 illustrates how the Lead enriches the Ideational meaning
carried in the Primary Announcement. The sophisticated, sensuous, coylooking LoA in Plate 7.1 is a visual exemplification of the statement
'people who already have (a statement to make)'. A RelationaLIdentifying:
Intensive process occurs in the Investment of meaning from Lead to
Announcement.

180

MULTIMODAL DISCOURSE ANALYSIS

Primary Announcement
'It does not make a statement. It's for people who already have one'

Investment
Relational: Identifying: Intensive process occurs between
Announcement and Lead.
The sophisticated LoA in die Lead is the Value exemplifying the
Token 'It's for people who already have one'.

Lead
'Visual of LoA'

Figure 7.5

Investment of Meaning from Lead to Primary Announcement

Through the Identifying: Intensive process, viewers read the following


meaning in the advertisement:

The LoA

represents

'people who already have


(a statement to make)'

Token

Identifying: Intensive

Value

At this stage, another ideological perspective emerges: the promoting and


endorsing by advertisers of how statement-making women should look.
Should one paradigmatically replace 'It doesn't make a statement. It's for
people who already have one' with 'Dangerous: wanted convict', the LoA
would assume a different meaning. Thus the point is she means what she
means, whether as a statement-making person or as a dangerous convict,
due to Relational processes that invest meaning bidirectionally from the
Primary Announcement to Lead and vice-versa. Barthes's (1977:40) concept
of'anchorage' operates when:
the text directs the reader through the signifieds of the image, causes him to avoid
some and receive others; by means of an often subtle dispatching, it remotecontrols him towards a meaning chosen in advance. In all these instances of
anchorage, language clearly has a function of elucidation, but this elucidation is
selective . . .
Barthes (1977), however, does not address how this selective elucidation is
achieved. The ongoing discussion on Transitivity processes between the
Announcement and the Lead, resulting in the Investment of meaning bidirectionally, is proposed as the crux to this selective elucidation.
Not all Announcements, however, enter into a Relational process with the
Lead. The Primary and Secondary Announcements in the Ml advertisement

181

PRINT MEDIA

(Plate 7.3) enter into Relational, Verbal, Mental and Material processes with
the Lead (Figure 7.6).
Defined as the most Interpersonally Salient
Announcement among other Announcements in the
same advertisement
(E.g. (1) '118 Off-peak hours every week')

Primary

The catch-phrase of an advertisement

Announcements

(E.g. (2) 'Everywhere under the sun')

The less Interpersonally Salient Announcement/s


among other Announcements in the same advertisement

Secondary

(E.g. (1) 'I get the feeling that Ml wants me to enjoy value and
enjoy life. Everything they offer is brighter, nicer and more fun!')
(E.g. (2) 'Bigger value. Better service. Brighter smiles.
Nobody covers it all as nicely as Ml')

Figure 7.6 The Primary and Secondary Announcements in the M1 Advertisement


At the most obvious level, the LoA in the Lead is the Sayer, with 'I get
the feeling that Ml wants me to enjoy value - and enjoy life. Everything they offer is
brighter, nicer and more fun!' as the Locution and the viewers as the Receiver.
Underscoring this is an ideology of persuasion particularly through the
Attributive:Intensive process in the Locution:
//Everything [[they offer]]

is

brighter, nicer and


more fun!//

Carrier

Attributive: Intensive

Attribute

The juxtaposition of the Lead and the Primary Announcement 1 gives rise to
the following possible meanings:
She

has

118 off-peak hours every week


(because she uses Ml)

Carrier

Attributive: Possessive

Attribute

She

enjoys

118 off-peak hours every week


(because she uses Ml)

Senser

Mental: Affect

Phenomenon

Ml

gives to her

118 off-peak hours every week


(therefore she is smiling)

MULTIMODAL DISCOURSE ANALYSIS

182

Actor

Material Beneficiary:
Recipient

Range

She

is

bright [[to choose Ml (with its


118 off-peak hours perk)]], thus
she stands out from the rest

Carrier

Attributive: Intensive

Attribute

Again in Barthes's (1977) words, how does the language 'remote-control'


viewers to read the above meanings in the advertisement? How does the
Announcement selectively elucidate the meaning in the Lead? I propose that
Relational, Mental and Material processes occur between the Announcement
and the Lead, resulting in a preferred reading intended by the advertisers.
Stage 3

The advertisers are able to convey, and viewers are able to receive, the
meaning of a satisfied Ml customer unambiguously because the juxtaposition of Announcements and Lead has resulted in the above Transitivity
processes. Should the Lead be paradigmatically replaced by a visual of
uncongested roads in the city, the meanings conveyed would definitely be
different. Again, although the Announcements are linguistic codes and
the Lead a visual one, their juxtaposition still creates meaning due to the
Transitivity processes between them. These meanings that result from the
interaction between the Announcements and the Lead are built on or modified (depending on the advertisers' intention) by the Enhancer.
An analysis of the Transitivity processes in the Enhancer reveals how it
builds on the meanings generated by the Announcements and the Lead. In
Stage 2b one meaning generated between the Lead and the Primary
Announcement in the Ml advertisement is:
Ml

gives

118 off-peak hours every week


(therefore she is smiling)

to her

Actor Material Beneficiary: Recipient

Range

Material processes in the Enhancer build on this meaning, with Ml as actor.


//Nobody covers
Actor

it

Material Range

all

as nicely as M l / /

Circumstance :
Circumstance :
Manner: Quality Manner:
Comparison

183

PRINT MEDIA

more off-peak hours than anyone


else.//

(to you)

//Ml

offers

Actor

Material Beneficiary: Range


Recipient

Circumstance:
Manner:
Comparison

//Weekend at 50% off


off-peak
hours,

start

at 7 p.m.

every Friday//

Token

Circ:
Manner:
Quality

Rel Ident:
Intensive

Value

Circ: Frequency:
Time

and

last

right through the weekend//

+2 //

Rel Att: Circ Attribute


Note: Att = Attributive (process), Benef = Beneficiary, Circ = Circumstance,
Ident = Identifying (process), Rel = Relational (process)
Ml is constructed as the Actor providing services benefiting the Recipients.
Thus, the Recipients have only to subscribe to Ml to enjoy all the benefits.
The ideological perspective that emerges is one of persuasion. The CircumstantiahManner: Comparison ('than anyone else') strengthens this persuasive
voice. So do the Relational participants, which emphasize the frequency and
duration of Ml benefits.
As mentioned earlier in Stage 2b, another meaning that arises from the
interaction of Primary Announcement 1 and Lead in the Ml advertisement
is:
She

enjoys

118 off-peak hours every week (because she


uses Ml)

Sensor

Mental: Affect

Phenomenon

The Enhancer builds on this meaning of enjoyment through the


clause:
//But that

is

not all [[an Ml customer enjoys]]//

Carrier

Attribute: Intensive

Attribute

184

MULTIMODAL DISCOURSE ANALYSIS

In the Relational clauses below, the 'Free talk time' is accorded a Value and an
Attribute, to attract viewers to these benefits that they can enjoy. Herein,
again, lies the ideology of persuasion.

//Free
talk time

(is) worth

(iporv
3>4U

every month

with the Ml
PrimePlan//

Token

Identifying:
Intensive

Value

Circumstance :
Extent:
Frequency

Circumstantial:
Accompaniment

//which

can be

as much as
400 min//

Carrier Attribute:
Intensive

Attribute

The reading path for the Ml advertisement is displayed in Figure 7.7,


according to the layout of the original Ml advertisement.
Halliday's (1994) model of expansion for logical meaning can be
adapted and applied to the Ml advertisement. The notion of expansion
includes:

(B) Primary Announcement 1: Second in Salience therefore read second.


The meaning of (A) is further enhanced, LoA is smiling because of the '118 off-peak hours
every week'.

(C)

Secondary Announcement 1: Third in Salience therefore its reading follows Primary


Announcement 1. Meaning of Primary Announcement 1 further enhanced, that is '118
off-peak hours eveiy weeK is not only of good value, it enables one to enjoy life. The '118 offpeak hours every weeK is a means to a 'brighter, nicer and morejuri lifestyle.

(A) Lead is visually most Salient therefore viewers interact with it first.

(D) Secondary Announcement 2: Next in Salience after (C) and enhances the meaning of (C).

(E) Enhancer: Read last since it is at die bottom. The Enhancer further builds on what it
means to have a 'brighter, nicer and moreJim' lifestyle, by describing the benefits.

Figure 7.7

Reading Path for the Ml Advertisement

185

PRINT MEDIA

(a) elaboration, represented by the notation ''


(b) extension, represented by the notation '+'
(c) enhancement, represented by the notation 'x'
In the M1 advertisement, the Logical relations between the elements in the
advertisement may be expressed as:
Lead
X Primary Announcement 1
X Secondary Announcement 1
X Secondary Announcement 2
X Enhancer
Stage 2a mentions the interaction between the Announcement and the Lead
in the Golf advertisement which sets up Relational processes resulting in the
following meanings:

The LoA

represents

'people who already have


(a statement to make)'

Token

Identifying: Intensive

Value

The LoA

has

a 'statement-making
personality'

The LoA

is

statement-making

Statement-making
individuals

are

beautiful, sensuous, stylish

Carrier

Attributive: Intensive

Attribute

The Enhancer, which is the paragraph detailing additional information


about the car, and by extension the LoA, builds on the meaning of 'statement-making personality'. The following Circumstance types, Carriers and
Attributes are amplifications of what it means to be statement-making:
In its own confident the New
and quiet style
Golf
[[that has won
endless admirations
the world over]]

has

come of with a
beyond
age
sophistication comparison

Circ: Manner:
Quality

Att:
Poss

Attr

Carr

Circ:
Accomp

Circ:
Manner:
Quality

MULTIMODAL DISCOURSE ANALYSIS

186

//[[Setting itself
apart even further
...]]

is

a new and awesome 1.8 litre turbo


engine [[to take its performance to
a higher level]]//

Carr

Att: Intensive

Attr

//Not the beauty is


and luxury
only
of the New
that
Golf
Carr

also

graced

Att: Poss Circ:


Att:
Accomp Poss

with
exciting
refinement

both inside
and
outside//

Attr

Circ:
Location

Note: Accomp = Accompaniment, Att = Attributive (process), Attr = Attribute,


Garr = Carrier, Girc = Circumstance, Poss = Possessive (process)

The Enhancer functions to amplify the meanings generated between the


Primary Announcement and the Lead. The Display (that is, the Golf), and
the LoA, if unaccompanied by the Enhancer, would not realize the meanings advertisers intended. Figure 7.8 illustrates the reading path according to
the Compositional layout of the Golf advertisement.
Adapting Halliday's (1994) system of Expansion for Logical meaning,
relations in the advertisement may be expressed as:
Lead
= Primary Announcement
X Enhancer

To review, in Stage 1 of Figure 7.3, meaning in the Lead is initially effervescent and unstable. By the time the viewer reaches Stage 3, the initially
effervescent meaning is straitjacketed by the Announcement/s and
Enhancer. The meanings intended by the advertisers become crystal-clear
and are unambiguously communicated by the advertisers and unambiguously received by the viewers. We have moved from a meaning which is
effervescent, unstable and ambiguous to one which is stable and constrained. Any meaning options not intended by the advertisers are
effectively closed off. Suffice to say at this point of the discussion, with
reference to Figure 7.3, that the Interpretative Space is narrower and
the Semantic Effervescence low due to the contextualizing effect of the
Announcement/s and Enhancer in Stage 3. The Contextualization Propensity is thus higher in Stage 3. This is in contrast to the low Contextualization Propensity, wide Interpretative Space and high Semantic Effervescence
of Stage 1.

PRINT MEDIA
(C)

Enhancer is least in Salience


therefore read last. The meaning
generated through the interaction
between Lead and Announcement
is 'the LoA exemplifies people who
already have a statement', 'She is
statement-making'. The Enhancer
builds on this meaning, that
statement-making people
have
'exciting
refinement',
'setting
(themselves) even further apart',
etc.

(B)

Primary Announcement read second as it is second in Salience. Through


Relational processes that invest meaning Bi-directionally from
Announcement to Lead and vice-versa, the Announcement serves to define
the Lead as a visual exemplification of the Announcement. There is semantic
equivalence between Lead and Announcement.

(A)

The Lead is visually most Salient therefore read first. The LoA carries some
meaning but we are not sure yet what meanings advertisers intend her to
have till she interacts with the Announcement.

Figure 7.8

187

Reading Path for the Golf Advertisement

Stage 4

The total meaning derived from the interaction between the Lead,
Announcement/s and the Enhancer needs to be read in the socio-cultural
context within which it is placed. The meaning of the entire advertisement,
according to Wernick (1991: 42):
delivers back to the people the culture and values that are their own . . . [it is] a
reinforcement of whatever ideological codes and conditions [that have] come to
prevail.
Moreover, advertisements (Dyer, 1982: 77) project:
the goals and values that are consistent with and conducive to the consumer
economy and [socialize] us into thinking that we can buy a way of life as well as
goods.
However, society's ideologies are in continual evolution and metamorphosis.
The ever-shifting ideologies will influence the way society interprets

188

MULTIMODAL DISCOURSE ANALYSIS

advertisements. Whether society reads a marked or unmarked interpretation in the advertisements is 'culturally determined and changes over time
and may also eventually result in a narrowing of the meaning of an option'
(O'HaUoran, 1999: 320).
Gontextualization Propensity, Interpretative Space and Semantic
Effervescence: a further exploration of Ideational meaning

I discuss in greater detail here the Contextualization Propensity (CP), Interpretative Space (IS) and Semantic Effervescence (SE). As mentioned above,
the generic structure of a print advertisement constitutes visual as well as
linguistic components, and the interaction between these components creates Interpersonal, Ideational and Textual meanings. I further illustrate in
Figure 7.3 that through the Bidirectional Investment of meaning between
visual and linguistic components, the meaning of the visual images, such as
the Lead, is contextualized by the linguistic items, for example the
Announcement/Enhancer. Without the contextualizing function of linguistic
items, the Lead, as previously mentioned, has a bounty, a kaleidoscope of
meaning and has great meaning potential. The GP refers to the degree/
extent which linguistic items in a print advertisement, be it the Announcement, the Emblem and/or the Enhancer, contextualize the meaning of the
visual images. Thus the degree of interconnectedness and the degree of
interweaving of meaning between the Scene, Episode (O'Toole, 1994) and
the participants/processes in the visual images and linguistic text determine
the degree or extent of contextualization, as illustrated by the Epson advertisement. Such advertisements have high CP. Where a minimum of linguistic items accompany the visual images, and less definable relationships
are established between the linguistic and visual codes, as illustrated in the
Guess? advertisement, the meanings of the visual images are less contextualized. These advertisements exhibit a low CP.
The low CP Guess? advertisement and the high CP Epson
advertisement

Advertisements with a high CP allow viewers to read specific strands of


meaning intended by the advertisers. In the above discussion of the Epson
advertisement, linguistic items (a)-(e) contextualize the meaning of the LoA,
that is, the splash of water in the Epson advertisement, depicted in Plate 7.2.
Linguistic items (a)-(e) are redisplayed below for convenience of reference:
(a)
(b)
(c)
(d)
(e)

'EPSON STYLUS PHOTO EX - crystal-clear, photographic quality


printing'
//Six specially formulated colour inks deliver richer, more lifelike images//
//while EPSON PhotoEnhance provides realistic colour balance every
time//
//The EPSON Stylus Photo EX can transform your photography//
'EPSON Stylus. The most advanced inkjets.'

PRINT MEDIA

189

Linguistic items (a)-(e) provide the context within which the meanings of the
LoA may be negotiated and established. As the LoA is more contextualized
by linguistic items (a)-(e), the meaning of the LoA becomes more straitjacketed. Such a scenario defines a high CP in an advertisement. With a
high CP, the viewers' interpretation of the LoA is constricted, with a lowered freedom to read other meanings in the LoA given the semantic input by
linguistic items (a)(e).
The CP, therefore, has ideological implications. A greater Propensity for
Contexualization implies greater effort by the advertisers (through the linguistic items) to introduce specific strands of meanings. One is discouraged
to read alternative meanings in the LoA given the context by (a)-(e). The
viewers thus have limited IS, that is, space to create, invent and author
meaning. This of course does not mean that alternative readings do not
occur. A critical reader can interpret the intended meanings and offer further perspectives other than those intended by the advertiser.
As illustrated in Plate 7.5, the CP is low in the Guess? advertisement as
there is only one lexical item, namely 'Guess?' to contextualize the meaning
of the entire Lead, made up of the LoA, that is, the model whose limbs shine
with metallic sheen, and the Comp.LoA, that is, the background. Apart from
the possible reading that the LoA is in some way related to Guess?, which is
the brandname of a fashion product known for its watches and clothes, and
that there is the underlying message that Guess? fashion is trendy, chic and in
vogue, the entire Lead is an effervescence of meaning as there is a lack of
contextualizing function by linguistic items. Arising from this lack of
contextualization, that is, a low CP, a myriad of interpretations of the LoA is
possible: the LoA with the metallic sheen-like complexion is a probable
personification of the futuristic stance Guess? adopts towards fashion; the
current Guess? trend is the minimalist look, as exemplified by the generous
show of legs and body swathed with a minimum of cloth; the Guess? consumer is bohemian in outlook, as is the LoA whose cascading hair is caressed
by the wind and throws a cold, removed glance at the viewer and the world;
the Guess? consumer looks down at the world in nonchalance, articulating
the superiority of the product and hence the consumer who chooses to use
Guess? products. Guess? is thus selling an attitude, a certain style of living;
Guess? products hint of sexual attractiveness and availability (as signified
through the high-split in the skirt and slightly parted legs), which can be
extended to imply the non-conformist nature of Guess? products, which
challenge the conservative mould of society; Guess? applauds the flatchested female as opposed to society's fascination with and celebration of the
amply endowed female, again a hint of Guess?'s non-conformist ideology;
dark skies and seas fail to intimidate Guess? consumers, who are able to put
their best foot forward in style and confidence, the Stance (O'Toole, 1994)
adopted by the LoA; Guess? is beyond definition, there is no single aspect to
its fashion statement. Guess? products, it seems in this particular advertisement, have limitless possible interpretations within the semantic realm of
'the desirability' of this label, and that is likely to be the message intended by

190

Plate 7.5

MULTIMODAL DISCOURSE ANALYSIS

The Guess? advertisement

the advertisers. The IS in the Guess? advertisement is thus wide. There is a


pun on the Emblem 'Guess?': viewers are left guessing the most likely valid
readings of the advertisement. A low CP is no less ideological than a high
CP. That the advertisers allow viewers a larger and wider IS suggests that

PRINT MEDIA

191

advertisers wish the consumers to purchase the illusion that consumers are
empowered to create meanings for themselves in an advertisement. The
ideology of manipulation is no less evident, for by thinking they have freedom to interpret, the viewers have played themselves into the hands of the
advertisers. They have bought the ideology of Guess?, that is, there is no
single definition of the Guess? fashion statement, so dress the Guess? way
and be open to interpretation by the (admiring?) eye of the public.
Graphical representations of CP, IS and SE
To summarize, a low CP allows a wider IS, as evidenced in the previous
sections. There is greater SE of meaning in the Lead as a result of the lack of
linguistic items, which perform a contextualizing function in a print advertisement. Conversely, a high CP results in a narrower, limited IS, as seen in
the Epson advertisement. There is less effervescence of meaning in the Lead
as viewer choice in the selection of meaning is constrained by the more
abundant linguistic items, which define more tightly the meanings of visual
images in a print advertisement.
The triumvirate correlation among the CP, IS and SE in the Lead can be
captured graphically, as illustrated below. 'A' in Figure 7.9 indicates the
region that the Guess? advertisement is likely to be positioned. With few
linguistic items to provide an interpretative context for the Lead (that is, a low
CP arises), there is greater SE in the Lead, and thus greater Interpretative
Space (wide IS) for the viewer to roam and make meaning. This situation
corresponds approximately to Stage 3 in Figure 7.3 above.
The Epson advertisement is likely to be positioned in the vicinity of'B' in
Figure 7.9, as the advertisement contains an abundance of linguistic text
which provides the context within which the meaning of the LoA may be
derived. That is, there is a high CP, which narrows the IS of the advertisement. The Lead has lower SE due to the high CP. Stage 1 of Figure 7.3
reflects this.
Barthes (1977: 26) writes: 'formerly the image illustrated the text (made it
clearer); today the text loads the image, burdening it with a culture, a moral,
an imagination'. However, how much imagination and by what means the
text loads an image is not explained by Barthes (1977). The CP, IS and SE

Figure 7.9

Correlation between IS, CP and SE

192

MULTIMODAL DISCOURSE ANALYSIS

which I propose can be used as a tool to elucidate the degree/extent of


imagination invested into a visual image by a linguistic text. The lower the
CP, the greater the SE in the Lead and hence the wider and more open the
IS, indicating greater loading of imagination into the visual image. A high
CP, conversely, limits a freer loading of imagination from text to image as
the IS is narrower.
CP, IS, SE and Barthes's (1977) notion of readerly and writerly
texts
Barthes introduces the notion that texts vary in the degree to which they let
the reader enter into this creation of meaning from both the Textual and the
extratextual factors. On the opposing ends of the scale, he places 'writerly'
and 'readerly' texts (Bruns, 1998).
Bruns (1998) further quotes Barthes, stating that for readerly texts, the
reader is 'left with no more than the freedom to either accept or reject the
text', as in the case of a technical manual, as opposed to writerly texts, which
offer 'the reader more choice and try much less to push them in one or the
other direction' (Bruns, 1998). Barthes is further quoted, 'The writerly text
. . . has no determinate meaning' and 'can create a number of possible
meanings for readers' (Bruns, 1998).
The Epson advertisement, with its high CP, narrow IS and low SE in the
Lead, may, in the light of Barthes's (1977) readerly and writerly texts, be
construed as tending toward a readerly text while the low CP, wide IS and
high SE in the Lead of the Guess? advertisement lends itself more as a
writerly text.
A new dimension to the 'New' in Kress and van Leeuwen (1996)
Information Value, according to Kress and van Leeuwen (1996: 183), is
Compositionally determined. The information value of a left and right
Composition is construed as Given and New information respectively, where
(ibid.: 187)
[the Given is defined as] something the viewer already knows, as a familiar and
agreed-upon point of departure for the message. For something to be New means
that it is presented as something which is not yet known, or perhaps not yet
agreed upon by the viewer, hence as something to which the viewer must pay
special attention.
However, my proposal is that Given and New information need not be Compositionally determined in this manner of left to right organization. The
Given-New information value may be derived in any print advertisement, in
any layout, whether with left-right or top-down Composition. The Guess?
advertisement is a case in point.

PRINT MEDIA

193

From the low GP in the Guess? advertisement arises a multiplicity of


interpretations of the LoA in the Lead, as discussed above, since there is a
lack of linguistic items to contextualize the meaning of the LoA. That
viewers are given a wider IS to interpret the LoA, and that the LoA remains
Semantically Effervescent indicate that it is the Focus of the advertisement to
present the LoA. To reiterate Kress and van Leeuwen (1996: 187), 'something which is not yet known, or perhaps not yet agreed upon by the viewer'.
Ideationally, the LoA in Plate 7.5 is ambiguous, teeming with possible meanings. The LoA is thus construed as the New, while the Emblem 'Guess?' is the
Given, as viewers are not likely to have any argument with alternative

(a)
Epson advertisement

Guess? advertisement

Greater degree
ofCP

Lesser degree
ofCP

Degree of Contextualization Propensity

(b)
Guess? advertisement

Epson advertisement
Narrower IS

Wider IS

Expanse of Interpretative Space

(c)

Epson advertisement

Guess? advertisement

Lesser amplitude of SE

Greater amplitude of SE

Amplitude of Semantic Effervescence


Figure 7.10

Mapping CP, IS and SE

194

MULTIMODAL DISCOURSE ANALYSIS

interpretations of the brandname. Though the Guess? advertisement does


not have a left-right composition, New and Given information can still be
derived, thus strengthening my thesis that there is no need to limit GivenNew information to a left-right Composition in a print advertisement.
Conclusion: moving towards a topological grammar

The analysis of multi-semiotic texts, such as print advertisements, necessitates the formulation of a topological grammar, one which can handle the
analysis of texts in terms of'degree, quantity, gradation, continuous change,
continuous co-variation, non-integer ratios, varying proportionality,
complex topological relations of relative nearness or connectedness, or nonlinear relationships and dynamical emergence' (Lemke, 1998: 87). The
concepts proposed in this paper, namely, Contextualization Propensity,
Interpretative Space and Semantic Effervescence, can be seen as topological.
These proposed concepts are resources for articulating gradients and
nuances of meaning, and shades of significance in the multi-semiotic print
advertisement. Figure 7.10 explicates the varying degrees of CP, expanse of
IS and amplitudes of SE.
This paper has proposed a generic structure potential for advertisements,
and, further to this, suggested strategies for construing Ideational meaning in
multi-semiotic texts. There still remains a vast expanse to be traversed, with
exciting opportunities to further explore meaning-making of multi-semiotic
texts from a systemic-functional perspective.
Acknowledgements

Plates 7.1 and 7.4 are reproduced with kind permission of Volkswagen.
Plates 7.2, 7.3 and 7.5 are reproduced with kind permission of Epson,
MobileOne Ltd and Guess? Inc, respectively. The credits for the photograph in Plate 7.5 are due to creative director Paul Marciano and photographer Dah Len.
References
Baldry, A. P. (ed.) (2000) Multi-modality and Multimediality in the Distance Learning Age.
Gampobasso, Italy: Palladino Editore.
Bardies, R. (1977) (S. Heath, ed. and trans.) Image-Music-Text. London: Fontana.
Bohle, R. (1990) Publication Design For Editors. New Jersey: Prentice Hall.
Bruns, A. (1998) Major Terms in Structuralism: Text, Reading, Author, Intertextuality,
Discourse, (http://www.uq.au/~zzabruns/uni/en22l-ass05.html).
Cheong, Y Y. (1999) Construing meaning in multi-semiotic texts a systemiclinguistics perspective. Unpublished masters thesis. National University of
Singapore.
Cook, G. (1992) The Discourse of Advertising (2nd edn 2001). London: Routledge.
Dyer, G. (1982) Advertising as Communication. London: Routledge.
Goldman, R. (1992) Reading Ads Socially. London: Routledge.

PRINT MEDIA

195

Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:


Edward Arnold.
Halliday, M. A. K. and Hasan, R. (1985) Language, Context and Text: Aspects of Language
in a Socio-Semiotic Perspective. Victoria: Deakin University. (Republished by Oxford
University Press, 1989).
Hasan, R. (1996) What's going on: a dynamic view of context in language. In
G. Gloran, D. Butt and G. Williams (eds), Ways of Saying: Ways of Meaning. London: Gassell, 37-50.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (2001) Multimodal Discourse: The Modes and Media of
Contemporary Communication. London: Arnold.
Lemke, J. L. (1998) Multiplying meaning: visual and verbal semiotics in scientific
text. InJ. R. Martin and R. Veel (eds), Reading Science: Critical and Functional Perspectives on Discourses of Science. London: Routledge, 87113.
O'Halloran, K. L. (1999) Interdependence, interaction and metaphor in multisemiotic texts. Social Semiotics 9(3): 317354.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
Wee, G. K. A. (1999) A systemic-functional approach to multi-semiotic texts.
Unpublished honours thesis. National University of Singapore.
Wernick, A. (1991) Promotional Culture: Advertising, Ideology and Symbolic Expression.
London: Sage Publications.
White, P. R. R. (1999) An Introductory Tour Through Appraisal Theory. English Language
Research, Department of English, University of Birmingham, (http://
www.grammatics.com/appraisal/Frame.htm).

Multimodality in a biology textbook

Libo Guo
National University of Singapore

Introduction
Introductory biology textbooks in current use in educational institutions
invariably contain words and visual images, for example, schematic drawings, photographs, and mathematical and statistical graphs. Further, it is not
only recently that biology texts have been multimodal; drawings of animals
and plants have been used as an aid to the study of living organisms for
agricultural, medicinal and biological purposes since ancient civilizations
(Ford, 1992).
Sociologists or ethnomethodological researchers, notably Lynch (1990)
and Myers (1990, 1995), have attempted to theorize about the deployment
of visual displays in biology texts. Lynch (1990: 153-154), for instance,
believes that 'visual displays are more than a simple matter of supplying
pictorial illustrations for scientific texts. They are essential to how scientific
objects and orderly relationships are revealed and made analyzable'. In a
similar vein historians and philosophers of science have turned their attention to the evolution and philosophical aspects of scientific (including
biological) illustrations (see, for example, Baigrie, 1996). Although these
investigations have made significant contributions to our knowledge and
understanding, they often seem to lack a coherent framework to explain how
the various visual displays make meaning in their natural and social settings.
These approaches have shown us what is happening through videotape
recordings, verbal accounts and historical documents, but they have not
been explicit enough about the systems and functions that underlie the use
of visual images.
This paper explores the potential of an alternative approach to the study
of meaning-making practices in scientific discourses. This is the social semiotic approach developed by M. A. K. Halliday (1978, 1994) as systemicfunctional linguistics (henceforth SFL) and the emerging SFL-informed theory of multimodality (Baldry, 2000a, 2000b; Kress, 2000, 2003; Kress and
van Leeuwen, 1996; Lemke, 1998; O'HaUoran, 1999a, 1999b, 2003). Due
to the main purpose of my study, that is, helping non-native university
learners of English cope with English for Specific Purposes (ESP) or English
for Academic Purposes (EAP), I confine myself to the study of textbook

PRINT MEDIA

197

articles in biology, sharing Myers's conviction that textbooks are the type of
writing that university students are 'most likely to face' (Myers, 1992: 3). The
excerpts analysed here are from Chapter 17 Cell Division of Essential Cell
Biology: An Introduction to the Molecular Biology of the Cell (henceforth ECB) by
Alberts et al. (1998). This textbook is used as required reading material for
second-year biology majors for Bachelor of Science degrees at the National
University of Singapore for the module of Cell Biology.
This paper is organized as follows. I first discuss the semantics of biology
and what biologists do that characterize them as biologists. Second, following O'Toole (1994), Lemke (1998), and O'Halloran (1996, 1999a, 1999b) I
propose frameworks for the analysis of visual images in the textbook, and,
following this, I analyse two multimodal composites and discuss how each
type of resource contributes to meaning-making. This paper concludes by
outlining some of the implications of a multimodal approach for teaching
ESP/EAP to non-native speakers of English.
Biology and miiltimodality

Biology is 'the study of living things past and present, including their structure, function, chemistry, development, evolution, and environmental interactions', the 'environment' here including both the physical environment
and the biological environment (Purves, 1999: 769). Out of the different
approaches to studying life, two are particularly important to modern
biologists: observation and experimentation. Observation is to experience
the living world and take note of the living organisms. This represents the
naturalist tradition of doing biology, exemplified by Charles Darwin
(1809-1882). And today's biology majors at universities are required to go
on field trips as part of their degree programme. The key to experimentation, on the other hand, is manipulation and control of 'conditions in order
to reveal or produce observations that contribute to the solutions of puzzles'
(Janovy, 1996: 44). 'Certainly molecular biology and all its older relatives
rely on experiments, and experimentation is becoming more a part of ecological field research every day' (ibid.}. Observation and experimentation as
two important ways of studying life are reflected in many universities' curricula designed for biology majors. The practical classes for biology majors
in the Department of Biological Sciences at the National University of
Singapore, for instance, account for 27.0 per cent and 39.5 per cent of the
total contact hours of a first- and second-year student's learning life respectively, which strongly suggests that hands-on skills are crucial for a biologist in
training.
On the other hand, a biologist also reads and writes papers, textbooks,
and other documents. Similarly, the bulk of a biology student's learning
time in the first two years at college is spent in classes and tutorials (Haas,
1994: 59-63) where he or she is required to read, write and interpret verbal
and non-verbal messages. Minds-on skills are as important as hands-on ones.
As pointed out by Osborne (2002: 206), 'just as there can be no houses

198

MULTIMODAL DISCOURSE ANALYSIS

without roofs or windows, there can be no science without reading, talking


and writing'.
Due to the nature of the inquiry of the discipline and its methodological
approaches, biology texts have always been multimodal, that is, deploying a
range of semiotic resources in addition to natural language. The reason for
this is clear: natural language alone cannot adequately communicate or
construct the process and product of observation and experimentation; the
potential of natural language as a typologically oriented semiotic resource
(Lemke, 1998) falls far short of the semiotic demands of the discipline. For
example, it will be difficult, if not impossible, to describe in natural language
alone the colours, shapes and the flight path of a butterfly.
Like research activities in other fields, many of the investigations in the
biological sciences are quantitative, involving the collection, presentation,
analysis and interpretation of numerical observations. The biological
researchers apparently need an objective method of organizing the data
collected from field trips or experiments. In addition, they must draw sensible conclusions from the analysis of the data. Many have been guided by
statistics. In the US, statistics was first introduced to the university curriculum for biology students as early as 1897 at Harvard (Zar, 1999: x); biostatistics, or biometry, has nowadays become an important part of a biology
student's education. This involves the deployment of appropriate statistical
procedures and graphs in biology texts.
The semiotic demands of the discipline do not stop here. In cell biology,
in particular, recourse to non-linguistic semiotic resources has been necessary since Robert Hooke (1635-1703) first drew a picture of the 'cell' seen
under his microscope as reported in Micrographia (1665). This time-honoured
morphological approach to the studies of the cell, with the help of a light
microscope and an electron microscope, has recently culminated in what we
know as the ultrastructure of the cell. To communicate what was observed
under the microscope, the cell biologists have developed a range of devices,
including light micrographs, electron micrographs, and schematic drawings,
each of which has several sub-types, depending on the techniques adopted.
More recently, however, cell biologists have attempted to investigate the
biochemical basis of the structure and function of the living cell. Rather
than merely describe the mechanical or morphological features of the cellular life, this new approach seeks to account for the cell and cell activity in
terms of the structure and function of its chemical components, the four
major families of small organic molecules (sugars, fatty acids, amino acids
and nucleotides) and the macromolecules (polysaccharides, lipids, proteins
and nucleic acids). Most of the macromolecules normally exist as specific
biologically significant three-dimensional structures called conformations,
for example, the double helix for DNA, the extended chain conformation
for cellulose and the a-helix, (3-pleated sheet, p-turn and loop conformations for proteins. As noted by McMurry and Castellion (1999: xvi),
'[understanding many aspects of chemistry such as the specificity and
selectivity of enzymes, or the action of drugs - requires understanding the

PRINT MEDIA

199

three-dimensional nature of molecules'. That is, the introduction of biochemistry means that the semiotic demands of the discipline have exponentially increased so that natural language, however important it may be, is
inadequate when deployed as a single resource. As a result, other semiotic
means such as chemical notation, ball-and-stick models, space-filling
models, animations, video recordings and so forth have evolved for communicative purposes. Natural language alone has been inadequate with
morphological research; it is naturally insufficient as a means to describe
both the morphological and the biochemical.
Theoretical frameworks for the analysis of biology texts
In what follows, I present the frameworks for analysing the biology text,
which include the frameworks for analysing schematic drawings and statistical graphs. I also discuss the issue of the reading path in introductory
multi-semiotic texts.
Myers (1990: 233-249) identifies, 'in terms of realism and abstraction'
(ibid.: 247), five categories of visual displays in a sociobiology text: photographs, drawings, maps, graphs/models/tables, and imaginary figures (ibid.:
234). The first three types have some reference to our everyday visual
experience while '[gjraphs, models, and tables redefine space [. . .] so that each
mark has meaning only in relation to the presentation of the claim' (ibid.:
235). In many textbooks on the molecular study of the cell, one of which is
ECB, biochemical symbolism constitutes yet another semiotic resource. For
lack of space, however, I present the frameworks for the analysis of two
common types of visual displays in biology: schematic drawings and statistical graphs.
Framework for analysing schematic drawings

By schematic drawings I refer to those that are designed to depict in a


simplified way some scene or process, actual or imaginary. The functions
and systems chart for the analysis of schematic drawings is displayed in
Table 8.1.
Although the rank scale in the chart follows O'Toole (1994: 24), the
functions and systems are not, unsurprisingly, identical. For instance, in
O'Toole's (1994) model, in the Modal function at the ranks of Work and
Figure, Gaze is an important means deployed by artists to attract the attention
of the viewer. In the biological schematic drawings I have analysed, Gaze
does not appear to figure as an important resource. More importantly, in
the Compositional function, unlike in paintings where usually littie more
than a tide is provided to indicate what is depicted, in scientific illustrations,
Labelling appears frequently. This feature is related to the pedagogic use of
the schematic drawing. An important part of a biology student's training is
to learn to recognize the shapes of components of an organism and learn
how these components are named by the scientific community; for example,

Table 8.1 Functions and systems in schematic drawing (adapted from O'Toole, 1994: 24)
Unit/
Function

Representational

Modal

Compositional

Work

Overall shape;
Components of the structure;
Whole process;
Phases of the process

Frame;
Size;
Scale;
Perspective;
Full colour or black and white;
Colour contrast;
Shade or light

Gestalt Framing, Horizontals,


Verticals, and Diagonals;
Proportion;
Geometry;
Colour;
Drawing's relation to running text:
Spatial and Colour;
Labelling: Positioning, Colouring and
Leaders

Episode

Shape;
Colour;
Size;
Spatial relation to each other, and to
the structure;
Actions, events

Relative Prominence: Colour, amount


of detail;
Centrality;
Lettering (for label and caption): type
size, style (serif or san serif), Weight;
Line and arrow width;
Numerical sequence

Relative position in the structure or


process;
Colour contrast between components

Figure

Components; Acts

Contrast: Scale, Line, Light, Colour;


Omission of detail

Relative position in the component or


phase;
Colour contrast or similarity;
Subframing

Member

Natural form: Shape, colour, etc. and


spatial relationship to other
components

Stylization;
C onventionalization

Cohesion: Parallel/Contrast in Shape


and Colour;
Reference through language

PRINT MEDIA

201

a certain shape is named the 'stem', or 'root', or 'microtubule'. Labels and


Leaders provide in part the means for the enculturation of the learner into
the discipline of biology. The Representational meaning of the schematic
drawing is what Lemke (1998) calls the 'topologicaP meaning, especially, the
Shape, Colour, Size, Spatial relation to each other and to the whole structure,
and Action. Such meanings are also typological in that they fall into categories; for instance, the Shape is round, square, rectangular, and so on. But the
predominant aspect of these meanings in biology is topological where the
irregularity defies any linguistic encoding except in the most general terms.
The exact Spatial relations and the moment-to-moment movement in space
can best be shown in a drawing or video recording rather than by verbal
description.
Framework for the analysis of statistical graphs

Statistical graphs for frequency distributions, which include bar graphs, histograms, frequency polygons and so on, derive from data tables, which in
turn originate from linguistic and mathematical expressions of some quantitative relation between a set of variables. In most cases, statistical graphs
make use of the coordinate system, that is, the horizontal #-axis designating
the independent variable and the vertical j-axis designating the dependent
variable stand for the Given, and the space circumscribed by the two axes is
the New where the relations between the two variables are shown. Also thejyaxis represents the quantitative information, ratio- or interval-scale data,
which is capable of being scaled. This means that thejp-axis's quantitative
values can be turned into visually perceptible heights which can be compared visually: the higher the bar or point, the higher the value of the y
variable for the corresponding x variable. There is nothing in the height of
the bar per se except its assigned meaning concerning the quantitative value.
That is, a statistical graph, quite unlike a photograph, is an abstract theoretical entity although it may have material form: a photograph resembles
the perceptible object while a statistical graph constructs a theoretical object,
which may be invisible to human vision prior to its material formation. The
framework for analysing statistical graphs, adapted from O'Halloran (1996:
161), is presented in Table 8.2.
Compared with that for natural language, the Representational meaning
of a statistical graph is more specialized in that it deals only with the relative
numerical relationships between two sets of variables, or how an attribute
of some entities, for example, the height or weight of school children, is
distributed among a sample or a population, the latter being in essence a
comparison between the entities in terms of the attribute. This visually
expressed topological relationship powerfully complements the semantics of
natural language, which is typically typologically oriented (Lemke, 1998).
Depending on the nature of the variable designated by the ;v-axis, a curve
within a coordinate system may mean either a material process, as the
output of the crop increasing or decreasing, or a relational process, as the

Table 8.2

Functions and systems in graphs (adapted from O'Halloran, 1996: 161)

Unit/
Function

Representational

Modal

G ompositional

Graph

Statistical reality: topological meanings,


such as trends, continuous covariations, correlation and frequency
distribution;
Comparisons of patterns of variation

Accompanying text in the form of


Caption, Tide and Labelling which are
emphasized by Size, Positioning,
Underlining and Font;
Colour, Line width, Shading, Line
Solidarity, Arrows;
Curvature;
Perspective;
Framing;
Scale;
Style of production;
Directionality

Gestalt: Framing, Horizontals, Verticals


and Diagonals;
Positioning;
Use of Lines, Curves and Bars;
Interconnections established through
symbolism and language for the
labelling of Participants and Processes;
Cohesion: links to the running text

Episode

Change, or Relations between Figures

Prominence of interplay

Labelling of interplay

Figure

Participants;
Circumstantial features;
Portrayal of co-variation associated
with process as a Curve, Line or Bar

Prominence of individual figures;


Displayed trend of process through
Line, Bar, Curve

Labelling of Figures through


symbolism and/or language;
Portrayal of Process between
Participants as Axes and Figure with
relative Positioning and Size of Figure
and salient features as displayed by
Lines, Curves, Bars, Colour, Line
width, and Shadings

Part

Title;
Axes, Scale, Arrows;
Labels;
Lines, Curves, Intersection points;
Slope of Parts of the Figure

Stylization;
Conventionalization

Cohesion: Parallelism, Contrast;


reference through language and/or
symbolism

PRINT MEDIA

203

comparison between the entities. Interpersonally and experientially,


although in principle a graph is as reliable or unreliable as the data that
informs its compilation and, for that matter, is subject to the semiotic choices
made in the display (for example, the selection of the scale), whenever a
student encounters a graph in the textbook, he or she is normally expected
to believe it rather than doubt it. That is, the graph carries with it a selfauthenticating power and a high modality. Further, as noted by O'Halloran
(1999b: 18), with exceptions '[Interpersonal] strategies for engaging the
viewer of the mathematical visual display do not operate through nuance as
found in forms of art, but rather select for a direct unmarked command,
"look here"'. Gompositionally, x- andjy- axes provide the basis or grounding
where the New is expressed in the form of Curve, Line or Bar. The axes
'contribute to stability and harmony' (O'Toole, 1994: 23), while the Curve,
Line or Bar 'create[s] energy and dynamism' (ibid.}.
The reading path in introductory texts
Another crucial question with a multi-semiotic text is the reading path it
may create for its hypothetical reader. Underlying this question is the recognition of the page as a Textual unit where various semiotic resources
make meaning (Baldry, 2000b: 42). As O'Halloran (1999a: 322) points out,
'[wjith multi-semiotic texts, the most important stage is a step-by-step
analysis of the text through the reading path determined by the choices
within different semiotic codes'. It is to be noted that the reading path in a
multi-semiotic text identified by O'Halloran (ibid.} is not linear, from left to
right, or from top to bottom, but typically follows some specific sequence
(see also Kress and van Leeuwen, 1996: 218 if., 1998: 205-209; Kress, 2003:
156-160).
As will be illustrated below in the analysis of multimodal texts, there seem
to be two aspects to the reading path: the intersemiotic aspect, that is, how the
reader is expected to shift his or her attention from one semiotic to another,
and the intrasemiotic aspect, that is, how the reader is expected to move from
one component to another within one semiotic mode. Very often the
intersemiotic aspect of the reading path in an introductory textbook is that
after a brief'modal "scanning" of the page' (Kress, 2003: 159), or a quick
perusal of visually salient elements, usually images, the reader moves from
the verbal text (expressed by specific typographical features) to the nonlinguistic resources and then to the verbal text again, thus following a backand-forth reading sequence. Initially the visual image on the page exerts a
strong impact upon the reader through choices such as Colour and Framing.
After the initial visual impact subsides, however, he or she normally begins
to study the verbal text and may later be linguistically instructed to study the
visual image in greater detail. The relative privilege the verbal text enjoys in
the reading path is partially explained by the fact that at this stage of the
student's education it is largely through the verbal language in the running
text that he or she is instructed explicitly when to view and study the

204

MULTIMODAL DISCOURSE ANALYSIS

non-linguistic resources and how to interpret them. In other words, how a


multi-semiotic text 'indicate [s] to the reader/viewer the possible ways of
reading the text and the relative information priority to be assigned to the
different component parts of the overall visual composition' (Baldry, 2000b:
42) and how a reader/viewer is expected to respond to the text constitute a
visual semiotic strategy to realize the educational goals within which a multimodal text is constructed and interpreted. As the context of situation and
culture (Halliday, 1978) within which the text operates changes, the reading
path, along with the nature of the semiotic resources employed, also differs.
For example, Kress and van Leeuwen (1996: 219) and Lemke (1996: 216)
have observed that some scientists have non-sequential reading habits and
that many books are not designed 'to be read only in the strict linear order in
which the text appears on the pages' (Lemke, 1996: 216). This applies to
established scholars or scientists, people who are beyond their basic 'military
training' periods (Knight, 1992: 6, 143) and for whom texts are created and
read for research purposes rather than learning purposes. Students of biology or literature or any other field need eventually to learn non-sequential
reading. While they are still students, however, they may need in the multimodal textbook page a clear, pre-coded reading path in order to enter the
paradigms of contemporary science (Kuhn, 1996) and the practices and
conventions that characterize scientific activity.
Analysis of the multi-semiotic text: two examples
Drawing upon the discussion of the frameworks for schematic drawings and
statistical graphs presented above, this section contains an analysis of a
schematic drawing and a statistical graph from ECB.1
Analysis of a schematic drawing: Figure 173

Figure 173 (ECB, p. 549), together with the relevant verbal text, is reproduced in Figure 8.1 (see Note 1).
The reader is formally introduced to Figure 17.3 when he or she reads the
following clause:
These two processes together constitute the M phase of the cell cycle
(Figure 17.3).

However, he or she may not wait until being instructed to view Figure 17.3.
Since Figure 17.3 is a full-colour drawing, a picture more attractive than the
largely black and white verbal text, a reader's attention is more likely to be
drawn to the drawing than to the written description. Thus one plausible
reading session may be that a reader, at some point in his or her reading,
turns his or her attention to the figure, and then back to the verbal text for
careful study and then back to the figure again, following a back-and-forth
type of reading path as explained above.

PRINT MEDIA

205

Figure 8.1 Reading path for Figure 17.3


The reading path within Figure 17.3 is marked in Figure 8.1 by the
capitalized and italicized Roman letters A to G. As is clear from Figure 8.1,
the reading path is not linear, from left to right, from top to bottom, but is
determined ideationally by what is in focus in the running text (the M phase
of the cell cycle), and interpersonally by the visual means of directing the
reader's attention (for example, the bright yellow Shading and Capitalization of MITOSIS and CYTOKINESIS and light green Shading of M phase and the
large square bracket embracing MITOSIS and CYTOKINESIS in the original text).
This is, in verbal and common parlance, equivalent to saying 'Hey, look at
what is highlighted first!'. Indeed, in this part of the reading, Steps C and D
are all an experienced reader needs to attend to. The highlighting devices
such as arrows are equivalent to a lecturer's cursor in an actual classroom,
where he or she, while talking to the students, points to relevant parts of the
figures. Although in viewing Figure 17.3, one's gaze, especially that of a
novice, may work from Step G down to Step D due to the Interpersonal
impact of the downward-pointing arrows and the reading habit of a normal
reader, it is nonetheless arguable that the reading path suggested above is

206

MULTIMODAL DISCOURSE ANALYSIS

most economical for the experienced reader, that is, one that has followed
the Textual explication up to this point.
At the rank of Work, interpersonally, this figure thus employs an array of
visual means to emphasize various parts of the cell structure and stages of
cell division. Ideationally, the figure is designed to tell a story about what
happens in a cell cycle, in particular the M phase of the cell cycle. The
ideational meanings include: (a) material processes realized by changes in
the shapes at different stages, the arrows and the nominal groups in the
linguistic text; (b) intensive identifying processes realized by the labels, leaders and the pictorial elements, and, in the absence of leaders by the labels,
the spatial proximity between the pictorial element and the labels, and the
pictorial elements; and (c) possessive identifying relational processes realized
by the labels, square bracket, the pictorial elements and the linguistic text.
The overriding Experiential content seems to be concerned with material
processes, although the intensive and possessive relational processes contribute significantly to the construction of biological knowledge. And textually, the drawing is not isolated from other parts of the text. It is related to
the main text and the caption and is placed in a specific position on the
page. The drawing is vertically positioned, with the Arrows connecting one
stage with another. Other resources employed for the textual meaning
include Geometry (e.g. circles), Colour Contrast or Similarity, Labelling
(with or without leaders) and Framing. In what follows, I analyse selected
steps in terms of the Interpersonal (Modal) meaning, Ideational (Representational) meaning and Textual (Compositional) meaning, by reference to the
functions and systems chart in Table 8.1.
Step A: the title

Distinctive typographical features, such as the bold face of the title and the
greenness of the figure's caption number, function to attract the reader's
attention and thus attach more importance to this linguistic message. The
title is also the only explicit link to the main text; it is the reader's entrance to
the pictorial world of the figure. It is designed to be read first and taken as
the point of departure for what is to come next.
The title is a nominal group and apparently does not select an Interpersonal stance at the rank of clause in terms of SPEECH FUNCTIONS (Offer or
Demand) and MODALITY and MODULATION (Halliday, 1994). This is a nominal
group whose function is termed by Halliday (1994: 96) as 'Absolute' in that it
'could be either Subject or Complement in an agnate major clause'. Indeed
all the linguistic components except the caption in Figure 17.3 are
'[ujnattached nominals' (ibid.: 395) which function in this way. But such
nominal groups are nonetheless far from being free from any Interpersonal
meaning. As for this title, the nominal group presents the Process of a cell
dividing as a Thing, which is objective, absolute, visible and concrete. Such a
high level of certainty about the state of affairs is attainable through nominal
groups or grammatical metaphor in the form of nominalization (Halliday,

PRINT MEDIA

207

1993, 1998). In other words, distillation of phenomena into entity or transformation of clausal grammar to nominalized form means that the reader is
not in a position to doubt the existence of a phenomenon, but is led to
believe in its absolute, timeless and unconditional existence.
Ideationally, being a nominal group, the title serves to identify, and is thus
equivalent to an intensive identifying clause (Halliday, 1994: 119-120), for
example 'This is the drawing of the M phase of the cell cycle'. It is important to note that the nominal group identifies not only through language but
also by its spatial proximity to the schematic drawing. By itself this nominal
group points to a nominalized process, the M phase of the cell cycle. Thus a
sequence of dramatic events, where one cell splits into two, has been transformed into a thing which has consequently been deprived of all the original
vigour, liveliness and particularities.
Step C: mitosis

This step can be broken into three sub-stages: Step C-l the word
'MITOSIS', Step C-2 the arrow and Step C-3 the circle and the two overlapping circles which contain the semiotic depiction of the cell.
Step C-2: the arrow
Interpersonally, the single-headed arrow is a Command; it demands that the
reader look in the direction of the arrow, in this case, from top to bottom of
the page. Here, the Command effect is strengthened by the particular darkness and thickness of the arrow.
Ideationally, the arrow serves to signify the process and direction of
movement, change or progression, or the numerous intermediate phases
between the circle above and the circles below. In terms of Peirce's (1985:
912) trichotomy of signs into an index, an icon or a symbol,2 the arrow is a
highly stylized icon. That is, the arrow proper does not exist in the actual
world in the process of cell division; the designers have added it to the depiction. Besides, the direction of the arrow in the physical sense, that is, from
top to bottom, is iconic of progression in time.
Step C-3: the circles
Inside the circle (second from top), highlighting devices such as the Colouring of the two pairs of lines and pink Shading in the original text serve to
draw attention to the essential defining features of a cell at this stage. The
blank space (Omission) between the outer ring of the circle and the pinkshaded central area is, in reality, just as occupied as other parts of the cell.
This distortion functions as yet another means of highlighting the two pairs
of lines. The outermost black circle and the adjacent blank space inward
(Omission) surround the central pink-shaded area, serving as a Framing to
give weight to what is highlighted in the centre. Although not evident here in
the black and white reproduction, the Contrast of colour between black, red,
pink and white serves the same highlighting purpose.

208

MULTIMODAL DISCOURSE ANALYSIS

Ideationally, the circle is drawn to represent a snapshot of a particular


stage in cell division. It focuses on the separation of the two pairs of
chromosomes, omitting the changes taking place in the cytoplasm. The
Ideational meaning is realized by the changes in shapes and contents of the
pink-shaded area and also by the Diagonal orientations of the two pairs of
lines representing chromosomes. We need note that this circle is not an
obvious icon (Peirce, 1985). The two pairs of lines inside and the circular
shapes do somewhat look like some types of cell components, hence they are
iconic. But the colours, the circle and the blank space testify to the symbolic
nature of the iconic sign. For instance, the colour of a particular cell component one sees in a micrograph is the result of dyeing technique. However,
what is shown in the micrograph is not necessarily reproduced in a schematic drawing; in a drawing further treatment is carried out to produce
what appears in the final printed book. In other words, what meets a
reader's eye in a schematic drawing is at least two steps away from what is
really there: in terms of choice of colour and diagrammatic transformation.
Compositionally, several devices contribute to the organization of the
text. For instance, Colour Cohesion and Contrast enable the viewer to recognize similarity and difference in the Ideational meaning and Interpersonal meaning: in the colour reproduction appearing in the textbook, the
colours red, pink, black and white serve as a backdrop against which the
Ideational and Interpersonal meanings are expressed. Similarly, the shapes
of the components, that is, the lines, circles, and the Relative Position of
the components also constitute a resource to organize the text. Below I
discuss in greater detail the role of Horizontals, Verticals and Diagonals in
the Textual organization in the schematic drawing.
The two pairs of lines in the first circle in Step C are positioned diagonally relative to the vertical-horizontal frame of the drawing. In the original text, the pair to the right are coloured red and the pair to the left are
black. The red pair resembles the contour of a hill or sea wave, each of
which is perceived as the trace of drastic movement or thrust resulting from
the physical or geographical forces such as the gravitational pull. The axis of
the black pair is approximately 30 anticlockwise to the vertical axis of the
drawing. This tilt or obliqueness creates 'directed tension' (Arnheim, 1974:
424-428), or 'energy and dynamism' (O'Toole, 1994: 23; Thibault, 1997:
315-322). We may note that whereas the shape of the red pair of lines
remains roughly constant throughout the drawing, the black pair tilts most
in Step C. This well fits the Ideational theme of the step, which is concerned
with drastic change in terms of chromosomes in the nucleus. On the other
hand, the Diagonal orientations of the two pairs of lines in the step also
serve to connect this step with the preceding and following steps, thus contributing to the Textual organization or unity of the drawing. In other words,
obliqueness in orientation of the lines is echoed or shared by all the steps in
the drawing albeit to varying degrees. It is true that in the laboratory cell
biologists will know that the cells are undergoing some transformation however they are aligned relative to the mechanical stage of the microscope. But

PRINT MEDIA

209

when cells are represented in micrographs and in particular in schematic drawings, that is, when they are turned into lines, circles and so forth, to contribute to the Textual organization, 'the canons of classical painting' (Bastide,
1990: 199-200) are often respected. One such canon is what has
been discussed in this paragraph, that is, the deployment of oblique lines
to represent 'energy and dynamism' (O'Toole, 1994: 23; Arnheim, 1974:
424-428).
Step E: the caption
The black and white caption has less visual salience through the smaller
font size, normal type (i.e. not bold face) and shorter leading. This suggests that the caption is to be read later in the reading sequence. Space
limitations preclude a detailed analysis of the lexicogrammatical features
of the caption. It is worth noting, however, that Ideationally the caption
presents a possessive identifying relation and circumstantial identifying
relation, realized respectively by the verbal groups 'consists of and 'followed by'. This repeats the information presented in the main text (for
example in the clause 'These two processes together constitute the M
phase of the cell cycle'). The caption, however, serves in particular to
specify what the square bracket in Step B refers to, that is, a visual iconic
expression of a possessive identifying relation. Here we can appreciate
that while the visual images are important in biological texts they have to
be given categorical meanings by linguistic resources. The value of the
visuals in this figure is that, in addition to representing or constructing the
shapes of biological entities, they are a spatialization or icon of the temporal flow of events and also aid to construct a taxonomy of biological
terms (the relations between M phase, mitosis and cytokinesis). However,
language has to specify the relations and their visual transformation
(Barthes, 1977: 38-41).
Step F: chromosome replication
This step can be broken into two sub-stages: Step F-l the words 'chromosome replication' and Step F-2 the arrow.
Step F-2: the arrow
Compared to the others, this arrow is short, indicating less Prominence in
the figure. This arrow also leads the reader's attention to the next visual
representation. Ideationally, this arrow denotes the process by which one
pair of chromosomes duplicates into two pairs. One needs to note, however,
that the shortness of this arrow misrepresents the length of the time period.
That is, replication in the S phase takes much longer than the M phase. A
typical eucaryotic cell spends a fraction of its cell cycle time in the M phase,
and most of it in interphase, as noted in ECB, p. 549. For example, a

210

MULTIMODAL DISCOURSE ANALYSIS

mammalian cell of a 24-hour cell cycle requires only about one hour for the
M phase to complete. This misrepresentation of the temporal dimension
functions to highlight the M phase of the cell cycle.
Step G: the structure of the cell

Step G is located at the top of the figure and an uninitiated reader may
begin viewing the figure here as this step provides the background for
what follows. The labels in this step and the leaders functioning as the
identifying processes disappear in the later depictions. This means that
once they have fulfilled their contextualizing function, they are discarded
and are no longer made visible. Having previously established the structure of the cell in ECB, the reader is now invited to study in detail the M
phase. As argued above, the experienced reader reads Step A first and this
step last or simply skips this step, as would perhaps a lecturer in the
classroom. This step can be read in two sub-stages: G-l the circle and
G-2 the labels.
Step G-2: the labels
Like 'chromosome replication' in Step F, the words in Step G-2 are made
least prominent by means of smaller font size, no Shading and no Capitalization. The leaders are also made insignificant by means of Length and Weight.
Ideationally, they identify the major components of a cell, as if saying, for
example, 'This is the nucleus of the cell'.
Having explored the systems and functions that a schematic drawing in a
cell biology textbook has created and drawn upon to make meaning, I now
undertake a partial analysis of a visual image of a different kind, a statistical
graph, before discussing the implications of this research.
Analysis of a statistical graph: Figure Q17.1

Figure Q17.1 (ECB, p. 550) is a statistical graph which appears in Question


17.1. Here the students are expected to solve the problem by reference to
information from the main text and the verbal section of the question and
the graph. The Question including the graph is reproduced in Figure 8.2
(see Note 1). The discussion below briefly deals with the recording path,
the Ideational meaning of the graph and how the graph contributes to the
problem-solving required to answer the question.3
The expected reading path for this multimodal composite involves a shutding between the verbal and the visual: from the main text to the Question
(including the graph), then to the 'problem' part of the Question and relevant main text, and finally to the graph again. Within the graph, after
locating the orientations of the graph and identifying what the horizontal xaxis and the vertical jy-axis refer to, the reader would survey the greenshaded curve which is supposed to carry the New. At this stage the reader
may have to mark the graph to solve the problem.

PRINT MEDIA

211

Question 17.1 Cells from a growing population were stained with


a dye that becomes fluorescent when it binds to DNA, so that the
amount of fluorescence is directly proportional to the amount of DNA
in each cell. To measure the amount of DNA in each cell, the cells were
then passed through a fluorescence-activated cell sorter (FAGS), an
instrument that registers the level of fluorescence in individual cells.
The number of cells with a given DNA content were plotted on a graph,
as shown in Figure Q17.1. Indicate on the graph where you would
expect to find cells that are in the following stages: G b S, G2 and mitosis.
Which is the longest phase of the cell cycle in this populationof cells?

Figure Ql7.1
Figure 8.2

Reproduction of Figure Ql 7.1

Ideationally, at the rank of Graph, the graph shows visually the Result (or
part of the Result) of an experiment, the frequency distribution of cells with
different DNA contents in a population of growing cells. The x values refer
to the DNA content per cell, as the label indicates, and the y values the
number of cells with a given DNA content. In other words, cells in the
population are divided into various types according to the amount of DNA
the cell contains: the type of cells on the right of the .x-axis contains more
DNA than a type on the left. The value in thejy-axis records the number or
frequency of occurrence of each type of cells in the population. A higher
point on the graph means that the number of cells of a particular type is
greater. Thus Ideationally the graph is a visual equivalent to a group of
linguistic relational processes through its Curvature. In addition, this graph
shows the 'conceptual relations, and not actual data' (Lemke, 1998: 102).
For instance, we are not told how many cells there are in the population, the
exact number of cells with different DNA contents, nor how much DNA
each cell contains, as there is no indication of the unit of measurement on
either x- oiy- axis. We are provided with the theoretical relation between the

212

MULTIMODAL DISCOURSE ANALYSIS

two variables: the type of cell defined by its DNA content and its frequency
of occurrence in the population.
It is worth noting that this Ideational meaning resides uniquely in a graph
and that it cannot be expressed as effectively by a verbal text or a mathematical equation. For Figure Q17.1 visually expresses the general abstract
pattern, or spatializes the quantitative relationship. It is a document with
visual impact, one that enables the viewer or reader to 'take in' the pattern
at a glance. However well a verbal clause or clause complex or a mathematical
equation may express the trend or relationship, a graph always does so with
a strong visual impact.
I would also like to note that just as the move from concrete data recording to the abstract relationship between the values of two variables may
involve grammatical metaphor (Halliday, 1998), the visualization of the
abstract relationships may involve semiotic metaphor as formulated by
O'Halloran (1999a, 2003, forthcoming). By semiotic metaphor, O'Halloran
(2003: 357) refers to the phenomenon in which 'when a functional element
is reconstrued using another semiotic code' there may occur 'a shift in the
function and the grammatical class of [the] element, or the introduction of
new functional elements'. The formulation of semiotic metaphors involved
in the movements between natural language, mathematical symbolisms and
visual displays is crucial for the ultimate solution to mathematical problems,
as demonstrated by O'Halloran (1999a, 1999b, 2003, forthcoming).
Here I analyse the movements between the verbal text and the visual text
in Question 17.1, which involves instances of semiotic metaphor. 'The
number of cells with a given DNA content' in the verbal section of Question 17.1 functions as one participant, the Goal, with 'The number of cells'
as the head and 'with a given DNA content' as the embedded Postmodifier
(Halliday, 1994: 191-192). Experientially the 'cells' functions as the Thing
and 'with a given DNA content' the Qualifier. But the elements 'The number of cells' and 'with a given DNA content' do not mean only within
language; they are also to mean mfe^emiotically, that is, in relation to the
visual text. In other words, the Head and Postmodifier composite in the
linguistic text is transformed into two separate participants in the visual
text, the two variables represented by the j-axis and thejy-axis perpendicular to each other. This shift from one linguistic participant to two visual
participants of equal status may be considered an example of 'parallel
semiotic metaphor' (O'Halloran, 1999a: 348) in that the two participants in
the second semiotic derive from the Goal in the first. This movement from
the linguistic to the visual code permits, however, the exploitation of the
meaning potential of the visual semiotic. Once this shift has taken place, it
is possible to represent the relationship between the number of cells and the
amount of DNA content per cell in terms of the height of the points or
lines in the coordinate system and to make visual comparisons and even
hypothesize some mathematical relationship between the two variables.4 The
precise shape of the curve in the visual text did not exist in the linguistic
text and thus may be considered as a case of 'divergent semiotic metaphor'

PRINT MEDIA

213

(ibid.} because a new participant is introduced with the movement from the
language to the visual image. In this case the divergent semiotic metaphor
(the curve) occurs as a consequence of the parallel semiotic metaphor (the
introduction of two participants). As will be clear shortly, the solution of the
problem depends to a large extent on how much sense the student can
make of the two instances of semiotic metaphor together with the information contained in the main text. In what follows I discuss two questions: (a)
how do the Question and the graph relate to the main text? and (b) how do
the main text and Question (including the graph) contribute to the solution
of the problem?
The relationship between the question, the graph and the main text
The relevant main text reads:
During S phase (S = synthesis), the cell replicates its nuclear DNA, [. . .] S phase
is flanked by two phases where the cell continues to grow. The G1 phase (G =
Gap) is the interval between the completion of M phase and the beginning of S
phase (DNA synthesis). The G2 phase is the interval between the end of S phase
and the beginning of M phase. (Figure 17.4).
(ECB, p. 550)
This means that if a cell in Gl phase has 2n units of DNA content, then by
the end of S phase ('replicates its nuclear DNA'), it has doubled the amount
of nuclear DNA content and in the G2 and M phases, it has 4n units of
DNA content. That is, the amount of DNA per cell in G2 and mitosis is
twice the amount in Gl and S phase is in the transition from 2n to 4/z units.
Then how do Question 17.1 and the graph relate to such information
contained in the main text? The main text reveals the general facts, the
'laws' in biology, the conclusion, and/or the theory, which scientists arrive at
from numerous experiments (as can be seen in the use of simple present
tense in the quotation above). Question 17.1 (including the graph), on the
other hand, reports just one experiment, complete with Method and Results
of an experimental report (the verb tense in some of the first few clauses in
the Question is the simple past, for example, 'were stained' and 'were then
passed'). That is, the main text presents the conclusion and the Question
presents one of the experiments leading to such a general conclusion.
Question 17.1 is not, however, a real experimental report, but rather it is a
textbook question. In a real experimental report, the conclusion is presented
in the final part while in the textbook question the conclusion is the point of
departure and the student is expected to apply this general rule to solve a
practical problem.
The contribution of the main text, question text and the graph in solving
the problem
There are two parts to the Question. The first part reads: 'Indicate on the
graph where you would expect to find cells that are in the following stages:
G l5 S, G2, and mitosis'. To answer this question, the student must understand

214

MULTIMODAL DISCOURSE ANALYSIS

the change in the amount of DNA content at different stages of the cell
cycle. That is, he or she must understand the relevant part of the main text
quoted above. Then in relation to the question he or she must also know
how to interpret the x-axis and know that at point b, the point on the x-axis
corresponding to Peak B (which is not displayed in Figure 8.2) the amount
of DNA per cell is twice that at point a, the point on the x-axis corresponding to Peak A (again not displayed in Figure 8.2), and that Peak B is therefore
the place where one would expect to find cells in G2 and mitosis phases
(chromosomes replicated, doubled) and Peak A the place to find cells in Gl
phase (chromosomes not yet replicated). Here the ability to deduce b = 2a
on the x-axis is crucial to the solution of the problem. To know where to find
the cells that are in the S phase, the student must again understand the
relevant main text. He or she must also be able to translate such main text
information into the line segment ab on the x-axis and know that cells that
are in the S phase can be found between Peaks A and B.
The second sub-question reads: 'Which is the longest phase of the cell
cycle in this population of cells?'. To answer this question, the student needs
to interpret the divergent semiotic metaphor, that is, he or she needs to know
how to interpret the frequency graph. That is, Peak A is the highest, indicating that the number of cells with this DNA content, that is, cells at Gl phase,
is the largest. This further suggests that Gl is the longest phase of the cell
cycle, assuming that the cells were selected on a random basis.
In this subsection I have discussed the essential role that a knowledge of
the linguistic and visual resources and how they interact with each other
plays in the solution of an in-text problem in ECB. In the final section of the
paper I discuss the implications of the preceding analyses.
Multimodal meaning-making: some concluding remarks
This paper has proposed tentative frameworks for the analysis of two types
of visual displays common in biology texts and has attempted to apply them
to the analysis of multimodal meaning-making in the biology text. As may
be clear from the above discussion, the visual images in the biology text are
not redundant with language in meaning-making; they extend and complement it. The words, on the other hand, specialize in a range of typological meanings and certain Interpersonal and Textual meanings and thus
'anchor' and constrain the many possible meanings made in the visual
(Barthes, 1977: 38-41). One is dependent upon and co-contextualizes the
other (Thibault, 2000: 312). To understand the text, as in Figure 17.3 in
ECB, or solve a problem, as in Question 17.1, the reader must be able to
integrate the meanings made in the linguistic and the visual codes.
My analysis has also shown that each type of visual display carries with it
different sets of conventions of meaning-making, not only in the deployment and interpretation of combinations of ink or paint (dots, lines, curves,
etc.) but also in their relations to the verbal text. For example, a schematic
drawing, such as Figure 17.3 analysed above, spatializes the ideational

PRINT MEDIA

215

meanings made in the verbal text, while a statistical graph, such as


Figure Q17.1, transforms a set of quantitative data into a visually perceptible object. Although many aspects of the interstratal relationship
between the visual signifiers and their signifieds remain to be explored
(Thibault, 1997: 329-334), my analysis in the paper has shown that the
visual displays in disciplinary discourses as exemplified in the biology text
are important for meaning-making and that what they mean and how they
mean it are not always self-evident or universal.
What does all this mean for the teaching and researching in ESP/EAP?
Research in these areas, both for native speakers of English and for non-native
speakers, has almost exclusively concentrated on language issues (see, for
example, Swales's (2001) review of the developments of ESP/EAP in the
past forty years), assuming that once the learner crosses the language barrier,
he or she will achieve academic success. Language, of course, constitutes
our major means of meaning-making and may continue to be one of the
problems that hinder one's progress through his or her career. But as I have
shown in this paper, following Myers (1990, 1995) and Thibault (2001), in
biology textbook genres language is only one resource for making certain
kinds of meaning. It is simply not able to make certain topological meanings
required in certain contexts and it means what it does mean in the first place
only in co-deployment and co-contextualizations with other resources
(Thibault, 2000: 312, 362). In professional scientific practice, as well-attested
by Lynch and Woolgar (1990) and Lemke (1998), 'as the fine edge and the final
stage' of some laboratory research, the 'tiny set of figures' drawn on the
paper rather than the '[bjleeding and screaming rats' in the lab 'is all that
counts' (Latour, 1990: 39-40; emphasis in original). And the grant-proposals
in engineering must be written and designed in a way that enables the peer
reviewers 'to find the abstract, [mathematical] formulas, tables, illustrations,
and references with ease' (Johns, 1993: 82). Thanks to the pioneering work
of Kress et al. (2001), Lemke (2000), O'Halloran (1996, 2000), Scott and
Jewitt (2003) and Johns (1998) we have been able to see that in science
classrooms 'learning can no longer be treated as a process which depends on
language centrally, or even dominantly [. . .] Learning happens through (or
[. . .] learners actively engage with) all modes as a complex activity in which
speech or writing [are] involved among a number of modes' (Kress et al.,
2001: 1). Therefore, we ESP/EAP teachers and researchers need to take
seriously the multimodal nature of meaning-making in academic
apprenticeship and professional life and refocus our research and teaching
agenda so as to better prepare our students for their current and future
academic and professional life. We need, for example, to complete more
research into the nature of the interactions between the verbal and the
visual in various genres and in various disciplines rather than assuming a
universal model. This is particularly important when the learner of ESP/
EAP is a university student from a non-English-speaking background, where
the visual images need to be related to the verbal resources in English. ESP/
EAP teachers will be expected to 'give students a visual grammar that

216

MULTIMODAL DISCOURSE ANALYSIS

supplements the grammar of English' (Baldry, 2000b: 53) and organize


practical classroom activities that are geared towards the development of
the 'multimodal communicative competence' (Royce, 2002: 192). For want
of a thorough and systemic description of the 'full system of relations' that
several semiotic codes simultaneously enter into (Thibault, 2000: 362), we
may introduce our students to some basic multimodal analytical tools and
principles and then encourage them to reflect on the intersemiosis in specific
instances in their disciplines. Following Baldry (2000b) we may guide our
students to compare the contemporary multi-semiotic meaning-making
with that of the past. This guided reflection may benefit the students as
well as ESP/EAP teachers. In many circumstances it is also desirable for
the ESP/EAP teachers to consult the expert staff about the intersemiotic
meaning-making in their teaching and professional research (Johns, 1998:
193). It is encouraging to note that since the mid-1990s Baldry (2000b) and
Pavesi and Baldry (2000) have taken significant steps to design and offer
multimodal ESP/EAP courses to both complete beginners and more
advanced students. This includes the development of multimedia environment self-access courseware and corpora. Finally, I also suggest that the
multimodal construction of meaning should be reflected in ESP/EAP
assessment, although this is largely absent in many parts of the world. With
the new pair of spectacles called multimodal social semiotics, the nature and
complexity of scientific discourse and how they might be more effectively
taught to ESP/EAP students may be further explored.
Notes
1

Unfortunately, it has not been possible to reproduce these Figures in colour as they
appear in the textbook. However, the following glosses are provided on the
colours used in the original text.
Figure 173, reproduced in Figure 8.1
The words 'Figure 173' are green. The nucleus of the cell is shaded in pink. In
the top circle the chromosome on the left is black, the one on the right red, and
this scheme is retained throughout. 'MITOSIS' and 'CYTOKINESIS' are
shaded in bright yellow, and 'M phase' light green.
Figure Q171, reproduced in Figure 8.2
The Question is framed by a box which is marked by its yellow background. In a
similar manner, the actual graph is framed inside the yellow box by a white
background. The curve, the A and B, and the area underneath are green.
2 On the basis of how a signifier relates to the signified, Peirce (1985) classifies
signs into an icon, an index and a symbol. In simple terms, an icon is a sign that
relates to its object in terms of their resemblance. This resemblance can be
similarity in 'simple qualities', as in images or photographs, or in 'relations', as in
diagrams and algebraic formulae, or it can be 'a parallelism' as in metaphors
(Peirce 1985: 10-11). An index is 'a sign which refers to the Object that it
denotes by virtue of being really affected by that Object' (ibid.: 8), for example,
smoke as an indication of fire. A symbol is a sign that derives its meaning by
conventions, by agreement between people (ibid.}, for example, the phonological

PRINT MEDIA

217

or graphological feature of the word 'man' and its meaning. The reason for
applying Peirce's trichotomy to the present analysis is that it brings to light the
fact that the relationship between the signified and the signifier is not always
identical or straightforward. Thus signs vary in the degree of the potential
semiotic load they pose for students. An iconic photograph of some familiar
object is easy to decipher, less so the schematic drawing, and even less so the
symbolic signs such as 5'-UGC-3'.
3 For lack of space I do not analyse the graph in a step-by-step manner, as I did
with Figure 17.3 above. However, one can always explore the visual resources
the graph exploits by reference to Table 8.2.
4 According to Tilling (1975: 200-211), quantitative graphs were not only used by
scientists such asj. H. Lambert (17281777) to present experimental data graphically but also help to analyse them, for example derive mathematical relationship between the variables (e.g. the rate of water evaporation as a function of
temperature as reported in one of Lambert's papers (Tilling, 1975: 201)).
Apparently, the student reader in this question is not required to derive an
equation from the graph but just to interpret the results displayed in the graph
and draw some conclusions.

Acknowledgements

Figures 8.1 and 8.2: Copyright 1998 from Essential Cell Biology: An Introduction
to the Molecular Biology of the Cell by Alberts, B., Bray, D., Johnson, A., Lewis,
J., Raff, M., Roberts, K. and Walter, P. Reproduced by permission of
Routledge/Taylor & Francis Books, Inc.

References
Alberts, B., Bray, D., Johnson, A., Lewis, J, Raff, M., Roberts, K. and Walter,
P. (1998) Essential Cell Biology: An Introduction to the Molecular Biology of the Cell. New
York: Garland.
Arnheim, R. (1974) Art and Visual Perception: A Psychology of the Creative Eye. Berkeley:
University of California Press.
Baigrie, B. S. (ed.) (1996) Picturing Knowledge: Historical and Philosophical Problems
Concerning the Use of Art in Science. Toronto: University of Toronto Press.
Baldry, A. P. (ed.) (2000a) Multimodality and Multimediality in the Distance Learning Age.
Gampobasso, Italy: Palladino Editore.
Baldry, A. P. (2000b) English in a visual society: comparative and historical dimensions in multimodality and multimediality. In Baldry (ed.), 2000a: 41-89.
Barthes, R. (1977) Rhetoric of the image. In R. Bardies (S. Heath, ed. and trans.),
ImageMusicText. New York: Hill and Wang, 3251. (Originally published in
1964.)
Bastide, F. (1990) The iconography of scientific texts: principles of analysis. In
Lynch and Woolgar (eds), 187-229.
Ford, B. J. (1992) Images of Science: A History of Scientific Illustration. New York: Oxford
University Press.
Haas, G. (1994) Learning to read biology: one student's rhetorical development in
college. Written Communication 11(1): 43-84.

218

MULTIMODAL DISCOURSE ANALYSIS

Halliday, M. A. K. (1978) Language as Social Semiotic: The Social Interpretation of Language


and Meaning. London: Edward Arnold.
Halliday, M. A. K. (1993) On the language of physical science. In M. A. K. Halliday
andj. R. Martin, Writing Science: Literacy and Discursive Power. London: The Falmer
Press, 54-68.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:
Edward Arnold.
Halliday, M. A. K. (1998) Things and relations: regrammaticizing experience as
technical knowledge. In J. R. Martin, and R. Veel (eds), Reading Science: Critical and
Functional Perspectives on Discourses of Science. London: Routledge, 185235.
Janovy,J.,Jr. (1996) On Becoming a Biologist. Lincoln: University of Nebraska Press.
Johns, A. (1993) Written argumentation for real audiences: suggestions for teacher
research and classroom practice. TESOL Quarterly 27(1): 7590.
Johns, A. (1998) The visual and the verbal: a case study in macroeconomics. English
for Specific Purposes 17(2): 183-197.
Knight, D. (1992) Ideas in Chemistry: A History of the Science. New Brunswick, NJ:
Rutgers University Press.
Kress, G. (2000) Multimodality. In B. Cope and M. Kalantzis (eds), Multiliteracies:
Literacy Learning and the Design of Social Futures. South Yarra: Macmillan Publishers
Australia Pty Ltd, 182-202.
Kress, G. (2003) Literacy in the Mew Media Age. London: Routledge.
Kress, G., Jewitt, G., Ogborn, J. and Tsatsarelis, C. (2001) Multimodal Teaching and
Learning: The Rhetorics of the Science Classroom. London: Continuum.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (1998) Front pages: (the critical) analysis of newspaper layout. In A. Bell and P. Garrett (eds), Approaches to Media Discourse. Oxford:
Blackwell, 186-219.
Kuhn, T. S. (1996) The Structure of Scientific Revolutions (3rd edn). Chicago: University
of Chicago Press.
Latour, B. (1990) Drawing things together. In Lynch and Woolgar (eds), 1968.
Lemke, J. L. (1996) Hypermedia and higher education. In T. M. Harrison and
T. Stephen (eds), Computer Networking and Scholarly Communication in the Twenty-FirstCentury University. New York: State University of New York Press, 215231.
Lemke, J. L. (1998) Multiplying meaning: visual and verbal semiotics in scientific
text. InJ. R. Martin and R. Veel (eds), Reading Science: Critical and Functional Perspectives on Discourses of Science. London: Routledge, 87113.
Lemke, J. L. (2000) Multimedia literacy demands of the scientific curriculum.
Linguistics and Education 10(3): 247-271.
Lynch, M. (1990) The externalized retina: selection and mathematization in the
visual documentation of objects in the life sciences. In Lynch and Woolgar (eds),
153-186.
Lynch, M. and Woolgar, S. (eds) (1990) Representation in Scientific Practice. Cambridge,
MA: The MIT Press.
McMurry, J. and Castellion, M. E. (1999) Fundamentals of General, Organic, and
Biological Chemistry (3rd edn). Upper Saddle River, NJ: Prentice- Hall.
Myers, G. (1990) Every picture tells a story: illustrations in E. O. Wilson's Sociobiology. In Lynch and Woolgar (eds), 231-265.
Myers, G. (1992) Textbooks and the sociology of scientific knowledge. English for
Specific Purposes 11 (1): 3-17.

PRINT MEDIA

219

Myers, G. (1995) Words and pictures in a biology textbook. In T. Miller (ed.),


Functional Approaches to Written Text: Classroom Applications Vol. I, The Journal of
TESOL France, Paris, in association with US Information Service, Paris, 113126.
O'Halloran, K. L. (1996) The discourses of secondary school mathematics.
Unpublished Ph.D. thesis. Murdoch University, Western Australia.
O'Halloran, K. L. (1999a) Interdependence, interaction and metaphor in multisemiotic texts. Social Semiotics 9(3): 317354.
O'Halloran, K. L. (1999b) Towards a systemic-functional analysis of multi-semiotic
mathematics texts. Semiotica 124(1/2): 1-29.
O'Halloran, K. L. (2000) Classroom discourse in mathematics: a multi-semiotic
analysis. Linguistics and Education 10(3): 359388.
O'Halloran, K. L. (2003) Intersemiosis in mathematics and science: grammatical
metaphor and semiotic metaphor. In A.-M. Simon-Vandenbergen, M. Taverniers, and L. Ravelli (eds), Grammatical Metaphor: Views from Systemic Functional Linguistics. Amsterdam: John Benjamins, 337365.
O'Halloran, K. L. (forthcoming) Mathematical Discourse: Language, Symbolism and Visual
Images. London: Continuum.
Osborne, J. (2002) Science without literacy: a ship without a sail? Cambridge Journal of
Education 32(2): 203-218.
O'Toole, M. (1994) The Language of Displayed Art. London: Leicester University Press.
Pavesi, M. and Baldry, A. P. (2000) Learning to read scientific texts: integrated selfaccess courseware and corpora for university science students. In Baldry (ed.),
2000a: 227-245.
Peirce, C. S. (1985) Logic as semiotic: the theory of signs. In R. E. Innis (ed.),
Semiotics: An Introductory Anthology. London: Hutchinson, 423.
Purves, W. K. (1999) Biology. In The Encyclopedia Americana (international edition)
Vol. 3. Danbury, CT: Grolier, 769-778.
Royce, T. (2002) Multimodality in the TESOL classroom: exploring visual-verbal
synergy. TESOL Quarterly 36(2): 191-205.
Scott, P. and Jewitt, C. (2003) Talk, action and visual communication in teaching
and learning science. School Science Review 84(308): 117-124.
Swales, J. (2001) EAP-related linguistic research: an intellectual history. In J. Flowerdew and M. Peacock (eds), Research Perspectives on English for Academic Purposes.
Cambridge: Cambridge University Press, 4254.
Thibault, P. J. (1997) Re-reading Saussure: The Dynamics of Signs in Social Life. London:
Routledge.
Thibault, P. J. (2000) The multimodal transcription of a television advertisement:
theory and practice. In Baldry (ed.), 2000a: 311-385.
Thibault, P. J. (2001) Multimodality and the school science textbook. In C. Torsello,
G. Brunetti, and N. Penello (eds), Corpora Testuali per Ricerca, Tradu^ione e
Apprendimento Linguistico. Studi Linguistici Applicati. Padova: Unipress, 293335.
Tilling, L. (1975) Early experimental graphs. The British Journal for the History of
Science 8(30): 193-213.
Zar, J. H. (1999) Biostatistical Analysis (4th edn). Upper Saddle River, NJ: PrenticeHall.

Developing an integrative multi-semiotic model

Victor Lim Fei


National University of Singapore

Introduction
In this age of the multimedia, there is an increasing awareness that meaning
is rarely made with language alone. As Baldry (2000), Kress (2003) and
Kress and van Leeuwen (2001) note, we live in a multimodal society which
makes meaning through the co-deployment of a combination of semiotic
resources. Visual images, gestures and sounds often accompany the linguistic semiotic resource in semiosis. As such, there is a pressing need to
understand the dynamics of meaning-making, or semiosis, in multimodal
discourse. Academic disciplines that focus on mono-modality, such as that
of linguistics, must come into dialogue with other fields of research, for
instance, visual communication studies and media studies, to facilitate the
interdisciplinary nature of multimodal research.
In this paper, the Integrative Multi-Semiotic Model (IMM) (Lim, 2002) is
proposed as a 'meta-model' for the analysis of a page or frame which
involves the use of both language and pictures as semiotic resources. The
term 'meta-model' is used to describe the IMM as a model which brings
together and incorporates the systemic-functional matrices and frameworks
currendy available in the field of multimodal studies. This is undertaken
with the aim of unifying these contributions for the expression, content and
communicative planes of language and visual images in the IMM. There is
a need, however, to further develop the model into one that can account for
meaning arising from other semiotic resources in dynamic environments
such as video texts and hypertext.
Systemic-functional linguistics (SFL), developed by Michael Halliday
(1978, 1994) and extended by Martin (2002) and Martin and Rose (2003),
provides the theory for this investigation into semiosis involving language
and visual images. Although originally conceived for the semiotic resource
of language, the application of SFL to other semiotic resources has been
productive. Pioneering work in the application of systemic-functional theory to visual images, architecture and sculpture includes O'Toole's (1994)
The Language of Displayed Art and Kress and van Leeuwen's (1996) Reading
Images. Following this, further applications of SFL to other semiotic
resources for the analysis of multimodal discourses in mathematics, science,

PRINT MEDIA

221

and three-dimensional museum displays have provided insights into the


nature of intra-semiosis - meaning within different semiotic resources, and
inter-semiosis - meaning across different semiotic resources (for example,
Baldry, 2000; Baldry and Thibault, forthcoming; Pang, this volume; Lemke,
2002; O'HaUoran, 1999a, 1999b, 2000, forthcoming; Royce, 1998a, 1998b,
2002).
Of particular interest in this paper is the development of the theory of
the interaction and integration between language and pictures in cases
where these semiotic resources co-occur on a page as found, for example, in
children's picture books and advertisements. ledema (2003: 30) refers to
such intersemiotic shifts as 'resemioticization' which he defines as 'the analytical means for . . . tracing how semiotics are translated from one into
another as social processes unfold'. In this respect, of significance are Lemke's
(1998) observation of the 'multiplication of meaning' which takes place in
multimodal texts and O'Halloran's (1999a, 1999b) identification of'semiotic metaphor' which refers to the new 'semantic reconstruals' which occur
intersemiotically with shifts between semiotic codes. Royce (1998b) also proposes an 'intersemiotic complementarity' which describes the deployment
of intersemiotic resources in a multimodal text. Further to this, Thibault
(2000, forthcoming) uses phase theory to effectively conceptualize a framework to analyse the integration of language, visual images, sound and music
in television advertisements.
While the direct adoption of a linguistic theory for other semiotic
resources has been criticized (for example, Saint-Martin, 1990), Sonesson
(1993: 343) cautions that 'the outright rejection of the linguistics model must
be at least naive, and as epistemologically unsound as its unqualified acceptance'. As such, a delicate balance between the adoption and rejection of
linguistics theories to visual analysis and intersemiotic processes must be
maintained. That is, theories and concepts used in linguistics may not
belong solely to the study of language and could be productive in their
applications to other semiotic resources. For example, the systemicfunctional theory and the tri-metafunctional organization of semiotic
resources, although originally applied to language, rest essentially on the
basic assumption of language as a social semiotic. Therefore, it is appropriate to interpret SFL as a semiotic theory rather than a particular theory of
language.

Proposing an IMM
Despite the advances made in recent research, there remains a lack of
understanding of how meanings arise in multimodal texts. Apart from
Thibault's (2000; forthcoming) comprehensive framework for the analysis
of television advertisements and Baldry and Thibault's (2001) conception of
phase in dynamic video texts, an overarching model and a meta-language to
describe the processes involved in semiosis and intersemiosis in multimodal
texts is lacking. As such, the IMM and the related concepts introduced in

222

MULTIMODAL DISCOURSE ANALYSIS

this paper are proposed as a tentative step to account for the different
aspects of meaning arising from the use of multiple semiotic resources. The
IMM, which may be used for the analysis of a printed text involving the two
semiotic resources of language and visual images, is a modest step. Nonetheless, the necessity of developing a 'meta-model' with an accompanying
'meta-language' to describe semiotic processes in multimodal discourse is
demonstrated through the discussion of the IMM and the issues raised by
such a model.
The IMM, as displayed in Figure 9.1, demonstrates topologically the
complex multifaceted nature of meaning made in a multi-semiotic text.
The rectangular blocks are used metaphorically to represent the strata,
planes and dimensions of meaning within and across language and visual
images. Following Martin (1992), three planes are conceptualized for these
two semiotic resources. That is, the language and visual image plane
consists of an Expression plane and a Content plane (which is further
divided into grammar and discourse semantics strata), and the Context
plane which consists of register, genre and ideology as displayed in
Figure 9.1.
The top view of the model appropriately displays the Expression plane
which is referred to as 'Typography' for language and 'Graphics' for visual
images. This is significant as the Expression plane is the interface between
the text and the reader. As seen in Figure 9.1, this interface is mediated by
the medium and materiality of the text, which also mediates the other
planes. This mediation may be seen in operation in the simple case of a
wedding invitation card which is usually printed on certain types of paper.
This demonstrates that the Content, register and genre of the text (the wedding invitation) are related to the materiality options of the medium (the

Figure 9.1 The IMM (Lim, 2002: 37)

PRINT MEDIA

223

type of paper and print). Together, these choices carry ideological implications, which in this case concern the elevated status of weddings in Western
society.
An elevated platform between the linguistic and pictorial modalities can
be seen from the top of the IMM. This is called the Space of Integration
(Sol), which is the theoretical platform where intersemiosis occurs through
contextualizing relations. The elevation of the Sol signifies topologically the
semantic expansions that result from the interaction and negotiation
between semiotic resources in what Lemke (1998) terms as 'the multiplication of meaning'. Below the Expression plane is the Content plane which
consists of the lexicogrammatical and discourse semantics strata for language, and the visual grammar and discourse semantics strata for visual
images. As seen in Figure 9.1 the Sol also operates on the Content plane.
The lexicogrammatical and discourse systems for language are organized
according to the three metafunctions proposed by Halliday (1994); the ideational, Interpersonal and Textual metafunctions. The theory of metafunctionality has been extended to the systems which constitute the grammar of
other semiotic resources. For example, Kress and van Leeuwen (1996) and
O'Toole (1994) extend the metafunctional hypothesis to the systems of a
visual grammar. O'Toole (1994) proposes a detailed metafunctionally based
matrix for the analysis of paintings. In addition to the lexicogrammatical
and grammatical systems, a discourse semantics stratum is also recognized
for the pictorial modality as well as for the linguistic in the IMM. Although
not developed here, this extension follows from Martin's (1992) metafunctionally based discourse systems for language. The discourse semantics
stratum for language and visual images is useful for analysing children's
picturebooks, for example, which consist of a sequence of pictures and text
(Lim, 2002).
The systems of meaning in the Expression and Content plane for language and visual images are seen to be organized metafunctionally in the
IMM. The metafunctional distinctions within the systems on the grammar
and discourse strata in the IMM are indicated through the three rectangular
boxes of different Tone in Figure 9.1. Thibault (2000: 362) proposes that
'metafunctions are best seen as a principle of integration for approaching the
Experiential, Interpersonal, logical and Textual dimension of the text as a
whole'. The commonalities of metafunctional organization across semiotic
resources are drawn upon and metafunctional distinction is used as a means
of conceptualizing meaning across the different strata in the IMM.
The term system-metafunction fidelity is used to signify the degree of
dedication of a system towards a specific metafunction. Although meaning
is organized around the metafunctional classifications, the systemmetafunction fidelity of the visual grammar is less rigid compared to the
lexicogrammar in language. In other words, the metafunctional categories
by which the systems for visual images on the grammar stratum are organized
may be more fluid than depicted by the three rectangles in Figure 9.1. For
example, the system of Rhythm in the grammar for visual images (O'Toole,

224

MULTIMODAL DISCOURSE ANALYSIS

1994) may be oriented towards Interpersonal meaning (to capture attention), Textual meaning (to cohesively link parts of a text) or Experiential
meaning (to indicate an action) in different instantiations of the system in
text. The orientation of the system towards one metafunction rather than
another depends upon the surrounding co-text in the visual image.
The problem of system-metafunction fidelity is also relevant to the systems which operate on the Expression plane. That is, the Expression plane
for language and visual images, referred to as Typography and Graphics
respectively, is also seen to be organized metafunctionally. However, the
major systems on this plane are not always dedicated primarily to a single
metafunction. The system-metafunction fidelity is even lower than noted
above for the grammar of visual images. Although it is possible to distinguish meanings as being ideational, Interpersonal or Textual, the systems
on the Expression plane which are responsible for these meanings overlap
with regards to their metafunctional capabilities. For instance, the system of
Colour can realize ideational, Interpersonal, and Textual meanings as noted
by Kress and van Leeuwen (2002). The tendency of the instantiation of a
system to be orientated towards a particular metafunction is discussed in
more detail in relation to the notion of Critical Impetus (see below). However,
at this stage we may note that a cline, rather than a categorical trimetafunctional distinction, is used in Figure 9.1 to show the fluidity of the
systems which operate on the Expression plane. This cline is represented
by graduation in the system of Tone in Figure 9.1.
Kress and van Leeuwen (2002) adopt an alternative approach to deal
with the metafunctional diversity of the systems which operate on the
Expression plane by proposing that Colour, for example, is a semiotic
modality in its own right. Thus, rather than positing colour as a system
which operates on the Expression plane, Kress and van Leeuwen (2002)
attempt to locate Colour on the grammar stratum as a semiotic modality
which possesses its own grammatical systems - or rather scales - of meaning; for example, Saturation, Purity, Modulation and Hue. However, as
admitted by Kress and van Leeuwen, Colour differs from other semiotic
modalities such as language and visual images in that it cannot exist on its
own: 'It can survive only in a multimodal environment' (Kress and van
Leeuwen, 2002: 351). In order to accommodate this limitation of Colour as
a social semiotic, an alterative perspective is provided here. That is, Colour
is conceived to be a system with a low system-metafunction fidelity on the
Expression plane in the IMM. Systems such as Hue, Tone, Saturation and
so forth, are seen as sub-systems of the system of Colour. To account for the
metafunctional diversity of a system such as Colour, the notion of 'critical
impetus' is developed.
The IMM rests entirely upon the Context planes of register, genre and
ideology. This is significant because meaning is located within the Context
of Situation and Context of Culture. Martin (1992) suggests that the sociosemantic variables of Field, Tenor and Mode 'hook up' with the metafunctions on both the communication planes of Register and Genre. Another

PRINT MEDIA

225

layer, ideology, is also proposed by Martin (1992) to look at positions within


discourse formations manifested across a range of texts. Meanings made on
this intertextual level are also heteroglossic in nature according to different
reading positions and subjectivities.
The IMM aims to provide the apparatus for the analysis of a text which
utilizes both the linguistic and the pictorial semiotic resources. Using the
IMM as an approach also allows for a systematic evaluation of the meaning
made on various strata and planes and, at the same time, provides a platform for understanding the interaction between modalities and examining
the occurrence of semantic expansion during intersemiosis. In what follows,
two particular dimensions of the IMM which can contribute to a deeper
understanding of the dynamics of intersemiosis are explored: the systems
which operate on the expression plane, and the Space of Integration (Sol).
The scope of this paper

There is only space to investigate two issues raised by the IMM. The first
aspect is concerned with the systems such as Colour which operate on the
Expression plane of language and visual images. This is an undertheorized
area of research as the focus of interest has tended to be the Content plane
which consists of the grammar and discourse systems. The second aspect is
concerned with the interaction and negotiation between the two semiotic
modalities on the Sol in a multimodal environment. An understanding of
the intersemiotic processes which take place in the Sol is critical for an
understanding of how meaning is made in a multimodal environment.
In the first case, the Expression plane of language, or the 'Typography'
of printed texts, has often been neglected in linguistic theory. Likewise, the
Expression plane of the pictorial semiotic, referred to here as 'Graphics',
has also been undertheorized. Responding to this need, system networks are
proposed to account for the Typographic and Graphic selections made from
within the linguistic and pictorial semiotics respectively (see Figures 9.4 and
9.5). The system networks, still very much at an exploratory stage, are
conceived in the tradition of SFL and thus are seen to be organized
metafunctionally. The proposed networks represent a deliberate effort to
give recognition to the role of the Expression plane in contributing to the
functions and meaning of discourse which is traditionally seen to be located
within the realm of the Content plane.
In the second case, Gestalt theory in art has long observed the phenomenon of the whole as always greater than the sum of its parts (Gombrich,
1960). Likewise, in the interaction and integration between the linguistic
and pictorial semiotic resources, the total meaning made is more than just
adding up the meaning made by each independent modality. In other
words, semantic expansion or a 'multiplication of meaning' (Lemke, 1998)
occurs during this co-deployment. To account for this expansion of meaning,
an Sol in the IMM is proposed so that the contextualizing relations between
two modalities can be studied. As explained in a following section, semantic

226

MULTIMODAL DISCOURSE ANALYSIS

expansion can occur through the mechanisms of'Homospatiality' and 'Semiotic Metaphor' (O'Halloran, 1999a, 1999b, 2003, forthcoming) in the Sol.
Other intersemiotic mechanisms have also been proposed: semiotic cohesion,
semiotic mixing, semiotic adoption, juxtaposition and semiotic transition
(O'Halloran, forthcoming). However, in this discussion only homospatiality
and semiotic metaphor are considered. Before discussing these contextualizing
relations, the Expression plane for language and visual images is examined.
The Expression and Content planes for language and visual
images
Halliday (1978: 39) proposes that language is 'a system of meaning potential'. Seen to operate on the levels of the Content and Expression plane,
meaning potential is conceived as a network of options where meaning is
made through paradigmatic selections from the available system networks.
Language is an abstraction (the system network) until it is materialized or
expressed through either speech or writing (the process in the form of a
text). When the linguistic semiotic is expressed through sound, the Expression plane consists of Phonology. When language is materialized as writing, the Expression plane is Graphology, or in the case of a printed text
under consideration here, Typography.
The visual image is similarly a tool for meaning construction. That
is, the pictorial semiotic resource is also seen as a conceptual abstraction
with systems of meaning constituting the meaning potential. As shown in
Figure 9.2, language is conceived to possess abstract lexicogrammatical systems of meaning where choices are expressed on the Expression plane
through Typography in printed texts. In the same manner, the grammar of
visual images is also abstraction which is instantiated through choices from
networks of systems (such as Form, Perspective, Layout and Strokes) on the
Graphics Expression plane. The separation of the Expression and grammar strata for the pictorial semiotic may be perceived as an uneasy one due
to the interwoven nature of the elements on both strata in meaning-making.
Nonetheless, it is useful and necessary to differentiate between the two strata
in order to investigate the systems' potential and understand the meaningmaking process. Figure 9.3 is used to discuss the theoretical distinction
between the Expression and visual grammar planes.

Figure 9.2

Instantiation of language and pictures

PRINT MEDIA

Figure 9.3

227

An iconic face

The Expression plane of the Figure involves, for example, the systems
of Colour and Form used to make meaning. This refers to choices in the form
of the black thin line, the two small black circles and the larger circle in
Figure 9.3. Should any of the choices be altered at the rank of the Expression plane; for example, should the eyes become green, or the thin black
line becomes a red brushstroke, the meaning of the picture would change.
The choices from systems in the Expression plane (see Figure 9.5) are significant in terms of the meaning of the picture. This illustrates that choices
made from systems on the Expression plane contribute or feed through to
the meanings made through systems operating on the Content plane. This
point is further discussed below.
The grammar stratum, as extensively theorized by O'Toole (1994, 1995)
and Kress and van Leeuwen (1996), relates one disparate element to
another and explains how the whole functions cohesively to make meaning.
Just as the grammar of language concerns itself with the chains of words to
form coherent sentences, the grammar of visual images is about the piecing
of one item with another to construct a coherent message. The relations of
the parts to a whole, for instance, how the various shapes form the iconic
face in Figure 9.3, operate on the grammar stratum. This grammar is culturally dependent and governs the way a reader 'reads' and understands
images such as the iconic face in Figure 9.3.
Following O'Toole (1994: 24), a hierarchy of different ranks analogous to
Halliday's (1978) rank scale for language, is proposed for the visual grammar. In this way, it is possible to examine the meaning made on each of the
rank units, which are Member, Figure, Episode and Work. This adoption of a
rank scale operating within the principle of constituency, where one rank is
constitutive of the next higher rank in the hierarchy, facilitates a more
systematic analysis of the meaning made in the different units on the visual
grammar stratum.
In a sense, the delicate distinction between the Expression plane and
grammar stratum can be made with the Expression plane being largely
concerned with the surface instantial features of the text and the Content
plane with the interaction and negotiation between the different elements in
the text. In the same way that Context mediates the meaning of a text, the
Expression plane mediates the choices made from the grammatical and
discourse systems operating on the Content plane. The notion here is one of
'mutual engendering' which has been used to describe the relationship

228

MULTIMODAL DISCOURSE ANALYSIS

between language and social Context (Martin, 1992). In this case, the
mutual engendering encompasses the Expression plane and the Content
plane, the materiality and medium of the text, and the social and cultural
Context within which the text was produced.
Perceptual equity between language and visual images
Saint-Martin (1990) claims that pictures are primarily objects of visual perception and therefore are distinct from language in many ways. While
acknowledging this, it is also recognized that the linguistic semiotic resource
instantiated through the system of Typography is also a visual experience.
With the adoption of this position, some of the assumptions based on the
commonalities between the two modalities are discussed before introducing
the unique systems through which each semiotic operates to make meaning.
Since both the linguistic and the pictorial semiotics are expressed through
the visual medium on a page and experienced visually through the sense of
sight, it appears reasonable to assume co-equal statuses between the two
modalities. This assumption challenges the conventional privileging of language over the visual image. Here it is recognized that both the linguistic
and the pictorial semiotic resources serve different, though complementary,
functions. Therefore, both are equally important as signifying systems
through the different roles they perform. This point is developed below.
Until recently, the pictorial text has often been relegated to the status of
mere illustrations to the linguistic text. In the field of semiology, recent
interest in visual communication may be traced to Barthes's (1977) influential work, Rhetoric of the Image, where the visual images are seen to play a
somewhat attendant role to language. That is, Barthes proposes that language serves to 'anchor' (by elaborating) or 'relay' (by extending) the meaning of the visual text. However, it is important to recognize that despite the
constant co-deployment of language and pictures in multimodal texts, both
the linguistic and the pictorial semiotic modalities have the potential to
function independently. Some instances of these include the popularity of
wordless picture books, such as Monique Felix's (1980) The Story of A Little
Mouse Trapped in a Book, and the increasing use of wordless instruction sheets
to transcend language barriers, such as the Swedish-based but internationally marketed IKEA furniture which utilizes only the pictorial semiotic in
the assembly instructions. The success of these examples of visual communication attests the ability of the pictorial modality to operate as an
independent semiotic resource.
The adoption of the stance that both the linguistic and the pictorial modalities should share an equal status is now widely recognized (for example,
Baldry, 2000; Kress and van Leeuwen, 2001; O'Halloran, 2000; Thibault,
2000). Van Leeuwen (2000), for instance, criticizes the negative comparisons
between language and visual images in his refutation of Barthes's
(1977) earlier proposition that words have 'fixed meaning' while images
are 'polysemous'. In addition to this, van Leeuwen (2000) confronts some

PRINT MEDIA

229

misconceptions regarding the pictorial semiotic such as the assertion that


visual images cannot represent negative polarity. Van Leeuwen (2000: 179)
also argues that visual semiotics should focus 'not only on the image as
representation, but also on the image as (interact5.
However, it is important to remember that each semiotic resource (language, visual images, mathematical symbolism, gesture for example) has
evolved to be used in conjunction with other semiotic resources, and this rather
obvious but often neglected fact has serious implications for the way we view
the functions and resulting grammatical and discourse systems of each
resource. Examining one semiotic resource in isolation, for example language, results in an impoverished view of how that resource is organized for
meaning. The grammatical and discourse systems of each semiotic resource
need to be considered in relation to how they are organized to interact with
systems in other semiotic resources to accomplish particular functions
within the whole realm of what can be achieved semiotically.
Lemke (1998), for example, observes that language and visual images
each have their individual functions and strengths. He summarizes some of
the key distinctions by noting that language is more adept in encapsulating
typological meaning, or meaning by category. It is also a more time-sensitive
semiotic where the linear progression of time can be reflected. The pictorial
semiotic, on the other hand, has resources for the representation of topological meaning, or meaning by degree. It is also a more space-sensitive
semiotic that supersedes the linguistic mode in representing spatial relations.
Each with their own niches, it is hardly surprising to find them serving
different functions in a multimodal text. In addition, the co-deployment of
these two modalities in a multimodal text can lead to meaning expansions as
well. Nevertheless, it is important to understand that systems within each
resource independently have the potential to realize unique meanings that
may not necessarily be integrated during intersemiosis. This is the meaning
made by each independent modality on each stratum and is topologically
reflected in the model as the area outside the Sol as shown in Figure 9.1.
The assumption of equal status means that both are accorded co-equal
value in meaning-making. The implication of this on the Expression stratum
is that of perceptual equity between the two semiotic modalities. It must be
noted, however, that having the same status does not translate to the claim
that both the semiotic resources of language and the pictures have the same
degree of influence on each other in a text. It is not unusual to find that in
one particular text, the linguistic semiotic may be more dominant in terms
of meaning than the pictorial semiotic, and yet in another text, the visual
semiotic may be the primary semiotic source for meaning. This point is
further elaborated when intersemiosis between language and visual image
on the Sol is discussed below. The proposed model shown in Figure 9.1
allocates equal space for each semiotic resource thereby signifying topologically the equity in status between the two semiotic modalities.

230

MULTIMODAL DISCOURSE ANALYSIS

Reading path
The assumption of perceptual equity on the Expression plane has profound implications for our approach to the analysis of the multimodal text.
The Expression plane is the interface the reader experiences upon reading
the text. In this paper, the term 'reading', despite being a term derived
from the study of language, is taken to include visual perception or viewing.
Following Sardar and van Loon's (2000: 44) work in media studies, reading
is defined as 'the process of interaction when a text is analysed as well as the
final result of that process, the interpretation'. Hence, in any multimodal
text, it is useful to chart a typical reading path that the hypothetical reader may
follow in the reading of different episodes on a page. In a sense, the reading
path is the order by which the reader may process different episodes in a
multimodal text.
As previously mentioned, Thibault (2000, forthcoming) and Baldry and
Thibault (2001, 2004) use phasal analysis in their deconstruction of a film
segment, where salience or the 'use of foregrounding strategies' allows for
certain modalities to be thrust into prominence. Analysis is therefore guided
by the contrastive salience of a specific semiotic resource in each particular
instance. This presupposes and builds upon the theory of a 'reading path'
where the viewer reads according to the contrastive salience of the semiotic
resources at each instantiation. O'Halloran (1999: 323) proposes that a practical approach to analyzing a multi-semiotic text can be through a progressive analysis following the 'reading path determined by the choices within
different semiotic codes'.
The notion of a linear or uni-directional reading path, however, deserves
to be more closely scrutinized. This conception seems to be appropriate for
a reader reading a book or magazine, navigating across the pages or frames
in a linear reading pattern, governed by literacy conventions. Following
Pang (2000), however, this would more suitably be termed as a directional path
rather than a reading path. The usefulness of a restrictive and regulated
reading path breaks down when analyzing the multimodal text on a page or
frame. The reading path on a multimodal frame is seldom only unidirectional, as the hypothetical reader's eyes are led through contrastive
salience, possibly even in a back and forth fashion between two items or
Episodes (O'Toole, 1994) on a page. In other words, the path, although
sequential due to constraints of human visual perception, may not be unidirectional but is free to be bidirectional (Pang, 2000) or multidirectional as
displayed in Plate 9.1. Following the assumption of perceptual equity, the
reading path may disregard the distinction between linguistic and pictorial
semiotic resources as the reader is drawn by the contrastive salience of a
section or Episode.
Kress and van Leeuwen (1998) introduce the notion of scanning which
clarifies their earlier claim that readers tend to read in a left to right and up
to down pattern. They describe scanning as a process that occurs before
reading. The 'scanning process sets up connections between the different

PRINT MEDIA

231

Plate 9.1 Unrestrictive bidirectional reading path across three Episodes


reproduced from Wong (2000: 6)
elements, relating them to each other in terms of their relative importance'
(1998: 205). This 'relative importance' is determined by the contrastive
salience between Episodes. The scanning process first locates our eyes on the
Centre of Visual Impact (C VI), which signals the beginning of our reading process. The scanning pattern is closely related to salience of semiotic choices
within the multimodal page and the Context of the reader's literacy
conventions.
The notion of a CVI is an interesting one. Bohle (1990) cites Garcia's
proposal of the CVI as the focal point where the reader enters the page.
Working in the tradition of Gestalt psychology of picture perception, Sonesson
(1993: 378) claims that evidence has been found for 'the existence, if not for
an order of reading, then at least of certain points of fixation where the
glance tends to cluster'. The initial point of fixation or the CVI is the
hypothetical reader's point of entry into the multimodal text, which initiates
the entire process of visual perception. Thus, on a web page for instance,
although there may exist in theory multiple entry points into the text, in
practice semiotic choices function to ensure that the viewer's attention is
initially focused on one part of the text. This can be explained, for example,
by the relative Interpersonal salience of semiotic choices, a point which is
developed below.
Critical impetus in metafunctional meaning in the Expression
plane
The purpose of this section is to introduce concepts which require further
development and theorization, in particular the notion of 'critical impetus'
which is used to explore the metafunctional diversity of the systems operating

232

MULTIMODAL DISCOURSE ANALYSIS

on the Expression plane. This includes developing the notion that a viewer
is drawn towards interpersonally salient components in a multimodal text.
While system networks for some of the more prominent systems of the
linguistic and pictorial modalities on the Expression stratum are proposed in
the next section of this paper, these are not exhaustive and remain very
much at a preliminary stage.
Although the meanings made through the systems in the grammar stratum are organized metafunctionally, the tri-metafunctional distinction
appears to be more uncertain on the Expression plane as previously discussed. These systems with a low system-metafunction fidelity can be more
appropriately described as functioning on a cline and, as such, the classification of the systems is not based on metafunctionally based discrete categories in Figure 9.1. Instead, systems operating on the Expression plane can
contribute to the ideational, Interpersonal and Textual meanings in a text. It
is therefore useful to examine the critical impetus, or the necessary conditions
and circumstances which reveal which particular metafunctional meaning is
likely to emerge from choices within systems on the Expression plane.
The critical impetus for a dominant Interpersonal meaning on the Expression plane is salience., and this can be achieved through contrast of Colour,
Shape, Size, and so forth. The critical impetus for Textual meaning on the
Expression plane is the presence of Textual unity and cohesiveness. But first,
what is the nature of the ideational meaning made on the Expression
plane? Visual semioticians Floch (1986) and Thurlemann (1990) have
observed a double layer of signification in pictures. They term the first level
as 'iconic' and the second as 'plastic'. Sonesson (1993: 325) explains that 'on
the iconic level, the picture is supposed to stand for some object recognizable
from the ordinary perceptual lifeworld, while concurrently on the plastic
level, simple qualities of the pictorial Expression serve to convey abstract
concepts' within the lifeworld as well. Lifeworld, according to Husserl, is the
'world taken for granted'. To extend this rather crudely into SFL terms,
lifeworld can be compared to the Context of Situation and Context of
Culture, the social reality in which the individual operates.
Doonan (1993: 15), working on picture books from a literary perspective,
also recognizes the 'two modes of referring' in pictorial images. She simplifies 'Denotation' as the representation of an object in a particular Context of
culture. 'Exemplification', on the other hand, is the mode by which
'abstracted notions, conditions and ideas' (1993: 15) are represented within
that culture. This approach to the representation and composition of pictorial semiotics is congruent to our proposed formulations in this paper,
which draws expedientiy upon some of these ideas. Modifying the original
sense of denotation and connotation as proposed by Barthes (1977), the
terms Denotative Value and Connotative Value are used to describe the two types
of ideational meanings made on the Expression and Content strata.
The Denotative Value is understood as the literal or iconic meaning. For
instance, the denotative value of the colour red is confined to the perception
and reference of the reddish hue. Saint-Martin (1990) observes that two

PRINT MEDIA

233

persons can look at one colour and yet see it differently. Hence, it must be
added that the use of denotative value is qualified with the acknowledging
of the reader's cultural-based subjectivities. This contrasts with Barthes's
(1977) use of denotation as a rather non-Context-dependent Platonic ideal.
In other words, the denotative value is understood in this paper as the literal
but Context-dependent meaning. Like Floch (1986) and Thurlemann's
(1990) conception of the 'plastic' and Doonan's proposal of'Exemplification', the Connotative Value is the ideas and abstractions evoked from the
literal image. For instance, the connotative value of the colour red refers to
the abstract concepts which the colour evokes in the reader. Dependent on
the Context of culture, situation and co-text, the red hue could connote
antithetical ideas ranging from danger in a European Context to good fortune in Chinese culture.
The Interpersonal meaning dominates when system choices on the
Expression stratum generate Salience., in other words, when salience has a
critical impetus. This salience can sometimes be achieved through contrasts
in, for example, Size, Shape and/or Colour as mentioned above. The critical
impetus of salience can be linked to the notion of'markedness' in Halliday's
(1994) conceptions. The notion of'markedness' could be helpful to account
for the meaning expansion on the grammar stratum as well as on the Expression plane. Markedness in Halliday's (1994) original usage means to
'stand out' as an atypical choice. The choices made in Typography for most
texts, for example, are usually stereotypical options according to their genre.
For instance, in the Context of a piece of formal academic writing such as a
dissertation, a particular selection of appropriate Typography is expected. In
addition, because of the association of certain Typography with particular
genres, any departure from the convention or mismatch between Typography and genre would render those typoGraphical choices as 'marked'.
This is consistent with Halliday's (1994) observation that there is an order in
a clause which is usually expected in a particular clause type, for example,
the nominal group functioning as Subject is usually the first item in a clause
which has a declarative mood. When this order is not adhered to, the clause
is marked. A marked selection in Typography is similarly meaningful.
The notion of critical impetus is thus useful when included in systemic
analysis of both linguistic and multimodal discourse. The critical impetus is
used to identify the environment whereby certain Interpersonal meanings
may dominate through the notion of marked choices. In the same manner,
Textual meanings are usually observed when the critical impetuses of
Unity and Cohesiveness in a text are in operation. For example, in a tapestry
design, the system of Saturation and Hue in Colour and the geometric
forms through the system of Shape operate to create unity and cohesion in
the text. As may be seen from this very preliminary discussion, however,
further research is needed to understand the conditions under which certain
metafunctional orientations are realized through choices from the systems
operating on the Expression strata. Provisional networks for these systems
are given in the next section of this paper.

234

MULTIMODAL DISCOURSE ANALYSIS

Linguistic and visual system networks on the Expression


plane
The predominant linguistic and visual systems which operate on the Expression stratum, respectively referred to as Typography, is displayed in Figure
9.4. Given their metafunctional diversity, the systems are not classified
according to metafunctional categories. As seen in Figure 9.4, meaning on
the Expression plane is made through the selections in Typography within
the systems of Font and Layout. The system of Font in Typography has
three sub-systems, Type, Size and Colour. Paradigmatic options are also
available within each of the three sub-systems.
As the meaning potential of particular systems is theoretically infinite, it is
not possible to list all the possible options. Thus, the network represented in
Figure 9.4 confines itself to a few common selections for the purposes of
exemplification. For instance, the system of typeface keeps expanding with
new font types being created. The options shown within this system in
Figure 9.4 are merely the Typeface families, within which many other Typefaces are classified. For instance, the Typeface Times New Roman is categorized within the Roman family. The system of Size similarly contains too
many options to be listed here, and thus the option of 12 point is an example
and the sign indicates the system's infinite potential. It is the ultimate cline.
As displayed in Figure 9.4, the system of Layout includes the systems of
Spacing andJustification. The system of Spacing has four sub-systems. Leading is the spacing between lines on a page, which includes options for double
to single spacing. Kerning, on the other hand, is the adjustment of space
between the letters of a word. The Internal Space refers to the space
between words and the system of Justification is the alignment of the sentences. Finally, the choices for Indentation allow a clearer demarcation
between paragraphs and function primarily to signal a shift in direction or
text-type from the preceding lines.
The systems operating on the Expression plane for pictures on the
Graphics stratum, namely Perspective and Form, are displayed in Figure
9.5. Perspective, according to Doonan (1993: 34), is 'the way an artist controls space in the picture'. Perspective has two sub-systems: Deep Space (DS)
and Point of View (PoV). DS portrays an illusion of a three-dimensional
world through a two-dimensional image on a page thereby generating a
sense of illusionary depth. DS can be achieved through Contrasting Size,
Converging Lines or Chiaroscuro. The use of Contrasting Size, for
example, in Picture A in Plate 9.2, shows that illusionary depth is created as
the play slide is represented as located further back in the picture world. In a
two-dimensional surface, the figure of the duck is shown to be larger than
the slide. This interpretation, however, defies the hypothetical reader's cultural knowledge. Hence, in order to make sense and maintain relevance, the
reader assumes that DS through Contrasting Size generates the impression
of a three-dimensional world. This interpretation, as opposed to a world
of enormous ducks, fits more congruently with the reader's world. The

PRINT MEDIA

Figure 9.4

235

System network for Typography

theoretical assumption behind this interpretation is consistent with Sperber


and Wilson's (1986) theory of relevance in verbal communication, which
suggests that their observations can be extended to visual communication
as well.
In Plate 9.2, the use of Converging Lines to produce DS is seen in Picture
B taken from Satoshi Kitamura's (1986) When Sheep Cannot Sleep. The series

236

Figure 9.5

MULTIMODAL DISCOURSE ANALYSIS

System network for Graphics

of converging vectors gives a sense of the illusionary depth and adds a sense
of three-dimensionality into the picture world. Finally, Chiaroscuro is the
application of light and shadows to create DS in Picture C. The example of

PRINT MEDIA

237

Picture A: Contrasting sizes.


Reproduced from Wong (2000: 9)

Picture B: Converging lines.


Reproduced from Kitamura (1986: 15)

Picture C: Chiaroscuro.
Reproduced from Yee (1998: 13)

Picture D: Low tilt.


Reproduced from Sallustio (1999a: 4-5)

Picture E: A high tilt.


Reproduced from Kitamura (1986: 11)

Picture F: Close-up and medium shot.


Reproduced from Sallustio (1999b: 24-25)

Plate 9.2

Perspective

238

MULTIMODAL DISCOURSE ANALYSIS

the Singaporean Merlion statue shows how shading can suggest a sense of
three-dimensionality on a two-dimensional plane.
PoV is the viewpoint through which the reader is presented with a scene
in the picture. Following cinematography theory, Bordwell and Thompson
(1997: 241) explain that there are systems available in a cinematic shot
which determine the reader's entry into the story world. Two main systems
are Angle and Distance. Angle is the tilt at which the visual image is presented. A high tilt may place the viewer in a somewhat voyeuristic position.
This can be seen in the frame shown in Picture E in Plate 9.2, where the
reader is 'situated' in the position of an intrusive outsider. A sense of alienation and detachment or feelings of superiority could result from a skilful
use of the high tilt. Correspondingly, a low tilt may lead the reader to feel
overwhelmed, usually with the character positioned to be 'towering' over the
reader. An example can be seen in Picture D, where the pile of toys is
emphasized and the children are portrayed above the clutter. Finally, the
system of Distance has the categories of Long Shot, Medium Shot and Closeup. Although these categories are relative, they are typically discernible, as
displayed in Picture F, and have a powerful effect.
The system of Form displayed in Figure 9.5 contains four sub-systems,
those of Colour, Shape, Line and Strokes. Colour, following Doonan (1993),
operates through three sub-systems. Hue or pigment distinguishes the colour across the spectrum, making it possible to discriminate, for instance,
blue from purple. Tone 'is a measure of light and dark of an area regardless
of its colour, and its quality of a surface as measured purely by its position in
the scale between black and white' (Doonan, 1993: 30). Tone or shading can
render the effects of texture and lighting. Saturation refers to the purity of a
colour. The primary colours such as red, yellow and blue are hues with the
highest level of intensity or saturation.
The system of Shape includes the options Geometric and regular or NonGeometric and irregular. The selection of shapes adds to the multifarious
meaning made in the text. For instance, a picture composed of largely
regular shapes positioned horizontally or vertically could suggest stability
and even a sense of rigidity. The system of Line 'creates contour, modelling,
shading and a sign for movement. A contour puts a Line round objects and
figures and gives them individuality and character' (Doonan, 1993: 23).
Lines such as those used to create varying tone could render the effect of
lighting conditions. Finally, the system of Strokes in Graphics refers to the
way in which colour is applied. Some common options available are Brush,
Pencil, Paint and Crayon. Once again, these systems are not exhaustive, but
rather they are presented to illustrate how systems on the Expression plane
contribute to the overall meaning of the text.
Space of integration (Sol)
The Sol functions as the theoretical platform for discussion of the dynamics
in the interaction between language and visual images for meaning-making

PRINT MEDIA

239

in a multi-semiotic text. Sol topologically reflects the semantic multiplication brought about by the interaction and integration in intersemiosis
between the two semiotic resources. Thibault (2000: 362) explains that it is
'on the basis of co-contextualizing relations that meaning is created'. This
paper proposes contextualizing relations as the meaningful relationships that are
present between two modalities. Intersemiosis is therefore a result of the
contextualizing relations between the two semiotic modalities.
One of two types of contextualizing relations can be found whenever two
modalities operate in a multimodal text. In cases where the meaning of one
modality seems to 'reflect' the meaning of the other through some type of
convergence, the two resources share co-contextualizing relations. On the other
hand, in cases where the meaning of one modality seems to be at odds with
or unrelated to the other, their semantic relationship is one that diverges.
Here the resources share re-contextualizing relations. The implications of these
two contextualizing relations are apparent in the semantic expansion that
consequently occurs with the co-deployment of language and visual images.
It may be helpful to differentiate between the nature of the interaction
between the semiotic modalities and the extent or degree to which the
linguistic item contextualizes the meaning of the visual image. Both semiotic
modalities can either co-contextualize or re-contextualize the other, regardless of the degree of contextualization on each other. The nature of the
interaction between the two semiotic modalities thus refers to whether
the two resources are co-contextualizing or re-contextualizing. Further to
this, the interaction between the semiotic resources is seen to be mutually
contextualizing at every instantiation, as opposed to Barthes (1977: 26) conception of the visual image which 'illustrates the text. . . [or] the text loads
the image'.
Cheong (1999; this volume) refers to intersemiosis between the two modalities as the Bidirectional Investment of Meaning. Cheong's (1999) analysis of
advertisements as multimodal texts suggests that the degree of interconnectedness and the degree of interweaving of meaning between language
and visual images can be measured through a scale known as Contextualization
Propensity (GP). CP 'refers to the degree/extent to which the linguistic items
contextualize the meaning of the visual images' (Cheong, 1999: 44). In other
words, GP measures the strength of the influence the modalities exercise on
each other. Gheong (1999) also shows that CP in turn has a direct influence
on the Interpretative Space (IS) of the reader resulting in either a high or low
Semantic Effervescence (SE) of the text. For example, a multimodal text with a
high CP will lead to a low IS thus resulting in a low SE. Essentially, Cheong's
(1999) proposals provide us with the meta-language to look at the degree and
extent of contextualization the two semiotic resources have on each other
and the implications of these contextualizing relations.
The focus here, however, is the nature of the interaction between the two
semiotic modalities. Understanding this phenomenon can contribute significantly to a clear understanding of the mechanisms at work on the Sol.
For example, further expansion of meaning may occur on the Sol through

240

MULTIMODAL DISCOURSE ANALYSIS

the process of Homospatiality in the Expression plane and Semiotic Metaphor in the Content plane. The avenue by which this multifaceted semantic multiplication occurs on each plane is discussed in the following
sections. As will become evident in this discussion, further research is needed
to understand the range of mechanisms through which semantic expansions
take place intersemiotically (see also O'Halloran, forthcoming).
Sol on the Expression plane
One mechanism which can result in a semantic expansion on the Expression
plane is Homospatiality. The term is adapted from Carroll's (2001: 355)
conceptualization of 'disparate elements in one spatially bonded homogenous entity'. Carroll (2001) proposes the term for the analysis of visual metaphors. However, its incorporation into IMM to describe the related
phenomenon of two systems sharing the same spatial coordinates on the
Expression plane appears to be useful. This integration of the two different semiotic systems where one superimposes on the other usually results in
semantic multiplication on the Content plane, where the meaning made is
reinforced or where new meanings are made.
An example of Homospatiality is shown in Picture A on Plate 9.3, where
the linguistic text, 'Snaaap', realized through the system of Font on the
Typography, shares the same spatial coordinates as the visual image realized
through the systems in Graphics of the word breaking into two. The mechanism of Homospatiality reinforces the meaning of a strong force breaking
an object into two with an accompanying 'image' of sound. Hence, an
expansion of meaning through reinforcement results from this process of
Homospatiality. Another example is found in Picture B in Plate 9.3 where the
visual image of the smoke emitted by the campfire functions simultaneously
as the Typography for the word 'hot'. Thus, through the mechanism of

Picture A
Plate 9.3

Picture B

Homospatiality: Picture A reproduced from Sallustio (1999b: 4)

PRINT MEDIA

241

Homospatiality, an intensified sense of heat and smoke from the fire is represented. These extensions of the meaning stem from the intersemiosis on the
Expression plane of the multimodal text, which engenders the meaning
arising from choices in the Content plane.
Sol on the Content plane
Semiotic Metaphor is a mechanism, proposed by O'Halloran (1999a, 1999b,
2003, forthcoming), which operates on the Content plane, more specifically in the Sol existing between the lexicogrammar of language and the
visual grammar. O'Halloran defines semiotic metaphor as an intersemiotic
process whereby a shift in the functional status of an element arises
through a shift between semiotic resources. That is, with a movement
between semiotic codes, 'the new functional status of the element does not
equate with its former status in the original semiotic or, alternatively, a new
functional element is introduced in the new semiotic which previously did
not exist' (O'Halloran, 1999a: 348). O'Halloran gives two examples of
semiotic metaphor which occurred in a lesson on trigonometry in a secondary high school. In the first case, a process realized by the verb 'look' in
the teacher's oral linguistic statement 'and he looks down of course'
becomes an entity in the form of a line segment in the visual diagram on
the blackboard (O'Halloran, 1999b: 24, forthcoming). O'Halloran explains
that a second semiotic metaphor occurs in this lesson when the new entity of
a triangle is constructed visually in the mathematical diagram on the
blackboard. This entity did not exist prior to the visual semiotic representation of the trigonometric problem. Although originally proposed for the
intersemiotic reconstrual of elements occurring across language, visual
images and mathematical symbolism in mathematical discourse, the notion
of semiotic metaphor is productive in its extension to other semiotic
resources.
An example of Semiotic Metaphor is shown in Plate 9.4. The visual image
of the diamond is juxtaposed with the linguistic clause 'because he loves
me'. This association of the visual image of a diamond with the linguistic
clause implies the gift of a diamond is an Expression of love. Here the
dynamic process of 'love' is reconstrued as a fixed entity in the form of a
diamond, and thus is an example of a semiotic metaphor. Indeed, it could
be argued that diamonds (as gems, not cutting agents) are in themselves
always semiotic metaphors. This may be true of all social symbols, where
people are encouraged to attach a range of complex and dynamic meanings
to fixed entities in the form of consumer goods. The new meanings are
cultivated and circulated through the co-deployment of different modalities
in the media. As such, advertisements can be seen to specialize in the creation of semiotic metaphors.
O'Halloran (1999a) distinguishes between different forms of semiotic
metaphor, parallel and divergent, which may be seen to function as opposite
ends of a cline. A parallel Semiotic Metaphor has 'an expanded semantic field but

242

Plate 9.4

MULTIMODAL DISCOURSE ANALYSIS

Semiotic Metaphor (reproduced from http://www.hearts-on-fire.com)

also one which is situated within the old' (1999a: 348). Although there could
be redundant meanings due to overlaps, 'new layers of meaning are [essentially] simultaneously added to the original representation'. The reconstrual
of elements in a divergent Semiotic Metaphor, however, is more far-reaching.
Here 'the functional element is reconstrued into a new semantic field'
(ibid.}. The metaphorical shift in meaning accompanying such divergent
reconstruals is substantial as the functional element is literally relocated in
a semantic field which is not typically intertextually related to the first.
O'Halloran (1999a) explains that the types of semantic shifts involved
in divergent semiotic metaphors, however, gradually become naturalized
over time.
A possible by-product of the meaning made through parallel semiotic
metaphors is semantic redundancies. These redundancies are realized when
there is a duplication of the meaning made by the semiotic resources. These
meanings, though actualized when the modalities are independent, serve a
reinforcing function when the two systems combine in the Sol. A by-product
of divergent semiotic metaphor, on the other hand, could be the surfacing
of conflicting meanings. These conflicts or examples of 'ideological disjunction' are a possible result 'of the complex, often intricate, relations of interfunctional solidarity among the various semiotic resource systems that are
co-deployed' (Thibault, 2000: 321). However, the Sol usually brings about a
harmonization of these disjunctions and conflicts 'in the service of the
semiotic project of this particular text' (ibid.}. In a multimodal text where the
modalities share co-contextualizing relations, there is a stronger likelihood
for parallel semiotic metaphors to arise, where the new meaning made
remains situated within the old. Divergent semiotic metaphors where
new, previously unrealized meanings are being made through the process
are more likely to emerge from a text where its modalities share recontextualizing relations.

PRINT MEDIA

243

Conclusion
As a meta-model, the IMM attempts to synthesize various research efforts
by situating them on the strata, planes and metafunctional dimensions of
the IMM where there is greater centrality and focus. For instance, the field
of materiality and medium of resource is located within this larger theoretical multi-semiotic model, in this case, across the communication planes.
The IMM is designed to help unify diverse research efforts in the field by
locating their contributions into a single model, which takes into account the
complexities of multimodal meaning-making.
However, some qualifications exist with respect to the IMM. The problem
of addressing a dynamic phenomenon with a typological description and
framework is a perennial quandary. Hence, the IMM may bear the criticism, like other frameworks, of being reductionist and even rigid in the
categorization of systems according to the metafunctions, despite the usefulness of the metafunction as a principle of theoretical integration. The
severity of this criticism, however, will be somewhat alleviated in the IMM
with the construction of a model that can reflect more effectively topological
meaning in dynamic environments such as those afforded by film and hypertext. In addition, at this stage the categories in actuality are more fluid than
can be represented by clearly delineated and neat classifications of systems
in the model.
Apart from recognizing the fluidity of the classifications, it is useful to
note that each of the metafunctions may not be equally dominant on a
multimodal page. O'Toole (1994) discusses the monofunctional tendencies
of certain schools of paintings, where a single metafunction may tend to
dominate in a certain work. Similarly, not all metafunctions are equally
salient in a multimodal text, despite the appearance of the equal topological
space allocated to each metafunction in the abstract theoretical construction
of the IMM. Hence, it is not surprising to find a particular metafunction
having a greater role in a certain multimodal texts.
O'Toole (1999) also comments that since only some options within the
systems in the matrix are selected in the construction of any one text, it is
not necessary to account for every system in the analysis of a text. Likewise,
in the IMM, there are many systems used to describe and analyse a multimodal text. However, not every single system needs to be accounted for in
an analysis; rather, the model is to serve our purpose of understanding how
meaning is made in a multimodal text through the choices which have been
made in the text.
Despite these possible weaknesses, a categorical framework for the
analysis of a multimodal text that pays attention to the meaning made on
the Expression plane as well as on the Space of Integration is helpful.
IMM may be likened to a neat (although at this stage underequipped)
toolbox. The toolbox contains concepts and a theoretical meta-language to
describe and account for phenomena which arise in the multimodal construction of meaning. Just as one does not use all the equipment in a

244

MULTIMODAL DISCOURSE ANALYSIS

toolbox in any one instance, the analyst selects the tools most useful for the
analysis of the text. However, it is also realized that the IMM and the
accompanying conceptual apparatus are provisional and exploratory.
There remains much work to be done in the theory and practice of multimodal analysis.

Acknowledgements
Plate 9.1 and Picture A in Plate 9.2 are reproduced with kind permission
from SNP Panpac Pte Ltd, Singapore from the children's picturebook
Dominic Duck Goes to School (2000) written by Maeli Wong and illustrated by
Don Low. I thank Zuraidah Jaffar for generously waiving the copyright fees
for reproducing these pictures.
Picture B and Picture E in Plate 9.2 are reproduced from When Sheep
Cannot Sleep written and illustrated by Satoshi Kitamura with kind permission from Andersen Press. Thanks also to Red Fox who currently publish the
paperback version of the book.
Picture C in Plate 9.2 is reproduced from the book Rhyming Round Singapore
(Yee, 1998), written by Patrick Yee, Kathleen Chia and Linda Gan, Girl's
Brigade, Singapore. Thanks to Linda Gan for kindly granting permission to
reproduce the picture.
Picture D in Plate 9.2 is reproduced from The Tidy-Up Race and Picture
F in Plate 9.2 and Plate 9.3 are reproduced from Lightning and Thunder by
E. Sallustio with kind permission from the Educational Publishing House
(Singapore) with special thanks to Margaret Tan for her assistance.
Plate 9.4 is reproduced from the website http://www.hearts-on-fire.com
with kind permission from 'Hearts on Fire - The World's Most Perfectly Cut
Diamond'.

References
Baldry, A. P. (2000) (ed.) Multimodality and Multimediality in the Distance Learning Age.

Campobasso, Italy: Palladino Editore.


Baldry, A. P. (this volume). Phase and transition type and instance: patterns in media
texts as seen through a multimodal concordancer, 83108.
Baldry, A. P. and Thibault, P. J. (2001) Towards Multimodal Corpora. In G. Aston
and L. Burnard (eds), Corpora in the Description and Teaching of English, Bologna:
GLUEB, 87-102.
Baldry, A. P. and Thibault, P. (forthcoming) Multimodal Transcription and Text.
London: Equinox.
Barthes, R. (1977) Rhetoric of the image. In R. Barthes (S. Heath, ed. and trans.),
Image-Music-Text. London: Fontana, 32-51.
Bohle, R. (1990) Publication Design for Editors. New Jersey: Prentice-Hall.
Bordwell, D. and Thompson, K. (1997) Film Art: An Introduction. New York:
McGraw-Hill.
Carroll, N. (2001) Beyond Aesthetics: Philosophical Essays. Cambridge: Cambridge University Press.

PRINT MEDIA

245

Cheong, Y. Y (1999) Construing meaning in multi-semiotic texts: a systemicfunctional approach. Unpublished masters thesis. National University of
Singapore.
Gheong, Y. Y (this volume) The construal of ideational meaning in print advertisements.
Doonan, J. (1993) Looking at Pictures in Picture Books. Exeter: Short Run Press.
Felix, M. (1980) The Story of a Little Mouse Trapped in a Book. Lajolla, CA: The Green
Tiger Press.
Floch, J.-M. (1986) Les Formes de I'empreinte. Perigueux: Pierre Fanlac.
Gombrich, E. (1960) Art and Illusion. London: Phaidon Press.
Halliday, M. A. K. (1978) Language as Social Semiotic. London: Edward
Arnold.
Halliday, M. A. K. (1994) An Introduction to Functional Grammar (2nd edn). London:
Arnold.
ledema, R. (2003) Multimodality, resemioticization: extending the analysis of discourse as a multi-semiotic practice. Visual Communication, 2(1): 2957.
Kitamura, S. (1986) When Sheep Cannot Sleep: The Counting Book. New York: Farrar
Straus Giroux.
Kress, G. (2003) Literacy in the New Media Age. London: Routledge.
Kress, G. and van Leeuwen, T. (1996) Reading Images: The Grammar of Visual Design.
London: Routledge.
Kress, G. and van Leeuwen, T. (1998) Front page: (the critical) analysis of newspaper layout. In A. Bell and P. Garrett (eds), Approaches to Media Discourse. Oxford:
Blackwell, 186-219.
Kress, G. and van Leeuwen, T. (2001) Multimodal Discourse: The Modes and Media of
Contemporary Communication. London: Arnold.
Kress, G., and van Leeuwen, T. (2002) Colour as a semiotic mode: notes for a
grammar of colour. Visual Communication, 1(3): 343-368.
Lemke, J. L. (1998) Multiplying meaning: visual and verbal semiotics in scientific
text. InJ. R. Martin and R. Veel (eds), Reading Science: Critical and Functional Perspectives on Discourses of Science. London: Routledge, 87-113.
Lim, F. V (2002) The analysis of language and visual images an integrative multisemiotic approach. Unpublished masters thesis. National University of Singapore.
Lemke, J. L. (2002) Notes on multimedia and hypertext. Available online: http://
www-personal.umich.edu./~jaylemke/
Martin. J. R. (1992) English Text: System and Structure. Amsterdam: Benjamins.
Martin, J. R. and Rose, D. (2003) Working with Discourse: Meaning Beyond the Clause.
London: Continuum.
O'Halloran, K. L. (1999a) Interdependence, interaction and metaphor in multisemiotic texts. Social Semiotics 9(3): 317-354.
O'Halloran, K. L. (1999b) Towards a systemic-functional analysis of multisemiotic mathematics texts. Semiotica, 124(1/2): 1-29.
O'Halloran, K. L. (2000) Classroom discourse in mathematics: a multi-semiotic
analysis. Linguistics and Education, 10(3): 359-388.
O'Halloran, K. L. (2003) Intersemiosis in mathematics and science: grammatical
metaphor and semiotic metaphor. In L. Ravelli, A.-M. Simon-Vandenbergen and
M. Taverniers (eds), Grammatical Metaphor: Views from Systemic Functional Linguistics.
Amsterdam: Benjamins.
O'Halloran, K. L. (forthcoming). Mathematical Discourse: Language, Visual Images and
Mathematical Symbolism. London: Continuum.

246

MULTIMODAL DISCOURSE ANALYSIS

O'Toole, M. (1994) The Language Of Displayed Art. London: Leicester University


Press.
O'Toole, M. (1995) A systemic-functional semiotics of art. In P. H. Fries and
M. Gregory (eds), Discourse in Society: Systemic-Functional Perspectives: Meaning and
Choice in Language: Studies for Michael Halliday (159179). Norwood, NJ: Ablex,
159-182.
O'Toole, M. (1999) Engaging with Art. [CD-ROM] Murdoch University, Perth
Western Australia.
Pang, K. M. A. (2000) Designing Children In Changing World, Changing Hopes: A
Multi-semiotic Analysis of a Museum Exhibition. Unpublished Honours Dissertation. National University of Singapore.
Pang, K. M. A. (this volume). Making history in From Colony to Nation: a multimodal
analysis of a museum exhibition in Singapore, 2854.
Royce, T. (1998a) Intersemiosis on the Page: A Metafunctional Interpretation of
Composition in the Economist Magazine. In P. Joret and A. Remael (eds),
Language and Beyond. Amsterdam: Rodopi, 157176.
Royce, T. (1998b) Synergy on the Page: Exploring Intersemiotic Complementarity
in Page-Based Multimodal Text. JASFL Occasional Papers, 1(1), 25-49.
Royce, T. (2002) Multimodality in the TESOL Classroom: Exploring Visual-Verbal
Synergy. TESOL QUARTERLY, 36(2), 191-205.
Saint-Martin, F. (1990) Semiotics of Visual Language. Bloomington, IN: Indiana University Press.
Sallustio, E. (1999a) The Tidy-Up Race. Singapore: The Educational Publishing
House Pte Ltd.
Sallustio, E. (1999b) Lightning and Thunder. Singapore: The Educational Publishing
House Pte Ltd.
Sardar, Z. and Van Loon, B. (2000) Introducing Media Studies. Cambridge: Icon Books.
Sonesson, G. (1993) Pictorial semiotics, gestalt theory, and the ecology of perception. Semiotica 99 (3/4): 319-399.
Sperber, D. and Wilson, D. (1986) Relevance, Communication and Cognition. Oxford:
Blackwell.
Thibault, P. J. (2000) The multimodal transcription of a television advertisement:
theory and practice. In Baldry (ed.), Multimodality and Multimediality in the Distance
Learning Age, 311-384.
Thibault, P. J. (forthcoming) The theory and practice of multimodal analysis of
video texts. In Baldry and Thibault (forthcoming).
Thurlemann, F. (1990) Vom Blid Sum Raum. Beitrage zu einer Semiotischen Kumstwissenchaft. Koln: DuMont.
van Leeuwen, T. (2000) Some notes on visual semiosis. Semiotica 129(1/4): 17995.
Wong M. (2000) Dominic Duck Goes to School. Singapore: SNP Education Pte Ltd.
Yee, P. (1998) Rhyming Round Singapore. Singapore: Girl's Brigade Singapore.

Index

Aarseth, E. 132
Advertisements (printed) 4, 83
Alberts, B. et al. 4,197,217
see also ECB
Anderson, B. 3
Announcement 164-5, 171, 173, 175-82,
186
Primary 165-7, 170, 173, 179-81,
183-7
Secondary 165, 167, 170, 173, 180-1,
184-5
Antze, P. and Lambek, M. 49
Appraisal 32, 38-40, 42-3
see also language systems
architecture 11-27, 55-6, 66, 68, 72, 110
Arnheim, R. 43, 208-9, 217
Australian Broadcasting Commission
(ABC)
15
Baigrie,B. S. 196,217
Bakhtin, M. 21
Bal, M. 43
Baldry, A. P. 1, 2, 3, 5, 29, 83, 84, 85, 89,
90,94, 106, 110, 111, 113, 118, 163,
196, 203-4, 215-17, 219, 220, 228
Baldry, A. P. and Taylor, C. 84, 96, 99, 106
Baldry, A. P. and Thibault, P. J. 5, 84, 98
Barthes, R. 135, 153, 156, 164, 175, 180,
191-2, 209, 214, 217, 228, 232, 239
Bastide, E 209,217
Beetle 165, 171-2, 175
Belcher, M. 31
Bennelong Point 21
Bennett, T. 31,40,43
Bernstein, B. 91
Betsky, A. 55
Bi-directional investment of meaning 4,
164, 176-8, 188
see also intersemiotic mechanisms
biology texts 4, 196-219
see also Essential Cell Biology (ECB]

Bohle,R. 148,168
Bordwell, D. and Thompson, K. 114, 115,
116, 117,238
Bruns,A. 192
buildings
see also Sydney Opera House 556, 60,
66
Business Times 33
Callaghan, J. and McDonald, E. 1,
110
Call-and-Visit Information 164-7, 170,
174-6
camera work 86, 88-9, 93, 119, 125, 126,
238
Capture 164, 176
Centre of Visual Impact (C VI) 148,
231
Cheong,YY. 163
chiaroscuro 18,234
Chinatown 4,110,113-27
Christopher, N. 116
chthonicity in buidings 21
Chua, B. H. 48, 55
Circular Quay, Sydney 21
circulation path (in exhibitions) 40-4
clarity and focus 119
cluster 135-7
Coffin, C. 41
collocation 1
colour 119,124,224,225
see also language systems
colour cohesion and contrast 68, 72, 124,
125, 126, 127
see also language systems
Communist Party 38-54
Comp.LoA see Complement to the Locus of
Attention
Complement to the Locus of Attention
165-97, 169, 170, 189
compositional balance 124,125

248

INDEX

compositional (textual) meaning


buildings 2, 11-13, 15, 23-5, 26, 60-1,
66-8,71
cities 61,66-7
film 120-3, 124-7
hypertext 141-5
museum exhibitions 31, 41-4, 47-8
schematic drawings 200, 206, 208-9
statistical graphs 202-3
concordancer see Multi-modal Corpus Authoring
(MCA) system
conjunction 11
connectors in buidings 23
Contextual Propensity (CP) 4, 164, 176-8,
186, 188-94, 239
see also intersemiotic mechanisms
Contextualising Relations 223, 239
see also intersemiosis
Cook, G. 83,175
Cortazzi, M. and Jin, L. 49
costume 124
culture of consumerism 65, 77
Dale, O. J. 57
Darwin, C. 197
Dean, D. 36
Dellora, D. 27
Display 164-7, 170, 175-6, 186
congruent 1657
explicit 165-7
implicit 165, 170
incongruent 165, 170
DoonanJ. 232-8
duration image 126
Dyer, G. 187
eating spaces 1618
Eaton, M. 114,115,116-17
Emblem 164-7, 170, 175-6
English for Specific Purposes (or English for
Academic Purposes) 196, 215-16
Enhancer 164-7, 170-1, 173, 175-7, 182-7
Epson 165-8, 175, 188, 191-3
Essential Cell Biology (ECB) 197, 204-14
experiential meaning see representational
meaning
expression plane 226-38
see also language system, graphics and
typography
Fairclough, N. 38
Fan car advertisement
Fawlty Towers 18
Focus 164, 176
Fong, T. W. 57

85105

Ford,B.J. 196,217
film 109-30
see also multimodal framework
framing 124, 125, 126
From Colony to Nation 2, 3354
functionalism in architecture 15, 26
functionalism in design 18
gaze 76, 84, 90-1, 105, 124, 125, 126, 127
Generic Structure 164, 166-7, 170
Generic Structure Potential (GSP) 163-5,
174-6, 194
genre
advertisements (print) 4
dynamic 956, 105-6
film 116-17
television advertisements 84,96, 105
Gestalt theory 225
gesture 84, 106, 125, 126
Given and New Information 164, 192, 194
Goffinan, E. 18,27
Goldman, R. 179
Golf 164,166,171, 173-5,177,179,185-7
grammatical metaphor 206-7
graphics 222,234-8
see also expression plane, language system,
graphics and typography
perspective 234
form 238
strokes 238
graphology 226
Gregory, M. 84, 87, 89, 94, 112, 113
Gregory, M. and Malcolm, K. 84, 94, 98
Guess? 165, 175, 188-94
Gwee, P. K. W. 68, 72
Haas, C. 197,217
Hall,M. 31
Halliday, M. A. K. 1, 4, 5, 27, 28-9, 32, 56,
84, 85, 87, 98, 110, 131, 133, 142,
176, 184, 186, 196, 204, 206-7, 212,
217-18,220,233
Halliday, M. A. K. and Hasan, R. 85, 164
Halliday, M. A. K. and Martin, J. R. 28
Harris, R. 29
Hasan, R. 164,176
Heisner,B. 114,117,127
Heng, G. and Devan, J. 138
Hernadi, P. 32-3
Hirsch,E 116
Hodge, R. and D'souza, W. 33
homospatiality 5
see also intersemiotic mechanisms
Hooke, R. 198
Hooper-Greenhill, E. 30,31,33

INDEX
Humphries, B. 21
hypertext 4,26,131-59
ledema, R. 1,110
Integrative Multisemiotic Model (IMM)
220-46
see also intersemiosis, intersemiotic
mechanisms and multimodal
frameworks
contrastive salience 231-3
critical impetus 224, 232
salience 232-3
textual unity 232
cohesiveness 232
saturation 233
hue 233
connotative value 2323
denotative value 2323
homospatiality 223
perceptual equity 228-9
space of integration 223, 238
system-metafunction fidelity 224,
232
Item 134-5, 137, 141-59
interpersonal (modal) meaning
buildings 11-13, 15, 17-19, 23-5, 61,
65-8,71-2
cities 66
film 119,120-3,124-7
hypertext 141-55
museum exhibitions 32, 38-40, 42-3,
47-8
schematic drawings 199-200, 204-7,
209-10
statistical graphs 202-3, 210-11
television car advertisements 86, 90-91
Interpretive Space (IS) 4, 164, 176-8, 186,
188-94,239
see also intersemiotic mechanisms
intersemiosis
see also intersemiotic mechanisms
advertisements (print) 163195
biology text 196-219
buildings 61,55-79
children's picture books 220-246
cities 5579
hypertext 4, 152-59
Integrative Multisemiotic Model (IMM)
5,220-46
film 109-130
museum exhibitions 29-54
multimodal concordancing 83-108
scientific text 4, 203-5, 210-12, 214
television advertisements 83-4, 87,
90-5, 95-106

249

intersemiotic mechanisms
Bi-directional investment of meaning 4,
4, 164, 176-8, 188,239
conjunction/disjunction 923
Contextual Propensity (CP) 4,164,
176-8, 186, 188-94,239
homospatiality 5, 223, 240
Integrative Multisemiotic Model (IMM)
220-46
Interpretive Space (IS) 4, 164, 176-8,
186, 188-94,239
RIM (Relation, Intersection,
Manifestation) 155-6
Semantic Effervescence (SE) 4, 164,
176-8, 186, 188, 191-4,239
semiotic metaphor 5, 61, 71, 241
sychronization 90-1,93
Visual Metaphor 4, 164, 168-9,176,240
intertextuality 3, 21, 23, 68
intertextual motif 124,125
JanovyJ.Jr. 197,218
Jayapal, M. 55
Jewitt, C. 1
Johns, A. 4,215-16,218
Justification 164, 176
Kaplan, E. A. 116
Kavanagh, G. 31
Keung, J. (57)
kinesics 85-95,119
Knight, D. 204,218
Kok, K. C.A. 157
Kress, G. 1, 61, 74, 132, 196, 203-4, 215,
218
Kress, G. and van Leeuwen, T. 1,4, 5, 29,
31,44,84,91, 110, 131, 132, 153,
163-6, 169, 192-3, 196, 203-4, 218,
220, 224, 227-8, 230
Kress, G. et al. 1,29,215,218
Kronberg Castle, Helsingor 23
Krutnik, F. 114,116
Kuhn, T. S. 204,218
Landow, G. 131,132,135
language systems
see also intersemiosis
Appraisal 32, 38-40, 42-3, 174
Colour 68, 72, 141, 152, 173, 234
Font 142-3, 148, 150, 152, 155, 173,
234
Modality 11
Mood 11
Position 165
Prominence 141, 169, 173

250

INDEX

Salience 165, 169, 173, 177-8, 181


Scale 141, 173
Size 141, 173
Latour, B. 215,218
Lead 164-5, 167, 170, 175-89, 191
Lee, T. H. 47
Lemke J. L. 1, 4, 5, 29-30, 47, 110,
132, 163, 194, 196-8, 201, 204, 211,
215,218,221,229
Lexia 135-7, 141-59
lighting 74,119,124,127
Lim, B. L. L. 132
LoA see Locus of Attention
Locus of Attention 165-71, 178-81,
185-9, 191, 193
logical meaning
elaboration 185
enhancement 185
expansion 184
extension 185
film 120-3
Lombardo, L. 84
Liu, T. K. 57
Lynch, K. 55, 67-8
Lynch, M. 196,215,218
Lynch, M. and Woolgar, S. 215,218

multimodal frameworks
see also intersemiosis and intersemiotic
mechanisms
architecture 68, 77
advertisements (print) 4,163-95
buildings 2, 60, 62-3, 77
children's picture books 220-46
cities 3, 58-9, 77
film 3,114-18,120-3
hypertext 4, 133-7, 143
museum exhibitions 3, 28, 34-5
schematic drawings 4, 199201
statistical graphs 4, 201-3
television car advertisements 8595
see also phase and transition
multimodal software see Multimodal Corpus
Authoring (MCA) system
multimodal concordancer see Multimodal
Corpus Authoring (MCA) system
multimodal transcription 3, 8395, 96,
106, 110-11, 112-13
Multimodal Corpus Authoring (MCA) system
3, 27, 83-4, 85, 95-106, 110-11, 127
Mumford, L. 55
museum exhibitions 3, 2854
Myers, G. 196-7, 199, 215, 218

Ml 165, 169-71, 173-5, 180-5


McDonald, S. 28
Mclnnes, D. 110
McMurryJ. and Castellion, M. E. 198,
218
makeup 124
Maroevic, I. 30
Marriott Hotel (Singapore) 3,56,61, 66-8,
71-7
Martin, J. R. 1,31,32,36,220
Martinec, R. 110
mathematical and statistical graphs 4
MCA see Multi-modal Corpus Authoring (MCA)
system
metafunctional organization text 84, 85,
86-95,98-9, 110-11, 113, 127
meta-language 222
metaphor see visual metaphor, semiotic
metaphor and intersemiosis
Ministry of Education (MOE) Singapore
4, 131, 136, 137-57
Modality see language systems
Mood see language systems
Moore, M. 132
movement-action-event 86, 88-93, 125,
126
multimodal communicative competence
216

National Heritage Board (Singapore) NHB


47,49
Neal, A. G. 49
Noble, S. 72
on-screen space 124,125,126,127
O'DonneU, M. 85
O'Halloran, K. L. 1, 2, 3-4, 5, 29, 41, 110,
148, 163, 188, 196-7, 201-3, 212, 215,
218-19,221,228,241
O'Halloran, K. L. andjudd, K. 98, 106,
110
opacity 23
OsborneJ. 197-8,219
Orchard Road (Singapore) 3, 55-7, 61,
65-8, 71-2, 76-7
O'Toole, M. 1,2, 3, 4, 11-3, 16, 27, 29, 30,
32, 44, 55-6, 60-1, 65, 67-8, 72, 77,
84, 110, 111-12, 118, 131, 133, 134,
142, 143, 145, 153, 163, 169, 173,
188-9, 197, 199-200, 203, 208-9,
219,227,243
Ove Arup and Partners 16
Palmer, R. B. 116
Pavesi, M. and Baldry, A. P. 85,105,216,
219
Pearse, S. M. 30

INDEX
Peirce, C. S. 207-8, 216 n.2, 219
perspective 55, 124, 126, 234
see also graphics
phase 3, 84, 85-95, 95-106, 110-11,
112-13
phonology 226
photography 448
Phillips, D. 46
Piastra, M. 105
Pike, B. 55
Place, J. 117
Polanski, R. 4, 110, 114, 116, 119, 125,
126-7, 127-8, 127-8 n.l
politics of buildings 19
Pompidou Centre 19
Preziosi, D. 3, 55, 60-1
Price, D. 44, 46
Price, D. and Wells, L. 46
proxemics 19, 125, 126, 127
PuruShotam, N. S. 47
Purves,W. K. 197,219
Ravelli,L. 29,30,31,33
reading path 148-52, 157, 168, 178, 184,
187,203-5,210-11,230
reflectivity 23
representational (Experiential) meaning
advertisements (print) 1647, 170-1,
173-7, 185-7, 192-3
buildings 2, 12-13, 15, 60, 70-2, 74
film 94-5,104,119,120-3
hypertext 145-6, 148-159
museum exhibitions 29, 36-40, 44-8
schematic drawings 200-1,206-10
statistical graphs 201-3, 211-12
television advertisements 89-93, 95-104
rhythm 90, 119
Rogers, R. 19
roof-tiles 17-18
Royce, T. 1,215,219,221
Saarinen 18, 21
Safeyaton, A. 56
schematic drawings 4, 199-201, 204-10
scientific texts
see biology texts, schematic drawings, and
statistical graphs
Scott, P. and Jewitt, C. 215,219
sculpture 110
Seidler, H. 17,27
Semantic Effervescence (SE) 4, 164,
176-8, 186, 188, 191-4,239
see also intersemiotic mechanisms
semiotic metaphor 5, 41-3, 212, 214, 241
see also intersemiotic mechanisms

251

setting 124, 125, 126


Singapore History Museum (SHM) 3,
28-54
Singapore Master Plan 56-7
Singapore Ministry of Education (MOE) see
Ministry of Education
Smith, C. S. 31,46
software see also Multimodal Corpus Authoring
(MCA) system
Adobe Premiere 4, 83-4, 113
HyperContext Web 105
OCP 98
Systemics 1.0 111
video editing tools 109,110,111,113,
118-27
Wordsmith 98
soundtrack 84, 85-95, 112, 122-3
space 40-4
Stables, K. 117
statistical graphs 201-3, 210-14
Stern, R. A. M. 55
Swales, J. 215,219
Sydney Harbour 24
Sydney Harbour Bridge 21
Sydney Opera House 2,11,1427
Sydney Symphony Orchestra 15
SykesJ. 15,16,24,27
systemic functional frameworks see
intersemiotic mechanisms, language
systems, multimodal frameworks and
visual images
system network 225
Tag 164-6, 170, 174-6
TaggJ. 46
Tan, J. H. 57
Tan, S. 56, 57
Taylor, C. and Baldry, A. P. 84
television advertisements 85106, 112-13
texture 17
textual meaning see compositional meaning
theatre 16
The Straits Times 33, 67, 138-40
Thibault, P.J. 1, 5, 29, 44, 46, 83, 84, 86,
87,90,92,93,94, 102, 104, 110,
112-13, 118, 208, 214-15, 219, 221,
230
This Week Singapore 64
Tilling, L. 217n.4, 219
topological and typological meaning
198-9, 201-2
topological grammar 194
Towne, R. 114
transition 3, 65, 84, 92-95, 95-106, 110,
113,118

252

INDEX

Tuman, M. C. 131
TuskaJ. 114,116
typography 222,234-237
see also expression plane
font 234
layout 234
spacing 234
URA Annual Report (Singapore) 5 67,61,
66-7
Unsworth, L. 1,132
UtzonJ. 2, 11, 15, 16, 17, 19, 21, 23, 24,
27
van Leeuwen, T. 110, 228-9
Ventola, E., et al. 1,110
Vergo, P. 31,36
vertically in buildings 21
video texts 3, 110-11
see also television advertisements and film
visual images
see also intersemiosis, intersemiotic
mechanisms and multimodal
frameworks
advertisements (print) 3,16395
biology texts 4,196-219

film 3-4, 109-30


hypertext 141-59
Integrative Multisemiotic Model (IMM)
220-246
museum exhibitions 448
paintings 110,111-12,118
scientific drawings 4,196219
software for analysis see multimodal
software
television advertisements 83106
Vasta, N. 84
Visual Metaphor 4,164,168-9,176,
240
see also intersemiotic mechanisms
Voytilla, S. 116
web
136-7
Wee, O.K. 163,168
Wee, C. J. W-L 47, 48-9
Wernick, A. 187
White, P. R. R. 173-4
Whyte,W. 61,65

Zago, S. 94
Zammit, K. and Callow, J.
Zar,J. H. 198,219

You might also like