In its most basic sense, multimodality is a theory of communication and social semiotics. Multimodality describes communication practices in terms of the textual, aural, linguistic, spatial, and visual resources – or modes – used to compose messages. Where media are concerned, multimodality is the use of several modes (media) to create a single artefact. The collection of these modes, or elements, contributes to how multimodality affects different rhetorical situations, or opportunities for increasing an audience’s reception of an idea or concept. Everything from the placement of images to the organization of the content creates meaning. This is the result of a shift from isolated text being relied on as the primary source of communication, to the image being utilized more frequently in the digital age. While multimodality as an area of academic study did not gain traction until the twentieth century, all communication, literacy, and composing practices are and always have been multimodal.
Information is presented through the design of digital media, engaging with multimedia to offer a multimodal principle of composition. Standard words and pictures can be presented as moving images and speech in order to enhance the meaning of words. Joddy Murray wrote in “Composing Multimodality” that both discursive rhetoric and non-discursive rhetoric should be examined in order to see the modes and media used to create such composition. Murray also includes the benefits of multimodality, which lends itself to “acknowledge and build into our writing processes the importance of emotions in textual production, consumption, and distribution; encourage digital literacy as well as nondigital literacy in textual practice. Murray shows a new way of thinking about composition, allowing images to be “sensuous and emotional” symbols of what they do represent, not focusing so much on the “conceptual and abstract.”
Murray writes in his article, through the use of Richard Lanham’s The Electronic World: Democracy, Technology, and the Arts, how “discursive text is in the center of everything we do,” going on to say how students coexist in a world that “includes blogs, podcasts, modular community web spaces, cell phone messaging…”, urging for students to be taught how to compose through rhetorical minds in these new, and not-so-new texts. “Cultural changes, and Lanham suggests, refocuses writing theory towards the image”, demonstrating how there is a change in alphabet-to-icon ratios in electronic writing. One of these prime examples can see through the Apple product, the iPhone, in which “emojis” are seen as icons in a separate keyboard to convey what words would have once delivered. Another example is Prezi. Often likened to Microsoft PowerPoint, Prezi is a cloud-based presentation application that allows users to create text, embed video, and make visually aesthetic projects. Prezi’s presentations zoom the eye in, out, up and down to create a multi-dimensional appeal. Users also utilize different media within this medium that is itself unique.
This term has its origins in the work of the psychologist James Gibson (1979) on perception and action. Working from an ‘interactionist’ perspective Gibson focuses on agent-situation interaction which means that he defines affordances as all ‘action possibilities’ latent in an environment, where the potential uses of a given object arise from its perceivable properties and always in relation to the actor’s capabilities and interests (because perception in always selective). Donald Norman (1988) took up these ideas in relation to the design of objects, and he emphasizes social, as well as material aspects. Adapted by Kress (e.g. 2010), the term ‘modal affordance’ has particular currency in multimodality. It refers to the potentialites and constraints of different modes – what it is possible to express and represent or communicate easily with the resources of a mode, and what is less straightforward or even impossible – and this is subject to constant social work. From this perspective, the term ‘affordance’ is not a matter of perception, but rather refers to the materially, culturally, socially and historically developed ways in which meaning is made with particular semiotic resources.
The affordance of a mode is shaped by its materiality, by what it has been repeatedly used to mean and do (its ‘provenance’), and by the social norms and conventions that inform its use in context – and this may shift, as well as through timescales and spatial trajectories (Lemke, 2000; Massey, 2005). Each mode – as it has been shaped and is socially contextualized – possesses certain ‘logics’. The logic of sequence in time is characteristic of speech: one sound is uttered after another, one word after another, one syntactic and textual element after another. In producing possibilities for putting things first or last, or somewhere else, in temporal arrangement, this sequentiality becomes an affordance. In contrast, still images are more strongly governed by the logic of space and simultaneity because items are represented concurrently. Not without its critics, the term ‘affordance’ is subject to ongoing debate.