Through the Flat Canvas: The Motor Meaning of Realistic Paintings

Silvano Zipoli Caiani


It is a common experience that objects in the environment indicate how we are to interact with them. For example, a door handle suggests an interaction based on whole-hand grips, whereas a mug handle suggests that we interact with it by means of a precision grip. Additionally, the type of interaction that is suggested by an object is biased by our motor intentions and goals. A spoon, for example, suggests an inter-digit grip when we need to stir our coffee, but it suggests a clenched grip when we need to feed a baby.

Over the last few decades, researchers in the cognitive science of vision have attempted to understand how we perceive the possibilities of action in our environment. With the increasing consensus regarding an embodied approach to cognition (e.g., Shapiro [2011] for a moderate view; Chemero [2009], for a more radical approach), it has become increasingly evident that perception is not only functional in the planning of action and execution, but also that our planning and execution of actions shape the way we perceive the environment (Hurley [2001], Noë [2004]). As a consequence of this view, perception and action should be understood as two intimately related cognitive processes, so that it is nowadays generally agreed that we access our perceptual experience through the lenses of our motor competence and goals.

Although interactions between perception and action were the focus of the large and rapidly growing field of theoretical and empirical research in the past several decades, there are still divergences in the philosophical and experimental literature. An important point of controversy among scholars concerns the perception of action possibilities that are evoked by bi-dimensional representations of action-related objects such as pictures and paintings. Notably, there is no consensus concerning how bi-dimensional representations suggest opportunities for action and what epistemic value pertains to such a sort of perception. The reason is that pictures and paintings do not allow for the execution of the actions that are suggested by the objects that they represent. This phenomenon raises the question of whether those possibilities for action that are evoked by bi-dimensional representations can be genuinely perceived or whether they are instances of mere illusions.

It is common to conceive of an illusion as “any perceptual situation in which a physical object is actually perceived, but in which that object perceptually appears other than it really is” (Smith [2002]: 23). Thus, the perceptual experience of potential actions that are evoked by bi-dimensional representations can be regarded as an instance of illusory perception because the properties of objects as they are perceptually represented do not reflect the properties of objects as they actually are.

Following this line of thought, James Gibson (1979) and later scholars (Turvey, et al., 1981) have developed an influential framework that suggests that perceiving possibilities of action involves interacting with the action-related dispositional properties of the environment. According to this view, action possibilities in the environment are specified by the external structure of the ambient light array and are directly perceivable without the need for any internal representation. Consequently, because action-related information is obtained directly from the environment, a misrepresentation results in cases in which perceptual information does not reflect the environment.

For Gibson, pictures and paintings should be understood nothing but as surfaces treated in such a way that a delimited optic array to an available point of observation contains the same type of information that is found in the natural array of light. However, this occurs without specifying the presence of any genuine environmental property (Gibson [1971]: 31). According to this view, because a picture or a painting of an object may contain false information concerning the presence of possible motor interactions that are not available in the nearby environment, the perceptual experience of pictures and paintings may be understood as a prototypical case of illusion. As a consequence, a significant part of our experience of works of art, such as works of photography, hyper-realistic painting or trompe l’oeil effects, should be understood as only an instance of false experience.

However, there is a way to resist the temptation to write off a great deal of aesthetic experience as merely illusion. As an alternative to this view, we might begin by assuming that the attribution of motor meanings to perception is the natural way of making sense of our perceptual experiences of the action-related three-dimensional objects that are situated in our natural environment. Consistent with this view, we can hypothesize that also the ascription of motor meanings to bi-dimensional objects is part of the natural way in which we pragmatically understand the system of representations that makes up our cultural environment.

It is interesting to note that over the last decades, a large amount of evidence has been accumulated to support the view that our perceptual access to the world is substantially shaped by our motor abilities and intentions. In particular, it has been shown that the activation of the motor system plays a functional part in object perception, allowing for the pragmatic interpretation of our perceptual experience to plan and execute motor actions. Remarkably, the activation of the motor system during perceptual tasks does not distinguish between three-dimensional or bi-dimensional targets, rather, it appears to be functionally relevant in both cases.

The aim of this paper is to maintain that the attribution of motor meanings to perceptual experience is not merely a matter of correspondence with the environment but is the way we commonly interpret and understand the motor significance of both real and represented objects. My argument is based on an analysis of a large amount of experimental data that show that the functional role of the motor system is pervasive in perceptual experience. Accordingly, the paper is divided into two parts. The first part introduces details concerning the dispositional account of perception. Here, I focus on the notion of affordance as it is developed within the ecological approach to perception that was discussed by Gibson and his scholars. Then, I show that this notion of affordance has serious consequences for our consideration of the way we understand the perceptual experience of pictures and paintings. The second part is devoted to an introduction of the dual-stream theory of perception, which stresses the role of pragmatic information in perception. In this section, I review the relevant evidence which shows that the activation of the motor system complements our conceptual knowledge and contributes to the way we provide meaning of our perceptual experience. Finally, considerations concerning the role of the motor system in meaning attribution and aesthetic experience will be outlined in the conclusions.

2. The Dispositional Account of Affordance

2.1 Affordances as Dispositional Properties

After Gibson introduced the concept of affordance (Gibson [1966], [1979]), the nature and function of action in perception became a distinguished topic of research in many areas of the cognitive sciences. According to Gibson, the motor meaning of objects in the environment is specified by the ambient array of light, so no internal information is required to complement the perceptual stimulus and execute a planned action. Gibson argued that the perception of objects consists of the detection of possible interactions without involving high-level cognitive skills such as reasoning about properties and categories.

In particular, Gibson (1979) focused on the ambient array of light as the only source of perceptual information that specifies and makes available to the agent all of the visual information about the environment that is necessary for the agent’s survival. According to Gibson, the agent’s ability to pick up the information from the optic array does not require computation of internal representations. Importantly, the possibilities of action of any given object do not depend only on the intrinsic properties of the object, but also on the bodily properties of the perceiving agent. A hammer, for example, affords grasping actions for agents with hands that are endowed with opposable thumbs, but not for dogs, for which it might afford biting and licking. Moreover, in the case of object manipulation an agent who sees a tool does not notice its shape and colors and then infer its possible use; rather, he quickly and unreflectively perceives its properties against the motor- related properties of his body. The quickness that characterizes the perception of such motor properties, usually called affordances (Gibson [1979]), relates to the fact that sensing the environment is a dynamical interaction that does not involve conceptualization and propositional skills. In contrast, motor relatedness involves that perception is for action, and that our visual experience of the environment is shaped by the agent’s motor abilities, which are made available by his or her bodily shape (Gibson [1979]: Chap. 8).

Among post-Gibsonian attempts to develop a theory of affordance the most influential is the view that considers affordances to be dispositional properties that pertain to objects that are complemented by an organism’s features (e.g., Turvey et al., [1981], Scarantino [2003]). There are many ways to conceive of a property in dispositional terms, but a commonly accepted definition frames dispositions in terms of counterfactual conditions and emphasizes the way in which things are disposed to behave given certain changes in the environmental circumstances.

Typical cases of dispositional properties are the solubility of sugar when it is immersed in water at the proper concentration and temperature and the fragility of a piece of glass that is manifested by breaking events, such as when we hit a window with a ball. Importantly, all of the times that we attribute a dispositional property to a certain object X, we also attribute a complementary property to another object or situation. In the case of the solubility of sugar, for example, it is necessary for water to have a solvent power to dissolve sugar, whereas in the case of the fragility of a glass window, it is necessary for the ball to be endowed with the correct amount of force and hardiness to break the window.

Consistent with this definition of dispositionality, post-Gibsonian approaches to perception (e.g., Turvey, et al. [1981], Shaw et al. [1982]) have conceived of affordances as dispositions of objects that behave in a certain way given complementary conditions. Therefore, action-related affordances should be viewed as properties that pertain to a class of objects that, given the appropriate background circumstances, allow an organism that is endowed with suitable properties to execute a motor action. Accordingly, when there is a positive probability that an agent will act on a target, then the target should be considered to be the bearer of a dispositional property that is related to that agent and to that action. For example, the bottle that stands in front of me is the bearer of a grasping-related affordance for me if and only if there is some minimal positive probability that I will successfully grasp it in the future (Scarantino [2003]: 956, 959–960).

It should be noted that because an affordance is a property of the environment and its actualization gives rise to a possible action, for a dispositional account to perceive an affordance necessarily involves the presence of an object that allows the actualization of that action. Perceiving an affordance, therefore, depends on the presence of a property bearer that is appropriate for a related motor interaction with the environment, so that it becomes senseless to expect the actualization of an action without postulating the presence of a referent that is appropriate for the execution of that action. According to this view, to say that affordances are the dispositional properties of the en-vironment is to say that the environment and the agent’s body join in a single physical system (Turvey et al., 1981). Thus, to assess whether an object, such as a handle, actually affords a grasping action, the properties of the environ¬ment should be considered part of a coupling system with relevant properties of the agent’s body.

2.2 Affordances and illusions

According to dispositional theory (see § 2.1), an affordance is understood such that perceiving it cannot be divorced from the presence in the environment of a property bearer that is suitable for action. Following this line of reasoning, an attempt to perceive an affordance in the absence of an appropriate object that allows its actualization results in a sort of contradiction. It is a general assumption of a dispositional view that either we perceive an affordance and a suitable target for action is truly present in the environment or that there are no actual possibilities of action in the immediate area, and thus we do not perceive a genuine affordance (note that this conclusion coincides with the main claim of the disjunctivist theory of perception: e.g., Austin [1962]; Haddock & Macpherson [2008], Brewer [2011]).

The fact that the perception of the environment quickly evokes possible patterns of motor interactions is a natural antecedent of the planning and execution of motor actions. If this were not so — that is, if the perceptual experience of the environment were silent about how to interact with the objects around us — we would have serious difficulties in deciding how to take action. To understand this point, consider the following example. Suppose you are standing at a bar counter waiting for a coffee while chatting with your colleague. The discussion is quite involving, and your attention is entirely absorbed by a series of pointed arguments. Meanwhile, the barman places a mug of coffee in front of you. Soon you will have to decide how to grasp the mug to drink. This issue implies the identification of possible ways to interact with the mug, the selection of the most comfortable among these ways, and then the planning of how to actualize the selected interaction. Although they may seem to be demanding tasks, this is a common experience that does not require that we divert our attention from what are we doing to reach for the mug and drink the coffee. Grasping a mug while you chat with a friend, pulling a door handle while you remember our last holiday in Malaysia or grasping the gear lever of our car as you listen to the news on the radio are common examples in which perception successfully guides action without resorting to conscious cognitive abilities.

However, it is also a common experience that sometimes perception goes wrong and presents confounding information about the source of our experience. Reliance on a successful visual experience is not always appropriate because it can occasionally lead to discrepancies between the actual and perceived properties of objects in the environment. The perception of possibilities of action is not free of this condition of uncertainty. For example, common errors regarding the perception for action concern the size of apertures that appear to be able to accommodate the passage of our body, but that may be revealed to be smaller than they appear or things that appear able to sit upright despite the fact that they are inappropriate for this purpose because they may be too soft or too fragile.

As a result, perception can guide us to the execution of incorrect actions, suggesting motor plans that might be unsuitable when they are compared to our intentions and goals. Remarkably, according to a dispositional view (see Section 2.1), because our visual experience suggests a potential for action that is not actually present in the environment, these cases are not genuine instances of affordance perception but rather cases in which we have a false experience of them.

2.3 Affordances and art

Our world is not made only of concrete three-dimensional objects that allow for potential actions. We are immersed in a cultural environment that is filled with flat representations that mimic the possibilities for action that are allowed by common three-dimensional objects. However, in contrast to three-dimensional objects, bi-dimensional surfaces do not make available the interactions that are permitted by the objects that they represent. Pictures and drawings are cases of this type. For example, a picture of an apple may evoke a grasping action when viewed in particular circumstances. Similarly, a drawing of a door may suggest passage through a wall when it is observed from a particular distance and angle. Among these cases, the trompe l’oeil (“fool the eye”) effect is a well-known painting technique that uses bi-dimensional imagery to create an optical experience such that the depicted objects appear to exist in three dimensions and thus become possible targets of motor interaction.

The trompe l'oeil technique has its roots in antiquity. Pliny the Elder’s “History of Nature”, for example, tells us the story of a rivalry between two painters, Zeuxis and Parrhasius. According to Pliny, Zeuxis was able to paint a representation of bunches of grapes that were so realistic that they attracted hungry birds that intended to eat them. Zeuxis’ rival Parrhasius was so impressed by this result that he asked Zeuxis to visit his studio to see his work. Zeuxis went to Parrhasius’ studio where he was intrigued by a painting that was draped by a curtain, but when he approached the painting and tried to pull the curtain back to reveal the canvas, he was surprised that the curtain was painted. Unfortunately, no examples of such works have survived, which makes it difficult to assess their similarity to the illusionistic paintings that began to fill the walls of churches and buildings from the Renaissance onward.

A famous example of trompe l’oeil in this period is located in the church of Santa Maria presso San Satiro in Milan. The architect Donato Bramante was commissioned to create the illusion of space behind the altar that visually appeared to be three or four times deeper than it is in reality. Bramante was successful, and the area behind the altar, which actually has a depth of only one meter due to the presence of a previously existing road behind the church, was replaced by the artist with a fresco that gives the illusion of a large room down to the main aisle. As a result, a spectator who enters the church has the perceptual experience of a possible walking path that extends through the wall in front of him, at least until he or she reaches a closer distance from the wall or moves toward one of the side aisles. In such cases, indeed, the pattern of optical stimulation changes its configuration and stops mimicking that of real spatial depth.

Andrea Pozzo’s ceiling painting, the Apotheosis of St Ignatius, in the church of San Ignazio in Rome, is another famous example of trompe l’oeil .The actual form of the ceiling is cylindrical, but Pozzo was able to make it appear to be a dome. Pozzo left an extended documentation of how he conceived and constructed this illusion following the principles of perspective. The illusion of a dome-shaped room instead of a cylindrical room is successful because of the skillful way in which the artist integrated the structures of the building, such as walls and columns, with their painted continuations. However, the true shape of the ceiling becomes clear as soon as the spectator moves away from the ideal viewing point. The optical experience of a dome, indeed, is only accessible from one position that is marked by a star on the floor of the church, which is the point of view that allows the viewer to experience the patterns of optical stimulation of a real dome. Different points of view do not allow this pattern of stimulation.

More recent examples of trompe l’oeil are paintings that were created by a group of American artists who were active in the late 19th century. Typical of these American trompe l’oeil paintings are William Harnett’s and John Peto’s oils on canvas, which are so striking in their realism that they are sometimes perceived as actual objects that hang on a wall. These American artists achieved their illusory effects not by the implementation of an original technique but by combining extremely fine and realistic renderings with a careful choice of subject matter. Their works demonstrate that skillful painting technique can induce a surrogate of the perceptual experience of three-dimensional objects without adding anything to the surface of a flat canvas.

During the 20th century, American Hyperrealism was an art movement that represented an American inheritance of the trompe l’oeil tradition. Hyperrealist paintings create in the observer a tangible sense of physical presence and solidity of the depicted object through the accurate choice of lighting, shading effects and colors. The spectator usually experiences the details of the depicted object closest to the front of the painting, as if they appear just beyond the flat surface of the canvas, with more clarity than can be found in natural experience.

Among the contemporary hyperrealist painters, Jason de Graaf shows an interesting production of artworks that represent common objects on canvas but with an additional three-dimensional effect. The Canadian artist infuses the painting with a quality that pushes the spectator to search within the image for a clue to reveal its unreality; such clues, however, are almost impossible to find. Although de Graaf’s paintings may appear to be merely realistic representations of photos, the artist is able to create an illusion of depth and a sense of presence of objects that is not usually found in photographs. Things appear to be at the spectator’s fingertips, easily reachable and graspable, as in the case of the shiny balls that rest on the snow represented in “Ice Palace” or the group of glasses that are painted in “Vessels” (for a sense of de Graaf’s production, visit the artist’s website:

The visual experience of one of de Graaf’s artworks, like the visual experience of the earlier examples of trompe l’oeil paintings, suggests the possibility of interaction with the depicted objects. This amounts to a sort of deception to which we voluntarily submit each time we choose to observe a hyperrealistic painting or a classical trompe l’oeil rendering. Interestingly, when we approach these types of artworks we know that we are facing nothing more a bi-dimensional representation, but we are nevertheless subject to the same illusionary effect: we have the perceptual experience of possibilities of action that are not actually present in the environment. In other words, our experience induces us to judge the environment as endowed with affordances that, paradoxically, are not available to us. How is it that we let ourselves be fooled by a trick, when we are perfectly aware of the deceptive nature of the artwork?

To answer this question, we must rethink the illusory nature of our perceptual interaction with bi-dimensional representations of objects such as trompe l’oeil and hyperrealistic paintings. Many findings from the cognitive sciences of vision address the functioning of the perceptual apparatus and reveal the importance of experiencing possibilities of action even when they are not actually available in the environment. To illustrate this point, I will present some evidence below.

3. The Informational Account of Perception for Action

3.1 Bi-dimensional Representations and Behavioral Effects

We are always surrounded by bi-dimensional representations of objects that perceptual experience suggests to us as a plurality of possible interactions that are not present in the environment. Accordingly, because the perception of possible interactions with the environment is quick and motor related (Section, 2.1), one may hypothesize that the action-related value of bi-dimensional representations could be immediately obtained by the observer with regard to his or her bodily motor properties.

According to this view, the detection of specific patterns of interaction in the perceptual experience of a bi-dimensional representation of an object might influence the planning and the execution of actions according to the possibilities that are evoked by the represented object. This is precisely the case that is illustrated by much of the behavioral evidence concerning the perception of affordances, which shows that compatibility effects occur when agents are asked to execute a motor task that is congruent with one of the possible actions afforded by the visual target (Ellis & Tuker [2000]). What follows is a brief elaboration on this concept.

In influential research conducted by Tucker and Ellis (1998), subjects were presented with images of graspable targets and asked to specify whether the depicted objects, such as mugs, pans and other action-related objects, were right-side up or upside down by pressing keys on a keyboard using their left or right hands. This experiment showed that a significant facilitation effect occurred when the hand used in the response was that of the same side of the action-related parts of the observed object (e.g., the handle of a mug), even though the horizontal object orientation was irrelevant to the assigned task. The same type of compatibility effect that is related to the perception of affordances has been found when the image of a single target is presented in a set of many others rather than on its own (Derbyshire et al. 2006). Interestingly, Costantini et al. (2010) showed that this type of compatibility effect is strictly related to an individual’s apparent possibility of interacting with a visual target. To test the hypothesis that what is relevant in perception for action is the detection of patterns of motor interaction with the environment, the authors divided the depicted space into both reachable and non-reachable subspaces by showing the image of a mug either in front of or behind the image of a transparent panel. The authors found a compatibility effect only when the image of the target was apparently reachable by the observer, that is when it was apparently located within the subject’s peripersonal space. A similar result was achieved in another experiment conducted by Vainio et al. (2011), which revealed the presence of a compatibility effect when an agent executed actions while observing the image of a graspable object but not when he or she observed the orientation of a depicted arrow.

This evidence also suggests that the perceptual experience of a bi-dimensional representation of an object can evoke possible interactions with the environment and influence the execution of motor actions that are concomitant or immediately subsequent to perception. Interestingly, the visual detection of action possibilities in images of common objects influences spatial attention even when the subject’s ability to recognize the identity of the perceptual targets is not involved in the task or even disrupted. Handy, et al. (2003), for example, showed that normal subjects who are presented with images of action-related targets automatically direct their attention in the visual hemifield that is dominant for the execution of the suggested action, re-allocating their attention in the direction of the suggested motor interaction. Experiments performed on patients who suffer from specific brain lesions that impair attention are particularly significant to illustrate this phenomenon. For example, patients who suffer from extinction are an interesting target for experiments that aim to assess the interaction between perception and action. Extinction patients show a deficit in visual attention to stimuli that are presented toward the contralesional side of the space while competing stimuli appear further to the other side. However, Di Pellegrino et al. (2005) showed that contralateral extinction was significantly reduced in patients who were presented with images of pairs of cups that were endowed with handles in both of the visual fields, but no reduction was registered when the cups were deprived of their functional part. This means that it is easier for the image of a cup that is endowed with a handle to elicit the experience of possible motor interactions, enabling appropriate reports of its presence in the perceptual field. This type of evidence suggests that information concerning the presence of possible actions in the surrounding space can be extracted even when the observer is not able to semantically recognize the target identity, suggesting that categorization and identification abilities are not necessary for the detection of action possibilities (see also Riddoch [2003]).

3.2 Bi-Dimensional Representations and Motor System Activations

The behavioral evidence that seeing the representation of action-related objects induces modification of the agent’s behavioral response to the stimulus calls attention to the role of the motor system in perceptual processing. According to a general hypothesis (Gallese & Sinigaglia [2011]), if perceptual information and motor intentions are coded together, it becomes possible to account for motor and attentional facilitations that characterize the perception of real objects and bi-dimensional images of common targets of action. Consistent with this view, a great deal of evidence supports the hypothesis that visually presented images of action-related objects activate the same cortical areas that are functionally implicated in the planning and execution of those actions afforded by represented objects.

For example, in a classic experiment, Grafton et al. (1997) assess whether observing and naming frequently used tools elicit premotor areas even in the absence of any overt motor task. As expected, the findings show that the observation and naming tools activates the left dorsal premotor cortex, which is the area of the brain that is normally involved in planning motor interactions with the environment. This result is confirmed by another classic study conducted by Chao and Martin (2000). In this experiment, the subjects were visually presented with different categories of objects while they were scanned in an fMRI machine. The findings clearly reveal the sensitiveness of the left ventral premotor cortex for pictures of tools, which supports the hypothesis that the ability to detect motor-related information in a perceptual stimulus may depend on the function of the same cortical regions related to planning motor interactions with the environment.

Experiments based on transcranial magnetic stimulation (TMS) can be used to assess where and when the activation of the motor system occurs during the observation of bi-dimensional representations of action-related objects. In an experiment conducted by Buccino et al. (2009), motor-evoked potentials (MEPs) from an area of the hand that is involved in grasping actions were recorded while the participants were presented with images of objects characterized by a whole or a broken handle. At the same time, the participants’ left hand motor area in the brain was stimulated 200 ms after the visual stimulus presentation. Interestingly, the results show that at this early stage of the processing, the MEPs were larger when the whole handle was located ipsilaterally to the monitored hand than when the graspable part was broken or located on the contralateral side. In another TMS experiment, Cardellicchio et al. (2011) showed that MEPs can be modulated by the orientation of the action-related part of the object, as in the previous case, and by its spatial location within or without the observer’s peripersonal space.

To summarize, imaging and TMS experiments support the hypothesis that the perception of the potential for action results in a somatotopic involvement of the motor apparatus. This allows for a matching between optical patterns of stimulation and the perceiver’s vocabulary of motor acts (Rizzolatti, et al., 1988), which is already at a very early stage of the perceptual process. This finding lends empirical support to the idea that perceiving bi-dimensional representations of objects may consist of picking up sensorimotor patterns of optical stimulation that elicit specific motor representations, even in cases in which no action-related dispositional property is present locally.

3.3 The Pragmatic Meaning of Visual Perception

Over the last few decades, great progress has been made toward a better understanding of the mechanism that underlies our ability to visually detect action possibilities in the environment. A major discovery was that a large part of neurons in the premotor area transforms the visual layout properties of objects in the appropriate information for motor execution (Rizzolatti et al. [1988], Rizzolatti & Luppino [2001]). These “visuomotor” neurons are characterized by the specific selectivity for certain solids as opposed to others; the difference between them depends on the type of grip that is afforded by those objects. Further evidence (Sakata et al. [1995], Murata et al. [1997], [2000]) demonstrates that the same sensorimotor selectivity is present in the anterior intraparietal area, which projects its connections in the premotor cortex (Borra et al. [2008]). A large part of the neurons in this area discharges during object fixation and is selective for action-related properties, such as the object’s shape, size, and orientation (Verhoef et al. [2010]).

Interestingly, the above-mentioned evidence has been commonly interpreted as supporting a dual stream model of visual processing (Milner & Goodale [1995], Jacob & Jeannerod [2003]). According to this now-classic model, the dorsal pathway and the ventral pathway perform two different functions along the course of visual information processing. The dorsal stream extracts sensorimotor information from the perceptual stimulus, which allows for the detection of a series of affordances distributed on a visual map (Goodale & Milner [1992], Rizzolatti & Luppino [2001]). Concurrently, the ventral pathway assigns an identity to visual patterns, which provides for effective detection and selection of action possibilities in the environment according to the agent’s motor intentions and goals. Accordingly, dorsal and ventral representations of action-oriented objects come apart in two different ways: the latter supports our semantic recognition abilities, whereas the former helps us to detect possibilities of action in the environment to guide our motor interactions with visual targets.

Remarkably, the dorsal stream performs transformations that convert information about the shape and the location of the stimulus’ source into parameters that are suitable for action planning and execution. Along this path, indeed, different areas may concurrently represent information that involves several possibilities of action, each of which has a different salience and probability (Cisek & Kalaska [2010]). In line with this finding, Baumann et al. (2009) provide compelling evidence that the multiple possibilities of action that are offered by a single object can evoke the activation of different grasp-related areas in the anterior intra-parietal cortex. This finding suggests that this portion of the brain encodes information about the variety of visual features that are related to possible actions that the agent may be able to address to the same target.

In light of the abovementioned evidence, there are good reasons to suggest that perceptual access to the motor value of bi-dimensional representations of objects is obtained through the dorsal processing of visual information (Nanay [2015]). If we consider that the attribution of motor relevance to depicted surfaces relies on the same dorsal processing by means of which we attribute motor salience to three-dimensional objects, we can explain why the perception of images of action-related objects influences motor behavior. More precisely, because the network of sub-streams that constitute the dorsal pathway automatically map the information that is contained in the perceptual stimulus on a specific motor plan for action (Gallese & Sinigaglia [2011]), real and represented objects can evoke the same perceptual experience of possible interactions with the environment. Notably, the information processing that occurs within the dorsal pathway subserves the detection of patterns of optical stimulation that are associated with action opportunities without distinguishing between three-dimensional and bi-dimensional sources of stimulation.

We have seen that the dorsal stream does not process information concerning the identity of the visual target because this latter role is primarily allocated to the ventral stream. The function of the dorsal stream is not to provide visual targets with an objective identity but rather to attribute motor meanings to the patterns of optical stimuli that continuously reach our eyes, which allows pragmatic access to the visual perception of the environment. Accordingly, because the motor representations that are processed in the dorsal stream do not assign an objective identity to the visual target, it is incorrect to consider them in terms of true or false statements about something (Butterfill & Sinigaglia [2013])

4.Concluding Remarks: seeing images and understanding language

Some realistic images of objects actually cause their observers to question reality. These images, which are exemplified by the implementation of trompe l’oeil techniques and hyperrealistic paintings, not only instill doubts about reality but also demonstrate how a work of art can be powerful and effective. Some paintings are able to mimic the real environment to such an extent that the spectator cannot avoid feeling completely absorbed by the subject, which induces in observers the experience of a reality that is not actually accessible. But what exactly are realistic paintings?

According to a dispositional approach to visual perception (Section, 2), realistic paintings should be conceived as bi-dimensional representations of objects that are treated in such a way that the optic array that originates from them mimics the same optical array that is originated by the real objects that they represent (Gibson [1971], [1979]). Following this line of thought, the painting of a graspable object mimics the same optical patterns that are generated by a real graspable object despite the fact that the properties of the two things are very different. Whereas the visual array of a real graspable object specifies the disposition to be grasped of the object, the same optical array generated by the depicted representation of the same object does not relate to a disposition to be grasped. Thus, on the basis of a dispositional view, because the patterns of optical stimulation induced by a bi-dimensional representation of an object contain only false information about the presence of possible motor interactions with the represented object, the perceptual experience of realistic paintings might be understood as a mere case of illusion.

However, recent findings in the cognitive science of vision suggest that the illusory character of our perceptual experience of bi-dimensional representations, such as trompe l’oeil and hyperrealistic paintings, is only part of the story. Our perceptual apparatus is organized such that there is a functional portion of this system that is specialized in detecting patterns of stimulation that are normally, but not necessarily, associated with the presence of action possibilities in the environment (Section 3). This functional specialization is made possible by the reuse of motor representations that are also involved in action execution, without a distinction between three-dimensional and bi-dimensional causes of the stimulation. Without this visuo-motor specialization of the perceptual apparatus, it would be virtually impossible to assign so quickly a motoric meaning to the objects in the environment and to pragmatically access the manifold systems of representation that constitute our cultural environment.

Among such systems, the use of words may be the most abstract way to represent the concrete objects on a format that is not three-dimensional. As with pictorial representations, the use of words is semantically associated with the presence of possibilities of action in the environment despite the fact that no motor interaction is made available by linguistic items. Evidence derived from cognitive linguistics shows that our understanding of action-related words and verbs relies on motoric cognitive schemas that are encoded by the sensorimotor apparatus (Lakoff & Johnson [1999], Peruzzi [2000a], [2000b]). Interestingly, theories concerning the evolution of language suggest that pragmatic abilities are the cognitive source of semantic abilities, establishing a primacy of motoric understanding in the abstract categorization of the world (Corballis [2002], Arbib [2012]). According to this view, our ability to give significance to linguistic gestures has its origin in our ability to understand others’ actions by means of a sensorimotor matching system which codifies the gestures of others in terms of one’s own vocabulary of motor act (Rizzolatti & Arbib [1998]).

It is nowadays well known that pre-motor regions involved in action planning are activated both when the agent is involved in the execution of an action, as well as when she is involved in understanding motor-related concepts. Several studies devoted to investigate the correspondences between language comprehension and cerebral cortical activations support the idea that part of the process underlying semantic understanding is grounded in the sensorimotor system (see Borghi, & Binkofski [2014], for a review). A pivotal evidence supporting this view has been provided by Buccino et al. (2005) where the authors have shown a decrease of motor evoked potentials (MEPs) recorded from hand muscles and leg muscles when agents were required to process, respectively, hand-related verbs and foot-related verbs. Moreover, using brain-imaging techniques, it has been established that when agents process words and sentences with contents related to effectors, such as hands and feet, somatotopic activations of pre-motor and motor cortical areas occur (Hauk et al. [2004]; Tettamanti et al. [2005]). In the same vein, Aziz-Zadeh, et al. (2006) have shown that reading sentences describing hand actions activate the same premotor areas as watching videos that show an actress performing a manual action, revealing the involvement of the mirror system in language processing (Gallese [2008]). Importantly, it has been shown that the activation of effectors-related areas of the motor system occurs in the early phase of processing (Glenberg & Gallese [2012]), supporting the hypothesis that motor effects during language processing do not merely reflect the agent’s imagery after language understanding is completed (Gallese & Lakoff [2005]).

More recently, Vega, et al., (2014), have shown that processing action-related sentences elicits stronger activations than visual sentences in the primary motor area, as well as in regions generally associated with the planning and understanding of actions. Interestingly, these motor activations occur not only with affirmative cases (e.g., I unwrapped the gifts), but also with negations (e.g., I did not unwrap the gifts) and with counterfactuals cases (e.g., I would have unwrapped the gifts), suggesting that the motor system is involved in language understanding by default, even when the actions described by the sentences are not happening or merely hypothetical. Following this result, Marino, et al. (2014), have demonstrated that processing images and nouns of natural graspable objects share the same neural substrate, eliciting the same motor responses during a semantic recognition task. This evidence supports the hypothesis that the recruitment of the motor system during the presentation of images and nouns is crucial to perform semantic tasks, allowing a pragmatic access to both the figurative and the linguistic representations of action related objects.

To conclude, it should be noted that, although hearing action-related words and verbs induces a somatotopic activation of the motor system that is congruent with the action possibility to which they refer, the perception of such linguistic items is not treated as a mere case of illusion. Instead, according to an embodied approach to language processing, the activation of the motor system is viewed as a constitutive part of the cognitive mechanism underlying the understanding of action-related words and sentences. This consideration, should be extended to the perceptual experience of bi-dimensional representations of action-related objects, in particular to those instances of works of art aiming at inducing a sense of reality in the spectator.


Arbib, M., 2012: How the Brain Got Language. The Mirror System Hypothesis, New York and Oxford: Oxford University Press.

Austin, J.L., 1962: Sense and Sensibilia, Oxford: Oxford University Press.

Aziz-Zadeh, L., Wilson, S.M., Rizzolatti, G., Iacoboni, M. 2006: Congruent embodied representations for visually presented actions and linguistic phrases describing actions, “Current Biology”, 16, pp. 1818–1823.

Baumann, M., Fluet, M., & Scherberger, H., 2009: Context-specific grasp movement representation in the macaque anterior intraparietal area, “Journal of Neuroscience”, 29, pp. 6436-6448.

Borghi, A., & Binkofski, F., 2014: Words As social Tools: An embodied view on abstract concepts, New York: Springer.

Borra, E., Belmalih, A., Calzavara, R., Gerbella, M., Murata, A., Rozzi, S., Luppino, G., 2008: Cortical Connections of the Macaque Anterior Intraparietal (AIP) Area, “Cerebral Cortex”, 18, pp. 1094–1111.

Brewer, B., 2011: Perception and its Objects, Oxford: Oxford University Press.

Buccino, G. S., Cattaneo, L., Rodà, F., & Riggio, L., 2009: Broken affordances, broken objects: a TMS study, “Neuropsychologia”, 47, pp. 3074–3078.

Buccino, G., Riggio, L., Melli, G., Binkofski, F., Gallese, V., Rizzolatti, G. 2005: Listening to action-related sentences modulates the activity of the motor system: a combined TMS and behavioral study, “Brain Res.Cogn.Brain Res.”, 24, pp. 355–363.

Butterfill, S., Sinigaglia, C., 2014: Intention and motor representation in purposive action, “Philosophy and Phenomenological Research”, 88(1), pp. 119–145.

Buxbaum, L., & Kalenine, S., 2010: Action knowledge, visuomotor activation, and embodiment in the two action systems, “Annals of the New York Academy of Science”, 1191, pp. 201–218.

Cardellicchio, P., Sinigaglia, C., Costantini, M., 2011: The space of affordances: a TMS study, “Neuropsychologia”, 49(5), pp. 1369–1372.

Chao, L. L., Martin, A., 2000: Representation of manipulable man-made objects in the dorsal stream, “NeuroImage”, 12, pp. 478–484.

Chemero, A., 2009: Radical Embodied Cognitive Science, Cambridge (MA): MIT Press.

Cisek, P.,  Kalaska, J., 2005: Neural Correlates of Reaching Decisions in Dorsal Premotor Cortex: Specification of Multiple Direction Choices and Final Selection of Action, “Neuron”, 45, pp. 801-814.

Cisek, P., Kalaska, J. F., 2010: Neural mechanisms for interacting with a world full of action choices, “Annual Review of Neuroscience”, 33, pp. 269-98.

Corballis, M. C., 2002: From hand to mouth. The origins of language, Princeton: Princeton University Press.

Costantini, M., Sinigaglia, C., 2011: Grasping affordance: a window onto social cognition, in Joint attention: new developments, ed. A. Seeman, pp. 431–470, Cambridge, MA: MIT Press.

Costantini, M., Ambrosini, E., Tieri, G., Sinigaglia, C., Committeri, G., 2010: Where does an object trigger an action? An investigation about affordance in space, “Experimental Brain Research”, 207, pp. 95–103.

de Vega, M., Leon, I., Hernàndez, J. A., Valdés, M., Padron, I., Ferstl, E. C., 2014: Action Sentences Activate Sensory Motor Regions in the Brain Independently of Their Status of Reality, “Journal of Cognitive Neuroscience”, 26, 7, pp. 1363-1376.

Derbyshire, N., Ellis, R., & Tucker, M., 2006: The potentiation of two components of the reach-to-grasp action during object categorisation in visual memory, “Acta Psychologica”, 122, pp. 78–94.

Di Pellegrino, G., Rafal, R., Tipper, S., 2005: Implicitly evoked actions modulate visual selection: evidence from parietal extinction, “Current Biology”, 15, pp. 1469–1472.

Ellis, R., & Turker, M., 2000: Micro-affordance: the potentiation of components of action by seen objects, “British Journal of Psychology”, 91, pp. 451–471.

Gallese, V., 2008: Mirror neurons and the social nature of language: The neural exploitation hypothesis,  “Social Neuroscience”, 3, pp. 317-333.

Gallese, V., Sinigaglia, C., 2011: What is so special with embodied simulation, “Trends in Cognitive Science”, 15(11), pp. 512–519.

Gallese, V., Lakoff, G., 2005: The Brain’s concepts: the role of the Sensory-motor system in conceptual knowledge, “Cognitive Neuropsychology”, 22, pp. 455 - 479.

Gibson, J. J., 1966: The senses considered as perceptual systems, Boston: Houghton Mifflin.

Gibson, J. J., 1979: The ecological approach to visual perception, Boston: Houghton Mifflin.

Gibson, J. J., 1971: The information available in pictures, “Leonardo”, 4, pp. 27-35.

Glenberg, A. M., Gallese, V., 2012: Action-based language: a theory of language acquisition, comprehension, and production, “Cortex”, 48, 7, pp. 905-922.

Grafton, S., Fadiga, L., Arbib, M., Rizzolatti, G., 1997: Premotor cortex activation during observation and naming of familiar tools, “NeuroImage”, 6, pp. 231–236.

Haddock, A., Macpherson, F., 2008: Disjunctivism: Perception, Action, Knowledge, Oxford: Oxford University Press.

Handy, T. C., Grafton, S. T., Shroff, N. M., Ketay, S., Gazzaniga, M. S., 2003: Graspable objects grab attention when the potential for action is recognized, “Nature Neuroscience”, 6, pp. 421–427.

Hauk, O., Johnsrude, I., Pulvermuller, F., 2004: Somatotopic representation of action words in human motor and premotor cortex, “Neuron”, 41, pp. 301-307.

Hoshi, E., Tanji, J., 2007: Distinctions between dorsal and ventral premotor areas: anatomical connectivity and functional properties, “Current Opinions in Neurobiology”, 17(2), pp. 234–242.

Hurley, S., 2001: Perception and Action: Alternative Views, “Synthese”, 129, pp. 3-40.

Jacob, P., Jeannerod, M., 2003: Ways of Seeing: The Scope and Limits of Visual Cognition, Oxford: Oxford University Press.

Lakoff, G., Johnson, M., 1999: Philosophy In The Flesh: the Embodied Mind and its Challenge to Western Thought, New York: Basic Books.

Marino, B., Sirianni, M., Dalla Volta, R., Magliocco, F., Silipo, F., Quattrone, A., Buccino, G., 2014: Viewing photos and reading nouns of natural graspable objects similarly modulate motor responses, “Frontiers in Human Neuroscience”, 8, pp. 1-10.

Milner, A. D., Goodale, M. A., 1995: The visual brain in action, Oxford: Oxford University Press.

Murata, A., Fadiga, L., Fogassi, L., Gallese, V., Raos, V., Rizzolatti, G., 1997: Object representation in the ventral premotor cortex (area F5) of the monkey, “Journal of Neurophysiology”, 78, pp. 2226–2230.

Murata, A., Gallese, V., Luppino, G., Kaseda, M., Sakata, H., 2000: Selectivity for the shape, size and orientation of objects in the hand-manipulation-related neurons in the anterior intraparietal (AIP) area of the macaque, “Journal of Neurophysiology”, 83, pp. 2580–2601.

Nanay, B., 2015: Trompe l’oeil and the Dorsal/Ventral Account of Picture Perception, “Review of Philosophy and Psychology”, 6, pp. 181-197.

Noe, A., 2004: Action in Perception, Boston: MIT Press.

Peruzzi, A., 2000a: An Essay on the Notion of Schema, in Shapes of Form, ed. L. Albertazzi, Amsterdam: Kluwer.

Peruzzi, A., 2000b: The Geometric Roots of Semantics, in Meaning and Cognition a Multidisciplinary Approach, ed. L. Albertazzi,  pp. 169-201, Amsterdam: Benjamins Publishing Company.

Riddoch, M. J., Humphreys, G., Edwards, S., Baker, T., Willson, K., 2003: Seeing the action: neuropsychological evidence for action-based effects on object selection, “Nature Neuroscience”, 6(1), pp. 82–90.

Rizzolatti,  G. & Arbib, M., 1998:  Language  within  our  grasp, “Trends  in Neuroscience”, 21, pp. 188-194.

Rizzolatti, G., Luppino, G., 2001: The cortical motor system, “Neuron”, 31, pp. 889–901.

Rizzolatti, G., & Matelli, M., 2003: Two different streams form the dorsal visual system: anatomy and functions, “Exp Brain Res.”, 153, pp. 146–157.

Rizzolatti, G., Camarda, R., Fogassi, L., Gentilucci, M., Luppino, G., Matelli, M., 1988: Functional organization of inferior area 6 in the macaque monkey. II. Area F5 and the control of distal movements, “Experimental Brain Research”, 71, pp. 491–507.

Scarantino, A., 2003: Affordance explained, “Philosophy of Science”, 70, pp. 949–961.

Shapiro, L., 2011: Embodied Cognition, New York: Routledge.

Shaw, R. E., Turvey, M., Mace, W. M., 1982: Ecological Psycology. The consequence of a commitment to realism in Cognition and the symbolic processes, eds. W. Weimer & D. Palermo,  pp. 159–226, Hillsdale, N. J.: Lawrence Erlbaum Associates.

Shikata, E., Hamzei, F., Glauche, V., Koch, M., Weiller, C., Binkofski, F., Büchel, C., 2003: Functional properties and interaction of the anterior and posterior intraparietal areas in humans, “European Journal of Neuoroscience”, 17, pp. 1105–1110.

Smith, A.D., 2002, The Problem of Perception, Cambridge: Harvard University Press.

Tettamanti, M., Buccino, G., Saccuman, M. C., Gallese, V., Danna, M., Scifo, P., et al. 2005: Listening to action-related sentences activates fronto-parietal motor circuits, “J. Cogn.Neuroscience”, 17, pp. 273–281.

Tucker, M., Ellis, R., 1998: On the relations between seen objects and components of potential actions, “Journal of Experimental Psychology: Human Perception and Performance”, 24, pp. 830–846.

Turvey, M., Shaw, R., Reed, E., Mace, W., 1981: Ecological Laws of Perceiving and Acting: In Reply to Fodor and Pylyshyn, “Cognition”, 9, pp. 237-304.

Vainio, L., Hammaréén, L., Hausen, M., Rekolainen, E., Riskilää, S., 2011: Motor inhibition associated with the affordance of briefly displayed objects, “Quarterly Journal of Experimental Psychology”, 64(6), pp. 1094-1110.

Verhoef, B. E., Vogels, R., Janssen, P., 2010: Contribution of inferior temporal and posterior parietal activity to three-dimensional shape perception, “Current Biology”, 20, pp. 909–913.


Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License (CC-BY- 4.0)

Firenze University Press
Via Cittadella, 7 - 50144 Firenze
Tel. (0039) 055 2757700 Fax (0039) 055 2757712