Interactions Between Vision and Language: The state of the art.
Saturday, October 01st, 2011 [14:20 - 16:00]
SY_17. Interactions between vision and language: The state of the art
University of Wisconsin-Madison
Historically, researchers have paid little notice to the ways in which language may not only change how visual experiences are reported, but actually change ongoing visual processing. In this symposium we present a broad range of empirical and computational evidence arguing for deep and transformative effects of language on visual processing. In learning a language we learn associations between arbitrary cues (words) and objects, actions, and relations. For example, we learn to associate the word “up” with a direction of motion, “lemon” with oblong yellow objects of a certain size, and “red” with a color category. With this word-to-world association in place, language influences visual processing in ways that are surprisingly robust and pervasive. Gerry Altmann will show that language affects oculomotor control as fast as neuroanatomy allows-within about 100 ms. Not only saccades, but pursuit eye movements are subject to rapid influences of (task-irrelevant) language. Eiling Yee will demonstrate how language can be used to probe the unfolding in time of object representations, showing how different object attributes exert influences at different times during visual search. Michael Spivey will provide complementary evidence showing that the incremental delivery of information using language augments the process of visual search. Emre Ozgen will provide novel sources of evidence that color categorical perception is a product of perceptual reorganization. Gary Lupyan will show that hearing object words can enable more accurate detection of stimuli that are otherwise invisible. He will also demonstrate that linguistic cues activate visual representations more effectively than nonlinguistic cues. Together the research presented in this symposium calls for a renewed effort to understand the extremely rapid, pervasive, and deep interactions between language and vision. The interdisciplinary scope of this symposium-a combination of psychophysics and psycholinguistics-will present the audience with fresh perspectives on familiar phenomena.
SY_17.1 - The Evocative Power of Words: Language modulates (even low-level) visual processing
University of Wisconsin-Madison
Beyond making linguistic communication possible, words (verbal labels) affect nonverbal cognitive processes such as categorization, memory, and cognitive control. Can simply hearing a word also affect visual processing, and if so, how deep do such effects go? I will present findings from a variety of paradigms showing that verbal labels modulate ongoing perceptual processing even in low-level visual tasks such as simple object detection. Hearing a label can un-suppress an object made invisible through continuous flash suppression or backward masking (increasing visual sensitivity). Hearing an entirely redundant verbal label also facilitates the deployment of attention to all objects on a screen that match the label, in parallel. The long-term experience of using words in a referential manner appears to make them particularly effective in activating visual representations of the denoted object category. I will show a series of results comparing verbal and nonverbal cues in activating visual information, showing that controlling for familiarity, verbal cues activate visual information more effectively than nonverbal cues. This verbal advantage appears to arise because representations activated by verbal means are more categorical and more similar from subject to subject than representations activated without the overt use of language.In sum, performance on a wide range of visual tasks-tasks that have been presumed to be immune from linguistic influence-is in fact deeply affected by language. I will argue that these effects are best explained in terms of language as a form of top-down modulation (The Label Feedback Hypothesis)
SY_17.2 - Roses are red. Jeans are blue. Frisbees are round, and triangles can be too.
Yee, E. 1 , Huffstetler, S. 2 & Thompson-Schill, S. 2
1 Basque Center on Cognition, Brain and Language
2 University of Pennsylvania
When looking at an object (e.g., pizza), we become aware of not only what it looks like, in its current instantiation (a reddish triangular slice, let’s say), but also what other forms that object can take (e.g., round), and what it is used for (food). We describe several eye tracking studies that demonstrate that when searching for a named object, non-visible shape and function properties can guide visual attention - with different kinds of knowledge influencing visual attention at different times. Participants viewed multi-object displays and clicked on the picture corresponding to a heard word. In critical trials, the conceptual representation of one of the objects in the display was similar in shape, color, or function to the heard word. Importantly, this similarity was not apparent in the visual depictions (e.g. for the target “frisbee”, the shape-related object was a triangular slice of pizza - a shape that a frisbee cannot take); preferential fixations on the related object were therefore attributable to activation of the conceptual representations on the relevant features. Shape-, color-, and function- related objects were preferentially fixated, but function effects occurred later than shape and color. These findings show that visual object recognition is a dynamically unfolding process in which function follows form, and that when searching for a named object, visual attention is influenced by top-down conceptual knowledge about the properties of other objects in the scene.
SY_17.3 - Language-mediated eye movements: Interactions between language, attention, and oculomotor control
Altmann, G. T.
Department of Psychology, University of York, UK
The influence of language on eye movements is, to all intents and purposes, as fast as it could possibly be. Here, we consider exactly what this must mean in terms of the processes that support eye movement control and the processes and representations that are engaged by the language comprehension system. The first set of studies explores when the earliest linguistic influences of language can be observed on the oculomotor system. We find influences as early as 100 ms at which point they most likely reflect the cancellation of already-planned saccades, due to competition between covert attention towards an endogenously cued target (i.e. to where we were going to move our eyes) and covert attention towards an exogenously cued location (i.e. to where the language then told us to instead move our eyes); cf. the double-step paradigm. The second set of studies explored such competitive influences in the context of pursuit eye movements: Verbs denoting upward or downward motion (e.g. ‘climb’/‘dive’) were presented auditorily as participants pursued a dot moving vertically or horizontally. When the directionality implied by the verb was congruent with the direction in which attention had to be deployed to track the target, eye velocity increased. When incongruent, it decreased. This interaction, between attention during pursuit and the task-irrelevant (but attentionally modulating) language, suggests a process in which language can activate representations which compete with those that regulate oculomotor control. Taken together, the data argue for a tight theoretical linkage between language comprehension, attention, and oculomotor control.
SY_17.4 - Support for the perceptual re-organisation account of categorical perception effects.
Recent evidence suggests that categorical perception (CP) effects in colour result from the activity of a linguistic code, rather than perceptual re-organisation (warping) as previous research suggested. I will review recent evidence from our labs that suggest otherwise. The first point of consideration is the erroneous use of colour metrics to test hypotheses. In one study, performance on low-level discrimination and a “high-level” task has been compared, to reach the conclusion that low-level discrimination does not show categorical effects, while high-level tasks, where linguistic codes can and must be used, do. But this study used two very different colour metrics to equate stimuli on the two tasks. We show that these results can be entirely attributed to the confound of colour space, and that depending on the space used, "CP effects" can also be observed in low-level discrimination. We also suggest a less commonly used metric that gives good results. Another line of support for the verbal coding hypothesis comes from the study of hemispheric asymmetries, which suggests that CP effects are lateralised to the left hemisphere:, providing a direct link with language processing. However, evidence from our labs, suggests that at least on low-level discrimination, there are no hemispheric asymmetries in categorical effects. If anything, any hemispheric asymmetry seems to have more to do with cone sensitivity: there is a right hemisphere advantage for blue region discrimination, while no asymmetry is observed in the green region or on the blue-green boundary. It is possible that this “blue” effect is falsely interpreted as evidence for hemispheric asymmetry in CP. Finally, I will review evidence from our labs on the effects of category learning on colour discrimination thresholds. Category learning seems to selectively improve discrimination on the category boundary. This is consistent with a perceptual change account of CP.
SY_17.5 - Preferential inspection of recent real-world events
Cognitive Interaction Technology (CITEC), Bielefeld, Germany.
When people listened to a sentence that could either refer to a clipart event they had recently seen, or to another, future, event, they preferred to look more at the recent event target. We examined whether this inspection preference also holds for real-world events, and whether it is sensitive to how often people see recent (vs. future) events. In a first study, we observed the same inspection preference with real-world events. When people saw a real-world action and heard a sentence (verb) that was temporarily ambiguous between that recent action versus an equally plausible future action, they immediately looked more often at the target of the recent (vs. the other, future) action event. However, since none of the future events were acted out, this inspection preference could reflect a frequency bias. Alternatively, it could index a preferred grounding of verbs in performed actions. In a subsequent study, the experimenter performed equally many future and recent actions. A corpus study ensured the verbs and adverbs in our sentences (indicating past versus future actions) were equally frequent. Notwithstanding, when recent and future actions (and corresponding verbs) were equally frequent and predictive, listeners still initially preferred to look more at the recent than the future event target during sentence comprehension. Thus, recent real-world actions can rapidly influence comprehension, and even when recent and future actions and corresponding verbs / adverbs do not bias towards the recent past, people preferred to inspect recent (vs. future) events. A simple frequency-of-experience account cannot accommodate these findings.