12th Biannual Conference of The German Cognitive Science Society

123
Table of Contents
Keynote lectures ............................................................................................................................................................................................. S6

The perception of others goal-directed actions ............................................................................................................................. S6
Body ownership, self-location, and embodied cognition ............................................................................................................ S6
Life as we know it ................................................................................................................................................................................... S6
Elements of extreme expertise ............................................................................................................................................................. S7
Dynamic Field Theory: from the sensory-motor domain to embodied higher cognition ................................................. S7
How t(w)o perform actions together .................................................................................................................................................. S7
Symposia ........................................................................................................................................................................................................... S8
DRIVER COGNITION ............................................................................................................................................................................. S8
The CSB model: A cognitive approach for explaining speed behavior ................................................................................. S8
Validation of the Driving by Visual Angle car following model ............................................................................................ S8
The effects of event frequency and event predictability on drivers attention allocation ................................................ S9
Integrated modeling for safe transportation (IMoST 2): driver modeling & simulation .................................................. S9
Simulating the influence of event expectancy on drivers attention distribution ................................................................ S9
PROCESSING LANGUAGE IN CONTEXT: INSIGHTS FROM EMPIRICAL APPROACHES ............................... S10
Investigations into the incrementality of semantic interpretation: the processing of quantificational restriction ... S10
When the polar bear fails to find a referent: howare unmet presuppositions processed? .............................................. S10
Deep or surface anaphoric pronouns?: Empirical approaches ................................................................................................. S10
Comparing presuppositions and scalar implicatures ................................................................................................................... S11
The time course of referential resolution ....................................................................................................................................... S11
COGNITION OF HUMAN ACTIONS: FROM INDIVIDUAL ACTIONS TO INTERACTIONS .............................. S11
Signaling games in sensorimotor interactions............................................................................................................................... S11
Perceptual cognitive processes underlying the recognition of individual and interactive actions ............................... S11
Neural theory for the visual processing of goal-directed actions ........................................................................................... S11
From individual to joint action: representational commonalities and differences ............................................................ S12
Neural mechanisms of observing and interacting with others ................................................................................................. S12
CORTICAL SYSTEMS OF OBJECT GRASPING AND MANIPULATION ..................................................................... S12
Influences of action characteristics and hand used on the neural correlates of planning and executing object
manipulations ........................................................................................................................................................................................... S13
Attention is needed for action control: evidence from grasping studies .............................................................................. S13
Effects of object recognition on grasping ...................................................................................................................................... S13
The representation of grasping movements in the human brain ............................................................................................. S13
Avoiding obstacles without a ventral visual stream ................................................................................................................... S14
Action and semantic object knowledge are processed in separate but interacting streams: evidence from fMRI
and dynamic causal modelling........................................................................................................................................................... S14
EYE TRACKING, LINKING HYPOTHESES AND MEASURES IN LANGUAGE PROCESSING ........................ S14
Conditional analyses of eye movements......................................................................................................................................... S14
Rapid small changes in pupil size index processing difficulty: the index of cognitive activity in reading, visual world,
and dual task paradigms ...................................................................................................................................................................... S15
Measures in sentence processing: eye tracking and pupillometry.......................................................................................... S15
Improving linking hypotheses in visually situated language processing: combining eye movements and event-related
brain potentials ........................................................................................................................................................................................ S16
Oculomotor measurements of abstract and concrete cognitive processes ........................................................................... S16
MANUAL ACTION ................................................................................................................................................................................ S16
The Bremen-Hand-Study@Jacobs: effects of age and expertise on manual dexterity .................................................... S17
123
Planning anticipatory actions: on the interplay between normative and mechanistic models ...................................... S17
Identifying linguistic and neural levels of interaction between gesture and speech during comprehension using EEG
and fMRI ................................................................................................................................................................................................... S17
Neural correlates of gesture-syntax interaction ............................................................................................................................ S18
Interregional connectivity minimizes surprise responses during action perception .......................................................... S18
The development of cognitive and motor planning skills in young children ..................................................................... S18
PREDICTIVE PROCESSING: PHILOSOPHICAL AND NEUROSCIENTIFIC PERSPECTIVES ............................. S18
Bayesian cognitive science, unification, and explanation ......................................................................................................... S19
The explanatory heft of Bayesian models of cognition ............................................................................................................. S19
Predictive processing and active inference .................................................................................................................................... S19
Learning sensory predictions for perception and action ............................................................................................................ S19
Layer resolution fMRI to investigate cortical feedback and predictive coding in the visual cortex .......................... S19
HOW LANGUAGE AND NUMERICAL REPRESENTATIONS CONSTITUTE MATHEMATICAL
COGNITION ........................................................................................................................................................................................... S20
Influences of number word inversion on multi-digit number processing: a translingual eye-tracking study .......... S20
On the influence of linguistic and numerical complexity in word problems ..................................................................... S21
Linguistic influences on numerical understanding: the case of Welsh................................................................................. S21
Reading space into numbers: an update ......................................................................................................................................... S21
How language and numerical representations constitute mathematical cognition: an introductory review ............. S21
Language influences number processing: the case of bilingual Luxembourg .................................................................... S21
Language differences in basic numerical tasks ............................................................................................................................ S22
Cognitive components of the mathematical processing network in primary school children: linguistic and language
independent contributions .................................................................................................................................................................... S22
It does exist! A SNARC effect amongst native Hebrew speakers is masked by the MARC effect........................... S22
MODELING OF COGNITIVE ASPECTS OF MOBILE INTERACTION .......................................................................... S22
Creating cognitive user models on the basis of abstract user interface models ................................................................ S22
Expectations during smartphone application use ......................................................................................................................... S22
Evaluating the usability of a smartphone application with ACT-R ....................................................................................... S23
Simulating interaction effects of incongruous mental models................................................................................................. S24
Special offer! Wanna buy a trout?Modeling user interruption and resumption strategies with ACT-R ......... S24
Tutorials .......................................................................................................................................................................................................... S25
Introduction to probabilistic modeling and rational analysis ................................................................................................... S25
Modeling vision ...................................................................................................................................................................................... S25
Visualization of eye tracking data .................................................................................................................................................... S25
Introduction to cognitive modelling with ACT-R ....................................................................................................................... S25
Dynamic Field Theory: from sensorimotor behaviors to grounded spatial language ...................................................... S25
Poster presentations ..................................................................................................................................................................................... S27
The effect of language on spatial asymmetry in image perception ....................................................................................... S27
Towards formally founded ACT-R simulation and analysis.................................................................................................... S27
Identifying inter-individual planning strategies ............................................................................................................................ S28
Simulating events. The empirical side of the event-state distinction .................................................................................... S29
On the use of computational analogy-engines in modeling examples from teaching and education ......................... S30
Brain network states affect the processing and perception of tactile near-threshold stimuli ........................................ S31
A model for dynamic minimal mentalizing in dialogue ........................................................................................................... S32
Actions revealing cooperation: predicting cooperativeness in social dilemmas from the observation of everyday
actions ........................................................................................................................................................................................................ S33
The use of creative analogies in a complex problem situation ............................................................................................... S34
Yes, thats right? Processing yes and no and attention to the right vs. left........................................................................ S35
Perception of background color in head mounted displays: applying the source monitoring paradigm ................... S36
Continuous goal dynamics: insights from mouse-tracking and computational modeling .............................................. S37
Looming auditory warnings initiate earlier event-related potentials in a manual steering task ................................... S38
The creative process across cultures ................................................................................................................................................ S38
123
How do human interlocutors talk to virtual assistants? A speech act analysis of dialogues of cognitively impaired people
and elderly people with a virtual assistant..................................................................................................................................... S40
Effects of aging on shifts of attention in perihand space ......................................................................................................... S41
The fate of previously focused working memory content: decay or/and inhibition? ...................................................... S41
How global visual landmarks influence the recognition of a city ......................................................................................... S42
Explicit place-labeling supports spatial knowledge in survey, but not in route navigation .......................................... S44
How important is having emotions for understanding others emotions accurately? ...................................................... S45
Prosody conveys speakers intentions: acoustic cues for speech act perception ............................................................... S46
On the perception and processing of social actions.................................................................................................................... S46
Stage-level and individual-level interpretation of multiple adnominal adjectives as an epiphenomenontheoretical
and empirical evidence ......................................................................................................................................................................... S47
What happened to the crying bird? Differential roles of embedding depth and topicalization modulating syntactic
complexity in sentence processing ................................................................................................................................................... S48
fMRI-evidence for a top-down grouping mechanism establishing object correspondence in the Ternus display . S48
Event-related potentials in the recognition of scene sequences .............................................................................................. S49
Sensorimotor interactions as signaling games .............................................................................................................................. S50
Subjective time perception of verbal action and the sense of agency .................................................................................. S51
Memory disclosed by motion: predicting visual working memory performance from movement patterns ............. S52
Role and processing of translation in biological motion perception ..................................................................................... S53
How to remember Tubingen? Reference frames in route and survey knowledge of ones city of residency ......... S53
The effects of observing other peoples gaze: faster intuitive judgments of semantic coherence .............................. S54
Towards a predictive processing account of mental agency .................................................................................................... S55
The N400 ERP component reflects implicit prediction error in the semantic system: further support from a connectionist
model of word meaning ....................................................................................................................................................................... S56
Similar and differing processes underlying carry and borrowing effects in addition and subtraction: evidence from eye-
tracking ...................................................................................................................................................................................................... S57
Simultaneous acquisition of words and syntax: contrasting implicit and explicit learning ........................................... S58
Towards a model for anticipating human gestures in human-robot interactions in shared space ............................... S59
Preserved expert object recognition in a case of unilateral visual agnosia ......................................................................... S60
Visual salience in human landmark selection ............................................................................................................................... S60
Left to right or back to front? The spatial flexibility of time ................................................................................................. S61
Smart goals, slow habits? Individual differences in processing speed and working memory capacity moderate
the balance between habitual and goal-directed choice behavior .......................................................................................... S62
Tracing the time course of n - 2 repetition costs ...................................................................................................................... S62
Language cues in the formation of hierarchical representation of space............................................................................. S63
Processing of co-articulated place information in lexical access ........................................................................................... S64
Disentangling the role of inhibition and emotional coding on spatial stimulus devaluation ........................................ S65
The role of working memory in prospective and retrospective motor planning ............................................................... S66
Temporal preparation increases response conflict by advancing direct response activation ......................................... S67
The flexibility of finger-based magnitude representations ....................................................................................................... S68
Object names correspond to convex entities ................................................................................................................................. S69
The role of direct haptic feedback in a compensatory tracking task .................................................................................... S71
Comprehending negated action(s): embodiment perspective ................................................................................................... S71
Effects of action signaling on interpersonal coordination ........................................................................................................ S72
Physiological changes through sensory augmentation in path integration: an fMRI study ........................................... S73
Do you believe in Mozart? The influence of beliefs about composition on representing joint action outcomes
in music ..................................................................................................................................................................................................... S73
Processing sentences describing auditory events: only pianists show evidence for an automatic space pitch
association ................................................................................................................................................................................................ S74
A free energy approach to template matching in visual attention: a connectionist model ............................................ S75
ORAL PRESENTATIONS ....................................................................................................................................................................... S77
Analyzing psychological theories with F-ACT-R: an example F-ACT-R application .................................................... S79
123
F-ACT-R: defining the ACT-R architectural space .................................................................................................................... S81
Defining distance in language production: extraposition of relative clauses in German ............................................... S81
How is information distributed across speech and gesture? A cognitive modeling approach ...................................... S84
Towards formally well-founded heuristics in cognitive AI systems ..................................................................................... S87
Action planning is based on musical syntax in expert pianists. ERP evidence................................................................. S89
Motor learning in dance using different modalities: visual vs. verbal models .................................................................. S90
A frontotemporoparietal network common to initiating and responding to joint attention bids .................................. S93
Action recognition and the semantic meaning of actions: how does the brain categorize different social actions? S95
Understanding before language ......................................................................................................................................................... S95
An embodied kinematic model for perspective taking .............................................................................................................. S97
The under-additive effect of multiple constraint violations ................................................................................................... S100
Strong spatial cognition...................................................................................................................................................................... S103
Inferring 3D shape from texture: a biologically inspired model architecture .................................................................. S105
An activation-based model of execution delays of specific task steps ............................................................................... S107
How action effects influence dual-task performance ............................................................................................................... S110
Introduction of an ACT-R based modeling approach to mental rotation .......................................................................... S112
Processing linguistic rhythm in natural stories: an fMRI study............................................................................................ S114
Numbers affect the processing of verbs denoting movements in vertical space ............................................................. S115
Is joint action necessarily based on shared intentions? ........................................................................................................... S117
A general model of the multi-level architecture of mental phenomena. Integrating the functional paradigm
and the mechanistic model of explanation................................................................................................................................... S119
A view-based account of spatial working and long-term memories: Model and predictions ..................................... S120
Systematicity and Compositionality in Computer Vision ....................................................................................................... S123
Control and flexibility of interactive alignment: Mobius syndrome as a case study..................................................... S125
Efficient analysis of gaze-behavior in 3D environments ........................................................................................................ S127
The role of the posterior parietal cortex in relational reasoning .......................................................................................... S129
How to build an inexpensive cognitive robot: Mind-R ........................................................................................................... S131
Crossed hands stay on the time-line .............................................................................................................................................. S134
Is the novelty-P3 suitable for indexing mental workload in steering tasks? .................................................................... S135
Modeling perspective-taking by forecasting 3D biological motion sequences ................................................................ S137
Matching quantifiers or building models? Syllogistic reasoning with generalized quantifiers .................................. S139
What if you could build your own landmark? The influence of color, shape, and position on landmark salience S142
Does language shape cognition? ..................................................................................................................................................... S144
Ten years of adaptive rewiring networks in cortical connectivity modeling. Progress and perspectives ............... S146
Bayesian mental models of conditionals ...................................................................................................................................... S148
Visualizer verbalizer questionnaire: evaluation and revision of the German translation ............................................. S151
AUTHOR INDEX ..................................................................................................................................................................................... S155
Disclosure: This issue was not sponsored by external commercial interests.
123
Cogn Process (2014) 15 (Suppl 1):S1S158
DOI 10.1007/s10339-014-0632-2
ABSTRACTS
Special Issue: Proceedings of KogWis 2014
12th Biannual conference of the German cognitive science society

(Gesellschaft fur Kognitionswissenschaft)
Edited by Anna Belardinelli and Martin V. Butz
Keynote lectures participants bodily self to different locations during high-resolution

fMRI acquisition. It was found that activity patterns in the hippocampus,
retrosplenial, posterior cingulate, and posterior parietal cortices reflected
The perception of others goal-directed actions the sense of self-location, and that the functional interplay between self-
location and body ownership was mediated via the posterior cingulate
Harold Bekkering cortex suggesting a key role of this structure in generating the coherent
Donders Institute for Brain, Cognition and Behavior, Radboud experience of the bodily self in space.
University Nijmegen, The Netherlands In the final part of his talk Dr. Ehrsson will discuss recent studies
that have investigated how the central construct of the bodily self
It is widely assumed that perception of the world is based on internal
influences other higher cognitive functions such as the visual per-
models of that world and that models are shaped via prior experiences
ception of the world and the ability to remember personal events
that modulate the likelihood of a certain action given a certain con-
(embodied cognition). These experiments suggest that the represen-
text. In this talk, I will outline some experimental and theoretical
tation of ones own body affects visual perception of object size by
ideas how humans perceive goal-directed actions of others on the
rescaling the visual representation of external space, and that efficient
basis of object and movement knowledge. I will also discuss a
hippocampus-based episodic-memory encoding requires a first-person
potential role for language in improving our world model including a
perspective of the spatial relationship between the body and the
better perception of other agents goal-directed actions.
world. Taken together, the studies reviewed in this lecture advance
our understanding of how we come to experience ownership of a body
located at a single place, and unravel novel basic links between
Body ownership, self-location, and embodied cognition central body representation, visual perception of the world and epi-
sodic memory.
H. Henrik Ehrsson
Department of Neuroscience, Karolinska Institutet, Stockholm,
Sweden Life as we know it
Ask any child if his hands belong to him and the answer will be Of
Karl Friston
course! However, how does the brain actually identify its own
Wellcome Trust Centre for Neuroimaging, Institute of Neurology,
body? In this talk, Dr. Ehrsson will describe how cognitive neuro-
University College London, UK
scientists have begun to address this fundamental question. One key
idea is that parts of the body are distinguished from the external How much about our interaction withand experience ofour world
world by the patterns of the correlated information they produce can be deduced from basic principles? This talk reviews recent
from different sensory modalities (vision, touch and muscle sense). attempts to understand the self-organized behavior of embodied
It is hypothesized that these correlations are detected by neuronal agentslike ourselvesas satisfying basic imperatives for sustained
populations in premotor and posterior parietal areas that integrate exchanges with our world. In brief, one simple driving force appears
multisensory information from the space near the body. Dr. Ehrsson to explain nearly every aspect of our behavior and experience. This
and his team have recently used a combination of functional mag- driving force is the minimization of surprise or prediction error. In the
netic resonance imaging (fMRI) and human behavioral experiments context of perception, this corresponds to (Bayes-optimal) predictive
to present experimental results that support these predictions. To coding that suppresses exteroceptive prediction errors. In the context
change the feeling of body ownership, perceptual illusions were of action, simple reflexes can be seen as suppressing proprioceptive
used so that healthy individuals experienced a rubber hand as their prediction errors. We will look at some of the phenomena that emerge
own, their real hand being disowned, or that a mannequin was from this formulation, such as hierarchical message passing in the
their body. brain and the perceptual inference that ensues. I hope to illustrate
Dr. Ehrsson will also describe recent experiments that investigate how these points using simple simulations of how life-like behavior
we come to experience our body as being located at a specific place in the emerges almost inevitably from coupled dynamical systemsand
world, and how this sense of self-location depends on body ownership To how this behavior can be understood in terms of perception, action
this end an out-of-body illusion was used to perceptually teleport and action observation.
123
Cogn Process (2014) 15 (Suppl 1):S1S158 S7
Elements of extreme expertise been extended to understand elements of visual cognition such as
scene representations, object recognition, change detection, and
Wayne D. Gray binding. Sequences of cognitive or motor operations can be under-
Rensselaer Polytechnic Institute, Troy, NY, USA stood in this framework, which begins to reach into language by
providing simple forms of grounding of spatial and action concepts.
We are studying the acquisition and deployment of extreme expertise Discrete events emerge from instabilities in the underlying neural
during the real-time interaction of a single human with complex, dynamics. Categories emerge from inhomogeneities in the underlying
dynamic decision environments. Our dilemma is that people who neural populations that are amplified into macroscopic states by
have the specific skills we wish to generalize to (such as helicopter dynamic instabilities. I will illustrate how the framework makes
piloting, laparoscopic surgery, and air traffic control) are very rare in contact with psychophysical and neural data, but can also be used to
the college population and too expensive to bring into our lab. Our create artificial cognitive systems that act and think based on its own
solution has been to study expert and novice video game players. Our sensory and motor systems.
approach takes the position that Cognitive Science has been overly
fixated on isolating small components of individual cognition. That
approach runs the danger of overfitting theories to paradigms. Our
way out of this dilemma is to bring together (a) powerful computa-
How t(w)o perform actions together
tional models, (b) machine learning techniques, and (c) microanalysis
techniques that integrate analyzes of cognitive, perceptual, and action
data collected from extreme performers to develop, test, and extend Natalie Sebanz
cognitive theory. SOMBY LAB, Department of Cognitive Science, Central European
Since our January 2013 start, we have built our experimental University, Budapest, Hungary
paradigm, collected naturalistic and laboratory data, published journal Humans are remarkably skilled at coordinating their actions with one
and conference papers, won Rensselaer Undergraduate research pri- another. Examples range from shaking hands or lifting a box together
zes, developed single-piece optimizers (SPOs, i.e., machine to dancing a tango or playing a piano duet. What are the cognitive and
learning systems), compared machine performers to human per- neural mechanisms that enable people to engage in joint actions? How
formers, and begun analyzing eye and behavioral data from two 6 h does the ability to perform actions together develop? And why is it so
human studies. Our tasks have been the games of Tetris and Space difficult to have robots engage in smooth interactions with humans
Fortress. Future plan include (a) using our SPOs to tutor piece-by- and with each other? In this talk, I will review recent studies
piece placement, (b) developing integrated cognitive models that addressing two key ingredients of joint action: how individuals
account for cognition, action, and perception, and (c) continued include others in their action planning, and how they achieve the fine-
exploration of the differences between good players and extreme grained temporal coordination that is essential for many different
experts in Tetris and Space Fortress. types of joint action. This research shows that people have a strong
Games such as Tetris and Space Fortress are often dismissed as tendency to form representations of others tasks, which affects their
merely requiring reflex behavior. However, with an estimated total perception and attention, their action planning, and their encoding of
number of board configurations of 2199 (approx. 8 followed by 59 information in memory. To achieve temporal coordination of their
zeroes), Tetris cannot be merely reflect behavior. Our preliminary actions, people reduce the variability of their movements, predict the
analyzes show complex goal hierarchies, dynamic two-piece plans actions of their partners using their own motor system, and modulate
that are updated after every episode, sophisticated use of subgoaling, their own actions to highlight critical information to their partner. I
and the gradual adaptation of strategies and plans as the speed of play will discuss how social relations between individuals and groups and
increases. These are very sophisticated, human strategies, beyond our the cooperative or competitive character of social interactions mod-
current capability to model, and are challenging topic for the study of ulate these processes of action planning and coordination. The next
the Elements of Extreme Expertise. challenge for joint action research will be to understand how joint
action enables learning. This will allow us to understand what it takes
for people to become experts in particular joint actions, and how
Dynamic Field Theory: from the sensory-motor domain experts teach individual skills through performing joint actions with
to embodied higher cognition novices.
Gregor Schoner
Institut fur Neuroinformatik, Ruhr-Universitat Bochum, Germany
The embodiment stance emphasizes that cognitive processes are
closely linked to the sensory and motor surfaces. This implies that
cognitive processes share with sensory-motor processes fundamental
properties including graded state variables, continuous time depen-
dence, stability, and continuous metric contents. According to the
embodiment hypothesis these properties are pervasive throughout
cognition. This poses the challenge to understand how seemingly
categorical states emerge, on which cognitive processes seem to
operate at discrete event times. I will review Dynamic Field Theory, a
theoretical framework that is firmly grounded in the neurophysiology
of population activation in the higher nervous system. Dynamic Field
Theory has its origins in the sensory-motor domain where it has been
used to understand movement preparation, sensory-motor decisions,
and motor memory. In the meantime, however, the framework has
123
S8 Cogn Process (2014) 15 (Suppl 1):S1S158
Symposia System 1 is active in common and uncritical situations. The

regulation of speed is determined by sensory data from the envi-
ronment that are processed automatically without demanding many
DRIVER COGNITION resources. The resulting visual, auditive, haptic and kinesthetic
sensations are integrated into a subjective speed impression. An
Convenor: Martin Baumann unconscious and automated control process continuously matches
Ulm University, Germany this impression against the drivers skills and resources. When his
capabilities are exceeded, the driver decelerates the vehicle, when
From a psychological point of view driving is a highly complex task they are underchallenged he accelerates it. In case both components
despite the fact that millions of people perform this task in a safe and are balanced, he keeps the speed constant. The drivers behavior
efficient way each day. It involves many mental processes and determines the objective speed of the vehicle that in turn impacts
structures, such as perception, attention, memory, knowledge, manual his sensations and thus his subjective speed impression. Hence in
control, decision making, and action selection. These processes and the dynamic situation of driving, system 1 is considered as a
structures need to work closely integrated to master the challenges of closed-loop process that requires but little attention and controls the
driving a vehicle in a highly dynamic task environmentour daily speed of the car in an automated way. This process is monitored by
traffic. On the other hand despite all advances in traffic safety in system 2 that is responsible for tactic and strategic actions. It takes
recent years still about 31.000 people were killed in 2010 on Euro- over control when a critical situation demands specific maneuvers
pean roads. A high percentage of these fatalities are due to human under attention or when decisions for way finding and navigation
error, which reflects a brake-down of the interplay between the are required.
aforementioned cognitive processes. Therefore, understanding the The assumptions of the CSB Model with respect to system 1 were
cognitive processes that underlie driver behavior is not just a highly tested in four experiments using a simple driving simulator. Their
interesting academic endeavor to learn how the human mind masters results support the basic characteristics of the model. In the most
highly dynamic tasks but is also vital for further improvement of complex study, features of the environment were varied together with
traffic safety. the drivers mental workload. As predicted by the model, these
The papers presented in this symposium address different variables influenced the subjective impression of speed as well as the
aspects of driver cognition demonstrating the variety of processes objective speed. Besides such supporting evidence, additional influ-
relevant in the study of driver cognition. They have all in common ences were detected which served to state some components more
that their empirical work is based on models of the underlying precisely and to expand the model. The final CSB version is published
mental processes, ranging from conceptual models to quantitative in Brandenburg (2014).
and computational models. Two papers present recent results on
models addressing the drivers longitudinal control behavior.
References
Whereas Kathner and Kuhl present results on the validation of a
Brandenburg S (2014) Geschwindigkeitswahl im Straenverkehr:
specific car following model that is based on those input variables
Theoretische Erklarung und empirische Untersuchungen. SVH-
that are actually available to the human driver, Brandenburg and
Verlag, Saarbrucken
Thuring present empirical results on the validation of a general
Kahnemann D (2012) Schnelles Denken - langsames Denken. Siedler
model of speed behavior based on the interplay of bottom-up and
Verlag, Munchen
top-down processes. Weber presents the results of a joint research
project that aimed at developing an integrated driver model within
a computational cognitive architecture, called CASCaS, allowing
simulations of driver behavior in a real-time driving simulation Validation of the Driving by Visual Angle car following
environment. Both papers of Wortelen and Kaul and Baumann model
investigate factors influencing the distribution of attention while
driving. Wortelen implemented a computational model of attention
distribution within the cognitive architecture CASCaS to model the David Kathner, Diana Kuhl
effects of expectations about event frequencies and of information Deutsches Zentrum fur Luft- und Raumfahrt, Braunschweig, Germany
value on attention distribution. Kaul and Baumann investigated the Development and validation of Advanced Driver Assistance Systems
effects of event predictability in comparison to event frequency on require both an understanding of driver behavior as well as the means
attention distribution to explain related findings on rear-end to quickly test systems under development at various stages. Quan-
accidents. titative models of human driver behavior offer both capabilities.
Conventional models attempt to reproduce or predict behavior based
on arbitrary input, whereas psychological models seek to emulate
human behavior based on assumed cognitive functions. One common
The CSB model: A cognitive approach for explaining driving task is car following. As this is a straightforward control
speed behavior problem, a plethora of control models exist (Brackstone, McDonnald
1999). But typical car following models use input variables that are
not directly accessible to human drivers, such as speed of or distance
Stefan Brandenburg, Manfred Thuring
to a lead vehicle. One example of such a model is the classic Helly car
Cognitive Psychology and Cognitive Ergonomics, TU Berlin,
following model (Helly 1959). Andersen and Sauer (2007) argued
Germany
that to a human driver the only available input parameter is the visual
Based on Daniel Kahnemans (2012) distinction between highly angle of a lead vehicle. They substituted both velocities and distances
automated, fast processes (system 1) and conscious slower cognition in Hellys model with the visual angle, changing the properties of the
(system 2), the Components of Speed Behavior Model explains the controller considerably. They showed their Driving by Visual Angle
drivers longitudinal control of a vehicle by the interplay of bottom- (DVA) model to be superior to other car following models but did not
up and top-down processes. compare it directly to the Helly model. In a simulator pre-study, we
123
recreated Anderson and Sauers experimental setting to gather Integrated modeling for safe transportation (IMoST 2):
information on the DVA parameter properties and compared them to driver modeling & simulation
the original findings. To test the models usability in real world set-
tings, we conducted an extensive data collection in real traffic. On a
Lars Weber
70 km course through urban and rural settings as well as on a
OFFIS, Institute for Information Technology, Oldenburg, Germany
motorway, 10 subjects were instructed to follow a lead vehicle driven
by a confederate. We will present findings on the models quality,
properties of the models parameters such as their stability, and
compare them to similar models of car following.
References
Andersen GJ, Sauer CW (2007) Optical information for car follow-
ing: the driving by visual angle (DVA) model. Human Factors
49:878896
Brackstone M, McDonald M (1999) Car-following: a historical
review. Transp Res Part F 2(4):181196
Helly W (1959) Simulation of bottlenecks in single lane traffic flow.
In: International symposium on the theory of traffic flow,
New York, NY, USA
The effects of event frequency and event predictability

on drivers attention allocation
Robert Kaul1, Martin Baumann2

1
Deutsches Zentrum fur Luft- und Raumfahrt e.V., Institut fur
Verkehrssystemtechnik, Braunschweig, Germany; 2 Department
Human Factors, Ulm University, Germany
Safe driving requires the appropriate allocation of visual attention to
the relevant objects and events of a traffic situation. According to the IMoST 21 is an interdisciplinary research project between the three
SEEV model (e.g., Horrey, Wickens, Consalus 2006) the allocation of partners C.v.O. University of Oldenburg, OFFIS and the DLR Bruns-
visual attention to a visual information source is influenced by four wick (20102013). The project addresses the question of completing the
parameters: i) the salience of the visual information, ii) the effort to scope of model-based design to also incorporate human behavior. The
allocate attention to this source, iii) the expectancy, i.e. the expectation application area is the design of advanced driver assistance systems
that at a given location new relevant information will occur and iv) the (ADAS) in the automotive domain. Compared to the predecessor pro-
value or importance of the piece of information perceived at an ject IMoST 1 which addressed a single driving maneuver only (entering
information source. Whereas the first two reflect more or less bottom- the autobahn), IMoST 2 increased the scope of the scenario and deals
up processes of attention allocation, the latter two reflect top-down with typical driving maneuvers on the autobahn, including lane changes
processes. According to the SEEV model the expectancy itself is as well as free-flow and car-following.
mainly determined by the frequency of events at that information The presentation will give an overview of the final state of driver
source or location. But it seems plausible to assume that these top- modeling and simulation activities conducted in the project. During the
down processes, represented in the expectancy parameter of the model, 3 years of the project a driver model was implemented based on the cog-
are also influenced by the by the predictability of events at a certain nitive architecture CASCaS. The architecture incorporates several
information source. That is, many predictable events in channel cause psychological theories about human cognition and offers a flexible com-
less attention allocation than a single but unexpected event. ponent based approach to integrate various human modeling techniques.
In a driving simulator experiment, conducted within the EU project The presentation will provide a brief overview about the various sub-
ISi-PADAS, we compared the effects of event frequency and event models like multimodal perception, situation representation, decision
predictability on the allocation of visual attention. 20 participants making and action selection/execution and how this architecture can be
took part in this experiment. They had to drive in an urban area with used to model and simulate human machine interaction in the domain of
a lead car changing either frequently its speed or not at all on a driver modeling. Additionally, some of the empirical study results will be
straight section before a crossing, braking either predictably at the presented which were used to parameterize the model.
crossing (stop sign) or unpredictably (priority sign) at the crossing,
and simultaneously performing a visual secondary task with either
high frequency or low frequency stimulus presentation. Drivers gaze Simulating the influence of event expectancy on drivers
behavior was recorded while driving. The results show drivers attention distribution
allocation of visual attention is mainly determined by the predict-
ability of the lead cars behavior demonstrating the importance of the Bertram Wortelen
drivers ability to predict events as major determinant of driving OFFIS, Institute for Information Technology, Oldenburg, Germany
behavior.
The distribution of attention is a critical aspect of driving. The
References increased use of assistance and automation system as well as new
Horrey WJ, Wickens CD, Consalus KP (2006) Modeling drivers
visual attention allocation while interacting with in-vehicle 1
Integrated Modeling for Safe Transportation 2 (Funded by MWK
technologies. J Exp Psychol Appl 12(2):6778 Niedersachsen (VW Vorab)).
123
infotainment systems changes the distribution of attention. This work provide an interdisciplinary platform for linguists and cognitive
presents the Adaptive Information Expectancy (AIE) model, a new psychologists to discuss questions pertaining the cognitive processing
model of attention distribution, which is based on Wickens SEEV- of language. Our speakers will present their research obtained by
model. It can be integrated into cognitive architectures which are used means of different empirical approaches.
to simulate task models. The AIE model enables a very detailed
simulation of the distribution of attention in close interaction with the
simulation of a task model. Unlike the SEEV model, simulations using
the AIE model allow to derive several measures of human attention Investigations into the incrementality of semantic
distribution besides the percentage gaze distribution, like gaze fre- interpretation: the processing of quantificational
quencies and gaze transition probabilities. Due to the tight integration restriction
with the simulation of task models, it is also possible to simulate the
resulting consequences on the operators behavior (e.g. steering
Petra Augurzky, Oliver Bott, Wolfgang Sternefeld, Rolf Ulrich
behavior of drivers). The AIE model considers two factors which have
SFB 833, University of Tubingen, Germany
a great impact on drivers attention: the expectancy of events and the
value of information. The main focus is on the expectancy of events. Language comprehenders have the remarkable ability to restrict
The AIE model provides a new method to automatically determine the incoming language seemingly effortless in a way that it optimally fits
event expectancy from the simulation of a task model. the referential domain of discourse. We present a study which
It is shown how the AIE model is integrated in the cognitive investigates the incremental nature of this update process, in partic-
architecture CASCaS. A driving simulator study is performed to ular, whether the semantic processor immediately takes into account
analyze the AIE model in a realistic driving environment. The sim- the context of the utterance to incrementally compute and, if neces-
ulation scenario is driven by human drivers as well as by a driver sary, reanalyze the semantic value of yet partial sentences.
model developed with CASCaS using the AIE model. This scenario
investigates the effects of both factors on drivers attention distribu-
tion: event expectancy and information value. Comparing the
behavior of the human drivers to model behavior shows a good model When the polar bear fails to find a referent: how
fit for the percentage distribution of attention as well as gaze fre- are unmet presuppositions processed?
quencies and gaze transition probabilities.
Christian Brauner, Bettina Rolke
SFB 833, University of Tubingen, Germany
PROCESSING LANGUAGE IN CONTEXT: Discourse understanding fails when presuppositions, i.e., essential
INSIGHTS FROM EMPIRICAL APPROACHES context information, are not given. We investigated the time-course of
presupposition processing by presenting presupposition triggers such
Convenors: Christian Brauner, Gerhard Jager, Bettina Rolke as the definite article or the iterative again in a context which
Project B2, SFB 833, University of Tubingen, Germany contrasted with the presupposed content of the trigger or was com-
patible with it. By means of reading-time studies and event-related
Discourse understanding does not only mean integrating semantic brain potentials we show that readers notice semantic inconsistencies
knowledge along syntactic rules. It rather needs a Theory of Mind, at the earliest time point during reading. Our results additionally
entails the inclusion of context information, and presupposes that suggest that different presupposition processing strategies were
pragmatic principles were met. Moreover, data from brain imaging employed depending on the type of required reference process.
studies suggest that language is embodied within the motor and
sensory processing systems of the brain. Thus, it seems clear that the
faculty of language does not constitute a single, encapsulated pro-
cessing module. Instead it requires the interoperation of several Deep or surface anaphoric pronouns?: Empirical
different processing modules serving to aid an unambiguous discourse approaches
understanding.
Important processing prerequisites for successful discourse
Pritty Patel-Grosz, Patrick Grosz
understanding are the ability to make references to previously SFB 833, University of Tubingen, Germany
established knowledge and to integrate new information into a given
context. There are several linguistic tools which help to signal the Anaphoric expressions, such as pronouns (he/she/it/they), which gen-
requirement for suitable referents in a given discourse and which erally retrieve earlier information (e.g. a discourse referent), are typically
provide additional meaning aspects. One example are presupposi- taken to be central in establishing textual coherence. From a cognitive/
tions. These carry context assumptions aside from the literal meaning processing perspective, the following question has been posed by Hank-
of the words. For example, the sentence The cat ran away asserts amer, Sag (1976) and Sag, Hankamer (1984): do all anaphoric expressions
that some cat ran away, whereas it presupposes that there exists a cat involve the same cognitive mechanisms, or is there a difference between
and that the cat that is mentioned is unique in the discourse. This deep anaphora (which retrieves information directly from the context)
symposium will have its main focus on the cognitive processing of vs. surface anaphora (which operates on structural information/syntactic
such semantic and pragmatic phenomena. principles)? This question has largely been investigated for phenomena
The interconnection of the faculty of language with different such as do it anaphora vs. VP ellipsis, but it also bears relevance for
cognitive processing modules confronts us with questions that seem to pronouns proper: in the course of categorizing pronouns into weak vs.
escape a uniform analysis by one single academic discipline. Hence, strong classes (cf. Cardinaletti, Starke 1999), Wiltschko (1998) argues that
research into cognitive language processing and pragmatics in par- personal pronouns are deep anaphoric (by lacking an elided NP), whereas
ticular are a fruitful interdisciplinary interface between linguistics and demonstrative pronouns are surface anaphoric (and contain an elided NP).
cognitive psychology. While linguists have mainly focused on theo- We present new empirical evidence and argue that a distinction between
retical aspects of pragmatics, cognitive psychologists aimed to deep anaphoric vs. surface anaphoric pronouns must be rejected, at least in
identify involved cognitive processing functions. The symposium will the case of personal vs. demonstrative pronouns, and that the observed
123
differences between these classes can be deduced at the level of prag- view regarding the relationship between individual actions and
matics, employing economy principles in the spirit of Cardinaletti, Starke interactions. It will provide new insights from several research fields
(1999) and Schlenker (2005). including decision making, neuroscience, philosophy of neuroscience,
computational neuroscience, and psychology. The aim of the sym-
References posium is give a state of the art overview about commonalities and
Cardinaletti A, Michal S (1999) The typology of structural deficiency: differences of the perceptual cognitive processes underlying indi-
a case study of three classes of pronouns. In Henk van Riemsdijk vidual actions and social interactions.
(ed) Clitics in the languages of Europe. Mouton, Berlin,
pp 145233
Hankamer J, Ivan S (1976) Deep and surface anaphora. Linguist Inq
7:391426
Signaling games in sensorimotor interactions
Ivan S, Hankamer J (1984) Toward a theory of anaphoric processing.
Linguist Philos 7:325345 Daniel Braun
Schlenker P (2005) Minimize restrictors! (Notes on definite descrip- Max Planck Institute for Biological Cybernetics, Tubingen, Germany
tions, condition C and epithets. In Proceedings of Sinn und In our everyday lives, humans not only signal their intentions through
Bedeutung 2004, pp 385416 verbal communication, but also through body movements, for instance
Wiltschko M (1998) On the syntax and semantics of (relative) pro- when doing sports to inform team mates about ones own intended
nouns and determiners. J Comper German Linguisti 2:143181 actions or to feint members of an opposing team. Here, we study such
sensorimotor signaling in order to investigate how communication
emerges and on what variables it depends on. In our setup, there are
Comparing presuppositions and scalar implicatures two players with different aims that have partial control in a joint
motor task and where one of the two players possesses private infor-
mation the other player would like to know about. The question then is
Jacopo Romoli under what conditions this private information is shared through a
University of Ulster, UK signaling process. We manipulated the critical variables given by the
In a series of experiments sentences were used containing a presup- costs of signaling and the uncertainty of the ignorant player. We found
position that was either compatible or incompatible to a context that the dependency of both players strategies on these variables can
sentence. This was compared to a sentence in a context containing be modeled successfully by a game-theoretic analysis.
either a compatible or incompatible scalar implicature. The talk will
draw some conclusions on the cognitive cost of presuppositions in
relation to the putative cost of scalar implicatures.
Perceptual cognitive processes underlying
the recognition of individual and interactive actions
The time course of referential resolution
Stephan de la Rosa
Max Planck Institute for Biological Cybernetics, Tubingen, Germany
Petra Schumacher
University of Mainz, Germany Humans are social beings whose physical interactions with other
people require rapid recognition of the other person actions, for
Referential expressions are essential ingredients for speaker-hearer example when shaking hands. Previous research has investigated the
interactions. During reference resolution incoming information perceptual cognitive processes involved in action recognition using
must be linked with prior context and also serves information open loop experiments. In these experiments participants passively
progression. Speakers use different referential forms and other view actions during recognition. These studies identified several
means of information packaging (e.g., linear order, prosody) to important bottom-up mechanisms in action recognition. However, in
convey additional meaning aspects. Using event-related brain daily life participants often recognize action for or during action
potentials, we can investigate the time course of reference reso- production. In order to fully understand action recognition under more
lution and examine how comprehenders exploit multiple cues realistic conditions, we examined visual action perception in classical
during the construction of a mental representation. In this talk, I open-loop (participants observe actions), semi-closed (participants
present data that indicate that reference resolution is guided by two interact with an avatar which carries out prerecorded actions), and
core mechanisms associated with i) referential accessibility and closed loop experiments (two participants interact naturally with each
expectation (N400) and ii) accommodation and mental model other using feedback loops). Our results demonstrate the importance
updating (Late Positivity). of considering high level factors that are under top-down control in
action recognition.
COGNITION OF HUMAN ACTIONS:

FROM INDIVIDUAL ACTIONS TO INTERACTIONS Neural theory for the visual processing of goal-directed
actions
Convenor: Stephan de la Rosa
Max Planck Institute for Biological Cybernetics, Tubingen, Germany Martin. A. Giese
Previous research has focused on the perceptual cognitive processes Section for Computational Sensomotorics, Dept. for Cognitive
involved in the execution and observation of individual actions such Neurology, HIH and CIN, University Clinic Tubingen, Germany
as a person walking. Only more recently research started to investi- The visual recognition of biological movements and actions is an
gate the perceptual-cognitive processes involved in the interaction of important visual function that involves computational processes that
two or more people. This symposium provides an interdisciplinary link neural representations for action perception and execution.
123
This fact has made this topic highly attractive for researchers in social information processing. Essentially, two different neural
cognitive neuroscience, and a broad spectrum of partially highly specu- systems have been established in this research domain that appear to
lative theories have been proposed about the computational processes constitute two different routes of processing underlying our social
that might underlie action vision in primate cortex. In spite of this very cognitive capacities in everyday social encounters, namely the so-
active discussion about hypothetical computational and conceptual the- called mirror neuron system (MNS) and the social neural net-
ories, our detailed knowledge about the underlying neural processes is work (SNN, also theory of mind network or mentalizing network).
quite limited, and a broad spectrum of critical experiments that narrow The functional roles of both systems appear to be complementary.
down the relevant computational key steps remain yet to be done. The MNS serves comparatively early stages of social information
I will present a physiologically-inspired neural theory for the pro- processing that are more related to spatial or bodily signals
cessing of goal-directed actions, which provides a unifying account for expressed in the behaviour of others and supports the detection of
existing neurophysiological results on the visual recognition of hand potential social salience, including observation of other persons
actions in monkey cortex. At the same time, the model accounts for actions. Complementary to the functional role of the MNS, the SNN
several new experimental results, where a part of these experiments were serves comparatively late stages of social information processing
motivated by testing aspects of the proposed neural theory. Importantly, that are more related to the evaluation of emotional and psy-
the present model accounts for many basic properties of cortical action- chological states of others that have to be inferred as inner mental
selective neurons by simple physiologically plausible mechanisms that experience from the behaviour of this person. Empirical studies on
are known from visual shape and motion processing, without necessi- the neural mechanisms of ongoing social interactions with others
tating a central computational role of motor representations. show that essentially SNN components are recruited during the
The same model also provides an account for experiments on the experience of social encounters together with the reward system of
visual perception of causality, suggesting that simple forms of cau- the brain.
sality perception might be a side-effect of computational processes
that mainly subserve the recognition of goal-directed actions. Exten-
sions of the model might provide a basis for the investigation of the
neurodynamic phenomena in the visual processing of action stimuli. CORTICAL SYSTEMS OF OBJECT GRASPING
AND MANIPULATION
Acknowledgments
Research supported by the EC FP7 projects AMARSi, Koroibot, Convenor: Marc Himmelbach
ABC, and Human Brain Project, and by the BMBF and the DFG. Division of Neuropsychology, Hertie-Institute for Clinical Brain
Research, Centre for Integrative Neuroscience, University
of Tubingen, Germany
From individual to joint action: representational Reaching for objects, grasping them, and finally using or manipu-
commonalities and differences lating these objects are typical human capabilities. Although several
non-human species are able to do these things, the anatomical
adaptation of our hands for an extraordinarily precise and flexible use
Hong Yo Wong in the interaction with an infinite number of different target objects
CIN, University of Tubingen, Germany makes humans unique among the vertebrate species. The unique
To what extent do the structures underpinning individual action differ anatomy of our hands is matched by a cortical sensorimotor control
from those underpinning joint action? What are the representational system connecting multiple areas in the frontal and parietal lobes of
commonalities and differences between individual and joint action? the human cortex, which underwent a considerable enlargement
Can an individual account of planning intentions be extended to cover across the primate species. Although our hands by themselves, their
the case of joint action (as suggested by Bratman)? What is the flexible and precise use, and the capacities of our cortical hand motor
phenomenology of acting together? Is an adequacy condition on a systems already distinguish us from all other species, the use of
theory of action that it must account for the action of an arbitrary objects as tools to act on further objects and thereby mediate and
number of agents (as suggested by Butterfill)? This talk will approach transform our actions, makes us truly human. Although various non-
these questions from the point of view of the philosophy of action. We human species use tools in some situations, the versatility of human
will draw on recent empirical studies on joint action to reflect on tool use is totally unrivalled. Neuropsychological and neuroimaging
prominent philosophical accounts of joint action, using this as an research showed that dedicated cortical tool use systems overlap
opportunity to reflect on the significance of a philosophy of action for partially with the arm/hand sensorimotor systems but include addi-
the science of action (and vice versa). tional frontal, parietal, and temporal cortical structures. While most of
the structures that seem to be relevant for tool use beyond the arm-
hand sensorimotor system have been identified, we are still missing a
satisfactory description of their individual functional contributions.
Neural mechanisms of observing and interacting Across the whole range from simple grasping to the use of objects as
with others tools on other objects, investigations of interdependencies and inter-
actions between these cortical system components are still at the
Kai Vogeley beginning. The speakers of this symposium together cover the range
University Hospital Cologne, Germany from simple grasping to tool use and will present their current
behavioral, neuropsychological, and neuroimaging findings that fur-
Over the last decade, cognitive neuroscience has started to sys- ther specify the functional description of the human object grasping
tematically study the neural mechanisms of social cognition or and manipulation systems.
123
Influences of action characteristics and hand used Effects of object recognition on grasping
on the neural correlates of planning and executing
object manipulations Marc Himmelbach
Division of Neuropsychology, Hertie-Institute for Clinical Brain
Research, Centre for Integrative Neuroscience, University
Joachim Hermsdorfer1, Marie-Luise Brandi1,2, Christian Sorg2,
Georg Goldenberg3, Afra Wohlschlager2
1
Department of Sport and Movement Science, Technical University Grasping a manipulable object requires action programming and
Munich, Germany; 2 Department of Neurology, Technical University object recognition, two processes that were supposed to be ana-
Munich, Germany; 3 Department of Neuropsychology, Bogenhausen tomically segregated in a dorsal and a ventral visual subsystem.
Hospital, Germany Our studies investigated interactions between these proposed sub-
systems studying the influence of familiar everyday objects on
Studies using functional magnetic resonance imaging (fMRI) techniques
grasp programming and its cortical representation in humans. Our
have revealed a wide-spread neural network active during the naming or
behavioral studies revealed an effect of learned identity-size
imagination of tool action as well as during pantomimes of tool use.
associations on reach-to-grasp movements under binocular viewing
Actual tool has however only rarely been investigated due to methodo-
conditions, counteracting veridical binocular depth and size
logical problems. We have constructed a tool carousel to enable the
information. This effect of object recognition on grasp program-
controlled and quick presentation and use of a variety of everyday tools
ming was further supported by differences in the scaling of the
and corresponding recipients, while restricting body movements to lower
maximum grip aperture between grasping featureless cuboids and
arm and hand. In our paradigm we compared the use of tools as well as the
grasping recognizable everyday objects in healthy humans. A
goal-directed manipulation of neutral objects with simple transportation.
subsequent fMRI experiment showed that during grasping every-
We tested both hands in 17 right-handed healthy subjects. An action
day objects relative to grasping featureless cuboids BOLD signal
network including parietal, temporal as well as frontal areas was found.
levels were not only increased at the lateral occipital cortex but
Irrespectively of the exact characteristics of the action, planning was
also at the anterior intraparietal sulcus, suggesting that object-
strongly lateralized to the left brain and involved similar areas, which
identity information is represented in the dorsal subsystem. Mea-
remained active during actual task execution. Handling a tool versus a
suring reach-to-grasp kinematics in two patients with lateral
neutral bar and using an object versus simple transportation strengthens
occipito-temporal brain damage we observed significant behavioral
the lateralization of the action network towards the left brain. The results
deficits in comparison to a large healthy control group, suggesting
support the assumption that a dorso-dorsal stream is involved in the on-
a causal link between visual processing in the ventral system and
line manipulation of objects according to orientation and structure
grasp programming. In conclusion, our work shows that the rec-
independent of object knowledge. Regions of a ventral-dorsal pathway
ognition of a particular object not only affects grasp planning, i.e.
process and code the specific knowledge of how a common tool is used.
the selection of a broad motor plan, but also the parameterization
Temporal-ventral areas identify objects and may code semantic tool
of reach-to-grasp movements.
information. Use of the left-hand leads to a larger recruitment of action
areas, possibly to compensate for the lack of routine and automatism
when using the non-dominant hand.
The representation of grasping movements
in the human brain
Attention is needed for action control: evidence
from grasping studies Angelika Lingnau
Center for Mind/Brain Sciences, Department of Psychology
Constanze Hesse and Cognitive Science, University of Trento, Italy
School of Psychology, University of Aberdeen, UK Daily life activities require skillful object manipulations. Whereas
It is well known that during movement preparation, attention is we begin to understand the neural substrates of hand prehension in
allocated to locations which are relevant for movement planning. monkeys at the level of single cell spiking activity, we still have a
However, until now, very little research has examined the influence of limited understanding of the representation of grasping movements
distributed attention on movement kinematics. In our experiments, we in the human brain. With recent advances in human neuroimaging,
investigated whether the execution of a concurrent perceptual task such as functional magnetic resonance imaging (fMRI) repetition
that requires attentional resources interferes with movement planning suppression (fMRI-RS) and multi-variate pattern (MVP) analysis,
(primarily mediated by the ventral stream) and/or movement control it has become possible to characterize some of the properties
(primarily mediated by the dorsal stream) in grasping. Participants represented in different parts of the human prehension system. In
had to grasp objects of varying sizes whilst simultaneously per- this talk, I will present several studies using fMRI-RS and MVP
forming a perceptual identification task. Movement kinematics and analysis that investigated the representation of reach direction,
perceptual identification performance in the dual-task conditions were wrist orientation, grip type and effector (left/right hand) of simple
compared to the baseline performance in both tasks (i.e. performance non-visually guided reach-to-grasp movements. We observed a
levels in the absence of a secondary task). Furthermore, movement preference for reach direction along the dorsomedial pathway, and
kinematics were measured continuously such that interference effects overlapping representations for reach direction and grip type along
could also be detected at early stages of the movement. Our results the dorsolateral pathway, in line with a growing literature that
indicate that both movement planning (as indicated by prolonged casts doubts on a clear-cut distinction between separate pathways
reaction times) as well as movement control (as indicated by a for the reach and grasp component. Moreover, we were able to
delayed adjustment of the grip to the objects size) are altered when distinguish between premotor areas sensitive to grip type, wrist
attention has to be shared between a grasping task and a perceptual orientation and effector, and parietal areas that are sensitive to grip
task. These findings suggest that the dorsal and the ventral stream type across wrist orientation and grip type. Our results support the
share common attentional processing resources and that even simple view of a hierarchical representation of movements within the
motor actions such as grasping are not completely automated. prehension system.
123
Avoiding obstacles without a ventral visual stream EYE TRACKING, LINKING HYPOTHESES
AND MEASURES IN LANGUAGE PROCESSING
Thomas Schenk
Department of Neurology, University of Erlangen-Nuremberg, Convenors: Pia Knoeferle, Michele Burigo
Germany Bielefeld University, Germany
When reaching for a target it is important to avoid knocking over The present symposium focuses on a core topic of eye tracking in
objects that stand in the way. We do this without thinking about it. language processing, viz. linking hypotheses (the attributive rela-
Experiments in a hemiblind patient demonstrate that obstacles that are tionship between eye movements and cognitive processes). One
not perceived can be avoided. To account for such dissociations the central topic will be eye-tracking measures and their associated
two visual-streams model suggests that perception is handled in the linking hypotheses in both language comprehension and produc-
ventral visual stream while visually-guided action depends on visual tion. The symposium will discuss both new and established gaze
input from the dorsal stream. The model also assumes that the dorsal measures and their linking assumptions, as well as ambiguity in
stream cannot store visual information. Consequently it is predicted our linking assumptions and how we could begin to address this
that patients with dorsal stream damage will fail in the obstacle- issue.
avoidance task, but succeed when a short delay is introduced between
obstacle presentation and response onset. This has been confirmed in
patients with optic ataxia. In contrast ventral stream damage should
allow normal obstacle avoidance but destroy the patients ability to Conditional analyses of eye movements
avoid obstacles in a delayed condition. We tested these predictions in
DF. As expected we found that she can avoid obstacles in the standard Michele Burigo, Pia Knoeferle
condition. More surprisingly she is equally good in the delayed Bielefeld University, Germany
condition and a subtle change in the standard condition is sufficient to
impair her obstacle-avoidance skills. The implications of these find- In spoken language comprehension fixations guided by the verbal
ings for the control of reaching will be discussed. input have been interpreted as reflecting a referential link between
words and corresponding objects (Tanenhaus, Spivey-Knowlton,
Eberhard, Sedivy 1995). However, they cannot reveal other aspects of
how comprehenders interrogate a scene (e.g., attention shifts from one
Action and semantic object knowledge are processed object to another). Inspections, on the other hand, are, by definition, a
in separate but interacting streams: evidence good reflection of attentional shifts much like saccades (see Altmann,
Kamide 2004 for related discussion). One domain where attentional
from fMRI and dynamic causal modelling shifts and their direction are informative is spatial language (e.g., the
plant is above the clock). Some models predict that objects are
Peter H. Weiss-Blankenhorn inspected as they are mentioned while others predict that attention
Department of Neurology, University Hospital Cologne, Germany & must shift from a later-mentioned object (the clock) to the earlier
Cognitive Neuroscience, Institute of Neuroscience & Medicine (INM- mentioned located object (the plant). To assess these model predic-
3), Research Centre Julich, Germany tions, we examined in particular the directionality of attention shifts
While manipulation knowledge is differentially impaired in patients via conditional analyzes. We ask where people look next, as they hear
suffering from apraxia, function knowledge about objects is selec- the spatial preposition and after they have made one inspection to the
tively impaired in patients with semantic dementia. These clinical clock. Will they continue to inspect the clock or do they shift attention
observations fuelled the debate whether manipulation and function back to the plant? Three eye tracking experiments were used to
knowledge about objects rely on differential neural substrates, as the investigate the directionality of attention shifts during spatial lan-
processing of function knowledge may be based on either the action guage processing. The results from these conditional analyzes
or the semantic system. revealed, for the first time, the overt attentional shifts from the ref-
By using new experimental tasks and effective connectivity ana- erence object (the clock) to the located object (the plant) in sentences
lysis, fMRI studies can contribute to this debate. Behavioral data such as The plant is above the clock). In addition conditional ana-
revealed that functional object knowledge (= motor-related semantic lyzes of inspections may provide a useful approach for further
knowledge) and (non-motor) semantic object knowledge are pro- refining the linking hypotheses between eye movements and cognitive
cessed similarly, while processing manipulation-related action processes (Fig. 1).
knowledge took longer. For the manipulation task compared to the
two (motor and non-motor) semantic tasks, a general linear model References
analysis revealed activations in the bilateral extra-striate body area Altmann GTM (1999) Thematic role assignment in context. J Mem-
and the left intra-parietal sulcus. The reverse contrast led to activa- ory Lang 41:124145
tions in the fusiform gyrus and inferior parietal lobe bilaterally as well Regier T, Carlson L (2001) Grounding spatial language in perception:
as in the medial prefrontal cortex. Effective connectivity analysis an empirical and computational investigation. J Exp Psychol Gen
demonstrated that action and semantic knowledge about objects are 130:273298
processed along two separate, but interacting processing streams with Tanenhaus MK, Spivey-Knowlton MJ, Eberhard KM, Sedivy JE
the inferior parietal lobe mediating the exchange of information (1995) Integration of visual and linguistic information in spoken
between these streams. language comprehension. Science 268:16321634
123
Beatty J, Lucero-Wagoner B (2000) The pupillary system. Cambridge

University Press, Cambridge
Engelhardt PE, Ferreira F, Patsenko EG (2010) Pupillometry reveals
processing load during spoken language comprehension. Quart J
Exp Psychol 63:639645
Frank S, Thompson R (2012) Early effects of word surprisal on pupil
size during reading. In: Miyake N, Peebles D, Cooper RP (eds)
Proceedings of 34th annual conference cognitive science society,
pp 15541559
Gutirrez RS, Shapiro LP (2010) Measuring the time-course of sen-
tence processing with pupillometry. In: CUNY conference on
human sentence processing
Hess E, Polt J (1960) Pupil size as related to interest value of visual
stimuli. Science
Hess E, Polt J (1964) Pupil size in relation to mental activity during
simple problem-solving. Science
Hyona J, Tommola J, Alaja A (1995) Pupil dilation as a measure of
processing load in simultaneous interpretation and other lan-
guage tasks. Quart J Exp Psychol 48(3):598612
Fig. 1 The plant is above the clock. The 5 9 6 grid was used to Just MA, Carpenter PA (1993) The intensity dimension of thought:
define the objects locations and was invisible to participants pupillometric indices of sentence processing. Can J Exp Psychol
47(2)
Kahneman D, Beatty J (1966) Pupil diameter and load on memory.
Rapid small changes in pupil size index processing Science
difficulty: the index of cognitive activity in reading, Marshall S (2000) US patent no. 6,090,051
visual world, and dual task paradigms Marshall S (2002) The index of cognitive activity: Measuring cog-
nitive work-load. In: Proceedings of 7th conference on human
factors and power plants, IEEE, pp 57
Vera Demberg
Marshall S (2007) Identifying cognitive state from eye metrics. Aviat
Saarland University, Saarbrucken, Germany
Space Environ Med 78(Supplement 1):B165B175
The size of the pupil has long been known to reflect arousal (Hess, Schluroff M (1982) Pupil responses to grammatical complexity of
Polt 1960) and cognitive load in a variety of different tasks such as sentences. Brain Lang 17(1):133145
arithmetic problems (Hess, Polt 1964), digit recall (Kahneman, Beatty Zellin M, Pannekamp A, Toepel U, der Meer E (2011) In the eye of
1966), attention (Beatty 1982) as well as language complexity (Sch- the listener: pupil dilation elucidates discourse processing. Int J
luroff 1982; Just, Carpenter 1993; Hyona et al. 1995; Zellin et al. Psychophysiol
2011; Frank, Thompson 2012), grammatical violations (Gutirrez,
Shapiro 2010) and context integration effects (Engelhardt et al. 2010).
All of these studies have looked at the macro-level effect of the
overall dilation of the pupil as response to a stimulus. Recently, a Measures in sentence processing: eye tracking
micro-level measure of pupil dilation has been proposed, called the and pupillometry
Index of Cognitive Activity or ICA (Marshall 2000, 2002, 2007),
which does not relate processing load to the overall changes in size of Paul E. Engelhardt1, Leigh B. Fernandez2
the pupil, but instead counts the frequency of rapid small dilations, 1
University of East Anglia, UK; 2 University of Potsdam, Germany
which are usually discarded as pupillary hippus (Beatty, Lucero-
Wagoner 2000). In this talk, we will present data from two studies that measured pupil
Some aspects which make the ICA particularly interesting as a diameter as participants heard temporarily ambiguous sentences. In the
measure of cognitive load are that the ICA a) is less sensitive to first study, we examined visual context. Tanenhaus et al. (1995) found
changes in ambient light and fixation position b) is more dynamic, that in the context of a relevant visual world containing an apple on
which makes it easier to separate the effect of stimuli in close a towel, an empty towel, and a box, listeners will often incorrectly
sequence and c) is faster than overall pupil size, i.e., it can usually be parse an instruction, such as put the apple on the towel in the box. The
measured in the time window of 3001,200 ms after stimulus. misinterpretation is that the apple must be moved on to the empty
If it reliably reflects (linguistic) processing load, the ICA could towel, and thus, the primary dependent measure is rate of saccadic eye
hence constitute a useful new method to assess processing load using movements launched to the empty towel. Eye movements to the empty
an eye-tracker, in auditory experiments, visual world experiments, as towel do not occur when the visual world contains more than one
well as in naturalistic environments which are not well suited for the apple (Ferreira et al. 1995). In the first study, we examined the role that
use of EEG, e.g. while driving a car, and could therefore usefully visual context plays on the processing effort associated with garden-
complement the range of experimental paradigms currently used. path sentence processing (see example A). Pupil diameter was mea-
In this talk I will report experimental results on the index of sured from the key (disambiguating) word in the sentence (e.g.
cognitive activity (ICA) in a range of reading experiments, auditory played). Our main hypothesis was that relevant visual context (e.g. a
language plus driving experiments as well as a visual world experi- picture of a woman dressing herself) would be associated with reduced
ment, which all indicate that the ICA is a useful index of linguistic processing effort (i.e. no increase in pupil size). In contrast, when the
processing difficulty. visual context supported the garden-path misinterpretation (e.g. a
picture of a woman dressing a baby) pupil diameter would reliably
increase.2 Results were consistent with both predictions.
References
Beatty J (1982) Task-evoked pupillary responses, processing load,
2
and the structure of processing resources. Psychol Bull 91(2):276 The prosodic boundary between clauses was also manipulated.
123
A. While the woman dressed (#) the baby that was cute and cuddly movements and ERPs). This naturally limits the conclusions we can draw
played on the floor. from this research with regard to language comprehension (theory). In
B. The superintendent learned [which schools/students] the pro- further refining our theories of sentence comprehension, better linking
posal [that expanded/to expand] upon the curriculum would motivate hypotheses would thus be an essential step.
____ during the following semester.3 The present contribution argues that combining eye-tracking and
In the second study, we examined a special type of filler gap event-related brain potentials would improve the interpretation of
dependency, called parasitic gap constructions. Filler gap dependen- these two individual measures, the associated linking hypotheses, and
cies occur when a constituent within a sentence has undergone correspondingly insights into situated language comprehension pro-
movement (e.g. Whati did the boy buy ti?). In this sentence, what has cesses (Knoeferle, in press).
moved from its canonical position as the object of buy, and thus, the
parser must be able to keep track of moved constituents and correctly Acknowledgments
associate them with the correct verbs (or gap sites). Difficulty arises This research was funded by the Cognitive Interaction Technology
when (1) there are multiple verbs in the sentence, and (2) when those Excellence Center (DFG).
verbs are optionally transitive (i.e. have the option to take a direct
object or not). Parasitic gaps are a special type of construction References
because a filler is associated with two gaps. An example is What did
Knoeferle P (in press) Cognitive Neuroscience of Natural Language
the attempt to fix _ ultimately damage _?. Even more interestingly,
Use, Cambridge University Press, Cambridge, chap Language
from a linguistic perspective, is that the first gap occurs in an ille-
comprehension in rich non-linguistic contexts: combining eye
gal position. Phillips (2006) used a self-paced word-by-word reading
tracking and event-related brain potentials
paradigm to test sentences containing parasitic gap like constructions
(see example B). He found slowdowns only in situations in which
parasitic gap dependency was allowed (i.e. with to expand). Fur-
thermore, reading times were influenced by plausibility (i.e. it is Oculomotor measurements of abstract and concrete
possible to expand schools but not students). In our second study, we cognitive processes
used similar materials to investigate parasitic gaps using changes in
pupil diameter over time as an index of processing load. Our data
indicates that the parser actively forms dependencies as soon as Andriy Myachykov
possible, regardless of semantic fit. Northumbria University, Newcastle-upon-Tyne, UK
In summary, this talk will compare and contrast findings from eye Analysis of oculomotor behavior has long been used as a window into
tracking and reading times with pupil diameter. Both of our studies the cognitive processes underlying human behavior. Eye tracking
showed similarities to the original works, but at the same time, also allows recording of highly accurate categorical and chronometric
showed novel dissociations. The relationship between discrete mea- data, which provides experimental evidence about various aspects of
sures, such as saccadic eye movements, and continuous measures, human cognition including, but not limited to, retrieval and activation
such as pupil diameter and mouse tracking, will be discussed. of information in memory and allocation and distribution of visual
attention. As a very diverse and accurate experimental tool, eye
References tracking has been used for the analysis of low-level perceptual pro-
Ferreira F, Henderson JM, Singer M (1995) Reading and language cesses as well as for the investigation of higher cognitive processes
processing: Similarities and differences. In Henderson JM, Singer including mental arithmetic, language, and communication. One
M, Ferreira F (Eds) Reading and language processing. Erlbaum, example of the latter is research using visual world paradigm,
Hillsdale, pp 338341 which uses eye movements of language users (listeners and speakers)
Phillips C (2006) The real-time status of island phenomena. Language in order to understand cognitive processes underlying human lin-
795823 guistic communication.
Tanenhaus MK, Spivey-Knowlton MJ, Eberhard KM, Sedivy JC In the first part of my talk, I will offer a broad overview of eye
(1995) Integration of visual and linguistic information in spoken tracking methodology with a specific focus on measurements and
language comprehension. Science 268(5217):16321634 their evidential value in different cognitive domains. The second part
will discuss results of a number of recent eye-tracking studies on
sentence production and comprehension as well as number
processing.
Improving linking hypotheses in visually situated
language processing: combining eye movements
and event-related brain potentials
MANUAL ACTION
Pia Knoeferle
Bielefeld University, Germany Convenor: Dirk Koester
Bielefeld University, Germany
Listeners eye movements to objects in response to auditory verbal input,
as well as their event-related brain potentials (ERPs) have revealed that The hand is one of our most important tools for interacting with the
non-linguistic cues contribute rapidly towards real-time language com- environment, both physically and socially. Manual actions and the
prehension. While the findings from these two measures have contributed associated processes of motor control, both sensorimotor and cogni-
important insights into context effects during real-time language com- tive, have received much attention. This research strand has a focus
prehension, there is also considerable ambiguity in the linking between on the complexity of movement details (e.g. kinematics, dynamics or
comprehension processes and each of these two measures (eye degrees of freedom). At the same time, in a seemingly different
research field, manual actions have been scrutinized for their com-
municative goals or functions, so-called co-speech gestures. Here, a
3
The critical items were taken from Phillips (2006) and were focus is on what kind of information is supported by such actions;
simplified for auditory presentation. whether meaning is conveyed but also synchronization, akin to
123
kinematics, is currently under investigation. A tight functional inter- Our results confirmed a decline in basic components of manual
relation between manual action control and language has long been dexterity, finger force control and tactile perception, with increasing
proposed (e.g. Steklis, Harnad 1976). Not only hand movements are age, even already during middle adulthood. Also age-related changes
relevant and have to be controlled, also the environmental context in underlying neurophysiological correlates could be observed in
(i.e., the situation) has to be taken into account in order to fully middle-aged adults. Performing manual tasks on a comparable level
understand manual actions. Furthermore, technical advances permit to younger adults required more frontal (i.e. cognitive) brain resour-
also the deeper investigation of the neural basis, in addition to the ces in older workers indicating compensatory plasticity. Furthermore,
cognitive basis, of (manual) action control. Regarding other cognitive in both the motor and tactile domain expertise seemed to counteract
domains, recent evidence points towards a tight functional interaction age-related decline and to postpone age effects for about 10 years.
of grasping with other cognitive domains such as working memory or Although older adults generally performed at a lower baseline per-
attention (Spiegel et al. 2013; Logan, Fischman 2011). Whats more, formance level, they were able to improve motor and tactile
manual actions may be functional for abstract cognitive processing, functioning by short term practice or stimulation interventions. Par-
e.g., numerical reasons (as suggested by the phenomenon of finger ticularly in the tactile domain such an intervention was well suited to
counting). attenuate age-related decline. Overall, our data suggest that the aging
In this symposium we will bring together latest research that process of manual dexterity seems to start slowly but continuously
explores the manifold functions and purposes of manual actions such goes on during the working lifespan and can be compensated by
as exploring and manipulating objects, the development of such continuous use (expertise) or targeted interventions.
control processes for grasping and the changes associated with aging.
Different models of action control will be presented and evaluated.
Also, evidence for the role of manual gestures in interacting and
communicating with other people will be presented. That is, not only Planning anticipatory actions: on the interplay
the (physical) effects of manual actions in the environment will be between normative and mechanistic models
discussed but also the interpretation of gestures, i.e., communicative
goals will be debated. The symposium will shed light on new con- Oliver Herbort
cepts of and approaches to understanding the control of manual Department of Psychology, University of Wurzburg, Germany
actions and their functions in a social and interactive world.
Actions frequently foreshadow subsequent actions. For example, the
hand orientation used to grasp an object depends on the intended
References
object manipulation. Here, I examine whether such anticipatory grasp
Logan SW, Fischman MG (2011) The relationship between end-state
selections can be described purely in terms of their function or
comfort effects and memory performance in serial and free
whether the planning process also has to be taken into account. To test
recall. Acta Psychol 137:292299
functional accounts, three posture-based cost functions were used to
Spiegel MA, Koester D, Schack T (2013) The functional role of
predict grasp selection. As an example for a model of the planning
working memory in the (re-)planning and execution of grasping
process, I evaluated the recently proposed weighted integration of
movements. J Exp Psychol Human Percept Performance
multiple biases model. This model posits that grasp selection is
39:13261339
heuristically based on the direction of the intended object rotation as
Steklis HD, Harnad SR (1976) From hand to mouth: Some critical
well as other factors. The models were evaluated using two empirical
stages in the evolution of language. Annal N Y Acad Sci
datasets. The datasets were from two experiments, in which partici-
280(1):445455
pants had to grasp and rotate a dial by various angles. The models
were fitted to the empirical data of individual participants using
maximum likelihood estimates of the models free parameters. The
The Bremen-Hand-Study@Jacobs: effects of age model including the planning process provided a closer fit to the data
of both experiments than the functional accounts. Thus, human
and expertise on manual dexterity actions can only be understood as the superimposition of their func-
tion and computational artifacts imposed by the limitations of the
Ben Godde, Claudia Voelcker-Rehage central nervous system.
Jacobs Center on Lifelong Learning and Institutional Development,
Jacobs University, Bremen, Germany
A decline in manual dexterity is common in older adults and has been
demonstrated to account for much of the observed impairment in Identifying linguistic and neural levels of interaction
everyday tasks, like pouring milk into a cup, preparing meals, or between gesture and speech during comprehension
retrieving coins from a purse. Aiming at the understanding of the using EEG and fMRI
underlying mechanisms, the investigation of the regulation and
coordination of isometric fingertip forces has been given lot of
Henning Holle
attention during the last decades. Also tactile sensitivity is increas-
Department of Psychology, University of Hull, UK
ingly impaired with older age and deficits in tactile sensitivity and
perception and therefore in sensorimotor feedback loops play an Conversational gestures are hand movements that co-occur with
important role for age-related decline in manual dexterity. Within the speech but do not appear to be consciously produced by the speaker.
Bremen-Hand-Study@Jacobs our main focus was on the question of The role that these gestures play in communication is disputed, with
how age and expertise influence manual dexterity during middle some arguing that gesture adds only little information over and above
adulthood. In particular, we were interested in the capacity of older what is already transmitted by speech alone. My own work has pro-
employees to enhance their fine motor performance through practice. vided strong evidence for the alternative view, namely that gestures
To reach this goal, we investigated basic mechanisms responsible for add substantial information to the comprehension process. One level
age-related changes in precision grip control and tactile performance at which this interaction between gesture and speech takes place
as well as learning capacities (plasticity) in different age and expertise seems to be semantics, as indicated by the N400 of the Event Related
groups on a behavioral and neurophysiological (EEG) level. Potential. I will also present findings from a more recent study that
123
has provided evidence for a syntactic interaction between gesture and activation for the surprising than for the non-surprising context in the
speech (as indexed by the P600 component). Finally, fMRI studies parietal and temporal multi-modal association cortices (ACs) that are
suggest that areas associated with the detection of semantic mis- known to process context. Fronto-insular cortex (FIC) was more
matches (left inferior frontal gyrus) and audiovisual integration (left active for surprising actions compared to non-surprising actions.
posterior temporal lobe) are crucial components of the brain network When the non-surprising action was perceived, functional connec-
for co-speech gesture comprehension. tivity between brain areas that represent action surprise and
contextual surprise was enhanced. The findings suggest that the
strength of the interregional neural coupling minimizes surprising
sensations necessary for perception of others goal-directed actions
Neural correlates of gesture-syntax interaction and provide support for a hierarchical predictive model of brain
function.
Leon Kroczek, Henning Holle, Thomas Gunter
Max-Planck-Institute for Human Cognitive and Brain Sciences,
Leipzig, Germany
In a communicative situation, gestures are an important source of
The development of cognitive and motor planning skills
information which also impact speech processing. Gesture can for in young children
instance help when speech perception is troubled by noise (Obermeier
et al. 2012) or when speech is ambiguous (Holle et al. 2007). Kathrin Wunsch1, Roland Pfister2, Anne Henning3,4,
Recently, we have shown that not only meaning, but also structural Gisa Aschersleben4, Matthias Weigelt1
1
information (syntax) used during language comprehension is influ- Department of Sport and Health, University of Paderborn, Germany;
2
enced by gestures (Holle et al. 2012). Beat gestures, which highlight Department of Psychology, University of Wurzburg, Germany;
3
particular words in a sentence, seem to be able to disambiguate Developmental Psychology, University of Health Sciences Gera,
sentences that are temporarily ambiguous with respect to their syn- Germany; 4 Department of Psychology, Saarland University,
tactic structure. Here we explored the underlying neural substrates of Germany
the gesture-syntax interaction with fMRI using similar ambiguous
The end-state comfort (ESC) effect signifies the tendency to avoid
sentence material as Holle et al. (2012). Participants were presented
uncomfortable postures at the end of goal-directed movements and
with two types of sentence structures which were either easy (Subject-
can be reliably observed during object manipulation in adults, but
Object-Verb) or more difficult (Object-Subject-Verb) in their syn-
only little is known about its development in children. Therefore, the
tactic complexity. A beat gesture was shown either at the first or the
present study investigated the development of anticipatory planning
second noun phrase (NP). Activations related to syntactic complexity
skills in children and its interdependencies with the development of
were primarily lateralized to the left (IFG, pre-SMA, pre-central
executive functions. Two hundred and seventeen participants in 9 age
gyrus, and MTG) and bilateral for the Insula. A ROI-based analysis
groups (3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-year-olds, and adults) were tested
showed interactions of syntax and gesture in the left MTG, left pre-
in three different end-state comfort tasks and three tasks to assess
SMA, and in the bilateral Insula activations. The pattern of the
executive functioning (Tower of Hanoi, Mosaic, and the D2 attention
interaction suggests that a beat on NP1 facilitates the easy SOV
endurance task). Regression analysis revealed a robust developmental
structure and inhibits the more difficult OSV structure and vice versa
trend for each individual end-state comfort task across all age groups
for a beat on NP2. Because the IFG was unaffected by beat gestures it
(all p \ .01). Somewhat surprisingly, there was no indication of
seems to play an independent/isolated role in syntax processing.
generalization across these tasks, as correlations between the three
motor tasks failed to reach significance for all age groups (p [ .05).
Furthermore, we did not observe any systematic correlation between
Interregional connectivity minimizes surprise responses performance in the end-state comfort tasks and the level of executive
during action perception functioning. Accordingly, anticipatory planning develops with age,
but the impact of executive functions on this development seems to be
rather limited. Moreover, motor planning does not seem to be a
Sasha Ondobaka, Marco Wittmann, Floris P de Lange, holistic construct, as the performance in the three different tasks was
Harold Bekkering not correlated. Further research is needed to investigate the interde-
Donders Institute for Brain, Cognition and Behavior, Radboud pendencies of sensory-motor skill development with other cognitive
University Nijmegen, Netherlands abilities.
The perception of other individuals goal-directed actions requires the
ability to process the observed bodily movements and the surrounding
environmental context at the same time. Both action and contextual
processing have been studied extensively (Iacoboni et al. 2005; PREDICTIVE PROCESSING: PHILOSOPHICAL
Shmuelof and Zohary 2005; Bar et al. 2008), yet, the neural mecha- AND NEUROSCIENTIFIC PERSPECTIVES
nisms that integrate action and contextual surprise remain elusive.
The predictive account describes action perception in terms of a
Convenor: Alex Morgan
hierarchical inference mechanism which generates prior predictions
CIN, University of Tubingen, Germany
to minimize surprise associated with incoming action and contextual
sensory input (Friston et al. 2011; Koster-Hale and Saxe 2013). Here, The idea that the brain makes fallible inferences and predictions in
we used functional neuroimaging to establish which brain circuits order to get by in a world of uncertainty is of considerable vintage,
represent action and contextual surprise and to examine the neural but it is now beginning to achieve maturity due to the development of
mechanisms that are responsible for minimizing surprise-related a range of rigorous theoretical tools rooted in Bayesian statistics that
responses (Friston 2005). Participants judged whether an action was are increasingly being used to explain various aspects of the brains
surprising or non-surprising dependent on the context in which the structure and function. The emerging Bayesian brain approach in
action took place. They first viewed a surprising or non-surprising neuroscience introduces novel ways of conceptualizing perception,
context, followed by a grasping action. The results showed greater cognition, and action. It also arguably involves novel forms of
123
neuroscientific explanation, such as an emphasis on statistical opti- this principle; such as hierarchical message passing in the brain and
mality. The science is moving rapidly, but philosophers are the perceptual inference that ensues. I hope to illustrate the ensuing
attempting to keep up, in order to understand how these recent brain-like dynamics using models of bird songs that are based on
developments might shed light on their traditional concerns about the autonomous dynamics. This provides a nice example of how
nature of mind and agency, as well as concerns about the norms of dynamics can be exploited by the brain to represent and predict the
psychological explanation. The purpose of this symposium is to bring sensorium that isin many instancesgenerated by ourselves. I hope
together leading neuroscientists and philosophers to discuss how the to conclude with an illustration that illustrates the tight relationship
Bayesian brain approach might reshape our understanding of the between pragmatics of communication and active inference about the
mind-brain, as well as our understanding of mind-brain science. behavior of self and others.
Bayesian cognitive science, unification, and explanation Learning sensory predictions for perception and action
Matteo Colombo Axel Lindner

Tilburg Center for Logic and Philosophy of Science, Tilburg Hertie Institute for Clinical Brain Research, University of Tubingen,
University, Netherlands Germany
It is often claimed that the greatest value of the Bayesian framework Perception and action are not only informed by incoming sensory
in cognitive science consists in its unifying power. Several Bayesian information but, also, by predictions about upcoming sensory events.
cognitive scientists assume that unification is obviously linked to Such sensory predictions allow, for instance, to perceptually distin-
explanatory power. But this link is not obvious, as unification in guish self- from externally- produced sensations: by comparing action-
science is a heterogeneous notion, which may have little to do with based predictions with the actual sensory input, the sensory component
explanation. While a crucial feature of most adequate explanations in that is produced by ones own actions can be isolated (attenuated etc.).
cognitive science is that they reveal aspects of the causal mechanism Likewise, action-based sensory predictions allow the motor system to
that produces the phenomenon to be explained, the kind of unification react more rapidly to predictable events and, thus, to be less dependent
afforded by the Bayesian framework to cognitive science does not on delayed sensory feedback. I will demonstrate that the cerebellum, a
necessarily reveal aspects of a mechanism. Bayesian unification, structure intimately linked to plasticity within the motor domain,
nonetheless, can place fruitful constraints on causal-mechanical accounts for learning action- based sensory predictions on a short time
explanation. scale. I will further show that this plasticity is not solely related to the
motor domainit also influences the way we perceptually interpret the
sensory consequences of our behavior. Specifically, I will present
experiments in which we use virtual reality techniques to alter the
The explanatory heft of Bayesian models of cognition visual direction subjects associate with their pointing movements.
While we were able to change the predicted visual consequences of
pointing in healthy individuals, such recalibration of a sensory pre-
Frances Egan, Robert Matthews
diction was dramatically comprised in patients with lesions in the
Department of Philosophy, Rutgers University, USA
Cerebellum. Extending these results on sensory predictions for self-
Bayesian models have had a dramatic impact on recent theorizing produced events, I will show that the cerebellum also underlies the
about cognitive processes, especially about those brain-environment learning of sensory predictions about external sensory eventsinde-
processes directly implicated in perception and action. In this talk we pendent of self-action. In contrast to healthy controls, cerebellar
examine critically explanatory character of these models, especially patients were significantly impaired in learning to correctly predict the
in light of so-called new mechanist claims to the effect that these re-occurrence of a moving visual target that temporarily disappeared
models are not genuinely explanatory, at least are little more than behind an occluder. In summary, our research suggests that the cere-
explanation sketches. We illustrate our points with examples drawn bellum plays a domain-general role in fine-tuning predictive models
from both classical dynamics and cognitive ethology. We conclude irrespective of whether sensory predictions are action-based (efference
with a discussion of the import of these models for the presumption, copies) or sensory-based, and irrespective of whether sensory pre-
common among neuropsychologists, that commonsense folk psy- dictions support action, perception, or both.
chological concepts such as belief and desire have an important role
to play in cognitive neuroscience.
Layer resolution fMRI to investigate cortical feedback

and predictive coding in the visual cortex
Predictive processing and active inference
Lars Muckli
Karl Friston Institute of Neuroscience and Psychology, University of Glasgow, UK
Institute of Neurology, University College London, UK
David Mumford (1991) proposed a role for reciprocal topographic
How much about our interaction withand experience ofour world cortical pathways in which higher areas send abstract predictions of
can be deduced from basic principles? This talk reviews recent the world to lower cortical areas. At lower cortical areas, top-down
attempts to understand the self-organized behavior of embodied predictions are then compared to the incoming sensory stimulation.
agents, like ourselves, as satisfying basic imperatives for sustained Several questions arise within this framework: (1) do descending
exchanges with the environment. In brief, one simple driving force predictions remain abstract, or do they translate into concrete level
appears to explain many aspects of action and perception. This predictions, the language of lower visual areas? (2) how is incoming
driving force is the minimization of surprise or prediction error that sensory information compared to top-down predictions? Are input
in the context of perceptioncorresponds to Bayes-optimal predic- signals subtracted from the prediction (as proposed in the predictive
tive coding. We will look at some of the phenomena that emerge from coding framework) or are they multiplied (as proposed by other
123
models i.e. biased competition or adaptive resonance theory)? Con- number-space relations in Hebrew. Concluding the first part, Sol-
tributing to the debate of abstract or concrete level information, we tanlou, Huber, and Nuerk examine how different basic numerical
aim to investigate the information content of feedback projections with effects including the SNARC (spatial-numerical association of
functional MRI. We have exploited a strategy in which feedforward response-codes) effect are influenced by linguistic and other cultural
information is occluded in parts of visual cortex: i.e. along the non- properties.
stimulated apparent motion path, behind a white square that we used to The next three talks are concerned with the question, how number
occlude natural visual scenes, or by blindfolding our subjects (Muckli, word structure influences numerical and mathematical processing in
Petro 2013). By presenting visual illusions, contextual scene infor- children and adults. In the last years, it has been shown repeatedly that
mation or by playing sounds we were able to capture feedback signals intransparent number word structures specifically interfere with
within the occluded areas of the visual cortex. MVPA analysis of the mathematical performance. Schiltz, Van Rinsveld, and Ugen make
feedback signal reveals that they are more abstract than the feedfor- use of the fact that all children in Luxemburg are taught bilingual
ward signal. Furthermore, using high resolution MRI we found that (French, German). They are therefore able to examine the influence of
feedback is sent to the outer cortical layers of V1. We also show that different number word structures in within-participant designs. They
feedback to V1 can originate from auditory information processing observe linguistic influences on mathematical cognition, which,
(Vetter, Smith, Muckli 2014). We are currently developing strategies however, are mediated by a childs proficiency in a given language.
to reveal the precision and potential functions of cortical feedback. Dowker, Lloyd, and Roberts compare performance in English and
Our results link into the emerging paradigm shift that portrays the Welsh language. Welsh number word structure is fully transparent
brain as a prediction machine (Clark 2013). (13 = Ten-three; 22 = two-tens-two) for all two-digit number words,
which is commonly only found in Pacific Rim Countries. Since in
References Wales some children are taught in Welsh, and some in the less
Clark A (2013) Whatever next? Predictive brains, situated agents, and transparent English, the impact of the transparency of a number word
the future of cognitive science. Behav Brain Sci 36(3):181204 system can be studied within one culture thereby avoiding confounds
Muckli L, Petro L (2013) Network interactions: non-geniculate input of language and culture common in cross-cultural studies. The results
to V1. Curr Opin Neurobiol 23(2):195201 suggest that children benefit in specific numerical tasks, but not in
Mumford D (1991) On the computational architecture of the neo arithmetic performance in general. Finally, Bahnmueller, Goebel,
cortexthe role of the thalamocortical loop Biol Cybern Moeller, and Nuerk used eye-tracking methodology to examine in a
65(2):135145 translingual eye-tracking study, which processes are underlying lin-
Vetter S, Muckli L (2014) Decoding sound and imagery content in guistic influences on numerical cognition. They show that at least for
early visual cortex. Curr Biol 24(11):12561262 a sub-group language guides attentional processing of multi-digit
Arabic numbers in a way consistent with the number word structure.
In the final part of the symposium, Szucs examined the differential
contribution of language-related and language-independent skills on
HOW LANGUAGE AND NUMERICAL mathematical performance. He observed that on one hand, phono-
logical decoding skills predict mathematical performance in
REPRESENTATIONS CONSTITUTE
standardized tests, but that on the other hand, children with pure
MATHEMATICAL COGNITION dyscalculia do not show deficits in verbal and language functions.
Finally, Daroczy, Wolska, Nuerk, and Meurers used an interdisci-
Convenor: Hans-Christoph Nuerk plinary (Linguistics, Psychology) approach to study mathematical
University of Tubingen, Germany word problems. They systematically varied linguistic and numerical
complexity within one study and examined how both factors con-
Mathematical or numerical cognition has often been studied with little
tribute to mathematical performance in this task.
consideration of language and linguistic processes. The most basic
The symposium concludes with a general discussion about how lan-
representation, the number magnitude representation has been viewed
guage and numerical representations constitute mathematical cognition.
as amodal and non-verbal. Only in the last years, the influence of
linguistic processes has received again more interest in cognitive
research. Now we have evidence that even the most basic tasks like
magnitude comparison and parity judgment, and even the most basic
representations, such as spatial representation of number magnitude, Influences of number word inversion on multi-digit
are influenced by language and linguistic processes. number processing: a translingual eye-tracking study
The symposium brings together international researchers from
different fields (Linguistics, Psychology and Cognitive Neurosci- Julia Bahnmueller1,2, Silke Goebel3, Korbinian Moeller1,
ence) with at least three different foci within the general symposium Hans-Christoph Nuerk1,2
topic: (i) How is spatial representation of number influenced by 1
IWM-KMRC Tubingen, Germany; 2 University of Tubingen,
reading and writing direction? (ii) How do number word structures Germany; 3 University of York, UK
of different languages influence mathematical and numerical per-
formance? (iii) How are linguistic abilities of children and linguistic Differences in number word systems become most obvious for multi-
complexity of mathematical tasks related to mathematical digit numbers. Therefore, the investigation of multi-digit numbers is
performance? crucial to identify linguistic influences on number processing. One of
After an overview given by the organizer, the symposium starts the most common specificities of a number word system is the
with a presentation by Fischer and Shaki, who have shaped the inversion of number words with respect to the digits of a number (e.g.,
research about reading and writing influences on the relation between the German number word for 27 is siebenundzwanzig (*seven and
space and number in recent years. They give an update about explicit twenty). While linguistic influences of the number word system have
and implicit linguistic influences on spatial-numerical cognition. been reliably observed over the last years, the specific cognitive
Tzelgov and Zohar-Shai may partially challenge this view, because contributions underlying these processes are still unknown.
they show that related linguistic effects, namely, the Linguistic Therefore present study aimed at investigating the underlying
Markedness Effect, may mask seemingly observed null effects of cognitive processes and language specificities of three-digit number
123
processing. More specifically, it was intended to clarify to which Reading space into numbers: an update
degree three-digit number processing is influenced by parallel and/or
sequential processing of the involved digits and modulated by lan- Martin H. Fischer1, Samuel Shaki2
guage. English- and German-speaking participants were asked to 1
University of Potsdam, Germany; 2
Ariel University, Israel
complete a three-digit number comparison task while their response
latencies as well as their eye movements were recorded. Results Number-Space associations, and the SNARC effect in particular, were
showed that in both language groups there were indicators of both extensively investigated in the past two decades. Still, their origin and
parallel and sequential processing with clear-cut language-based directionality remain unclear. We will address the following ques-
differences being observed. Reasons for the observed language-spe- tions: (a) Does the SNARC effect reflect recent spatial experience or
cific differences contributing to a more comprehensive understanding long-standing directional habits? (b) Does the SNARC effect spill-
of mathematical cognition are discussed. over from reading habits for numbers or from reading habits for
words? (c) What is the contribution of other directionality cues (e.g.,
vertical grounding such as more is up; cultural metaphors)?
Finally, we will consider the impact of empirical findings from an
On the influence of linguistic and numerical complexity Implicit Association Test.
in word problems
Gabriella Daroczy1, Magdalena Wolska1, Hans-Christoph Nuerk1,2,

Detmar Meurers1 How language and numerical representations constitute
1
University of Tubingen, Germany; 2 IWM-KMRC Tubingen, mathematical cognition: an introductory review
Germany
Word problems, in which a mathematical problem is given as a Hans-Christoph Nuerk
reading text, before arithmetic calculation can begin, belong to the University of Tubingen and IWM-KMRC Tubingen, Germany
most difficult mathematical problems in children in adults. Dif- Mathematical or numerical cognition has often been studied largely
ferent linguistic factors, e.g., text complexity (nominalization vs. independently of language and linguistic processes. Only in the last
verbal phrases), numerical factors, (carry or non-carry, addition or years, the influence of such linguistic processes has received more
subtraction), and the relation between linguistic text and mathe- interest. Now it is known that even the most basic tasks like magni-
matical problem (order consistency) can all contribute to the tude comparison and parity judgment and even basic representations,
difficulty of a word problem. Our interdisciplinary group system- such as spatial representation of number magnitude, are influenced by
atically varied linguistic and numerical factors in a within- language and linguistic processes.
participant design. The results showed that both linguistic and Within the general topic of language contributions to mathemati-
numerical complexity as well as their interrelation contributed to cal cognition and performance we can distinguish at least three
mathematical performance. different foci: (i) How does spatial representation of number influence
by reading and writing direction? (ii) How do number word structures
of different languages influence mathematical and numerical perfor-
mance? (iii) How are linguistic abilities of children and linguistic
Linguistic influences on numerical understanding: complexity of mathematical tasks related to mathematical
the case of Welsh performance?
A short overview over the state of the research in above topics is
Ann Dowker1, Delyth Lloyd2, Manon Roberts3 given and it is introduced, which open questions are addressed in this
1
Dept of Experimental Psychology, University of Oxford, England; symposium.
2
University of Melbourne, Australia; 3 Worcester College, Oxford,
England
It is sometimes suggested that a reason why children in Pacific Rim
countries excel in mathematics is that their counting systems are Language influences number processing: the case
highly transparent: e.g. 13 is represented by the equivalent of ten- of bilingual Luxembourg
three; 22 by the equivalent of two-tens-two etc. This may make
both counting and the representation of place value easier to acquire Christine Schiltz, Amandine Van Rinsveld, Sonja Ugen
than in many other languages. However, there are so many cultural Cognitive Science and Assessment Institute, University
and educational differences between, for example, the USA and of Luxembourg, Luxembourg
China that it is hard to isolate the influence of any particular factor. In
Wales, both a regular counting system (Welsh) and an irregular In a series of studies we investigated how language affects basic
counting system (English) are used within a single region. Approxi- number processing tasks in a GermanFrench bilingual setting. The
mately 20 % of children in Wales receive their education in the Luxembourg school system indeed progressively educates pupils to
Welsh medium, while following the same curriculum as those become GermanFrench bilingual adults, thanks to extensive lan-
attending English medium schools. This provides an exceptional guage courses in both German and French, as well as a progressive
opportunity for studying the effects of the regularity of the counting transition of teaching language from German (dominant in primary
system, in the absence of major confounding factors. Studies so far school) to French (dominant in secondary school). Studying numer-
suggest that Welsh-speaking children do not outperform their Eng- ical cognition in children and adults successfully going through the
lish-speaking counterparts in all aspects of arithmetic, but that they do Luxembourg school system thus provides an excellent opportunity to
show superiority in some specific aspects: notably in reading and investigate how progressively developing bilingualism impacts
comparing 2-digit numbers, and in the precision of their non-verbal numerical representations and computations. Studying this question in
number line estimation. Luxembourgs GermanFrench bilingual setting is all the more
123
interesting, since the decades and units of two-digit number words It does exist! A SNARC effect amongst native Hebrew
follow opposite structures in German (i.e. unit-decade) and French speakers is masked by the MARC effect
(decade-unit). In a series of experiments pupils from grades 7, 8, 10,
11, and adults made magnitude comparisons and additions that were
Joseph Tzelgov, Bar Zohar-Shai
presented in different formats: Arabic digits and number words. Both
Ben-Gurion University of the Negev, Israel
tasks were performed in separate German and French testing sessions
and we recorded correct responses rates and response times. The The SNARC effect has been found mainly with participants who
results obtained during magnitude comparison show that orally pre- speak Germanic languages. The effect in these studies implies that
sented comparisons are performed differently by the same mental number line spreads from left-to-right. Therefore, it was
participants according to task language (i.e. different compatibility suggested that the effect derives from the experience of writing
effects in German vs. French). For additions it appears that the level from left-to-right. Commonly, studies of spatial-numerical asso-
of language proficiency is crucial for the computation of complex ciations in Hebrew speakers report a null SNARC effect when the
additions, even in adults. In contrast, adults tend to retrieve simple standard designs in which the participants are asked to perform
additions equally well in both languages. Taken together, these results parity task twice, each time with a different parity-to-hand map-
support the view of a strong language influence on numerical repre- ping. It has been argued that this is due to different reading
sentations and computations. directions of words and numbers. Hebrew is written from right-to-
left while numbers are written by Hebrew writers from left-to-
right as in Germanic languages. In this paper, we show that a
SNARC effect in native Hebrew speakers does exists when the
Language differences in basic numerical tasks design minimizes the MARC effect. Furthermore even Hebrew is
written from right-to-left the mental number line as estimated by
Mojtaba Soltanlou, Stefan Huber, Hans-Christoph Nuerk the SNARC effect spreads from left-to-right as in Germanic lan-
University of Tubingen and IWM-KMRC Tubingen, Germany guages. These findings challenge the assumption that direction of
Connections between knowledge of language and knowledge of reading is the main source of the direction of spatial-numerical
number have been suggested on theoretical and empirical grounds. association.
Chomsky (1986) noted that both the sentences of a language and the
numbers in a counting sequence have the property of discrete infinity,
and he suggested that the same recursive device underlies both
(Bloom 1994 and Hurford 1987). Numerical researchers have there- MODELING OF COGNITIVE ASPECTS
fore begun to examine the influences of linguistic properties. OF MOBILE INTERACTION
In this internet study, we explored adults from various countries in
some basic numerical tasks consist of symbolic and non-symbolic
Convenors: Nele Russwinkel, Sabine Prezenski, Stefan Lindner
magnitude comparison and parity judgment, and recorded responses
TU Berlin, Germany
to find the error rate and reaction time. The results suggest that not
only distinct languages influence these kinds of tasks differentially, Interacting with mobile devices is gaining more and more importance
but that the other cultural and individual factors play an important in our daily life. Using those devices provides huge comfort, but
role in numerical cognition. nevertheless entails specific challenges. In contrast to the classical
home computer setting, mobile device usage is more prone to dis-
ruptions, more influenced by time pressure and more likely to be
affected by earlier interaction experiences. An important issue in this
Cognitive components of the mathematical processing context consists in interfaces fitting best for the users cognitive
network in primary school children: linguistic abilities. These abilities display a high variety between different
and language independent contributions groups of users. How can developers and designers adapt an interface
to meet the users skills and preferences? For these purposes, cog-
nitive modeling provides an appealing opportunity to gain insights
Denes Szucs
into the users skills and cognitive processes. It offers a theoretical
University of Cambridge, UK
framework as well as a computational platform for testing theories
We have tested the cognitive components of mathematical skill in and deriving predictions.
more than one hundred 9 year old primary school children. We aimed The scope of this symposium lies in introducing selected
to separate the contributions of language related and language inde- approaches to user modeling and showing their application to the
pendent skills. We used 18 cognitive tests and 9 custom experiments. domain of mobile interaction. In this context we are particularly
We identified phonological decoding efficiency and verbal intelli- interested in criteria like learnability and efficiency from a cognitive
gence as important contributors to mathematical performance as well as a technical point of view. Moreover, research concerning
(measured by standardized tests). In addition, spatial ability, visual individual differences, interruption and expectancy is presented.
short term and working memory were also strong predictors of Overall, we aim to show that the mobile interaction scenario offers an
arithmetic performance. Further, children with pure developmental interesting research area to test model approaches in real life appli-
dyscalculia only showed impaired visuo-spatial processing but no cations, but also discuss cognitive processes that are relevant within
impairment in verbal and language function. The results can shed those tasks. We will look upon those different cognitive aspects of
light on the differing role of language and visual function in arith- mobile interaction and the role of modeling to improve cognitive
metic and on co-morbidity of language and arithmetic disorders. appropriate applications.
123
Creating cognitive user models on the basis of abstract described in Kurup et al. (2012). Other implementations are possible
user interface models (Lindner, Russwinkel 2013). Universal expectations are expectations
that result from the universally inherited pre-structuring of the envi-
ronment. In ACT-R universal expectations are in part already
Marc Halbrugge
reflected in the modelers decisions regarding the content of the
TU Berlin, Germany
model environment, memory items and production elements.
The recent explosion of mobile appliances creates new challenges not Both types of expectations play a dynamic role during the adaptation
only for application developers and content creators, but also for and use of a technical device. Using a new smartphone app users will
usability professionals. Conducting a classical usability study of a first rely on general expectations derived from past use of other smart
mobile user interface (UI) on an exhaustive number of devices is phone apps or computer programs. Universal expectations, especially
more or less impossible. One approach to tackle the engineering side in the form of assumed form-function contingencies, play an impor-
of the problem is model-based user interface development, where an tant role in this phase as well. With time, however, users will
abstract UI model is adapted to the target device at runtime (Calvary increasingly rely on expectations that are in line with specific
et al. 2003). When this method is applied, the application flow is knowledge acquired during use.
modeled first and user controls are abstractly identified by their roles
therein (e.g. command, choice, output). The elements of the final UI References
as presented to the users (e.g. buttons, switches, labels) are all rep- Friston K, Kiebel S (2009) Predictive coding under the free-energy
resentations of those, enriched by physical properties like position, principle. Philos Trans R Soc Biol Sci 364:12111221
size, and textual content. Kurup U, Lebiere C, Stentz A, Hebert M (2012) Using expectations to
While knowing the sizes and positions of the UI elements already drive cognitive behavior. In: Proceedings of the 26th AAAI
allows predictions of completion times for previously specified tasks, conference on artificial intelligence
e.g. by creating simple cognitive models using CogTool (John et al. Lindner S, Russwinkel N (2013). Modeling of expectations and surprise
2004), the additional information encoded into the abstract UI model in ACT-R. In: Proceedings of the 12th international conference on
allows to go much further. It contains machine readable knowledge cognitive modeling, pp 161166. Available online: http://iccm-
about the application logic and the UI elements that are to be visited conference.org/2013-proceedings/papers/0027/index.html
to attain a specified goal, which creates a significant opportunity for Umbach VJ, Schwager S, Frensch PA, Gaschler R (2012) Does
machine translation into more precise cognitive models (Quade et al. explicit expectation really affect preparation? Front Psychol
2014). In this talk, I will show how completion time predictions can 3:378. doi:10.3389/fpsyg.2012.00378
be improved based on abstract UI model information. Data from two
empirical studies with a kitchen assistance application is presented to
illustrate the method and quantify the gain in prediction accuracy. Evaluating the usability of a smartphone application
with ACT-R
References
Calvary G, Coutaz J, Thevenin D, Limbourg Q, Bouillon L, Van- Sabine Prezenski
derdonckt J (2003) A unifying reference framework for multi- TU Berlin, Germany
target user interfaces. Interact Comput 15(3):289308
John BE, Prevas K, Salvucci DD, Koedinger K (2004) Predictive The potentials of using ACT-R (Anderson 2007) based cognitive
human performance modeling made easy. In: CHI 04: Pro- models for evaluating different aspects of usability are demonstrated
ceedings of the SIGCHI conference on Human factors in using a shopping list application for an Android application.
computing systems. ACM Press, pp 455462 Smartphone applications are part of our everyday life. A suc-
Quade M, Halbrugge M, Engelbrecht KP, Albayrak S, Moller S cessful application should meet the standard of usability as defined in
(2014) Predicting task execution times by deriving enhanced EN ISO-924-110 (2008) and EN ISO-924-111 (1999). In general,
cognitive models from user interface development models. In: usability testing is capacious and requires vast resources. In this work,
Proceedings of EICS 2014, Rome, Italy (in press) we demonstrate how cognitive models can answer important ques-
tions concerning efficiency, learnability and experience in a less
demanding and rather effective way. Further we outline how cogni-
Expectations during smartphone application use tive models provide explanations about underlying cognitive
mechanisms which effect usability.
Two different versions of a shopping list application (Russwinkel
Stefan Lindner
and Prezenski 2014) are evaluated. The versions have a similar
TU Berlin, Germany
appearance but differ in menu-depth. User tests were conducted and
Expectations serve a multitude of purposes and play a large role in the an ACT-R model, able to interact with the application, was designed.
adoption and use of new technological devices. I will briefly discuss a The task of the user respectively the model consists in selecting
classification of expectations, implementation ideas in ACT-R and products for a shopping list. In order to discover potential learning
their role during smartphone app use. effects, repetition of the task was required.
In a general sense, expectations coordinate our goals and desires User data show, that for both versions time on task decreases as
with the current and the future state of the environment. They are user experience increases. The version with more menu-depth is less
necessary for any kind of intentions, help in action preparation efficient for novice users. The influence of menu-depth decreases as
(Umbach et al. 2012), and play a prominent role in action-perception user experience increases. Learning transfers from different versions
feedback loops (Friston, Kiebel 2009). are also found. Time on task for different conditions is approximately
Experience-based expectations are expectations that result from the the same for real users and the model. Furthermore, our model is able
individual learning history. Both the utility and activation mecha- to explain the effects displayed in the data. The learning effect is
nisms of ACT-R can be interpreted as reflecting experience-based explained through the building of application-specific knowledge
expectations about our environment. One possible way to model the chunks in the models declarative memory. These application-specific
formation of experience-based expectations from past experiences knowledge chunks further resolve why expertise is more important
using the partial matching and blending algorithms of ACT-R is than menu-depth.
123
References possible? In: Stephanidis C (ed) Universal access in human

Anderson JR (2007) How Can the Human Mind Occur in the Physical computer interaction. Users diversity, vol. 6766 of lecture notes
Universe? (p 304) New York Oxford University Press in computer science. Springer, Berlin, pp 131139
EN ISO 9241-110 (2008) Ergonomics in Human-System-Interaction Dickinson A, Arnott JL, Prior S (2007) Methods for humancomputer
Part 110: Fundamentals in Dialogmanagement (ISO 9241-110: interaction research with older people. Behav Inf Technol
2006). International Organization for Standardization, Genf 26(4):343352
EN ISO 9241-11 (1999) Ergonomic Requirements for Office Work Hanson VL (2011) Technology skill and age: what will be the same
with Visual Display Terminals (VDTs). Part 11: Guidance on 20 years from now? Univ Access Inf Soc 10:443452
Usability. International Organization for Standardization, Genf Hawthorn D (2000) Possible implications of aging for interface
Russwinkel N, Prezenski S (2014) ACT-R meets usability. In: Pro designers. Interact Comput 12(5):507528
ceedings of the sixth international conference on advanced Reason, JT (1990) Human error. Ambridge University Press,
cognitive technologies and applications. COGNITIVE Cambridge
Simulating interaction effects of incongruous mental Special offer! Wanna buy a trout?Modeling user
models interruption and resumption strategies with ACT-R
Matthias Schulz Maria Wirzberger

TU Berlin, Germany TU Berlin, Germany
Traditional usability evaluations involving older adults are difficult to Interruption is a frequently appearing phenomenon users have to
conduct (Dickinson et al. 2007) and the results may also be mis- deal with in interactions with technical systems. Especially when
leading, as often only the cognitively and physically fittest seniors using mobile applications on Smartphones they are confronted with
participate (Hawthorn 2000). In addition to this, older adults often a variety of distractors, induced by the system itself (e.g., product
lack experience in using modern devices (Hanson 2011). Further- advertisement, system crash) or resulting from the mobile context
more, it is reasonable to assume that older adults often have problems (e.g., motion, road traffic). Such interruptions might be critical
operating new devices, if they inappropriately transfer prior experi- especially in periods of already enhanced demands on working
ence using other devices (Arning, Ziefle 2007). Such an memory, resulting in increased experienced workload. Based on a
inappropriately transfer would result in an increase of wrong or time course model of interruption and resumption to a main task,
redundant interaction steps, which in turn may lead to unintended developed by Altmann and colleagues (e.g., Altmann, Trafton 2004),
actions being recognized by the system (Bradley et al. 2011). this research explores an interruption scenario due to product
To simulate the effects of incongruous mental models or the advertisement while using a simple shopping app Product adver-
inappropriate transfer of prior experience using other devices, an tisement is an omnipresent and at the same time cognitively
existing tool for automatic usability evaluationthe MeMo work- demanding kind of interruption, as it forces a decision for or against
benchwas extended. The goal of the enhancement was to simulate the offered product.
interaction of users with a smartphone including mistakes and slips; We developed an ACT-R model, able to perform an interrupted
According to Reason (Reason 1990, p 12 ff.), Mistakes, Lapses, and product selection task under alternating workload conditions,
Slips are the primary error types which can be used to classify errors resuming by either cognitively or visually tying in with the product
in human computer interaction. To simulate mistakeserrors which selection. In brief, the task consists of searching and selecting a set of
result from incongruous mental models or inappropriately transferring predefined products in several runs, and meanwhile being interrupted
prior experiencea new processing module was added. This pro- by product advertisement at certain times. Different levels of work-
cessing module uses 4 generalized linear models (GLMs) to compute load are induced by shopping for one vs. three people. Model
what kind of interaction the user model intends to apply to the validation is performed experimentally with a sample of human
touchscreen. To simulate slips we added a new execution module participants, assessing workload by collecting pupil dilation data.
which computes the probability that the user model interaction is not Our main focus of analysis consists in how execution and
executed as intended (e.g. missing a button when trying to hit it). resumption performance differ in case of workload, and what strate-
Our results show that it is possible to simulate interaction errors gies users apply in this terms to react to interruptions. In detail, we
(slips and mistakes) and describe interaction parameters for younger expect an impaired task performance and extended resumption times
and older adults operating a touchscreen by using the improved with increasing workload. Moreover, strategies while resuming to the
MeMo workbench. product selection might differ in terms of varying workload levels.
Important results concerning the assumed effects will be addressed
References within this talk.
Arning K, Ziefle M, (2007) Understanding age differences in PDA
acceptance and performance. Comput Human Behav 23(6):2904 References
2927 Altmann, EM, Trafton JG (2004) Task interruption: resumption lag
Bradley M, Langdon P, Clarkson P (2011) Older user errors in and the role of cues. In: Proceedings of the 26th annual con-
handheld touchscreen devices: to what extent is prediction ference of the Cognitive Science Society, Chicago, Illinois
123
Tutorials In this tutorial we will present an overview on further existing

visualization techniques for eye tracking data and demonstrate their
application in different user experiments and use cases. The tutorial
Introduction to probabilistic modeling and rational will present three topics of eye tracking visualization:
analysis 1.) Visualization for supporting the general analysis process of a
user experiment.
Organizer: Frank Jakel 2.) Visualization for static and dynamic stimuli.
University of Osnabruck, Germany 3.) Visualization for understanding cognitive and perceptual pro-
cesses and refining parameters for cognition and perception
The first part of the course is a basic introduction to probability theory simulations.
from a Bayesian perspective, covering conditional probability, inde- This tutorial is designed for researchers who are interested in eye
pendence, Bayes rule, coherence, calibration, expectation, and tracking in general or in applying eye tracking techniques in user
decision-making. We will also discuss how Bayesian inference differs experiments. Additionally, the tutorial could be of interest for psy-
from frequentist inference. In the second part of the course we will chologists and cognitive scientists who would like to evaluate and
discuss why Bayesian Decision Theory provides a good starting point refine cognition and perception simulations. It is suitable for PhD
for probabilistic models of perception and cognition. The focus here students as well as for experienced researchers. The tutorial requires a
will be on rational analysis and ideal observer models that provide an minimal level of pre-requisites. Fundamental concepts of eye tracking
analysis of the task, the environment, the background assumptions and visualization will be explained during the tutorial.
and the limitations of the cognitive system under study. We will go
through several examples from signal detection to categorization to
illustrate the approach.
Introduction to cognitive modelling with ACT-R
Organizers: Nele Russwinkel, Sabine Prezenski, Fabian Joeres,

Modeling vision Stefan Lindner, Marc Halbrugge
Contributors: Fabian Joeres, Maria Wirzberger; Technische
Organizer: Heiko Neumann Universitat Berlin, Germany
University of Ulm, Germany ACT-R is the implementation of a theory of human cognition. It has a
Models of neural mechanisms underlying perception can provide very active and diverse community that uses the architecture in lab-
links between experimental data from different modalities such as, oratory tasks others in applied research. ACT-R is oriented on the
e.g., psychophysics, neurophysiology, and brain imaging. Here we organization of the brain and is called hybrid architecture because it
focus on visual perception. holds symbolic and subsymbolic components. The aim of working on
The tutorial is structured into three parts. In the first part the role of cognitive models with a cognitive architecture is to understand how
models in vision science is motivated. Models can be used to for- humans produce intelligent behavior.
mulate hypotheses and knowledge about the visual system that can be In this tutorial the cognitive architecture ACT-R is introduced
subsequently tested in experiments which, in turn, may lead to model (Anderson 2007). In the beginning we will give a short introduction of
improvements. Modeling vision can be described at various levels of the background, structure and scope of ACT-R. Then we would like
abstraction and using different approaches (first principles approa- to start with hands-on examples how cognitive models are written in
ches, phenomenological models, dynamical systems). In the second ACT-R.
part specific models of early and mid-level vision are reviewed, In the end of the tutorial we will give a short overview about
addressing topics such as, e.g., contrast and motion detection, per- recent work and its benefit for applied cognitive science.
ceptual grouping, motion integration, figure-ground segregation,
surface perception, and optical flow. The third part focuses on higher- References
level form and motion processing and building learning-based rep- Anderson JR (2007) How can the human mind occur in the physical
resentations. In particular, object recognition, biological/articulated universe? Oxford University Press, New York
motion perception, and attention selection are considered.
Dynamic Field Theory: from sensorimotor behaviors

Visualization of eye tracking data to grounded spatial language
Organizer: Michael Raschke Organizers: Yulia Sandamirskaya, Sebastian Schneegans

Contributors: Tanja Blascheck, Michael Burch, Kuno Kurzhals, Ruhr University Bochum, Germany
Hermann Pfluger
University of Stuttgart, Germany Dynamic Field Theory (DFT) is a conceptual and mathematical
framework, in which cognitive processes are grounded in sensori-
Apart from measuring completion times and recording accuracy rates motor behavior through continuous in time and in space dynamics of
of correctly given answers during performance of visual tasks, eye Dynamic Neural Fields (DNFs). DFT originates in Dynamical Sys-
tracking experiments provide an additional technique to analyze how tems thinking which postulates that the moment-to-moment behavior
the attention of an observer is changing on a presented stimulus. of an embodied agent is generated by attractor dynamics, driven by
Besides using statistical algorithms to compare eye tracking metrics, sensory inputs and interactions between dynamic variables. Dynamic
visualization techniques allow us to visually analyze different aspects Neural Fields add representational power to the Dynamical Systems
of the recorded data. However, in most times only state of the art framework through DNFs, which formalize the dynamics of neuronal
visualization techniques are usually used, such as scan path or heat populations in terms of activation functions defined over behaviorally
map visualizations. relevant parameter spaces. DFT has been successfully used to account
123
for development of visual and spatial working memory, executive within DFT. We will show on an exemplary architecture for gener-
control, scene representation, spatial language, and word learning, as ation of flexible spatial language behaviors how the DNF
well as to guide behavior of autonomous cognitive robots. In the architectures may be linked to sensors and motors and generate real-
tutorial, we will cover the basic concepts of Dynamic Field Theory in world behavior autonomously. The same architecture may be used to
several short lectures. The topics will be: the attractors and instabil- account for behavioral findings on spatial language. The tutorial will
ities that model elementary cognitive functions; the couplings include a hands-on session to familiarize participants with a MAT-
between DNFs and multidimensional DNFs; coordinate transforma- LAB software framework COSIVINA, which allows to build complex
tions and coupling DNFs to sensory and motor systems; autonomy DNF architectures with little programming overhead.
123
Poster presentations which, e.g., intend to provide a more concise syntax cannot be proven
correct. We present a re-implementation of ACT-R which is based on
a formal abstract semantics of ACT-R.
The effect of language on spatial asymmetry in image Keywords
perception ACT-R Implementation, Formal Semantics
Introduction
Zaeinab Afsari, Jose Ossandon, Peter Konig ACT-R (Anderson 1983, 2007) is a widely used cognitive archi-
Osnabruck University, Germany tecture. It provides an agent programming language to create a
Image viewing studies recently revealed that healthy participants cognitive model and an interpreter to execute the model. A model
demonstrated leftward spatial bias while performing free viewing consists of a set of chunk types, a set of production rules, and the
task. This leftward gaze bias has been suggested to be due to the definition of an initial cognitive state. An execution of a model is a
lateralization in the cortical attention system, but might be alterna- sequence of time-stamped cognitive states where one cognitive state
tively explained or influenced by reading direction on the horizontal is obtained by the execution of a production rule on its predecessor
spatial bias while freely viewing images. Four eye-tracking experi- in the sequence.
ments were conducted by using different bilingual subjects and Over the past thirty years the ACT-R interpreter has been extended
different direction of reading paragraphs primes. Participants first and changed immensely based on findings in psychological research.
read a text and subsequently freely viewed nine images while the eye Unfortunately, the relation between concepts of the ACT-R theory
movements were recorded. Experiment 1 investigates the effect of and the implementation of the ACT-R interpreter are not always clear.
reading direction among bilingual participants with right-to-left So today, strictly speaking, only the Lisp source code of the ACT-R
(RTL) and left-to-right (LTR) text primes. Those participants were interpreter is defining the exact semantics of an ACT-R model, so it
native Arabic/Urdu speakers. In concordance with previous studies, is often felt that modelers merely write computer code that mimics the
after reading LTR prime, a leftward shift in the first second of image human data (Stewart, West 2007). Due to this situation, it is
exploration was observed. In contrast, after reading RTL text primes, unnecessarily hard to compare different ACT-R models for similar
participants displayed a rightward spatial bias. This result demon- tasks and ACT-R modelling is often perceived to be rather inefficient
strates that reading direction of text primes influences later and error prone (Morgan, Haynes, Ritter, Cohen 2005) in the litera-
exploration of complex stimuli. In experiment 2, we investigated ture. To overcome these problems, we propose a formal abstract
whether this effect was due to a systematic influence of native vs. syntax and semantics for the ACT- R cognitive architecture (Albrecht
secondary language, independently of the direction of reading. For 2013; Albrecht, Westphal 2014b). The semantics of an ACT-R model
this purpose, we measured German/English bilinguals with German/ is the transition system which describes all possible computations of
English LTR reading direction text stimuli. Here, participants showed an ACT-R model.
leftward spatial bias after reading LTR texts in either cases. This In this work, we report on a proof-of-concept implementation of
demonstrates that for the present purpose, the difference between the formal semantics given in Albrecht (2013) which demonstrates a
primary and secondary language is not important. In Experiment 3, formally founded approach to ACT-R model execution and provides a
we investigated the relative influence of scanning direction and actual basis for new orthogonal analyzes of (partial) ACT-R models, e.g., for
reading direction. LTR bilingual participants were presented with the feasibility of certain sequences of rule executions (Albrecht,
normal (LTR) and mirrored left-to-right (mLTR) texts. Upon reading Westphal 2014a).
the primes, reading direction differed markedly, reflecting mirrored Related Work
and not mirrored conditions. However, we did not observe significant Closest to our work is the deconstruction and reconstruction of ACT-
differences in the leftward bias. The bias is even slightly stronger R by Stewart and West (2007). Their work aims to ease the evaluation
after reading mLTR. This experiment demonstrates that the actual of variations in the structure of computational models of cognition.
scanning direction did not influence the asymmetry on later complex To this end, they analyzed the Lisp implementation of the ACT- R 6
image stimuli. In Experiment 4, we studied the effect of reading interpreter and re-engineered it, striving to clarify fundamental con-
direction among bilingual participants with LTR as primary language cepts of ACT-R. To describe these fundamental concepts they use the
and RTL as secondary language. These participants were native Python programming language and obtain another working ACT-R
Germans and Arabic Germans who learned Arabic mainly later in life. interpreter called Python ACT-R. To validate Python ACT-R, they
They showed a leftward bias after reading both LTR and RTL text statistically compare predictions of both implementations on a set of
primes. In conclusion, although it seems like the reading direction ACT-R models.
was the main factor for modulating the perceptual bias, there could be In our opinion, firstly, there should be an abstract, formal definition
another explanation. The innate laterality systems in our brain (left of ACT-R syntax and semantics to describe fundamental concepts.
lateralized linguistic and right lateralized attention) play a role in And only secondly, another interpreter should be implemented based
increasing/decreasing the bias. on this formal foundation which may, as Python ACT-R does, also
offer a more convenient concrete syntax for ACT-R models. This two-
step approach in particular allows to not only test but formally verify
that each interpretation implements the formal semantics.
Towards formally founded ACT-R simulation The ACT-UP (Reitter, Lebiere 2010) toolbox for rapid prototyping
and analysis of complex models is also not based on a formal basis. ACT-UP
offers higher level means to access fundamental concepts of the ACT-
R theory for more efficient modelling, but the aim is not to clarify
Rebecca Albrecht1, Michael Giewein2, Bernd Westphal2
1 these fundamental concepts. Re-implementations of ACT-R in the
Center for Cognitive Science, University of Freiburg, Germany;
2 Java programming language (jACT-R, 2010; ACT-R: The Java Sim-
Software Engineering, University of Freiburg, Germany
ulation and Development Environment 2013) have the main purpose
Abstract to make ACT-R accessible for other applications written in Java.
The semantics of the ACT-R cognitive architecture is today defined They do not contribute to a more detailed understanding of basic
by the ACT-R interpreter. As a result, re-implementations of ACT-R concepts of the ACT-R theory.
123
Implementation Identifying inter-individual planning strategies

We implemented the formal ACT-R semantics provided by (Albrecht
2013; Albrecht, Westphal 2014b) in the Lisp dialect Clojure, which Rebecca Albrecht, Marco Ragni, Felix Steffenhagen
targets the Java Virtual Machine (JVM). As a Lisp dialect, it is Center for Cognitive Science, University of Freiburg, Germany
possible to establish a close relation between the formalization and
the implementation. By targeting the JVM, our approach subsumes Abstract
the work of (Buttner 2010) without the need for TCP/IP based in- Finding solutions to planning problems can be very complex as they
terprocess communication. may consist of hundreds of problem states to be searched by an agent.
In the formal semantics the signature for the abstract syntax is In order to analyze human planning strategies cognitive models can
described using relation symbols, function symbols, and variables. be used. Usually the quality of a cognitive model is evaluated w.r.t.
Chunk types are given as functions and production rules as tuples quantitative criteria such as overall planning time. In complex plan-
over the signature. An ACT-R architecture is defined as a set of ning tasks, however, this may not be enough as different strategies
interpretation functions for symbols used in the signature. The com- may need the amount of same time to be solved. We present an
ponents can be directly translated into a Clojure implementation. integration of different AI methods from knowledge representation
The current implementation supports ACT-R tutorial examples for and planning to qualitatively evaluate a cognitive model with respect
base level learning and spreading activation using an own declarative to inter-individual factors.
module (Giewein 2014). The results of the ACT-R 6 interpreter are Keywords
reproduced up to small rounding errors. Qualitative Analysis, Model Evaluation, Strategies, Graph-based
Conclusion Representations, Planning
Our implementation of an ACT-R interpreter based on a formal Introduction
semantics of ACT-R demonstrates the feasibility of the two-step In cognitive modeling, a computer model based on psychological
approach to separate the clarification of fundamental concepts and assumptions is used to describe human behavior in a certain task. In
a re-implementation. In future work, we plan to extend our order to evaluate the quality of a cognitive model average results from
implementation to support further models. Technically, our choice behavioral experiments, e.g. response times, are compared to average
of Clojure allows to more conveniently interface Java code and results predicted by a cognitive model. However, this method does
cognitive models. Conceptually, we plan to use our implementation not accommodate for modeling qualitative and inter-individual
as a basis for more convenient modelling languages and as an differences.
intermediate format for new, exhaustive analyzes of cognitive We present a method for analyzing qualitative differences in user
models based on model- checking techniques and constraint strategies w.r.t. psychological factors which are different in individ-
solvers. uals, e.g. working memory capacity. A qualitative representation of a
user strategy is given by a path, i.e. a sequence of states, in the
References problem space of a task. Individual factors are represented by
ACT-R The Java Simulation and Development Environment (2013) numerical parameters controlling user strategies. In order to evaluate
Retrieved from http://cog.cs.drexel.edu/act-r/about.html, 16 a cognitive model strategies used by participants in a behavioral
May 2014 experiment are compared to strategies predicted by the cognitive
Albrecht R (2013) Towards a formal description of the ACT-R unified model w.r.t. different parameter values. The cognitive model is
theory of cognition. Unpublished masters thesis, Albert-Lud- evaluated by identifying for each participant a set of parameter values
wigs-Universitat Freiburg such that the execution of a model best predicts the participants
Albrecht R, Westphal B (2014a) Analyzing psychological theories strategies.
with F-ACT-R. In: Proceedings of KogWis 2014, to appear Method Sketch
Albrecht R, Westphal B (2014b) F-ACT-R: defining the architectural Firstly, we represent strategies of participants and the cognitive model
space. In: Proceedings of KogWis 2014, to appear w.r.t. different parameter settings in so-called strategy graphs. For-
Anderson JR (1983) The architecture of cognition, vol 5. Psychology mally, a strategy graph for a problem instance p is a directed, labelled
Press multigraph which includes a set of vertices Vp which represent all
Anderson JR (2007) How can the human mind occur in the physical states traversed by any participant or the cognitive model, a set of
universe? Oxford University Press, Oxford edges Ep which represent the application of legal actions in the task, a
Buttner P (2010) Hello Java! Linking ACT-R 6 with a Java simula set of initial states Sp , Vp and a set of goal states Gp , Vp Note that
tion. In: Proceedings of the 10th international conference on the strategy graph may include multiple edges (for different agents)
cognitive modeling, pp 289290 between two states. An example for a partial strategy graph with a
Giewein M (2014) Formalisierung und Implementierung des de planning depths of three in a task from the Rush Hour planning
klarativen Moduls der kognitiven Architektur ACT-R. (Bache- domain (Flake, Baum 2002) is shown in Fig. 1.
lors Thesis, Albert- Ludwigs-Universitat Freiburg) Secondly, parameter values for which the cognitive model best
jACT-R (2010). Retrieved from http://jactr.org, 16 May 2014 replicates human participants strategies are identified based on
Morgan GP, Haynes SR, Ritter FE, Cohen MA (2005) Increasing effi similarity measures calculated for each pair of parameter values and
ciency of the development of user models. In SIEDS, pp 8289 human participants. The similarity of two strategies is restricted to
Reitter D, Lebiere C (2010) Accountable modeling in ACT-UP, a values between 0 and 1 and is calculated based on strategies given in
scalable, rapid-prototyping ACT-R implementation. In: Pro- the strategy graph. In the evaluation, each participant is assigned to a
ceedings ofthe 10th international conference on cognitive set of parameter values where the cognitive models strategy is
modeling (ICCM), pp 199204 maximally similar to the participants strategy. The parameter values
Stewart TC, West RL (2007) Deconstructing and reconstructing assigned to participants are identified as the planning profile of the
ACT-R: exploring the architectural space. Cogn Syst Res participant. In this step, several similarity measures are possible, e.g.
8(3):227236 the Waterman-Smith algorithm (Smith, Waterman 1981).
123
Flake GW, Baum EB (2002) Rush hour is PSPACE-complete, or

Why you should generously tip parking lot attendants. Theor
Comput Sci 270:895911
Mcdermott D (1996) A heuristic estimator for means-ends analysis in
planning. In: Proceedings of the 3rd international conference on AI
planning systems, pp 142149
Smith TF, Waterman MS (1981) Identification of common molecular
subsequences. J Mol Biol 147:195197
Steffenhagen F, Albrecht R, Ragni M (2014) Automatic identification
of human strategies by cognitive agents. In: Proceedings of the
37th German conference on artificial intelligence, to appear
Fig. 1 Example of a partial strategy graph for a planning depth of
three in the Rush Hour problem domain. States are possible Rush
Hour board configurations. Dashed edges indicate moves on optimal
solution paths. Solid edges represent moves of participants in a Simulating events. The empirical side of the event-state
behavioral experiment or moves of cognitive agents. The circle distinction
around the state in the center of the figure indicates a so-called
decision point where several moves can be considered optimal. The Simone Alex-Ruf
dashed game objects in problem states on the bottom of the figure University of Tubingen, Germany
indicate the game object which was moved
Since Vendler (1957) an overwhelming amount of theoretical work on
the categorization of situations concerning their lexical aspect has
With respect to the presented method, the quality of the cognitive emerged within linguistics. Telicity, change of state, and punctuality
model is given by the mean similarity of strategies used by partici- vs. durativity are the main features used to distinguish between events
pants and strategies used by the cognitive model for best replicating and states. Thus, the VPs (verbal phrases) in (1) describe atelic stative
parameter settings. situations, the VPs in (2) telic events:
Preliminary Evaluation Results (1) to love somebody, to be small
We evaluated the proposed method preliminarily in the Rush Hour (2) to run a mile, to reach the top
planning domain (Steffenhagen, Albrecht, Ragni 2014). Human data Although there are so many theories about what constitutes an
was collected in a psychological experiment with 20 participants event or a state, the empirical studies concerning this question can be
solving 22 different Rush Hour tasks. The cognitive model was pro- counted on one hand. This is quite surprising, since the notion of
grammed to use means-end analysis (Faltings, Pu 1992; Mcdermott lexical aspect is a central issue within verb semantics. Even more
1996) with different parameters to control local planning behavior with surprising is the fact that these few studies provide results pointing in
respect to assumed individual factors. The similarity was calculated completely opposite directions:
with the Waterman-Smith algorithm for calculating local sequence The studies in Stockall et al. (2010) and Coll-Florit and Gennari
alignments (Smith, Waterman 1981). For each of the 20 participants a (2011) report shorter RTs (reaction times) to events than to states and,
set of parameter values controlling the cognitive model was identified therefore, suggest that the processing of events is easier. In contrast,
(1) constantly over all tasks and (2) for each task separately. The Gennari and Poeppel (2003) found shorter RTs after reading states
evaluation reveals that this cognitive model can predict 44 % of human than after reading events. They explain this result by the higher level
strategies for (1) and 76 % of human strategies for (2). of complexity in the semantics of verbs describing events, which
Conclusion requires longer processing times.
We present a method to qualitatively evaluate cognitive models by A closer look at these studies, however, reveals that in nearly all of
analyzing user strategies, i.e. sequences of states traversed in the them different verbs or VPs were compared: Gennari and Poeppel
solution of a task. The state space of a planning problem, e.g. the (2003), for example, used eventive VPs like to interrupt my father and
Rush Hour problem space, might be very complex. As a result, user stative VPs like to resemble my mother. One could argue that these
strategies and, therefore, underlying cognitive processes cannot be two VPs not only differ in their lexical aspect, but perhaps also in
analyzed by hand. With the presented method human strategies are their emotional valence and in the way the referent described by the
analyzed automatically by identifying cognitive models which tra- direct object is affected by the whole situation, and that these features
verse the same problem states as human participants. therefore occurred as confounding variables, influencing the results in
In cognitive architectures often numerical parameters are used to an undesirable way.
control the concrete behavior of a cognitive model, e.g. decay rate in To avoid this problem, in the present study German ambiguous
ACT-R. Often, these parameters also influence planning strategies of verbs were used: Depending on the context, verbs like fullen (to fill),
a model. Although parameter values might be different in individuals, schmucken (to decorate) and bedecken (to cover) lead to an eventive
usually they are set constantly over all executions of the model. With or a stative reading. With these verbs sentence pairs were created,
respect to the outlined similarity measure it is possible to analyze consisting of an eventive (3) and a stative sentence (4) (= target
which parameter values induce strategies similar to an individual. items). The two sentences of one pair differed only in their gram-
matical subject, but contained the same verb and direct object:
Target items:
Acknowledgment
(3) Der Konditor/fullt/die Form/[].
This work has been supported by a grant to Marco Ragni within the
The confectioner/fills/the pan/[].
project R8-[CSpace] within the SFB/TR 8 Spatial Cognition.
(4) Der Teig/fullt/die Form/[].
The dough/fills/the pan/[].
References In a self-paced reading study participants had to read these sen-
Faltings B, Pu P (1992) Applying means-ends analysis to spatial tences phrase-by-phrase and in 50 % of all trials answer to a
planning. In: Proceedings of the 1991 IEEE/RSJ international comprehension question concerning the content of the sentence.
workshop on intelligent robots and systems, pp 8085
123
Note that in the event sentences all referents described by the Stockall L, Husband EM, Beretta A (2010) The online composition of
grammatical subjects were animate, whereas in the state sentences all events. Queen Marys Occasional Papers Advancing Linguistics 19
subjects were inanimate. Many empirical studies investigating animacy Vendler Z (1957) Verbs and times. Philosoph Rev 66:143160
suggest that animate objects are remembered better than inanimate Zwaan, RA (2004) The immersed experiencer: Toward an embodied
objects (see, for example, Bonin et al. 2014). Therefore, shorter RTs on theory of language comprehension. In: Ross BH (ed) The psy-
the subject position of event sentences than of state sentences were chology of learning and motivation, vol 44. Academic Press,
expected, resulting in a main effect of animacy. Since this effect could New York, pp 3562
influence the potential event-state effect measured on the verb position
as a spillover effect, control items containing the same subjects, but
different, non-ambiguous verbs like stehen (to stand) were added:
Control items: On the use of computational analogy-engines
(5) Der Konditor/steht/hinter der Theke/[]. in modeling examples from teaching and education
The confectioner/stands/behind the counter/[].
(6) Der Teig/steht/hinter der Theke/[].
Tarek R. Besold
The dough/stands/behind the counter/[].
Institute of Cognitive Science, University of Osnabruck, Germany
The results confirmed this assumption: Mean RT measured on the
subject position was significantly shorter for the animate than for the Abstract
inanimate referents, F(1, 56) = 9.65, p = .003 (587 vs. 602 ms). The importance of analogy for human cognition and learning has
Within the control items, this animacy effect influenced the RTs on widely been recognized, and analogy-based methods are also being
the verb position: After animate subjects RTs of the (non-ambiguous) explicitly integrated into the canon of approved education and
verb were shorter than after inanimate subjects (502 vs. 515 ms), teaching techniques. Still, the actual level of knowledge about anal-
revealing the expected spillover effect. ogy as instructional means and device as of today is rather low. In this
However, within the target items mean RT measured on the summary report on preliminary results from an ongoing project, I
position of the (ambiguous) verb showed the opposite pattern: After propose the use of computational analogy-engines as methodological
animate subjects it was significantly longer than after inanimate tools in this domain of research, additionally motivating this attempt
subjects, F(1, 56) = 4.12, p = .047 (534 vs. 520 ms). Here no at connecting AI and the learning sciences by two worked application
spillover effect emerged, but a main effect which can be attributed to case studies.
the different lexical aspect of the two situation types. Keywords
If indeed processing times are longer for events than for states, Computational Analogy-making, Artificial Intelligence, Education,
how could this effect be explained? The simulation account, pro- Cognitive Modeling, Computational Modeling
posed, for example, by Glenberg and Kaschak (2002) und Zwaan
Introduction: Analogy in Education and Cognitive Modeling
(2004), provides an elegant solution. A strong simulation view of
Analogical reasoning (i.e., the ability of perceiving and operating on
comprehension suggests that the mental representation of a described
dissimilar domains as similar with respect to certain aspects based on
situation comes about in exactly the same way than when this situ-
shared commonalities in relational structure or appearance) is con-
ation is perceived in real-time. This means that language is made
sidered essential for learning abstract concepts (Gentner et al. 2001)
meaningful by cognitively simulating the actions implied by sen-
and in general for childrens process of learning about the world
tences (Glenberg and Kaschak 2002:595).
(Goswami 2001).
Imagine what is simulated during the processing of a state like the
Concerning an educational context, analogies facilitate learners
dough fills the pan: The simulation contains a pan and some dough in
construction processes of new ideas and conceptions on the grounds
this pan, but nothing more. In contrast, the simulation of an event like
of already available concepts (Duit 1991), and can be used for
the confectioner fills the pan not only requires additional participants
facilitating the understanding of concepts and procedures in abstract
like the confectioner and perhaps a spatula, but also action (of the
and formal domains such as mathematics, physics or science (Guerra-
confectioner), movement (of the confectioner, the dough, and the
Ramos 2011). Still, it is not a cure-all as unsuccessful analogies may
spatula), situation change (from an empty to a full pan) and a relevant
produce misunderstandings and can result in harmful misconceptions
time course. The simulation of a state can be envisioned as a picture,
(Clement 1993).
for the imagination of an events simulation a film is needed. In short,
Analogy has also been actively investigated in artificial intelli-
the simulation evoked by an event is more complex than that of a
gence (AI), bringing forth numerous computational frameworks and
state.
systems for automated analogy-making and analogical reasoning.
Under the assumption that a simulation constitutes at least a part of
And indeed, computational analogy frameworks also have found
the mental representation of a situation, it seems comprehensible that
entrance into the context of education and teaching: For instance in
the complexity of such a simulation has an influence on its processing
Thagard et al. (1989), the authors present a theory and implementa-
and that the higher degree of complexity in the simulation of events
tion of analogical mapping that applies to explanations of unfamiliar
leads to longer RTs.
phenomena as e.g. used by chemistry teachers, and Forbus et al.
(1997) show how an information-level model of analogical inferences
References can be incorporated in a case-based coach that is being added to an
Bonin P, Gelin M, Bugaiska A (2014) Animates are better remembered intelligent learning environment. Siegler (1989) conjectures how the
than inanimates: further evidence from word and picture stimuli. Structure-Mapping Engine (Falkenhainer et al. 1989) could be used to
Mem Cognit 42: 370382. doi:10.3758/s13421-013-0368-8 gain insights about developmental aspects of analogy use.
Coll-Florit M, Gennari SP (2011) Time in language: Event duration in Analogy Engines in the Classroom: Worked Examples
language comprehension. Cogn Psych 62:4179 Building on the outcome of these and similar research efforts, in
Gennari SP, Poeppel D (2003) Processing correlates of lexical Besold (2013), I firstly advocated to expand research applying anal-
semantic complexity. Cognition 89:B27B41 ogy-engines to problems from teaching and education into a proper
Glenberg AM, Kaschak MP (2002) Grounding language in action. program in its own right, opening up a new application domain to
Psychon Bull Rev 9:558565 computational analogy-making.
123
In order to provide factual grounding and initial worked examples Gentner D, Holyoak K, Kokinov B (eds) (2001) The analogical mind:
for the possible applications of computational analogy-engines, Be- perspectives from cognitive science. MIT Press
sold (2013) and Besold et al. (2013) feature two case studies. In both Goswami U (2001) The analogical mind: perspectives from cognitive
cases, the Heuristic-Driven Theory Pro jection (HDTP) analogy- science, MIT Press, chap Analogical reasoning in children,
making framework (Schmidt et al. 2014) was applied to modeling pp 437470
real-world examples taken from a classroom context. Besold (2013) Guerra-Ramos M (2011) Analogies as tools for meaning making in
provides an HDTP model of the string circuit analogy for gaining a elementary science education: how do they work in classroom
basic understanding of electric current (Guerra-Ramos 2011) used in settings? Eurasia J Math Sci Technol Educ 7(1):2939
science classes for 8 to 9 year old children. Besold et al. (2013) give a Schmidt M, Krumnack U, Gust H, Kuhnberger KU (2014) Heuristic-
detailed and fairly complex formal model of the analogy-based Cal- driven theory projection: an overview. In: Prade H, Richard G
culation Circular Staircase (Schwank et al. 2005), applied in teaching (eds) Computational approaches to analogical reasoning: current
basic arithmetics and the conception of the naturals as ordinal num- trends. Springer, Berlin, pp 163194
bers to children attending their initial mathematics classes in primary Schwank I, Aring A, Blocksdorf K (2005) Beitrage zum Mathemat-
school. The Calculation Circular Staircase, (i.e., a teaching tool ikunterricht, Franzbecker, Hildesheim, chap Betreten
shaped like a circular staircase with the steps being made up by erwunschtdie Rechenwendeltreppe
incrementally increasing stacks of balls, grouped in expanding circles Siegler R (1989) Mechanisms of cognitive development. Annu Rev
of ten stacks per circle corresponding to the decimal ordering over the Psychol 40:353379
naturals) offers children a means of developing an understanding of Thagard P, Cohen D, Holyoak K (1989) Chemical analogies: two
the interpretation of numbers as results of transformation operations kinds of explanation. In: Proceeding of the 11th international
by enabling a mental functional motor skill-based way of accessing joint conference on artificial intelligence, pp 819824
the foundational construction principles of the number space and the
corresponding basic arithmetic operations. The HDTP model gives a
precise account of how the structure of the staircase and the declar-
ative counting procedure memorized by children in school interact in Brain network states affect the processing
bringing forth the targeted conception of the natural number space. and perception of tactile near-threshold stimuli
Summarizing, in both case studies, the respective formal model shows
to be highly useful in uncovering the underlying structure of the
Christoph Braun1,2,3,4, Anja Wuhle1,5, Gianpaolo Demarchi3,
method or teaching tool, together with the consecutive steps of rea-
Gianpiero Monittola3, Tzvetan Popov6, Julia Frey3, Nathan Weisz3
soning happening on the level of computational theory. 1
MEG-Center, University of Tubingen, Germany; 2 CIN, Werner
Conclusion
Reichardt Centre for Integrative Neuroscience, University of
By providing a detailed formal description of the involved domains
Tubingen, Germany; 3 CIMeC, Center for Mind/Brain Sciences,
and their relation in terms of their joint generalization and the cor-
University of Trento, Italy; 4 Department of Psychology and
responding possibility for knowledge transfer our models try to
Cognitive Science, University of Trento, Italy; 5 CEA, DSV/I2BM,
explicate the structural relations and governing laws underlying the
NeuroSpin Center, F-91191 Gif-sur-Yvette, France, INSERM, U992,
respective teaching tools. Also we point out how the identified con-
Cognitive Neuroimaging Unit, F-91191 Gif-sur-Yvette, France, Univ
structive and transformation-based conceptualizations then can
Paris-Sud, Cognitive Neuroimaging Unit, F-91191 Gif-sur-Yvette,
provide support and a deeper- rooted model for the childrens initially
France; 6 Radboud University Nijmegen, Donders Institute for Brain,
very flat and sparse conceptions of the corresponding domains.
Cognition, and Behavior, 6500 HE Nijmegen, The Netherlands
In general, modeling educational analogies sheds new light on a
particular analogy, in terms of which information is transferred, what Introduction
the limitations of the analogy are, or whether it makes unhelpful Driving a biological or technical system to its limits reveals more
mappings; and what potential extensions might be needed. On this detailed information about its functional principles than testing it at
basis, we hope to acquire a deeper understanding of the basic prin- its standard range of operation. We applied this idea to get a better
ciples and mechanisms underlying analogy- based learning in fairly understanding of how tactile information is processed along the
high-level and abstract domains. somatosensory pathway. To get insight into what makes a stimulus to
become conscious i.e. reportable, we studied the cortical processing
References of near-threshold touch stimuli that are either perceived (hits) or not
Besold TR (2013) Analogy engines in classroom teaching: modeling (misses). Following the concept of win2con proposed by Weisz et al.
the string circuit analogy. In: Proceedings of the AAAI Spring (2014) we tested the hypothesis that the state of functional connec-
2013 symposium on creativity and (early) cognitive development tivity within the sensory network determines up to which level in the
Besold TR, Pease A, Schmidt M (2013) Analogy and Arithmetics: An sensory processing hierarchy misses are processed as compared to
HDTP-based model of the calculation circular staircase. In: hits. The level of sensory processing was inferred by studying
Proceedings of the 35th annual meeting of the cognitive science somatosensory evoked responses elicited by the near-threshold
society, Cognitive Science Society, Austin, TX stimuli. Since the amplitudes of near-threshold somatosensory stimuli
Clement J (1993) Using bridging analogies and anchoring intuitions are low, a paired pulse paradigm was used in which inhibitory effects
to deal with students preconceptions in physics. J Res Sci Teach of the near-threshold stimuli onto a subsequently applied supra-
30:12411257 threshold stimulus was assessed. Results show that the state of a wide-
Duit R (1991) The role of analogies and metaphors in learning sci spread cortical network prior to the application of the tactile stimulus
ence. Sci Educ 75(6):649672 is crucial for a tactile stimulus to elicit activation of SII and to be
Falkenhainer B, Forbus K, Gentner D (1989) The structure-mapping finally perceived.
engine: algorithm and examples. Artif Intell 41:163 Subjects and Methods
Forbus K, Gentner D, Everett J, Wu M (1997) Towards a computa Twelve healthy subjects participated in the study. Using a piezo-
tional model of evaluating and using analogical inferences. In: electric stimulator (Quaerosys, Schotten, Germany) we applied tactile
Proceedings of the 19th annual conference of the cognitive stimuli to the tip of the index finger of the left hand. Intensities of the
science society, pp 229234 near threshold stimuli were adjusted to subjects personal sensory
threshold using a staircase procedure. The near-threshold stimulus
123
was followed by a supra-threshold stimulus to probe the cortical from the global brain network. For stimuli to be perceived con-
activation of the first stimulus. Control conditions in which the first sciously, it seems that the sensory network has to reveal increased
stimulus was omitted and in which the first stimulus was delivered at coupling in a local (clustering) as well as a long-range (efficiency)
supra-threshold intensities were also added. Subjects reported in all sense.
trials how many stimuli they had perceived. Combining a sensory task at the limit of sensory performance with
Pre-stimulus network states and post-stimulus cortical processing elaborated techniques for brain network analyzes and the study of
of the sensory input were studied by means of magnetoencephalog- brain activation, the current study provided insight into the interaction
raphy. To assess the cortical network prior to stimulation, source between brain network states, brain activation and conscious stimulus
activity was estimated for nodes of an equally spaced grid and all-to- perception.
all imaginary coherence was calculated. Alterations in power and
graph theoretical network parameters were estimated. Since second- References
ary somatosensory cortex (SII) appears to play a crucial role in the Weisz N, Wuhle A, Monittola G, Demarchi G, Frey J, Popov T, Braun
processing of consciously perceived tactile stimuli we used it as a C (2014) Prestimulus oscillatory power and connectivity patterns
seed region for identifying the related brain network. In order to predispose conscious somatosensory perception. Proc Natl Acad
assess post-stimulus processing and its dependency from the presti- Sci USA 111(4):E417E425
mulus network state, evoked responses were recorded. Since evoked Wuhle A, Mertiens L, Ruter J, Ostwald D, Braun C (2010) Cortical
responses to near-threshold stimulation are rather weak, the activation processing of near-threshold tactile stimuli: an MEG study.
induced by near-threshold stimulus was probed by subsequently Psychophysiology 47(3):523534
applying a suprathreshold test stimulus. To determine the source Wuhle A, Preissl H, Braun C (2011) Cortical processing of near-
activity, a spatio-temporal dipole model with one source for primary threshold tactile stimuli in a paired-stimulus paradigman MEG
somatosensory cortex (SI) contralateral to the stimulation site and two study. Eur J Neurosci 34(4):641651
dipoles for ipsi- and contralateral SII were used. The model was
applied to both, the direct evoked responses of the near-threshold
stimuli and to the activation evoked by the probe stimulus. Since the
duration of activation differs across the different sensory brain areas A model for dynamic minimal mentalizing in dialogue
in the paired pulse approach varying ISIs of 30, 60, and 150 ms
between the near-threshold and the test stimulus allowed for probing
the sensory processing of the near-threshold stimulus at different Hendrik Buschmeier, Stefan Kopp
levels (Wuhle et al. 2010). Social Cognitive Systems Group, CITEC and Faculty of Technology,
Results Bielefeld University, Germany
Network analysis for the prestimulus period yielded increased alpha Spontaneous dialogue is a highly interactive endeavor in which
power in trials in which the near-threshold stimulus was not detected. On interlocutors constantly influence each others actions. As
a global level brain networks appeared to be more strongly clustered for addressees they provide feedback of perception, understanding,
misses than for hits. In contrast, on a local level, clustering coefficients acceptance, and attitude (Allwood et al. 1992). As speakers they
were stronger for hits than for misses in particular for contralateral SII. A adapt their speech to the perceived needs of the addressee, propose
detailed analysis of the connectedness of SII revealed that except for new terms and names, make creative references, draw upon
connections to the precuneus SII was more strongly connected to other established and known to be shared knowledge, etc. This makes
brain areas such as ipsilateral inferior frontal/anterior temporal cortex and dialogue a joint activity (Clark 1996) whose outcome is not
middle frontal gyrus for hits than for misses. Results suggest that the state determined up front but shaped by the interlocutors while the
of the prestimulus somatosensory network involving particularly middle interaction unfolds over time.
frontal gyrus, cingulate cortex and fronto temporal regions determine One of the tasks interlocutors need to carry out while being
whether near-threshold tactile stimuli elicit activation of SII and are engaged in a dialogue is keeping track of the dialogue information
subsequently perceived and reported. state. This is usually considered to be a rich representation of the
Studying poststimulus activation, no significant difference dialogue context, most importantly including which information is
between hits and misses was found on the level of SI, neither for the grounded and which is still pending to be grounded (and potentially
direct evoked response of the near-threshold stimulus nor for its much more information; see, e.g., Ginzburg 2012). Whether such a
effects on the subsequent probe stimulus. In contrast, on the level of detailed representation of the information state is necessaryand
SII a significant difference between hits and misses could be shown in whether it is a cognitively plausible assumptionfor participating in
response to the near-threshold stimuli. Moreover, the SII response of dialogue is a topic of ongoing debate.
the probe stimulus was inhibited by the primarily applied near- On the one hand, Brennan and Clark (Brennan and Clark 1996;
threshold for exclusively an ISI of 150 ms, but not for shorter ISIs Clark 1996) state that speakers maintain a detailed model of
(Wuhle et al. 2011). common ground and design their utterance to the exact needs of
Discussion their communication partnerseven to the extent that approximate
The here reported study emphasizes the importance of the prestimulus versions of mutual knowledge may be necessary to explain certain
state of brain networks for the subsequent activation of brain regions dialogue phenomena (Clark and Marshall 1981). On the other hand,
involved in higher level stimulus processing and for the conscious Pickering and Garrod (2004) argue thatfor reasons of effi-
perception of sensory input. In tactile stimulus processing, secondary ciencydialogue cannot involve heavy inference on common
somatosensory cortex appears to be the critical region that is ground, but is an automatic process that relies on priming and
embedded in a wide brain network and that is relevant for the gating activation of linguistic representations and uses interactive repair
of sensory input to higher level analysis. This finding corresponds upon miscommunication. A position that falls in between this
with the established view that processing of sensory information SII is dichotomy is Galati and Brennans (2010) lightweight one-bit
strongly modulated by top-down control. Network analyzes indicated partner model (e.g., has the addressee heard this before or not) that
that the sensory network involving SII, middle frontal gyrus, cingu- can be used instead of full common ground when producing an
lated cortex and fronto temporal brain regions has to be distinguished utterance.
123
We propose that interlocutors in dialogue engage in dynamic Acknowledgments

minimal mentalizing, a process that goes beyond the single properties This research is supported by the Deutsche Forschungsgemeinschaft
in the focus of Galati and Brennans (2010) (DFG) through the Center of Excellence EXC 277 Cognitive Inter-
one-bit model, but is comparable in computational efficiency. action Technology.
We assume that speakers maintain a probabilistic, multidimensional
(consisting of a fixed number of state variables), and dynamic References
attributed listener state (Buschmeier and Kopp 2012). We model Allwood J, Nivre J, Ahlsen E (1992) On the semantics and pragmatics
this as a dynamic Bayesian network representation (see Fig. 2) that is of linguistic feedback. J Semant 9:126. doi:10.1093/jos/9.1.1
continuously updated by the addressees communicative feedback Brennan SE, Clark HH (1996) Conceptual pacts and lexical choice in
(i.e., short verbal-vocal expressions such a uh-huh, yeah, huh?; conversation. J Exp Psychol Learn Memory Cogn 22:14821493.
head gestures; facial expressions) seen as evidence of understanding doi:10.1037/0278-7393.22.6.1482
in response to ongoing utterances. Brown-Schmidt S (2012) Beyond common and privileged: gradient
The proposed model is multidimensional because it represents the representations of common ground in real-time language use.
listeners mental state of listening in terms of the various communi- Lang Cogn Process 6289. doi10.1080/01690965.2010.543363
cative functions that can be expressed in feedback (Allwood et al. Buschmeier H, Kopp S (2011) Towards conversational agents that attend
1992): is the listener in contact?; is he or she willing and able to to and adapt to communicative user feedback. In: Proceedings of the
perceive and understand what is said?; and does he or she accept the 11th international conference on intelligent virtual agents, Rey-
message and agrees to it? Instead of making a decision conditioned kjavk, Iceland, pp 169182, doi:10.1007/978-3-642-23974-8_19
on the question whether the interlocutor has heard something before, Buschmeier H, Kopp S (2012) Using a Bayesian model of the listener
this model allows to make use of the still computationally feasible but to unveil the dialogue information state. In: SemDial 2012:
richer knowledge of whether he or she has likely perceived, under- proceedings of the 16th workshop on the semantics and prag-
stood, etc. a previously made utterance. matics of dialogue, Paris, France, pp 1220
Further, the model is fully probabilistic since the attributed mental Buschmeier H, Kopp S (2014) When to elicit feedback in dialogue:
states are modelled in a Bayesian network. Each dimension is rep- towards a model based on the information needs of speakers. In:
resented as a random variable and the probabilities over the state of Proceedings of the 14th International Conference on Intelligent
each variable (e.g., low, medium, high understanding) are interpreted Virtual Agents, Boston, MA, USA, pp 7180
in terms of the speakers degree of belief in the addressee being in a Clark HH (1996) Using language. Cambridge University Press,
specific state. This is a graded form of common ground (Brown- Cambridge. doi:10.1017/CBO9780511620539
Schmidt 2012) and presupposition (e.g., this knowledge is most likely Clark HH, Marshall CR (1981) Definite reference and mutual knowl-
in the common ground; see variables GR and GR0 in Fig. 2), which edge. In: Joshi AK, Webber BL, Sag IA (eds) Elements of discourse
can be accommodated by, e.g., interactively leaving information out understanding. Cambridge University Press, Cambridge, pp 1063
or adding redundant information; or by making information prag- Galati A, Brennan SE (2010) Attenuating information in spoken
matically implicit or explicit. communication: For the speaker, or for the addressee? J Memory
Finally, since the model is based on a dynamic Bayesian network, Lang 62:3551. doi10.1016/j.jml.2009.09.002
the interpretation of incoming feedback signals from the addressee is Ginzburg J (2012) The interactive stance. Oxford University Press,
influenced by the current belief state, and changes of the attributed Oxford
listener state are tracked over time. Representing these dynamics Pickering MJ, Garrod S (2004) Toward a mechanistic psychology of
provides speakers with a broader basis for production choices as well dialogue. Behav Brain Sci 27:169 226. doi:10.1017/S0140
as enabling strategic placement of feedback elicitation cues based on 525X04000056
informational needs. It also allows for a prediction of the addressees
likely future mental state, thus enabling anticipatory adaptation of
upcoming utterances.
In current work, the model of dynamic minimal mentalizing is Actions revealing cooperation: predicting
being applied and evaluated in a virtual conversational agent that is cooperativeness in social dilemmas
able to interpret its users communicative feedback and adapt its from the observation of everyday actions
own language accordingly (Buschmeier and Kopp 2011, 2014).
Dong-Seon Chang, Heinrich H. Bulthoff, Stephan de la Rosa
Max Planck Institute for Biological Cybernetics, Dept. of Human
Perception, Cognition and Action, Tubingen, Germany
Introduction
Human actions contain an extensive array of socially relevant infor-
mation. Previous studies have shown that even brief exposure to
visually-observed human actions can lead to accurate predictions of
goals or intentions accompanying human actions. For example, motion
kinematics can enable predicting the success of a basketball shot, or
whether a hand movement is carried out with cooperative or competi-
tive intentions. It has been also reported that gestures accompanying a
conversation can serve as a rich source of information for decision
making to judge about the trustworthiness of another person. Based on
these previous findings we wondered whether humans could actually
predict the cooperativeness of another individual by identifying visible
Fig. 1 The dynamic Bayesian network model for dynamic minimal social cues. Would it be possible to predict the cooperativeness of a
mentalizing. The network consists of the mental state variables for person by just observing everyday actions such as walking or running?
contact (C), perception (P), understanding (U), acceptance (AC), We hypothesized that even brief excerpts of human actions depicted and
agreement (AG), and groundedness (GR) attributed to the listener presented as biological motion cues (i.e. point-light-figures) would
123
provide sufficient information to predict cooperativeness. Using highly relevant considering the demands of most real-life problems.
motion-capture technique and a game-theoretical interaction setup we The goal of the current study is to examine how helpful or hindering
explored whether prediction of cooperation was possible merely by certain analogies can be for solving a complex and dynamic problem
observing biological motion cues of everyday actions, and which such as improving living conditions of a fictional tribe in the MORO
actions were enabling these predictions. simulation (Dorner 1996). We expect an analogy story that highlights a
Methods dynamic system (blood sugar) prime participants and facilitate problem
We recorded six different human actionswalking, running, greeting, solving more than an analogy story that highlights linear processing
table tennis playing, choreographed dancing (Macarena) and sponta- (visual perception) or no analogy story at all (control). The facilitating
neous dancingin normal participants using an inertia-based motion analogy story will make participants more sensitive to the intercon-
capture system. We used motion capture technology (MVN Motion nectedness of the system variables in the complex problem and
Capture Suit from XSense, Netherlands) to record all actions. A total therefore will lead to more reflection time at the beginning of the
number of 12 participants (6 male, 6 female) participated in motion simulation and more in-depth information collection and fewer actions.
recording. All actions were then post-processed to short movies (ca. Method
5 s) showing point light stimuli. These actions were then evaluated by Participants were 29 psychology students from Otto-Friedrich Uni-
24 other participants in terms of personality traits such as coopera- versity Bamberg, Germany. (More data will be collected.)
tiveness and trustworthiness, on a Likert scale ranging from 1 to 7. The We used three different analogy stories (facilitating systems analogy
original participants who provided the recorded actions then returned a storyblood sugar, distracting linear analogy storyvisual percep-
few months later to be tested for their actual cooperativeness perfor- tion, controlno story). Participants received either the blood-sugar
mance. They were given standard social dilemmas used in game story, the visual perception story, or no story prior to working with the
theory such as the give some game, stag hunt game, and public goods MORO simulation. The stories were 1.5 pages long including two
game. In those interaction games, they were asked to exchange or give figures each. The blood-sugar story described the changes in blood
tokens to another player, and depending on their choices they were sugar dependent on food intake. It also showed the long-term con-
able to win or lose an additional amount of money. The choice of sequences of high sugar consumption. It showed that the body is a
behavior for each participant was then recorded and coded for coop- dynamic system. The visual perception story described the linear
erativeness. This cooperativeness performance was then compared process of perception from stimulus till processing in cortex. The
with the perceived cooperativeness based on the different ratings of blood-sugar story will prime systemic thinking considering side- and
their actions performed and evaluated by other participants. long-term effects of actions and considering the balance of the sys-
Results and Discussion tem. The visual-perception story will prime liner one-dimensional
Preliminary results showed a significant correlation between coop- thinking. The control group will not receive any story and not be
erativeness ratings and actual cooperativeness performance. The primed.
actions showing a consistent correlation were Walking, Running and MORO is the computer simulation of a tribe of semi-nomads in the
Choreographed Dancing (Macarena). No significant correlation was Sahel zone (Lutsevich 2013). MORO is especially suited to study
observed for actions such as Greeting, Table tennis playing or complex problem solving due to the huge number of variables
Spontaneous Dancing. A similar tendency was consistently observed involved and the demand to come up with novel solutions and to
across all actions, although no significant correlations were found for coordinate the decisions. Participants take the role of developmental
all social dilemmas. The ratings of different actors and actions were aid assistants and try to help improve the living conditions of the
highly consistent across different raters and high inter-rater-reliability MORO tribe. Participants sit in front of a computer screen and can
was achieved. It seems possible that natural and constrained actions select information and make decisions using the mouse. A file doc-
carry more social cues enabling prediction of cooperation than actions umenting all of the participants decisions is automatically saved to
showing more variance across different participants. Further studies the hard drive. For the current study we focused only on the first
with higher number of actors and raters are planned to confirm 12 min of played time, because we postulated that especially the
whether accurate prediction of cooperation is really possible. initial time would be influenced mostly by the analogy story pre-
sented. Later, demands of the problem situation will become more
influential.
A demographic questionnaire was administered to control for
The use of creative analogies in a complex problem potential confounding variables, assessing age, sex, major, and stu-
situation dent status.
Results
Melanie Damaskinos1, Alexander Lutsevich1, Dietrich Dorner1, We are still continuing with data collection, but preliminary results
Ute Schmid1, C. Dominik Guss1,2 refer to the three dependent variables: ANumber of actions, IS
1
Otto-Friedrich Universitat Bamberg, Germany; 2 University of Number of information searches, and RTReflection time periods
North Florida, USA greater than 15 s where no action and no information search took
place. These variables were assessed for the first 12 min in intervals of
Keywords
4 min each from the participants MORO log files and then combined.
Analogy, creativity, dynamic decision making, complex problem
Results did not confirm our expectations. Initial data analysis
solving, strategies
showed that the group primed with the system blood sugar analogy
Analogical reasoning is one key element of creative thinking and story compared to the two other groups did not show more reflection
one of the key human abilities of domain-general cognitive mecha- periods and more information searches in the first 12 min.
nisms (Keane 1988). A person takes the structure and elements of one For the first 12 min, participants of the systems analogy story
domain and tries to apply them to the new problematic domain. followed a more balanced strategy compared to the two other
Experimental research has shown the transfer of knowledge from one groups. The control group followed blind actionism, engaging in
domain to another (e.g., Wiese et al. 2008). most actions and information searches, but fewest reflection times.
Analogical reasoning has been studied often in classrooms and The group primed with the linear analogy spent most time reflecting,
related to mathematical problems, but the use of analogies in complex but made the fewest actions and information searches. The means for
and uncertain domains has been studied rarely in the laboratory. Yet, actions, information searches, and reflection times of the systems
the study of creative analogy use in complex problem solving would be analogy group were between the means of the linear prime group and
123
control group (see Fig. 1). Mean differences among the three groups Acknowledgments
were not significant for Actions, F(2, 26) = 1.81, p = .18; but for This research was supported through a Marie-Curie IIF Fellowship to
Information Searches, F(2, 26) = 5.63, p = .009; and for Reflection the last author.
Times, F(2, 26) = 4.81, p = .02.
An alternative explanation for the strategic differences among the References
three groups could be individual difference variables. We assessed the Dorner D (1996) The logic of failure. Metropolitan Books, New York
need to reduce cognitive uncertainty, final high school leaving Funke J (2000) Psychologie der Kreativitat [Psychology of creativ-
examination grade, age, and gender. None of the four variables cor- ity]. In: Holm-Hadulla RM (ed) Kreativitat. Springer,
related significantly with either actions, or information searches, or Heidelberg, pp 283300
reflection time periods (see Table 1). Yet the three decision-making Keane MT (1988) Analogical problem solving. Ellis Horwood,
measures correlated significantly with each other. Obviously, the Chichester
more time spent for reflection, the fewer actions and the fewer Lutsevich A, Dorner D (2013) MORO 2 (completely revised new
information searches took place. Or, the more actions and the more version). Program documentation. Otto-Friedrich Universitat
information searches, the fewer reflection times took place. Bamberg
Conclusion Wiese E, Konerding U, Schmid U (2008) Mapping and inference in
Creativity has been rarely studied in relation to complex microworlds. analogical problem solvingas much as needed or as much as
Thus, a process-analysis of creative analogical reasoning in a com- possible? In: Love BC, McRae K, Sloutsky VM (eds) Proceed-
plex, uncertain, and dynamic microworld is a novel research topic and ings of the 30th annual conference of the cognitive science
other researchers expressed the need to experimentally assess crea- society. Lawrence Erlbaum, Mahwah pp 927932
tivity in a complex and novel problem situation and to focus on idea
evaluation and implementation (Funke 2000).
Further data analysis will also include the correlation of strategy
and performance in MORO. Preliminary results of the current study Yes, thats right? Processing yes and no and attention
showed that the presented analogy stories primed decision making and to the right vs. left
problem solving but not in the expected direction. Participants primed
with the systems story followed a balanced approach where number of Irmgard de la Vega, Carolin Dudschig, Barbara Kaup
actions, information searches, and reflection times were similarly University of Tubingen, Germany
frequent. Participants primed with the linear story spent most time
reflecting and searching information, perhaps because they were Recent studies suggest that positive valence is associated with the
primed that a decision leads to a linear consequence. Participants who dominant hands side of the body and negative valence with the non-
did not receive any story showed most actions and fewest reflection dominant hands side of the body (Casasanto 2009). This association
times. It is possible that no story provided no helpful cues and lead to is also reflected in response times, with right- and left-handers
most uncertainty and to actionism (see Dorner 1996). These findings responding faster with their dominant hand to positive stimuli (e.g.,
could have implications for training programs and education which love), and with their non-dominant hand to negative stimuli (e.g.,
focus on teaching children, students, and experts to be sensitive to the hate; de la Vega et al. 2012; see also de la Vega et al. 2013). Inter-
characteristics of complex, uncertain, and dynamic problems. estingly, a similar finding emerges for yes- and no-responses: right-
handed participants respond faster with their dominant hand to yes,
and with their non-dominant hand to no (de la Vega et al. in prep).
The present study tested whether the association between yes/no
and (non-)dominant hand is reflected in a visual attention shift.
Actions - InformationSearches - Reflection Times Spatial attention has been shown to be influenced by various cate-
16 gories. For example, the association between numbers and horizontal
14
12 space (SNARC effect; Dehaene et al. 1993) is also reflected in visual
10 attention: in a target detection task, participants responded faster to a
8
6 target presented on the left after a low digit, and to a target on the
4 right after a high digit (Fischer et al. 2003; see also Dudschig et al.
2
0 2012).
Linear System Control Linear System Control Linear System Control
We adapted the target detection task from Fischer et al. (2003) to
Prime Prime Prime Prime Prime Prime investigate visuospatial attention shifts after yes or no. In line with the
results obtained by Fischer et al. (2003), we expected faster detections
Actions Information Searches Reflection Times ** of a target located on the right after yes, and of a target on the left
** after no. Twenty-two volunteers (1 male; MAge = 23.0, SDAge = 5.3)
participated in the study. The word yes (in German: Ja) or no (in
Fig. 1 Means of actions, information searches, and reflection time German: Nein) appeared centrally on the computer screen for 300 ms,
periods of 15 s or longer for the first 12 min of participants working followed by a target on the right or on the left. Participants task was
on the MORO simulation
Table 1 Correlations of individual difference variables and behavioral complex-problem solving measures
Cognitive uncertainty Final high school grade Age Gender Actions Information searches
Actions -.05 -.11 -.20 -.14

Information searches -.24 .17 -.18 .09 .27
Reflection times .24 .04 .30 .07 -.70*** -.53**
*** p \ .001; ** p \ .005; * p \ .05
123
de la Vega I, Dudschig C, Kaup B (in prep.) Faster responses to yes

with the dominant hand and to no with the non-dominant hand: a
compatibility effect
Dudschig C, Lachmair M, de la Vega I, De Filippis M, Kaup B (2012)
From top to bottom: spatial shifts of attention caused by lin-
guistic stimuli. Cogn Process 13:S151S154
Fischer MH, Castel AD, Dodd DD, Pratt J (2003) Perceiving numbers
causes spatial shifts of attention. Nat Neurosci 6:555556
Posner MI, Cohen Y (1984) Components of visual orienting. In:
Bouma H, Bouwhuis D (eds) Attention and Performance Vol.
X. Erlbaum, pp 531556
Perception of background color in head mounted

displays: applying the source monitoring paradigm
Nele M. Fischer, Robert R. Brauer, Michael Unger

University of Applied Sciences Leipzig, Germany
Monocular look-around Head Mounted Displays (HMDs), for instance
the Smart Glasses Vuzix M100, are wearable devices that enrich visual
perception with additional information by placing a small monitor
Fig. 1 Mean response times in the target detection task. Error bars
(e.g., LCD) in front of one eye. While having access to various kinds of
represent confidence intervals (95 %) for within-subject designs and
information, users can engage in other tasks, such as reading assembly
were computed as recommended by Masson and Loftus (2003)
instructions on the HMD while performing a manual assembly task. To
reduce the distraction from the main task, the information should be
presented in a way that is perceived comfortable with as little effort as
to press a key as soon as they had detected the target in the left or possible. It is likely that display polarity has an impact on information
right box. Responses under 100 ms were excluded from analysis perception since positive polarity (i.e. black font on white background)
(1.1 %). The remaining RTs were submitted to a 2 (word: yes vs. no) is widely recognized for better text readability. However, in specific
x 2 (target location: left vs. right) ANOVA. Visuospatial attention viewing conditions the bright background illumination of a positive
was influenced by the words yes or no, as indicated by an interaction polarity was found to reduce word recognition and induce discomfort
between word and target location. However, contrary to our compared to negative polarity (white font on black background) (Tai,
hypothesis, an interference showed (see Fig. 1): Target detection Yan, Larson, Sheedy 2013). Since perception of HMDs might differ to
occurred faster on the left after yes, and faster on the right after no, some extend from stationary displays (e.g., Naceri, Chellali, Dionet,
F(1,21) = 6.80, p = .016. Toma 2010) and color has an impact on information perception (e.g.,
One explanation for this unexpected pattern might be inhibition of Dzulkifli, Mustafar 2013), we investigated the impact of polarity on
return (see Posner, Cohen 1984). Upon perceiving the word yes or no, perception in a monocular look-around HMD. If one type of polarity
attention might move immediately to the right or to the left, but after (positive or negative) is less distracting from the presented content, we
it is withdrawn, participants might be slower to detect a stimulus would expect enhanced recognition due to a deeper processing of the
displayed in this location. Using variable delays between word and material (Craik, Lockhart 1972). Meanwhile, the memory of the
target presentation should clarify this issue. Another possibility is that polarity itself should decrease when it is less distracting (source
the observed pattern does not result from an association between yes/ monitoring: Johnson, Hashtroudi, Lindsay 1993). Furthermore, sub-
no and right/left stemming from handedness, but rather corresponds to jective preference ratings should match the less distracting polarity
the order in which the words yes and no are usually encountered. (Tai et al. 2013).
When used together in a phrase, yes is usually used before no (e.g., To test this, we conducted a recognition test within the source
Whats your answeryes or no?); as a result, in left-to-right monitoring paradigm (Johnson et al. 1993) and asked participants for
writing cultures, yes might become associated with the left side, no their polarity preference. In our experimental setting, 32 single-item
with the right side. We are planning to investigate this possibility, as words were presented in sequence with either positive or negative
well as the question under which conditions an association between polarity on the LCD screen of the monocular look-around HMD
yes and the left side vs. yes and the right hand becomes activated, in Vuzix M100. Directly afterwards participants rated their preferred
future studies. polarity. Following a short distraction the recognition and source
memory test was conducted. All previously presented (old) words
References were mixed with the same amount of new distracter words. For each
Casasanto D (2009) Embodiment of abstract concepts: good and bad item, participants decided whether the item was previously presented
in right- and left-handers. J Exp Psychol Gen 138:351367 or new and, if assigned old, they had to determine the items polarity
Dehaene S, Bossini S, Giraux P (1993) The mental representation of (positive or negative).
parity and number magnitude. J Exp Psychol Gen 122:371396 The results of our study on polarity for the monocular look-around
de la Vega I, De Filippis M, Lachmair M, Dudschig C., Kaup B display Vuzix M100 indicated that negative polarity increased word
(2012) Emotional valence and physical space: limits of inter- recognition and was preferred by participants. Despite our assumptions,
action. J Exp Psychol Hum Percept Perform 38:375385 the recognition of negative polarity (source monitoring) increased as
de la Vega I, Dudschig C, De Filippis M, Lachmair M, Kaup B (2013) well, which might be the effect of the higher recognition rate for items
Keep your hands crossed: the valence-by-left/right interaction is having negative polarity. These results do not only support a design
related to hand, not side, in an incongruent hand-response key decision, they also correspond to the subjective preference ratings of
assignment. Acta Psychol 142:273277 participants with data from memory research. Thus, preference ratings
123
appear to be a good indicator for issues of user perception. Based on remained unclear. For example, Dreisbach and Goschke (2004) could
these results, we recommend the use of negative polarity to display short only speculate whether the LI-effect was driven by difficulties to
text information, e.g. assembly instructions, in monocular look-around activate a goal that had been ignored beforehand or by a novelty boost
HMDs with near-to-eye LCD display (e.g., approximately 4 cm dis- that draws attention towards the distracting color.
tance to the eye in Vuzix M100), since it appears to be less distracting Addressing these open questions, we created a DNF- model of the
and more comfortable than positive polarity. Due to the small sample task. Instead of including additional mechanisms to incorporate pro-
size, further examination is needed on this topic. cesses like attentional capture or goal-specific inhibition, we built the
most parsimonious model that relies exclusively on continuously
References changing levels of goal activation. In this respect, DNFs are suited
Craik FIM, Lockhart RS (1972) Levels of processing: a framework exceptionally well to model dynamic goal-directed behavior as they
for memory research. J Verbal Learn Verbal Behav 11:671684 embrace cognition as a deeply continuous phenomenon that is tightly
Dzulkifli MA, Mustafar MF (2013) The influence of colour on coupled to our sensorimotor systems (Sandamirskaya, Zibner, Sch-
memory performance: a review. Malaysian J Med Sci 20:39 neegans, Schoner 2013). Our model consists of three layers, similarly
Johnson MK, Hashtroudi S, Lindsay DS (1993) Source monitoring. to previous models of goal-driven behavior and goal-switching (c.f.
Psychol Bull 114:328 Gilbert, Shallice 2002; Scherbaum et al. 2012). A goal layer represents
Naceri A, Chellali R, Dionnet F, Toma S (2010) Depth perception the cued target color by forming a peak of activation at a specific site.
within virtual environments: comparison between two display When activation reaches a threshold, it feeds into an associations-layer
technologies. Int J Adv Intell Syst 3:5164 representing colors and magnitudes of the current stimuli. The
Tai YC, Yan SN, Larson K, Sheddy J (2013) Interaction of ambient emerging pattern of activation is then projected into a response layer,
lighting and LCD display polarity on text processing and viewing resulting in a tendency to move to the left or right. Notably, as is
comfort. J Vis 13(9) article 1157 typical for DNF-models, all layers are continuous in representational
space. This allowed us to study the models behavior continuously
over time instead of obtaining discrete threshold responses. Crucially,
Continuous goal dynamics: insights from the inert activation dynamics inherent to DNFs provide a simple
mouse-tracking and computational modeling mechanism for the time-consuming processes of goal-setting and -
shifting observed in behavioral data.
A simulation study of the original paradigm indicated similar costs
Simon Frisch, Maja Dshemuchadse, Thomas Goschke, in response times for PS- and LI-switches as observed by Dreisbach
Stefan Scherbaum and Goschke (2004). However, continuous response trajectories
Technische Universitat Dresden, Germany provided differential patterns for PS- and LI- trials: PS-switches
Goal-directedness is a core feature of human behavior. Therefore, it is yielded response trajectories that were deflected towards the previ-
mandatory to understand how goals are represented in the cognitive ously relevant information, while LI-switches yielded a tendency to
system and how these representations shape our actions. Here, we will keep the response neutral for a longer time before deciding for one
focus on the time-dependence of goal-representations (Scherbaum, alternative. We validated these predictions in a set-switching exper-
Dshemuchadse, Ruge, Goschke 2012). This feature of goal-repre- iment that was similar to the one conducted by Dreisbach and
sentations is highlighted by numerous task-switching studies which Goschke (2004). However, instead of responding with left or right key
demonstrate that setting a new goal is associated with behavioral costs presses, participants moved a computer mouse into the upper left or
(Monsell 2003; Vandierendonck, Liefooghe, Verbruggen 2010). right corner of the screen. As expected, goal switches induced switch
Moreover, participants have difficulties to ignore previously relevant costs in response times. More intriguingly, mouse movements repli-
goals (perseveration, PS) or to attend to previously irrelevant goals cated the models dynamic predictions: PS-switches yielded
(learned irrelevance, LI, c.f. Dreisbach, Goschke 2004). Thus, goals movements strongly deflected to the alternative response, whereas LI-
are not switched on or off instantaneously but take time to build switches yielded indifferent movements for a longer time than in
up and decay. This is also assumed by connectionist models of task repetition trials.
switching (e.g. Gilbert, Shallice 2002), where goal units need time to In summary, our DNF-model and mouse-tracking data suggest that
shift between different activation patterns. continuously changing levels of goal activation constitute the core
While both empirical evidence and theory underline the dynamic mechanism underlying goal-setting and shifting. Therefore, we
nature of goals, models and empirical findings have mostly been advocate the combination of continuous modelling with continuous
linked by comparing modelled and behavioral outcomes (e.g. behavioral measures, as this approach offers new and deeper insights
response times). However, these discrete values provide only loose into the dynamics of goals and goal-directed action.
constraints for theorizing about the processes underlying these mea-
sures. Here, we aim towards a deeper understanding of continuous References
goal dynamics by comparing the continuous performance of a Dreisbach G, Goschke T (2004) How positive affect modulates cog-
dynamic neural field (DNF) model with a continuous measure of nitive control: reduced perseveration at the cost of increased
goal-switching performance, namely mouse movements. Originally, distractibility. J Exp Psychol Learn Memory Cogn 30(2):343353.
the two phenomena of PS and LI were studied by Dreisbach and doi:10.1037/0278-7393.30.2.343
Goschke (2004) in a set switching task: participants categorized a Gilbert SJ, Shallice T (2002) Task switching: A PDP model. Cogn
number presented in a cued color (target) while ignoring a number in Psychol 44(3):297337. doi:10.1006/cogp.2001.0770
another color (distracter). After several repetitions, the cue indicated Monsell S (2003) Task switching. Trend Cogn Sci 7(3):134140. doi:
to attend to a new color. Two kinds of switches occurred: In the PS- 10.1016/S1364-6613(03)00028-7
condition, the target was presented in a new color while distracters Sandamirskaya Y, Zibner SKU, Schneegans S, Schoner G (2013)
were presented in the previous target color (e.g. red). In the LI-con- Using dynamic field theory to extend the embodiment stance
dition, the target was presented in the previous distracter color while toward higher cognition. New Ideas Psychol 31(3):322339. doi:
distracters were presented in a new color (e.g. green). While the 10.1016/j.newideapsych.2013.01.002
results indicated typical switch patterns in response times for both Scherbaum S, Dshemuchadse M, Ruge H, Goschke T (2012)
conditions, the processes underlying the observed switch costs Dynamic goal states: adjusting cognitive control without
123
conflict monitoring. NeuroImage 63(1):126136. doi:10.1016/

j.neuroimage.2012.06.021
Vandierendonck A, Liefooghe B, Verbruggen F (2010) Task
switching: Interplay of reconfiguration and interference control.
Psychol Bull 136(4):601626. doi:10.1037/a0019791
Looming auditory warnings initiate earlier event-

related potentials in a manual steering task
Christiane Glatz, Heinrich H. Bulthoff, Lewis L.Chuang

Max Planck Institute for Biological Cybernetics, Tubingen, Germany Fig. 1 The topographical plot shows the 500 ms after sound offset,
Automated collision avoidance systems promise to reduce accidents with scalp maps plotted every 50 ms, for the constant (row 1), the
and relieve the driver from the demands of constant vigilance. Such ramped (row 2), and the looming tone (row 3). The looming cues
systems direct the operators attention to potentially critical regions of evoked a strong positive deflection about 200 ms earlier than the
the environment without compromising steering performance. This other sounds. The black bar at the bottom of the figure indicates
raises the question: What is an effective warning cue? where the significance level of 0.01 was exceeded using a parametric
Sounds with rising intensities are claimed to be especially salient. test on the combined Fz, FCz, Cz, and Pz activity
By evoking the percept of an approaching object, they engage a
neural network that supports auditory space perception and attention increasing intensity. Future work will investigate how this benefit
(Bach et al. 2008). Indeed, we are aroused by and faster to respond to might diminish with increasing time between the warning tone and
looming auditory tones, which increase heart rate and skin con- the event that is cued for.
ductance activity (Bach et al. 2009).
Looming sounds can differ in terms of their rising intensity pro- References
files. While it can be approximated by a sound whose amplitude Bach DR, Schchinger H, Neuhoff JG, Esposito F, Salle FD, Lehmann
increases linearly with time, an approaching object that emits a C, Herdener M, Scheffler K, Seifritz E (2008) Rising sound
constant tone is better described as having an amplitude that increases intensity: an intrinsic warning cue activating the amygdala.
exponentially with time. In a driving simulator study, warning cues Cerebral Cortex 18(1):145150
that had a veridical looming profile induced earlier braking responses Bach DR, Neuhoff JG, Perrig W, Seifritz E (2009) Looming sounds
than ramped profiles with linearly increasing loudness (Gray 2011). as warning signals: the function of motion cues. Int J Psycho-
In the current work, we investigated how looming sounds might physiol 74(1):2833
serve, during a primary steering task, to alert participants to the Gray R (2011) Looming auditory collision warnings for driving.
appearance of visual targets. Nine volunteers performed a primary Human Factors 53(1):6374
steering task whilst occasionally discriminating visual targets. Their
primary task was to minimize the vertical distance between an
erratically moving cursor and the horizontal mid-line, by steering a
joystick towards the latter. Occasionally, diagonally oriented Gabor The creative process across cultures
patches (108 tilt; 18 diameter; 3.1 cycles/deg; 70 ms duration) would
appear on either the left or right of the cursor. Participants were Noemi Goltenboth1, C. Dominik Guss1,2, Ma. Teresa Tuason2
1
instructed to respond with a button-press whenever a pre-defined Otto-Friedrich Universitat Bamberg, Germany; 2 University of North
target appeared. Seventy percent of the time, these visual stimuli were Florida, USA
preceded by a 1,500 ms warning tone, 1,000 ms before they appeared. Keywords
Overall, warning cues resulted in significantly faster and more sen- Creativity, Culture, Artists, Cross-cultural comparison
sitive detections of the visual target stimuli (F1,8 = 7.72, p \ 0.05;
F1,8 = 9.63, p \ 0.05). Creativity is the driving force of innovation in societies across the
Each trial would present one of three possible warning cues. Thus, world, in many domains such as science, business, or art. Creativity
a warning cue (2,000 Hz) could either have a constant intensity of means to come up with new and useful ideas (e.g., Funke 2008). Past
65 dB, a ramped tone with linearly increasing intensity from 60 dB to research has focused on the individual, the creative process and its
approximately 75 dB or a comparable looming tone with an expo- product, and the role of the social environment when evaluating
nentially increasing intensity profile. The different warning cues did creative products. According to previous research, individual differ-
not vary in their influence of the response times to the visual targets ence variables such as intelligence and extraversion can partially
and recognition sensitivity (F2,16 = 3.32, p = 0.06; F2,16 = 0.10, predict creativity (e.g., Batey and Furnham 2006). Researchers have
p = 0.90). However, this might be due to our small sample size. It is also shown the importance of the social environment when labeling
noteworthy that the different warning tones did not adversely affect products as creative or not (e.g., Csikszentmihalyi 1988). Although,
steering performance (F2,16 = 1.65, p \ 0.22). Nonetheless, electro- creativity could be influenced by and differ among cultures, the
encephalographic potentials to the offset of the warning cues were influence of culture on creativity has been rarely studied.
significantly earlier for the looming tone, compared to both the Creativity and Culture
constant and ramped tone. More specifically, the positive component Culture can be defined as the knowledge base used to cope with the
of the event- related potential was significantly earlier for the looming world and each other, shared by a group of people and transmitted
tone by about 200 ms, relative to the constant and ramped tone, and from generation to generation (e.g., Guss et al. 2010). This knowledge
sustained for a longer duration (see Fig. 1). encompasses, for example, declarative world knowledge, values and
The current findings highlight the behavioral benefits of auditory behaviors (e.g., norms, rituals, problem-solving strategies). Following
warning cues. More importantly, we find that a veridical looming tone this definition, different cultures could value different aspects of
induces earlier event-related potentials than one with a linearly creativity (e.g., Lubart 1990).
123
The current study is based on two recommendations of creativity approach when studying creativity. The findings also broaden a nar-
researchers. First, it is important to study creativity across cultures as row cognitive view on creativity, highlighting also the role of
Westwood and Low (2003, p 253) summarized: Clearly personality motivational and socio-cultural factors during the creative process
and cognitive factors impact creativity and account for individual (for the role of societal context in creativity see also Nouri et al.
differences, but when it comes to differences across cultures the 2014).
picture is far from clear. Second, researchers recommend ethno- Whereas most artists experience similar creative processes, we
graphic or socio-historical analyzes and case studies of creativity in also found themes highlighting the influence of the artists cultural
different countries to study emic conceptions and to study the inter- background. Results are beneficial for further developing a compre-
action of societal, family and other factors in creativity (e.g., hensive theory of the creative process taking cultural differences into
Simonton 1975). The current study addresses these recommendations consideration and perhaps integrating them in computational crea-
by investigating creativity across cultures focusing on experts from tivity models (e.g., Colton and Wiggins 2012).
Cuba, Germany, and Russia.
Method Acknowledgments
Going beyond traditional student samples, we conducted semi-struc- This research was supported through a Marie-Curie IIF Fellowship to
tured interviews with experts, i.e., 10 Cuban, 6 Russian, and 9 the second author and to a Fellowship of the Studienstiftung des
German artists. Informed consent was obtained. All of the artists have deutschen Volkes to the first author. We would like to thank the artists
received awards and fellowships for their creative work (i.e., com- for participating and allowing us a glimpse into their world that we
positions, books, poems, paintings). The interviews focused on a) may learn from their experiences.
their personal history, b) the creative process, and c) the role of
culture during the creative process. These interviews lasted between
References
30 min and 1 h 43 min. They were transcribed verbatim and domains
Batey M, Furnham A (2006) Creativity, intelligence, and personality:
and themes were derived from these interviews using consensual
a critical review of the scattered literature. Genetic Soc Gen
qualitative research methodology (Hill et al. 2005). This means that at
Psych Monogr 132:355429
least 3 raters independently read and coded each transcribed inter-
Colton S, Wiggins GA (2012) Computational creativity: the final
view. Then the raters met and discussed the codings until they
frontier? In: Proceedings of 20th European conference on artifi-
obtained consensus.
cial intelligence (ECAI). Montpellier, France, pp 2126
Results
Csikszentmihalyi M (1988) Society, culture and person: a systems
Several categories were mentioned by more than three quarters of all
view of creativity. In: Sternberg RJ (ed) The nature of creativity:
25 participants. These categories refer to the following domains: 1)
contemporary psychological perspectives. Cambridge University
How I became an artist, 2) What being an artist means to me, 3)
Press, New York, pp 325339
Creating as a cognitive process, 4) Creating as a motivational process,
Funke J (2008) Zur Psychologie der Kreativitat. In Dresler M,
and 5) The role of culture in creating.
Baudson TG (eds) Kreativitat. Beitrage aus den Natur- und
Table 1 shows that German artists generally talk about financial
Geisteswissenschaften [Creativity: Contributions from natural
problems and the problem of selling their work, a topic rarely
sciences and humanities] (pp 3136). Hirzel, Stuttgart
mentioned by Cuban and Russian artists. Russian and German
Guss CD, Tuason MT, Gerhard C (2010) Cross-national comparisons
artists generally recognize persistence and hard work in creativity,
of complex problem-solving strategies in two microworlds. Cogn
and how a daily routine is helpful. A daily routine is rarely
Sci 34:489520
mentioned by Cuban artists. All artists, regardless of culture,
Hill CE, Knox S, Thompson BJ, Williams EN, Hess SA, Ladany N
recognize the universality of creativity, but acknowledge culture
(2005) Consensual qualitative research. J Counslg Psych
specific expressions.
52:196205. doi:10.1037/0022-0167.52.2.196
Discussion
Lubart TI (1990) Creativity and cross-cultural variation. Int J Psych
The current study is innovative as it investigates cultural differences
25:3959
among famous artists from Cuba, Russia, and Germany including
Nouri R, Erez M, Lee C, Liang J, Bannister BD, Chiu W (2014)
different groups of artists. The semi-structured interviews reveal a
Social context: key to understanding cultures effects on crea-
wealth of different domains and categories related to creativity, and
tivity. J Org Behav. doi:10.1002/job.1923
highlight the need for a holistic, action oriented, and system-oriented
Table 1 Some cultural differences in category frequencies

Cuba Russia Germany
Being an artist means being financially uncertain Variant Typical General

Being an artist means to deal with the necessary evil of marketing and selling the work Rare Rare Typical
Being creative is natural to human beings Variant Typical Variant
Creativity is persistence and hard work Variant General General
It helps me to have a daily regular routine Rare Variant Typical
Creativity is universal, but culture provides specific expressions (forms and circumstances) Typical Variant Typical
for creativity
General * [90 % General = 910 General = 6 General = 89
Typical * 5089 % Typical = 58 Typical = 45 Typical = 57
Variant * 1149 % Variant = 34 Variant = 23 Variant = 34
Rare * \10 % Rare = 12 Rare = 1 Rare = 12
123
Simonton DK (1975) Sociocultural context of individual creativity: a groups, as sequence patterns in the form of network structures (with
trans-historical time-series analysis. J Pers Soc Psych 32:11191133 speech acts as nodes and possible reactions as linking arrows). The
Westwood R, Low DR (2003) The multicultural muse: culture, cre- smallest units in these structures were the speech acts determined by
ativity and innovation. Int J Cross Cult Manag 3:235259 the definitions provided. Based on this, sequences of speech acts
were analyzed. We also investigated the range and frequency of
reactions found in the dialogues to a particular speech act. The
relative frequencies of speech act sequences were determined for
How do human interlocutors talk to virtual assistants? greeting and farewell phases as well for particular speech acts, such
A speech act analysis of dialogues of cognitively as expressives and assertives, for each of the user groups. The
impaired people and elderly people with a virtual politeness of discourse was determined by number of expressive
speech acts and complexity of speech in terms of number of
assistant
assertive speech acts (used to specify a request or explain an
appointment) following a directive speech act.
Irina Grishkova1, Ramin Yaghoubzadeh2, Stefan Kopp2, Constanze Results show that the elderly interlocutors have a more compli-
Vorwerg1 cated dialogue structure when communicating with an artificial
1
University of Bern, Switzerland; 2 Bielefeld University, Germany assistant. They use more assertive utterances like explaining,
An artificial daily calendar assistant was developed to provide valu- repeating, specifications. Furthermore, we have found that some of
able support for people with special needs (Yaghoubzadeh et al. the elderly speakers use more different expressive speech acts, com-
2013). Users may interact differently when they communicate with an pared to cognitively impaired people, demonstrating more politeness
artificial system. They normally tend to adapt their linguistic behavior towards an artificial assistant.
(Branigan et al. 2009), but different users may have different inter- The analysis of linguistic means has yielded a number of different
action styles (Wolters et al. 2009). In this study, we investigated how forms when requesting a virtual assistant to enter an appointment in
people with cognitive impairments and elderly people talk to their their virtual calendar. The linguistic forms used in the dialogues were
virtual assistant, focusing on pragmatic aspects: speech acts per- classified as I- and we-form, form of third person, or neutral form. The
formed, and the linguistic means use to perform them. most frequently used forms were I-form and neutral form. Participants
A starting point of our analysis is the observation that the patterns from A use the neutral form twice as much as the I-form. In contrast,
in which linguistic actions occur, and which provide socially shaped C users use the I-form twice as much as the neutral form. Participants
potentials for achieving goals, (Ehlich, Rehbein 1979) are not nec- from B also use I-form most frequently, but in contrast to A or C, they
essarily linear, but often manifest characteristic recursivity, decision also use the we-form and the form of third person.
points, supportive accessory patterns, and omissions of pattern ele- Altogether, the results show that there are no fundamental dif-
ments (Griehaber 2001). In addition, the linguistic means used to ferences in dialogue patterns between groups; however, there is a
perform linguistic action units may vary considerably. We addressed larger heterogeneity in the group A, and especially in the group B, as
two questions: (1) What communication patterns between a human compared to the group C. The group B does also seem to display a
and an artificial assistant occur in each of three groups of users larger diversity in linguistic means.
(elderly people, people with cognitive impairments, control group)
when making a request to enter an appointment? (2) What linguistic References
forms are typically used by the three user groups for making those American Psychiatric Association (2000) Diagnostic and statistical
requests? To answer these questions, we carried out a pragmatic manual of mental disorders DSM-IV-TR, 4th ed. American
analysis of conversations between participants of these three groups Psychiatric Publ., Arlington, VA
and the artificial assistant based on Searles speech act theory (Searle Branigan HP, Pickering JM, Pearson J, McLean JF (2010) Linguistic
1969; 1976), and techniques of the functional-pragmatic discourse alignment between people and computers. J Pragmat 42:
analysis (Griehaber 2001). 23552368
Three user groups participated in the study: cognitively impaired Ehlich K, Rehbein J (1979 a) Sprachliche Handlungsmuster. In:
people (A) were all participants had light to medium mental retar- Soeffner HG (Hrsg.) Interpretative Verfahren in den Sozial- und
dation (approximately F70-F71 on the APA DSM scale [American Textwissenschaften. Metzler, Stuttgart, pp 243274
Psychiatric Association 2000]), elderly people (B), and a control Griehaber W (2001) Verfahren und Tendenzen der funktional-
group (C) (Yaghoubzadeh et. al. 2013). The participants were handed pragmatischen Diskursanalyse. In: Ivanyi Z, Kertesz A (Hrsg.)
cards with appointments and asked to plan the appointments for the Gesprachsforschung. Tendenzen und Perspektiven. Peter Lang,
following week by speaking to the virtual assistant if it were a human Frankfurt am Main, pp 7595
being. The assistant was presented on a TV screen and as being able Hindelang G (1994) Sprechakttheoretische Dialoganalyse. In: Fritz G,
to understand the user and speak to him, using a Wizard-of-Oz Hundsnurscher F (Hrsg.), Handbuch der Dialoganalyze. Nie-
technique. meyer, Tubingen, pp 95112
All interactions between the participants and the assistant were Searle J (1969) Sprechakte. Ein sprachphilosohischer Essay. U ber-
recorded and transcribed. We split all dialogues in dialogue phases and setzt von Wiggershaus R und R. Suhrkamp Taschenbuch
annotated the speech acts performed by both the human interlocutor and Wissenschaft, Frankfurt am Main
the artificial assistant within a conversation. Therefore each dialogue Searle J (1976) A classification of illocutionary acts. Lang Soc
phase was split in minimal communication itemsspeech acts (Searle 5(1):123
1969), using a pattern oriented description (Hindelang 1994). For each Wolters M, Georgila K, Moore JD, MacPherson SE (2009) Being old
speech act, we provided its definition in terms of illocutionary force and doesnt mean acting old: how older users interact with spoken
rules for performance (Searle 1969), as well as the complete list of dialog systems. ACM Transa Accessible Comput 2(1):2
linguistic forms used in the conversations. Yaghoubzadeh R, Kramer M, Pitsch K, Kopp S (2013) Virtual agents
We modeled the structures of the pertinent dialogue phases as daily assistants for elderly or cognitively impaired people. In:
(greeting, making an appointment, farewell) for each of the three Intelligent virtual agents. Springer, Berlin, pp 7991
123
Effects of aging on shifts of attention in perihand space for older adults only. In particular, the advantage was larger when
their dominant hand was near the display. These results further sug-
Marc Grosjean1, Nathalie Le Bigot2 gest that visual processing is differentially affected by hand proximity
1
Leibniz Research Centre for Working Environment and Human in younger and older adults. In contrast to younger adults, which
Factors, Dortmund, Germany; 2 University of Bretagne Occidentale showed an effect of hand proximity on the involuntary shifting of
& CNRS (Lab-STICCUMR 6285), Brest, France attention, hand position seems to only affect the attentional prioriti-
zation of space in older adults (Reed et al. 2006).
It is well established that visual processing is altered for stimuli that
appear near the hands, that is in perihand space (for a recent review, References
see Brockmole et al. 2013). For example, placing ones hands near a Abrams RA, Davoli CC, Du F et al. (2008) Altered vision near the
display has been shown to increase visual sensitivity (Dufour and hands. Cognition 107:10351047
Touzalin 2008), enhance attentional engagement, such as the ability Bloesch EK, Davoli CC, Abrams RA (2013) Age-related changes in
to detect changes in dynamic displays (Tseng and Bridgeman 2011), attentional reference frames for peripersonal space. Psychol Sci
but also to slow down attentional disengagement, as evidenced by 24:557561
longer search times when trying to find a target stimulus in a cluttered Brockmole JR, Davoli CC, Abrams RA, Witt JK (2013) The world
display (Abrams et al. 2008). A number of studies suggest that these within reach: effects of hand posture and tool-use on visual
hand-proximity effects, as they are known, are modulated by the cognition. Curr Direction Psychol Sci 22:3844
functionality of the hands and that visual processing is altered at Dufour A, Touzalin P (2008) Improved visual sensitivity in the
locations where action is more likely to occur (e.g., Le Bigot and perihand space. Exp Brain Res 190:9198
Grosjean 2012; Reed et al. 2010). Geffen G, Bradshaw JL, Wallace G (1971) Interhemispheric effects
Although it is well documented that cognitive processing gener- on reaction time to verbal and nonverbal visual stimuli. J Exp
ally becomes slower and less accurate over the lifespan (e.g., Psychol 87:415422
Verhaegen and Salthouse 1997), hand-proximity effects have rarely Ghafouri M, Lestienne FG (2000) Altered representation of periper-
been studied with regard to aging. Of particular relevance for the sonal space in the elderly human subject: a sensorimotor
present study, sensorimotor abilities are also known to deteriorate approach. Neurosci Lett 289:193196
with age, especially for hand movements (Ranganath et al. 2001). Le Bigot N, Grosjean M (2012) Effects of handedness on visual
These age-related changes presumably reduce the overall function- sensitivity in perihand space. PLoS ONE 7(8): e43150
ality of the hands, which in turn could influence how visual Llyod DM, Azanon E, Poliakoff E (2010) Right hand presence
processing changes in perihand space. To test this notion, we sought modulates shifts of exogenous visuospatial attention in near
to examine whether visual processing, in general, and shifts of perihand space. Brain Cogn 73:102109
attention, in particular, are affected by hand proximity in the same Posner MI (1980) Orienting of attention. Quart J Exp Psychol
way for younger and older individuals. In a covert-orienting task 32:325
(Posner 1980), younger (mean age \ 25 years) and older (mean Ranganath VK, Siemionow V, Sahgal VS, Yue GH (2001) Effects of
age [ 65 years) right-handed adults were asked to discriminate aging on hand function. J Am Geriatrics Soc 49:14781484
between a target (letter) and distractor stimulus that could appear at a Reed CL, Betz R, Garza JP, Roberts RJ Jr (2010) Grab it! Biased
peripheral left or right location. The stimulus was preceded by an attention in functional hand and tool space. Attention Perception
uninformative peripheral cue (stimulus-onset asynchrony = 100 ms) Psychophys 72:236245
that was presented either at the upcoming stimulus location (valid Reed CL, Grubb JD, Steele C (2006) Hands up: attentional prioriti-
trial) or at the opposite location (invalid trial). Participants performed zation of space near the hand. J Exp Psychol Human Percept
the task under four hand-position configurations: Left only, right only, Performance 32:166177
both hands, or no hands (control condition) near the display. Tseng P, Bridgeman B (2011) Improved change detection with nearby
As expected, older adults were overall slower to respond than hands. Exp Brain Res 209:257269
younger adults, and both age groups showed a reliable cueing Verhaegen P, Salthouse TA (1997) Meta-analyzes of age-cognition
effect: Responses were faster on valid than on invalid trials. relations in adulthood: estimates of linear and nonlinear age
Interestingly, younger adults also revealed an interaction between effects and structural models. Psychol Bull 122:231249
cue validity and hand position, which reflected that the cueing
effects were larger when their dominant hand was near the display.
The latter finding is in line with those of Llyod et al. (2010), who
also observed that involuntary shifts of attention are affected by The fate of previously focused working memory
hand proximity (for younger adults) and that this effect seems to be content: decay or/and inhibition?
limited to the right (dominant) hand. More generally, these findings
suggest that hand proximity affects visual processing in different
Johannes Groer, Markus Janczyk
ways for younger and older adults. This may reflect how the
Department of Psychology III, University of Wurzburg, Germany
functionality of the hands and peoples representation of periper-
sonal space changes when cognitive and motor skills become Working memory is thought to allow short term storage of informa-
slower and less accurate over the lifespan. Consistent with this tion in a state in which this information can be manipulated by
notion, it has been shown that older individuals tend to have a ongoing cognitive processes. Evidence from various paradigms sug-
more compressed representation of peripersonal space (Ghafouri gests that at any time only one item held in working memory is
and Lestienne 2000) than younger adults and tend to spatially selected for possible manipulation. Oberauer (2002) has thus sug-
allocate their attention more around the trunk of their body than gested a 1-item focus of attention within his model of working
around their hands (Bloesch et al. 2013). memory. Conceivably, this focus of attention needs to shift between
Both age groups also showed evidence of a right hemi-field several items during task performance and the following question is
advantage (i.e., faster responses to stimuli presented to the right than unresolved: What happens to a formerly selected, but now de-selec-
to the left of fixation), which is most likely due to a left-hemisphere ted, item?
(right-hemifield) advantage in processing linguistic stimuli (Geffen Several studies have addressed this question, with opposing
et al. 1971). However the latter effect was modulated by hand position results. Bao, Li, Chen, and Zhang (2006) investigated verbal
123
working memory with an updating task where participants count the References
number of occurrences of (three) different sequentially presented Bao M, Li ZH, Chen XC, Zhang DR (2006) Backward inhibition in a
geometric objects (e.g., Garavan 1998; see also Janczyk, Grabowski task of switching attention within verbal working memory. Brain
2011). In particular, they employed the logic typically used to show Res Bull 69: 214221
n - 2 repetition costs in task-switching experiments and found Garavan H (1998) Serial attention within working memory. Mem
slower updating in ABA than in CBA sequences, i.e., evidence for Cogn 26:263276
an active inhibition of de-selected items (but see Janczyk, Wienrich, Janczyk M, Grabowski J (2011) The focus of attention in working
Kunde 2008, for no signs of inhibition with a different paradigm). memory: evidence from a word updating task. Memory 19:211225
Rerko and Oberauer (2013) investigated visual working memory Janczyk M, Wienrich C, Kunde W (2008) On the costs of refocusing
with the retro-cue paradigm. Participants first learned an array of items in working memory: a matter of inhibition or decay?
briefly presented colored items. Long after encoding, one, two, or Memory 16:374385
three retro-cues (arrows) were presented one after another, with Oberauer K (2002) Access to information in working memory:
always the last one pointing to the particular location that is sub- exploring the focus of attention. J Exp Psychol Learn 28:411421
sequently tested with a change detection task. (The retro-cue effect Rerko, L., Oberauer, K. (2013) Focused, unfocused, and defocused
refers to the finding of improved performance after valid compared information in working memory. J Exp Psychol Learn 39
with neutral cues.) In the critical condition, Rerko and Oberauer 10751096
presented three retro-cues to employ the n - 2 repetition logic and
found evidence for passive decay of de-selected items. These
diverging results obviously come with many differences between
experiments: verbal vs. visual working memory, three working items How global visual landmarks influence the recognition
vs. six working items, two different groups of participants, and so of a city
on. Here we present ongoing work aiming at identifying the critical
factor(s). Kai Hamburger, Cate Marie Trillmich, Franziska Baier,
As a first step, we attempted to replicate the results of Bao et al. Christian Wolf, Florian Roser
(2006) and Rerko and Oberauer (2013) within one sample of par- University of Giessen, Giessen, Germany
ticipants. A group of n = 24 students took part in two experiments
(we excluded participants with less than 65 % correct trials; 10 in Abstract
Exp 1/3 in Exp 2). In Experiment 1, participants performed in a What happens if characteristic landmarks are taken out of a city scene
three-objects updating task and we compared performance in ABA or being interchanged? Are we still able to recognize the city scene
and CBA trials. ABA trials yielded longer RTs (see Fig. 1, left itself or are we fooled by the missing or misleading information?
panel), thus pointing to inhibitory mechanisms just as Bao et al. What information is then represented in our mind and how? Findings
(2006) reported. In Experiment 2, participants performed in a retro- are discussed with respect to attentional capture and decision making.
cue task with 1, 2, or 3 retro-cues presented one after another. Most Keywords
importantly, in the 3 retro-cue condition the cues either pointed to Spatial cognition, Visual landmarks, Recognition, Attention, Decision
three different locations (CBA) or the first and the third cue pointed making
to the same location (ABA). We did not observe a difference in Introduction
accuracy in this case, but RTs were longer in CBA than in ABA Famous cities are hard to recognize if the characteristic global land-
trials (see Fig. 1, right panel), thus pointing to passive decay but not mark is taken out of the city scene. In this context we define a global
to inhibitory mechanisms. landmark as a (famous) building that may be used for orientation
After all, with one single sample of participants we were able to purposes from multiple viewpoints (however, other objects such as
largely replicate the diverging results from two tasks that were trees, mountains, rivers, etc. may also represent landmarks). Here, we
designed to answer the same research question. Given this, it appears focus on visual information processing and show that a global land-
worthwhile to us to continue this work and to isolate critical factors. mark in form of a famous building by itself does not necessarily lead to
This work is currently in progress. successful recognition of major city scenes. Thus, we assume that the
landmark (object) alone is very helpful for spatial representations and
spatial orientation, but the context/surrounding (city scene) is often
required for a full and correct mental representation. Thus, the isolated
objects sometimes lead to inappropriate mental representations and
may also lead us totally astray, especially when they are interchanged.
Evans et al. (1984) stated that landmarks and the pathways grid
configuration facilitates geographic knowledge and that especially
visual landmarks improve comprehension of place locations. But, the
authors also noted that manipulations of the grid configuration and
landmark placement in a simulated environment setting cause chan-
ges in environmental knowledge.
According to Clerici and Mironowicz (2009) it is important to
distinguish between landmarks acting as markers, which could
therefore be replaced by direction signs and indicators, and landmarks
acting as marks and brands of a specific city, which can be considered
as a key factor for the quality of urban life (e.g., Big Ben in London or
Golden Gate Bridge in San Francisco). So, what are the relevant
visual information characterizing a city scene?
Fig. 1 Response times (RT) in milliseconds (ms) of Experiments 1 Methods
and 2 as a function of trial sequence (CBA [control] vs. ABA The experiment to examine the influence of a famous landmark on city
[inhibition]) recognition was conducted on a standard PC presenting the different
123
Fig. 1 Original (left): city scenes of Berlin with the TV Tower (Alex) and Paris with the Eiffel Tower; Modified (center and right): without
TV and Eiffel Tower, and (right) Berlin with the Eiffel Tower of Paris and vice versa
combinations of (isolated/interchanged) landmarks and their corre- Presented in a different context the most prominent landmark is more
sponding cities. Each city scene/landmark only occurred once (between- important (e.g., dominates the decision/judgment) than its immediate
subject factor). Participants were assigned to the different combinations surroundings (including other potential landmarks and landscapes, e.g.,
randomly. An example is given in Fig. 1, while Table 1 presents the mountains). But, sometimes the city scene seems to contain more
questions raised with all further experimental details and results. important information than just one characteristic landmark and it can
Results still be recognized successfully without it (e.g., London, Venice).
To summarize the results: 1. In general, many city scenes (46 %) In our experiment, the object pops out from the city scene and captures
could be identified correctly if landmark and surrounding were a our attention (bottom-up). This attentional capture might prevent that
match (original city scene); 2. Participants had severe difficulties information from the visual scene/surrounding city is considered for
recognizing some of the given cities when the characteristic landmark recognition. The recognition process is therefore only based on infor-
was missing (e.g., Berlin without the TV Tower, Paris without the mation about the deceptive landmark (top-down). In this case, the
Eiffel Tower, Sydney without the opera); 3. Some cities could still be attentional capture might be caused by the high contextual salience of the
recognized very well without the characteristic landmark (London, landmark (Caduff, Timpf 2008) as it is clearly distinguishable from the
Venice); and 4. Most participants were totally fooled when other rest of the scenery. This phenomenon could as well be explained within a
(deceptive) landmarks were shown instead of the original ones. semantic network with two contradicting associations: One is based on
Discussion the deceptive landmark while the other is based on the surroundings. The
We demonstrate that a city scene without a characteristic global land- attentional capture on the deceptive landmark inhibits any information of
mark may be recognized correctly in some cases and wrongly in others; the further city scene to be considered for recognition.
while an object presented in a new context may lead to incorrect or Another possible interpretation could come from the research field of
inappropriate information retrieval from memory (semantic network). decision making: According to dual process theories (type 1 versus type 2
Table 1 Research questions and results for the 31 observers

Experimental design and results
Category Example Questions Result (%, ms)
Cities and Paris with Eiffel Tower 1. Do you know this city? (affirmations [%]) 64 % 2,386 ms
landmarks (Figure 1 bottom, left) 2. What is the name of the city? (correct labeling [%]) 46 % 1,887 ms
3. How confident are you with your answer? (Scale from 1 to 7; 2.10 2,024 ms
1 = very confident7 = very insecure/unsecure)
Cities without Paris without Eiffel Tower 1. Do you know this city? (affirmations [%]) 35 % 2,801 ms
landmarks (Figure 1 bottom, middle) 2. What is the name of the city? (correct labeling [%]) 19 % 2,037 ms
1 = very confident)
Cities with Paris with TV Tower Berlin 1. Do you know this city? (affirmations [%]) 50 % 3,268 ms
deceptive (Figure 1 bottom, right) 2. What is the name of the city? (correct labeling [%]) 8% 1,982 ms
landmarks
1 = very confident7 = very insecure/unsecure)
Correct labeling the city the landmark is really located in [%] 31 %
Participants answered three questions in the three conditions. N = 31 (students of the University of Giessen) 18f:13 m, mean age: 25 years
(SD = 4.4)
123
processing), decisions (here: what city is represented?) could be made parts of the map cluster together. Depending on task demands and the
consciously and unconsciously (e.g., Markic 2009). One key aspect of the time available for spatial learning, the coding of space can be sup-
unconscious, automatic process is associative learning (Evans 2003), ported at each of the levels of granularity or in combination.
which might explain that a single landmark stores all of the relevant The interaction of language and space has been studied on a wide
information for the context (object = city = explicit knowledge). This variety of aspects including the acquisition of spatial knowledge from
experiment shows some important connections between perception and verbal descriptions, verbal direction giving, influences of spatial ref-
recognition of spatial information on one side and theories of attention erence frames which are employed in specific languages on the
and decision making on the other. This could serve as a valuable basis for judgment of similarity of spatial configurations, or retrospective reports
future research on visuo-spatial information processing. of spatial thinking. Little is known, however, about possible functions
of language-based or language-supported representations in actual
References navigational, or wayfinding behavior. In a dual task study, Meilinger
et al. (2008) showed that verbal distractor tasks are more detrimental to
Caduff D, Timpf S (2008) On the assessment of landmark salience for
human wayfinding. Cogn Process 9(4):249267 route navigation than distractor tasks involving visual imagery or spa-
Clerici A, Mironowicz I (2009) Are landmarks essential to the city tial hearing. In an ongoing study, Meilinger et al. (2009) investigate
its development? In: Schrenk M, Popovich V V, Engelke D, advantages of different types of verbal place codes, i.e. names
describing local landmarks vs. arbitrary names. In this study, descrip-
Elisei P (eds) REAL CORP 2009: Cities 3.0Smart, sustainable,
integrative: strategies, concepts and technologies for planning the tive naming leads to better navigational results than arbitrary naming.
urban future. Eigenverlag des Vereins CORPCompetence In this study, the role of language-supported representations of
space was assessed in two wayfinding experiments (using virtual
Center of Urban and Regional Planning, pp 2332
Evans J St B (2003) In two minds: dual-process accounts of reasoning reality) with labeling of places using a route and a survey knowledge
Cogn Sci 7(10): 454458 task, respectively. In the association phase of both tasks, subjects are
Evans G W, Skorpanich M A, Garling T, Bryant K J, Bresolin B (1984) requested to label the places either with semantically meaningful
names (word condition) or icons (icon condition) to build up a link
The effects of pathway configuration, landmarks and stress on
environmental cognition. J Exp Psychol 4:323335 between sensorimotor and language representation. In a control
Markic O (2009) Rationality and emotions in decision making. In- condition no labeling was required. In the route task, subjects simply
terdiscip Descrip Complex Syst 7(2):5464 learned to repeat a route (containing 10 places) from a given starting
point to a goal location in a stereotyped way (route phase). In the
survey task, subjects first had to learn a set of four intersecting routes
(containing 45 places) and then are asked to infer four novel routes
Explicit place-labeling supports spatial knowledge by recombining sections of learned routes (survey phase). Way-
in survey, but not in route navigation finding performance was assessed by the distance subjects travelled
to find the goal in PAO (percentage above optimum). Overall, we
found no differences between word-based and icon-based labeling.
Gregor Hardiess, Marc Halfmann, Hanspeter Mallot Labeling supported wayfinding not in in the route (no effect of label
Cognitive Neuroscience, University of Tubingen, Germany condition on distance), but in the survey knowledge task. There,
The knowledge about the navigational space develops with landmark subjects performed the survey phase in the word as well as in the
and route knowledge as the precursors of survey (map-like) knowl- icon condition with reduced walking compared to the control con-
edge (Siegel, White 1975)a scheme that is widely accepted as the dition. Furthermore, this supporting effect was more pronounced in
dominant framework. Route knowledge is typically based on an subjects with good wayfinding scores. We conclude that the asso-
egocentric reference frame and learning a route is simply forming ciated place-labels supported the formation of abstract place
place-action associations between locations (places) and the actions to concepts and further the inference of novel routes from known route
take in the sequence of the route. On the other hand, in survey segments which are useful in the more complex (higher hierarchy
knowledge, places need to be represented independently of viewing and representational level) survey, but not in the simple route tasks
direction and position. Furthermore, survey representations include where just stereotyped stimulusresponse associations without
configural knowledge about the relations (topologic, action-based, or planning are needed.
graph-like) between the places in the environment. In wayfinding, it
seems that navigators can draw upon different memory representa- References
tions and formats of spatial knowledge depending on the task at hand Mallot HA, Basten K (2009) Embodied spatial cognition: Biological
and the time available for learning. and artificial systems. Image Vision Comput 27(11):16581670
The hierarchy of spatial representation comprises different levels Meilinger T, Knauff M, Bulthoff HH (2008) Working memory in
of granularity. At the finest level, the recognition of landmarks (i.e., wayfindinga dual task experiment in a virtual city. Cogn Sci
salient and permanent patterns or objects, available in the environ- 32(4):755770
ment) has to be considered. Grouping spatially related landmarks Meilinger T, Schulte-Pelkum J, Frankenstein J, Laharnar N, Hardiess
together leads to the concept of a place, the fundamental unit of routes G, Mallot HA, Bulthoff HH (2009) Place namingexamining
and maps. Building a route involves the connection of places with the the influence of language on wayfinding. In: Taatgen N, van Rijn
corresponding spatial behavior. At this intermediate level, several H (eds) Proceedings of the thirty-first annual conference of the
routes can exit in parallel also with spatial overlap but without cognitive science society. Cognitive Science Society
interactions to each other (Mallot, Basten 2009). Route combination Siegel AW, White SH (1975) The development of spatial representa-
occurs first at the level of survey representations. Here, the embed- tions of largescale environments. Adv Child Dev Behav 10:955
ding of places as well as routes in a so called cognitive map as a Tolman EC (1948) Cognitive maps in rats and man. Psychol Rev
configural representation of the environment enables the creation of 55:189208
novel routes and shortcuts to find the goal (Tolman 1948). On top of Wiener JM, Mallot HA (2003) Fine-to-coarse route planning and
the hierarchy, the coarsest level of granularity is provided by the navigation in regionalized environments. Spatial Cogn Comput
formation of regions (Wiener, Mallot 2003), where spatially related 3(4):331358
123
How important is having emotions for understanding the persons in the videos were to get mugged. The study found a
others emotions accurately? robust, positive correlation between primary (low-anxious) psy-
chopathic traits and accuracy in naming the persons, who had been
victims in their past, to be the ones most likely to get mugged. Sec-
Larissa Heege, Albert Newen
ondary (high-anxious) psychopaths did not demonstrate such a skill
Ruhr-University Bochum, Germany
(Wheeler et al. 2009).
Mirror neuron theory for understanding others emotions In a similar study five students had to walk through a lecture hall,
According to the research group, which discovered mirror neurons in in front of other students with many and few psychopathic traits. One
Parma, emotions can be understood through cognitive elaborations of of the walking students carried a hidden handkerchief. The students
visual emotional expressions and without a major involvement of with many and few psychopathic traits had to guess who hid the
mirror neuron mechanisms. They assume, though, that this provides handkerchief. Seventy percent of the students with many psycho-
only a pale and detached account of others emotions (Rizzolatti pathic traits named the right student; of the students with few
et al. 2004): psychopathic traits just thirty percent named the student with the
It is likely that the direct viscero-motor mechanism scaffolds the handkerchief (Dutton 2013).
cognitive description, and when the former mechanism is not present In another study, people with many psychopathic traits, showed a
or malfunctioning, the latter provides only a pale, detached account of decreased amygdala activity during emotion-recognizing tasks. The
the emotions of others. (Rizzolatti et al. 2004). people with primary psychopathic traits showed also an increased
Mirror neurons in reference to emotions are neurons that fire when activity in the visual and the dorsolateral prefrontal cortex. So pri-
we have an emotion as well as when we observe somebody else mary psychopaths use much more brain areas, which are associated
having the same emotion. It is assumed that mirror neuron mecha- with cognition and perception, when they solve emotion-recognizing
nisms evoke in the observer an understanding of others emotions, tasks (Gordon et al. 2004).
which is based on resonances of the observers emotions. This way an Conclusions
automatic first-person-understanding of others emotions is originated Primary psychopaths use primarily cognitive elaborations to under-
(Rizzolatti and Sinigaglia 2012; Rizzolatti et al. 2004): stand others emotions and (almost) do not have the emotion of fear.
Side by side with the sensory description of the observed social Thus according to mirror neuron theorists, psychopaths should have
stimuli, internal representations of the state associated with these [] a pale, detached account of fear in others (see end of first
emotions are evoked in the observer, as if they [] were experi- paragraph).
encing a similar emotion. (Rizzolatti et al. 2004). Psychopaths are surely not able to have a first person as if
Thus somebody, who is not able to have a specific emotion, would understanding of others fear: They cannot feel fear with others. In
also not be able to have a first person as if understanding of this this way it is possible to say that psychopaths have a pale, detached
emotion in others. Resonances of own emotions could not be pro- account of others emotions.
duced, the mirror neuron mechanism would not be present or could However, it cannot be said that the outcome of their understanding
not work appropriately. If this person used instead primarily cognitive of others fear is pale and detached. In fact, they recognize others
elaborations to understand this emotion in others, his emotion fear often more accurately than people, who are able to have fear.
understanding should be pale and detached, according to mirror Also we can conclude that (at least for psychopaths) having an
neuron theory. emotion is not important for understanding this emotion in others
Psychopaths and having the emotion of fear accurately.
Primary (low-anxious) psychopaths, demonstrated in the PANAS
(positive affect, negative affect scales) a significant negative cor-
relation with having the emotion of fear (-.297) (Del Gaizo and References
Falkenbach 2008). Furthermore, an experiment showed that psycho- Book AS, Quinsey VL, Langford D (2007) Psychopathy and the
paths, in contrast to non-psychopaths, do not get anxious when they perception of affect and vulnerability. Crim Justice Behav
breathe in the stress sweat of other people (Dutton 2013). Psychopaths 34(4):531544. doi:10.1177/0093854806293554
also have a reduced amygdala activity (Gordon et al. 2004) and a Del Gaizo AL, Falkenbach DM (2008) Primary and secondary psy-
reduced startle response (Herpertz et al. 2001). chopathic-traits and their relationship to perception and
Psychopaths and understanding fear in others experience of emotion. Pers Indiv Differ 45:206212. doi:
In a study 24 photographs showing different facial expressions 10.1016/j.paid.2008.03.019
(happy, sad, fearful, angry, disgusted and neutral) were presented to Dutton K (2013) The wisdom of psychopaths. Arrow, London
psychopathic inmates and non-psychopaths. The psychopathic Gordon HL, Baird AA, End A (2004) Functional differences among
inmates demonstrated a greater skill in recognizing fearful faces than those high and low on a trait measure of psychopathy. Biol
the non-psychopaths (Book et al. 2007): Psychiatry 56(7):516521. doi:10.1016/j.biopsych.2004.06.030
[A] general tendency for psychopathy [is] to be positively asso- Herpertz SC, Werth U et al. (2001) Emotion in criminal offenders
ciated with increased accuracy in judging emotional intensity for with psychopathy and borderline personality disorder. Arch Gen
facial expressions in general and, more specifically, for fearful faces. Psychiatry 58:737745. doi:10.1001/archpsyc.58.8.737
(Book et al. 2007). Rizzolatti G, Sinigaglia C (2008) Mirrors in the brain. Oxford Uni-
Psychopaths also identify bodily expressions, which are based on versity Press, Oxford
fear/anxiety, significantly better than non-psychopaths: Ted Bundy, a Wheeler S, Book AS, Costello K (2009) Psychopathic traits and
psychopathic serial killer, stated that he could identify a good victim perceptions of victim vulnerability. Crim Justice Behav
due to her gait. Relating to this statement, in a study twelve videos of 36:635648. doi:10.1177/0093854809333958
people walking through a corridor were shown to psychopaths and Rizzolatti G, Gallese V, Keysers C (2004) A unifying view of the
non-psychopaths; six of the walking people had been victims in their basis of social cognition. Trends Cogn Sci 8(9):396403. doi:
past. The psychopaths and non-psychopaths had to decide how likely 10.1016/j.tics.2004.07.002
123
Prosody conveys speakers intentions: acoustic cues seems to be supportive but not necessary for the comprehension of
for speech act perception different intentions, given that participants showed a high perfor-
mance for the non-words, but scored higher for the words. In total, our
results show that prosodic cues are powerful indicators for the
Nele Hellbernd, Daniela Sammler
speakers intentions in interpersonal communication. The present
Max Planck Institute for Human Cognitive and Brain Sciences,
carefully constructed stimulus set will serve as a useful tool to study
Leipzig, Germany
the neural correlates of intentional prosody in the future.
Recent years have seen a major change in views on language and lan-
guage use. During the last decades, language use has been more and References
more recognized as an intentional action (Grice 1957). In the form of Austin JL (1962) How to do things with words. Oxford University
speech acts (Austin 1962; Searle 1969), language expresses the Press, Oxford
speakers attitudes and communicative intents to shape the listeners Banse R, Scherer KR (1996) Acoustic profiles in vocal emotion
reaction. Notably, the speakers intention is often not directly coded in expression. J Pers Soc Psychol 70(3):614636
the lexical meaning of a sentence, but rather conveyed implicitly, for Bolinger D (1986) Intonation and its parts: melody in spoken English.
example via nonverbal cues such as mimics, body posture, and speech Stanford University Press, Stanford
prosody. The theoretical work of intonational phonologists seeking to Egorova N, Shtyrov Y, Pulvermuller F (2013) Early parallel pro-
define the meaning of specific vocal intonation profiles (Bolinger 1986; cessing of pragmatic and semantic information in speech acts:
Kohler 1991) demonstrates the role of prosody in conveying the neurophysiological evidence. Front Human Neurosci 7
speakers conversational goal. However, to date only little is known Grice HP (1957) Meaning. Philos Rev 66(3):377388
about the neurocognitive architecture underlying the comprehension of Holtgraves T (2005) The production and perception of implicit per-
communicative intents in general (Holtgraves 2005; Egorova, Shtyrov, formatives. J Pragm 37(12):20242034
Pulvermuller 2013), and the distinctive role of prosody in particular. Kohler KJ (ed) (1991) Studies in German intonation (No. 25). Institut
The present study aimed, therefore, to investigate this interper- fur Phonetik und digitale Sprachverarbeitung, Universitat Kiel.
sonal role of prosody in conveying the speakers intents and its Sauter D, Eisner F, Calder A, Scott S (2010) Perceptual cues in
underlying acoustic properties. Taking speech act theory as a nonverbal vocal expressions of emotion. Quart J Exp Psychol
framework for intention in language (Austin 1962; Searle 1969), we 63(11):22512272
created a novel set of short (non-)word utterances intoned to express Searle JR (1969) Speech acts: an essay in the philosophy of language.
different speech acts. Adopting an approach from emotional prosody Cambridge University Press, Cambridge
research (Banse, Scherer 1996; Sauter, Eisner, Calder, Scott 2010),
this stimulus set was employed in a combination of behavioral ratings
and acoustic analyzes to test the following hypotheses: If prosody
codes for the communicative intention of the speaker, we expect 1) On the perception and processing of social actions
above-chance behavioral recognition of different intentions that are
merely expressed via prosody, 2) acoustic markers in the prosody that Matthias R. Hohmann, Stephan de La Rosa, Heinrich H. Bulthoff
identify these intentions, and 3) independence of acoustics and Max Planck Institute for Biological Cybernetics, Tubingen, Germany
behavior from the overt lexical meaning of the utterance.
The German words Bier (beer) and Bar (bar) and the non- Action recognition research has mainly focused on investigating the
words Diem and Dahm were recorded from four (two female) perceptual processes in the recognition of isolated actions from bio-
speakers expressing six different speech acts in their prosodycrit- logical motion patterns. Surprisingly little is known about the
icism, wish (expressives), warning, suggestion (directives), doubt, and cognitive representation underlying action recognition. A funda-
naming (assertives). Acoustic features for pitch, duration, intensity, mental question concerns whether actions are represented
and spectral features were extracted with PRAAT. These measures independently or interdependently. Here we examined, whether
were subjected to discriminant analyzesseparately for words and cognitive representation of static (action image) and dynamic (action
non-wordsin order to test whether the acoustic features have movie) actions are dependent on each other and whether cognitive
enough discriminant power to classify the stimuli to their corre- representations for static and dynamic actions overlap.
sponding speech act category. Furthermore, 20 participants were Adaptation paradigms are an elegant way to examine the presence
tested for the behavioral recognition of the speech act categories with of relationship between different cognitive representations. In an
a 6 alternative-forced-choice task. Finally, a new group of 40 par- adaptation experiment, participants view a stimulus, the adaptor, for a
ticipants performed subjective ratings of the different speech acts (e.g. prolonged amount of time and afterwards report their perception of a
How much does the stimulus sound like criticism?) to obtain more second, ambiguous test stimulus. Typically, the perception of the
detailed information on the perception of different intentions and second stimulus will be biased away from the adaptor stimulus. The
allow, as quantitative variable, further analyzes in combination with presence of an antagonistic perceptual bias (adaptation effect) is often
the acoustic measures. taken as evidence for the interdependency of the cognitive repre-
The discriminant analyzes of the acoustic features yielded high sentation between test and adaptor stimulus.
above chance predictions for each speech act category, with an We manipulated the dynamic content (dynamic vs. static) of the
overall classification accuracy of about 90 % for both words and non- test and adaptor stimulus independently. The ambiguous test stimulus
words (chance level: 17 %). Likewise, participants were behaviorally was created by a weighted linear morph between the spatial positions
very well able to classify the stimuli into the correct category, with a of the two adapting actions (hand shake, high five). 30 participants
slightly lower accuracy for non-words (73 %) than for words (81 %). categorized the ambiguous dynamic or static action stimuli after being
Multiple regression analyzes of participants ratings of the different adapted to dynamic or static actions. Afterwards, we calculated the
speech acts and the acoustic measures further identified distinct pat- perceptual bias for each participant by fitting a psychometric function
terns of physical features that were able to predict the behavioral to the data. We found an action-adaptation after-effect in some but not
perception. all experimental conditions. Specifically, the effect was only present
These findings indicate that prosodic cues convey sufficient detail if the presentation of the adaptor and the test stimulus was congruent,
to classify short (non-)word utterances according to their underlying i.e. if both were presented in either a dynamic or a static manner
intention, at acoustic as well as perceptual levels. Lexical meaning (p \ 0.001). This action-adaptation after-effect indicates a
123
dependency between cognitive representations when adaptor and test yet are set apart with respect to possible occurrence in predicative
stimuli have the same dynamic content (i.e. both static or dynamic). position. The latter twos relation shows the reverse characteristics
Future studies are needed to relate those results to other findings in the (both +predicative use, yet differ in gradability). Adjective order at
field of action recognition and to incorporate a neurophysiological the right prenominal edge of Germanic NPs tends to follow the
perspective. sequence: QR QA CLAS N. At the same time, classifying
adjectives (either inherent classifiers such as relational adjectives or
other adjectives functioning as classifiers in established names) typ-
Stage-level and individual-level interpretation ically function as modifiers of complex namesjust as in, e.g., NN-
compounds, where the modifying N is non-referential, CLAS-adjec-
of multiple adnominal adjectives tives do not locate the NP-referent spatio-temporally but classify it as
as an epiphenomenontheoretical and empirical a member of a certain kind. Therefore, the IL-interpretation of A2 in
evidence e.g. (2) is an epiphenomenon of more global constraints on modifier
orderin doublets they are interpreted as CLAS, a layer for which
Sven Kotowski, Holden Hartl SL-interpretations are not available.
Institut fur Anglistik und Amerikanistik, Universitat Kassel, Germany To test our hypothesis, we conducted two questionnaire studies on
German adjective order. Study 1 was a 100-split task designed to test
As observed by various authors (among others Bolinger 1967; Cinque subjects order preferences when confronted with polysemous
2010; Larson 1998), certain adjectives in several languages are adjectives in combination with another adjective (i.e. either a time-
semantically ambiguous in different adnominal positions. These stable (e.g. wild & [not domesticated]) or a temporary reading
ambiguities concern semantic oppositions, such as intersective vs. (wild & [furious]) in combination with a time-stable adjective, e.g.
non-intersective, restrictive vs. non-restrictive, or individual-level vs. big). Introductory context paragraphs promoted both adjectives
stage-level. Thus, the time-honored examples in (1a/b) are argued to readings within an item. Subjects had to distribute 100 points over
have two distinct interpretations: two possible follow-up sentences, with alternating A1-A2-orders,
(1) a. the visible stars according to which sentence was more natural given the context.
b. the stars visible Crucially, time-stable AN-syntagms did not denote established kinds,
In (1a), visible can either have an occasion/stage-level (SL) or a i.e. the task tried to elicit order preferences based on a potential IL-
characterizing/individual-level (IL) reading. The postnominal adjec- SL-distinction only. While control group items following clear-cut
tive in (1b), however, is non-ambiguous and allows for the SL-reading order regularities described in the literature irrespective of tempo-
only (cf. Kratzer 1995 for test environments). Furthermore, when the rality, e.g. small French car, score significantly better than either of
same adjective occurs twice prenominally (2), the two interpreta- the above test categories, differences between IL- and SL-categories
tions are linked to rigidly ordered positions (cf. Cinque 2010; Larson are clearly insignificant.
1998): In a follow-up study conducted currently, subjects are presented
(2) the visible[SL] visible[IL] stars with introductory sentences containing AAN-clusters not further
In this paper, we argue that the order of multiple prenominal specified as regards interpretation. Again, alternating adjectival
adjectives in German (and possibly cross-linguistically) cannot be senses are utilized. In each test sentence one A is a deverbal ending
derived on the basis of an inherent dichotomy between SL- and IL- in bar (the rough German equivalent of English ible/able; e.g.
predicates, but requires a more general analysis of adnominal adjective ausziehbar extendable), which displays a systematic ambiguity
order. SL and IL are not intrinsically ordered along the lines of (2), i.e. between an occasion and a habitual reading (Motsch 2004). Com-
SL IL. Rather, they are found in this very order due to different bined with respective nouns, these adjectives in one reading encode
adjectival functions in a layered structure around the NPs head. established kinds (e.g. ausziehbarer Tisch pull-out table; CLAS
Crucially, in such adjective doublets, the second adjective always modification), while the respective second adjective encodes a time-
receives a generic reading, i.e. the [A2N] in such [A1[A2N]] expres- stable property that does not exhibit a kind reading in an AN-
sions functions as a complex common name that denotes a subkind of syntagm (e.g. blauer ausziehbarer Tisch blue pull-out table).
the kind denoted by the head noun (Gunkel, Zifonun 2009): In (1)/(2) Subjects are then asked to rate follow-up sentences according to
above, if star denotes the kind STAR, (1a) is ambiguous between a their naturalness as discourse continuationsthese systemati-
subkind and a qualifying reading, while in (2) the cluster visible2 stars cally violate the occasion reading and we hypothesize that
is interpreted as a subkind VISIBLE STAR and thus disambiguated. continuations will score higher for [A=KIND AKIND N] than for
Accordingly, we assume that doublets increase in general acceptability [AOCCASION(POTENTIAL KIND) A=KIND N] expressions. Should this
if A2Ns fulfil establishedness-conditions and pass tests with kind- hypothesis be confirmed, we take the resultstogether with the
selecting predicates (like INVENT etc.; see e.g. Krifka et al. 1995). findings from study 1as support for the above reasoning that
For example, a taxonomic subkind reading is triggered by the indefi- observed adjective interpretations as in (2) do not derive primarily
nite NP in (3a), while no such downward-projecting taxonomic from a grammatical distinction between IL- and SL-predicates, but
inference occurs for non-established complex expressions (3b): need to be understood as an epiphenomenon of more general con-
(3) a. John invented a potato peeler. ? a kind of POTATO straints on adjective order and kind reference.
PEELER
b. John invented a cauliflower steamer. ? the kind CAULI-
FLOWER STEAMER References
As regards general restrictions on adnominal adjective order, we Bolinger D (1967) Adjectives in English: attribution and predication.
assert a lack of descriptive adequacy for purely formal/syntactic (in Lingua 18:134
particular cartographic) as well as for purely semantic and/or com- Cinque G (2010) The syntax of adjectives. A comparative study. MIT
municative-functional NP models. Instead, we argue prenominal Press, Cambridge, MA
adjective sequences to include at least three distinct semantic-syn- Fernald T (2000) Predicates and temporal arguments. Oxford Uni-
tactic layers: a classifying (CLAS; e.g. relational adjectives like versity Press, Oxford.
musical), an absolute-qualifying (QA; e.g. basic color terms), and a Kratzer A (1995) Stage-level and Individual-level Predicates. In:
relative-qualifying (QR; e.g. dimensional adjectives) one. The former Carlson GN, Pelletier FJ (eds) The generic book. The University
two share certain semantic and morphosyntactic features (-gradable), of Chicago Press, Chicago, pp 125175
123
Krifka M, Pelletier FJ, Carlson GN; ter Meulen A, Chierchia G, Link Results and Conclusion
G (1995) Genericity: an introduction. In: Carlson GN, Pelletier FJ In healthy subjects only successive presentation of auditorily pre-
(eds) The generic book. The University of Chicago Press, Chi- sented sentences and the ensuing pictures (three distracters) yields
cago, pp 1124 robust behavioral differences. As a function of both (i) level of
Larson R (1998) Events and modification in nominals. In: Strolovitch embedding and (ii) topicalization, we find highly significant effects
D, Lawson A (eds) Proceedings from semantics and linguistic in terms of increasing reaction times (embedding: F(2,32) = 46.610,
theory (Salt) VIII, Cornell University Press, Ithaca, pp 145168 p = .000; topicalization, F(1,16) = 25.003, p = .000) as well as
Motsch W (2004) Deutsche Wortbildung in Grundzugen. Walter de decreased accuracy (embedding depth, F(2,32) = 20.826, p = .000;
Gruyter, Berlin topicalization, F(1,16) = 10.559, p = .005). Interestingly, factors do
not interact, suggesting partially independent factorial influence on
syntactic processing. Currently the paradigm is used in a study with
facilitatory transcranial direct current stimulation (tDCS) of each
What happened to the crying bird? Differential roles key area (IFG vs. TP-region). Additionally, patients with circum-
of embedding depth and topicalization modulating scribed acquired brain lesions are tested on different versions of the
paradigm adapted to the requirements of language-compromised
syntactic complexity in sentence processing
patients.
Carina Krause1, Bernhard Sehm1, Anja Fengler1, References

Angela D. Friederici1, Hellmuth Obrig1,2 Antonenko D, Brauer J, Meinzer M, Fengler A, Kerti L, Friederici A,
1
Max Planck Institute for Human Cognitive and Brain Sciences, Floel A (2013) Functional and structural syntax networks in
Leipzig, Germany; 2 University Hospital Leipzig, Clinic for Cognitive aging. Neuroimage 83:513523
Neurology, Germany Friederici A (2009) Pathways to language: fiber tracts in the human
The rat the cat the dog bit chased escaped. Previous studies pro- brain. Trends Cogn Sci 13(4): 175181
vide evidence that the processing of such hierarchical syntactic Friederici A (2011). The brain basis of language processing: from
structures involves a network including the inferior frontal gyrus and structure to function. Physiol Rev 91(4):13571392
temporo-parietal regions (Friederici 2009; Fengler et al. in press) as Makuuchi M, Bahlmann J, Anwander A, Friederici A (2009) Segre-
two key players. While most studies locate the processing of syn- gating the core computational faculty of human language from
tactically complex sentences in Brocas area (BA44/45), some studies working memory. PNAS 106(20):83628367
also report the involvement of BA47 and BA6 (Friederici 2011), and Meyer L, Obleser J, Anwander A, Friederici A (2012) Linking
temporo-parietal areas (Shetreet et al. 2009). Why is there so much ordering in Brocas area to storage in left temporo-parietal
variation in localizing the syntactic complexity effect? The interpre- regions: the case of sentence processing, Neuroimage
tation of multiple embedded sentence structures represents a 62(3):19871998
particular challenge to language processing requiring syntactic hier- Shetreet E, Friedmann N, Hadar U (2009) An fMRI study of syntactic
archy building and verbal working memory. Thereby syntactic layers: Sentential and lexical aspects of embedding. Neuroimage
operations may differentially tax general verbal working memory 48(4):707716
capacities, preferentially relying on TP-regions (Meyer et al. 2012),
and more syntax-specific working memory domains, preferentially
relying on IFG structures (Makuuchi et al. 2009). To disentangle the
specific contribution of each subsystem, we developed stimulus fMRI-evidence for a top-down grouping mechanism
material that contrasts syntactic complexity and the working memory establishing object correspondence in the Ternus
aspects. The goal of our project is to use this material in facilitation display
(tDCS study) and impairment (lesion study) to allow ascribing causal
roles of the above brain areas to these three aspects of syntax Katrin Kutscheidt1, Elisabeth Hein2, Manuel Jan Roth1,
processing. Axel Lindner1
Methods 1
Department of Cognitive Neurology, Hertie Institute for Clinical
20 healthy participants ( age: 24) performed an auditory sentence Brain Research, Tubingen, Germany; 2 Department of Psychology,
picture-matching task. Both reaction times and error rates were University of Tubingen, Germany
recorded.
Paradigm Our visual system is constantly confronted with ambiguous sensory
In a number of pilot studies (each 1015 participants), task com- input. However, it is rarely perceived as being ambiguous. It is, for
plexity was varied (number of choice options, distractors, presentation instance, possible to keep track of multiple moving objects in parallel,
order).Our stimulus set is based on material used in previous studies even if occlusions, or eye blinks might prevent the unique assignment
(Antonenko et al. 2013; Fengler et al. in press) and consists of 132 of objects identity based on sensory input. Hence, neural mecha-
German transitive sentences. It has a 2x3-factorial design tapping nismsbottom-up or top-downmust disambiguate conflicting
argument order (A: subject- vs. B: object-first) and depth of syntactic sensory information. The aim of this study was to shed light on the
embedding (0: no, 1: single, 2: double embedding): underlying neural mechanisms establishing object correspondence
across space and time despite such ambiguity.
To this end, we performed a functional magnetic resonance
A0: Der Vogel ist braun, er wscht den Frosch, und er weint. imaging (fMRI) study using a variant of the Ternus display (Ternus
B0: Der Vogel ist braun, ihn wscht der Frosch, und er weint. 1926). The Ternus display is an ambiguous apparent motion stimulus,
A1: Der Vogel, der braun ist, und der den Frosch wscht, weint in which two sets of three equidistant disks are presented in the fol-
B1: Der Vogel, der braun ist, und den der Frosch wscht, weint lowing way: while two disks are always presented at the same
A2: Der Vogel, der den Frosch, der braun ist, wscht, weint. position, a third disk is alternating between a position to the left and to
B2: Der Vogel, den der Frosch, der braun ist, wscht, weint. the right of these two central disks. This display either leads to the
123
percept of group motion (GM), where an observer has the object correspondence in the Ternus display could be determined by
impression that all three disks move coherently as one group or, top-down spatial binding of the discs within particular frames.
alternatively, to the percept of element motion (EM), in which the
outermost disk is seen as jumping back and forth over stationary Acknowledgments
central disks. The way the Ternus display is perceptually interpreted This work was supported by a grant from the BMBF (FKZ 01GQ1002
thereby depends on both low-level features (e.g. inter-frame interval to A.L.).
[IFI]; Petersik, Pantle,1979) and higher-level factors (e.g. context
information; He, Ooi 1999). References
Our Ternus display consisted of three white disks presented on a He ZJ, Ooi TL (1999) Perceptual organization of apparent motion in
grey background. The disks were shown for 200 ms in alternating the ternus display. Percept Lond 28:877892
frames. Each stimulus block lasted five minutes during which par- Kramer P, Yantis S (1997) Perceptual grouping in space and time:
ticipants (n = 10) had to fixate a central fixation cross and to evidence from the Ternus display. Percept Psychophys 59:8799
manually indicate their respective motion percept, GM or EM, using a Petersik JT, Pantle A (1979) Factors controlling the competing sen-
button box. Due to the ambiguous nature of the stimulus, participants sations produced by a bistable stroboscopic motion display.
perceptual interpretation constantly changed during the course of the Vision Res 19(2):143154
experiment. The average percept duration across individuals was Sterzer P, Kleinschmidt A, Rees G (2009) The neural bases of mul-
*11 s for GM and *8 s for EM. To guarantee comparable percept tistable perception. Trends Cogn Sci 13:310318
durations also within participants, we individually estimated the IFI at Sunaert S, Van Hecke P, Marchal G, Orban GA (1999) Motion-respon-
which EM and GM were perceived equally often in a pre-experiment. sive regions of the human brain. Exp Brain Res 127:355370
The IFI in the MRI experiment was then adjusted accordingly. The Ternus J (1926) Experimentelle Untersuchungen uber phanomenale
experiment comprised six blocks, each preceded by a 30 s baseline Identitat [Experimental investigations of phenomenal identity].
period without stimulus presentation. Psychologische Forschung 7:81136
Functional (TR = 2 s) and anatomical MRI images were acquired Zaretskaya N, Anstis S, Bartels A (2013) Parietal cortex mediates
on a 3 T Siemens TRIO scanner and processed using SPM8. In par- conscious perception of illusory gestalt. J Neurosci 33:523531
ticipant-specific first level analyzes, we specified general linear
models including three regressors: (i) onset of GM percept; (ii) onset
of EM percept; (iii) stimulus presentation. All regressors were con-
volved with the canonical haemodynamic response function. The Event-related potentials in the recognition of scene
initial fixation period was not explicitly modelled and served as a sequences
baseline. In each participant, we individually identified task-related
regions of interest (ROIs) by contrasting stimulus presentation (iii) vs.
Stephan Lancier, Julian Hofmeister, Hanspeter Mallot
baseline. Only those areas were considered ROIs that also surfaced in
Cognitive Neuroscience Unit, Department of Biology, University
a second-level group analysis of the same contrast. Task-related
bilateral ROIs were the lingual gyrus (LG), V3a, V5 and intraparietal
sulcus (IPS). For each individual and for each ROI, we then extracted Many studies investigated Event-Related Potentials (ERP) associated
the time course of fMRI-activity in order to perform time-resolved to the recognition of objects and words. Friedmann (1990) showed in
group analyzes for activity differences between EM and GM percepts. an old/new task that correctly recognized new pictures of objects
Analyzes of the simultaneously recorded eye data helped to exclude evoked a larger frontal-central N300 amplitude than familiar pictures
influences of eye blinks, saccades, eye position and eye velocity on the of objects. This indicates that participants are able to discriminate
motion percepts, as no difference between conditions was revealed. between old and new pictures 300 ms after stimulus onset. Rugg et al.
In all ROIs a perceptual switch was accompanied by a significant (1998) found different neural correlates for the recognition of
peak in fMRI-activity around the time of the indicated switch implicitly and explicitly learned words. In the so-called mid-frontal
(p \ .05). While the amplitude of these peaks did not differ between old/new effect, recognized, implicitly learned words were character-
perceived GM and EM across all ROIs (p [ .05, n.s.) we observed ized by lower N400 amplitude in contrast to the recognized new
significant differences in the temporal onset of the switch-related words. The explicitly learned words could be dissociated by their
fMRI-response in GM and EM (p \ .01). Specifically, there was a larger P600 amplitude from implicitly learned words, which was
particularly early rise in switch-related fMRI-activity in IPS for GM, called the left parietal old/new effect. Rugg et al. concluded that
which occurred about three seconds before the participant finally recognition memory can be divided into two distinct processes, a
switched from EM to GM. In the case of EM, on the other hand, this familiarity process for implicit learning and a recollection process for
switch-related increase in fMRI-activity in IPS seemed to rather occur explicit learning. These neural correlates were also shown for the
after the perceptual switch. Area V5 exhibited comparable results but recognition of pictures of objects (Duarte et al. 2004). In fast rec-
showed less of a temporal difference between GM and EM (p \ .05). ognition tasks, pictures of scenes are identified as fast as pictures of
In contrast, in areas LG and V3a the rise in fMRI-activity was rather isolated objects. Schyns and Oliva (1994) suggest that a coarse-to-fine
time-locked to the perceptual switch per se, being indistinguishable process extracts a coarse description for scene recognition before finer
between GM and EM (p [ .05, n.s.). information is processed. In this case the workload for recognizing a
Our results revealed significant peaks of fMRI activity that were scene would not differ substantially from the workload required in
correlated with a switch between two perceptual interpretations (GM object recognition. In the present study, we investigate the recognition
or EM) of a physically identical stimulus in LG, V3a, V5 and IPS, of target scenes from scene sequences and compare the elicited neural
brain regions, which are also involved in visual motion processing correlates to those of former studies. We hypothesize that the rec-
(e.g. Sunaert, Van Hecke, Marchal, Orban 1999). Importantly, the time ognition of scene identity and scene position in the sequence evoke
course of switch-related activity in IPS additionally suggests a dissociable neural correlates.
potential top-down influence on other areas (cf. Sterzer, Russ, Pre- At the current stage of this study five students of the University of
ibisch, Kleinschmidt 2009), here to mediate the perception of GM. The Tubingen participated. Each of them completed two sessions at dif-
specific role of IPS could thereby relate to spatial binding of individual ferent days. The experiment consisted of 100 trials. Each trial was
objects into a group (cf. Zaretskaya, Anstis, Bartels 2013). This idea is divided into a learning phase and test phase (see Fig. 1). During the
consistent with the theory of Kramer and Yantis (1997) suggesting that learning phase eight hallways each with two doors were shown. In
123
results showed also a parietal old/new effect but without a left lat-
eralization. The results of our experiment cannot be assigned
conclusively to one of the postulated memory processes. Furthermore
in the tasks involving non-sequence matching scenes, time course of
the ERP was reversed after 650 ms. We assume that this effect is a
neural correlate of sequence recognition processing.
References
Duarte A, Ranganath C, Winward L, Hayward D, Knight RT (2004)
Dissociable neural correlates for familiarity and recollection
during the encoding and retrieval of pictures. Cogn Brain Res
18:255272
Friedmann D (1990) Cognitive event-related potential components
during continuous recognition memory for pictures. Psycho-
Fig. 1 Schematic illustration of the learning phase and the test phase.
physiology 27(2):136148
After the learning phase the lettering test phase was presented on
Rugg MD, Mark RE, Walla P, Schloerscheidt AM, Birch CS, Allan K
the display for three seconds. ERPs were triggered with the onset of
(1998) Dissociation of the neural correlates of implicit and
the test scene
explicit memory. Nature 392:595598
Schyns PG, Oliva A (1994) From blobs to boundary edges for time-
each hallway the participants had to choose one door which they and spatial-scale-dependent scene recognition. Psychol Sci
wanted to pass. This decision had no impact on the further presen- 5(4):195200
tation but was included to focus attention on the subsequent scene.
After this decision two pictures of indoor scenes were presented each
for 600 ms. The first was the target scene which the participants had
to detect in the test phase. This picture was marked with a green Sensorimotor interactions as signaling games
frame. The second picture showed the distractor scene and was
marked with a red frame. The test phase followed immediately after
Felix Leibfried1,2,3, Jordi Grau-Moya1,2,3, Daniel A. Braun1,2
all eight hallways had been presented. During the test phase, all 1
Max Planck Institute for Biological Cybernetics, Tubingen,
hallways were tested. The number of the current hallway was pre-
Germany; 2 Max Planck Institute for Intelligent Systems, Tubingen,
sented as a cue followed by a test scene. In a Yes/No task, participants
Germany; 3 Graduate Training Centre of Neuroscience, Tubingen,
were asked to hit the corresponding mouse button if the presented
Germany
scene was the target scene they encountered in the corresponding
hallway during the learning phase. Fifty percent of presented test In our everyday lives, humans not only signal their intentions through
scenes were sequence matching target scenes known from the verbal communication, but also through body movements (Sebanz
learning phase (correct identity and position) and the other 50 percent et al. 2006; Obhi and Sebanz 2011; Pezzulo et al. 2013), for instance
were homogenously distributed into distractor scenes of the same when doing sports to inform team mates about ones own intended
hallway (false identity, correct position), new scenes which were not actions or to feint members of an opposing team. We study such
presented in the learning phase (false identity and position) and target sensorimotor signaling in order to investigate how communication
scenes which did not match the corresponding hallway (correct emerges and on what variables it depends on. In our setup, there are
identity, false position). In addition to the psychophysical measure- two players with different aims that have partial control in a joint
ments, ERPs were measured by EEG and were triggered with the test motor task and where one of the two players possesses private
scene presentation. information the other player would like to know about. The question
Behaviorally, the hit rate (correct recognition of scene identity and then is under what conditions this private information is shared
position) was about 80 %. Overall correct rejection (either identity or through a signaling process. We manipulated the critical variables
position incorrect) was about 85 %. Correct target scenes appearing at given by the costs of signaling and the uncertainty of the ignorant
incorrect positions were rejected at a rate of about 60 %. Target player. We found that the dependency of both players strategies on
scenes appearing at a false position were more likely to be rejected as these variables can be modeled successfully by a game-theoretic
the distance between their presentation in the learning and test analysis. Signaling games are typically investigated within the context
sequences increased. ERPs depend on the combination of decision of non-cooperative game theory, where each player tries to maximize
and task condition. The hit condition differs from all other task/ their own benefit given the other players strategy (Cho and Kreps
response combinations in a relatively weak N300. Especially at the 1987). This allows defining equilibrium strategies where no player
frontal sites the non-hit combinations lacked a P300 wave except for can improve their performance by changing their strategy unilaterally.
the false alarms of non-sequence matching target scenes where the These equilibria are called Bayesian Nash equilibria, which is a
ERP approached the level of the P300 of the hit condition abruptly generalization of the Nash equilibrium concept in the presence of
after the peak of the N300. Note that in both conditions, scene identity private information (Harsanyi 1968). In general, signaling games
was correctly judged whereas position is ignored. They also differ allow both for pooling equilibria, where no information is shared, and
from the other task/response combinations also in the N400 which is for separating equilibria with reliable signaling.
relatively weak. The parietal P600 wave of the hits differed only from In our study we translated the job market signaling game into a
the correct rejections of distractor scenes, the novel scenes and the sensorimotor task. In the job market signaling game (Spence 1973),
missed target scenes. Between 650 and 800 ms, the parietal electrodes there is an applicantthe senderwho has private information about
recorded a positive voltage shift for the correct rejections of the non- his true working skill, called the type. The future employerthe
sequence matching scenes and a negative voltage shift for the false receivercannot directly know about the working skill, but only
alarms of the non-sequence matching scenes. No such potentials were through a signalfor example, educational certificatesthat are the
found for the other task/response combinations. more costly to acquire, the less working skill the applicant has. The
The mid-frontal old/new effect of Rugg et al. (1998) seems to be sender can choose a costly signal that may or may not transmit
comparable to the N400 effect of our preliminary data. In addition our information about the type to the receiver. The receiver uses this
123
signal to make a decision by trying to match the paymentthe agents in our environment or controlling technology via voice inter-
actionto the presumed type (working skill) that she infers from the faces. Here we investigate SoA during verbal control of the external
signal. The senders decision about the signal trades off the expected environment using the intentional binding paradigm.
benefits from the receivers action against the signaling costs. Intentional binding is a phenomenon where the perceived action-
To translate this game into a sensorimotor task, we designed a outcome interval for voluntary actions is shorter than for equivalent
dyadic reaching task that implemented a signaling game with con- passive movements (Haggard, Clark, Kalogeras 2002). In this
tinuous signal, type and action space. Two players sat next to each experimental paradigm participants report the perceived time of
other in front of a bimanual manipulandum, such that they could not voluntary action initiation and the consequent effects using the so-
see each others faces. In this task, each player controlled one called Libet clock. Haggard, Clark, Kalogeras (2002) found that when
dimension of a two-dimensional cursor position. No other commu- participants caused an action, their perceived time of initiation and the
nication than the joint cursor position was allowed. The senders perceived time of the outcome where brought closer together, i.e. the
dimension encoded the signal that could be used to convey infor- perceived interval between voluntary actions and outcomes was
mation about a target position (the type) that the receiver wanted to smaller than the actual interval. In the case of involuntary actions the
hit, but did not know about. The receivers dimension encoded her perceived interval was found to be longer than the actual interval.
action that determined the senders payoff. The senders aim was to Importantly, intentional binding is thought to offer a reliable implicit
maximize a point score that was displayed as a two-dimensional color measure of SoA (Moore, Obhi 2012).
map The point score increased with the reach distance of the recei- In this study we developed a novel adaptation of the intentional
verso there was an incentive to make the receiver believe that the binding paradigm where participants performed both verbal commands
target is far away. However, the point score also decreased with the (saying the word go) and limb movements (key-presses) that were
magnitude of the signalso there was an incentive to signal as little followed by an outcome (auditory tone) after a fixed 500 ms interval.
as possible due to implied signaling costs. The receivers payoff was Participants sat at a desk in front of a 24 monitor, which displayed the
determined by the difference between his action and the true target Libet clock. The experimental design was a within-subjects design with
position that was revealed after each trial. Each player was instructed one independent variable: input modalityspeech input or keyboard
about the setup, their aim and the possibility of signaling. The input. A keyboard and microphone were used to register their actions.
question was whether players behavior converged to Bayesian Nash The trials were separated into individual blocksoperant blocks require
Equilibria under different conditions where we manipulated the sig- the participant to act (either via button press or verbal command) to cause
naling cost and the variability of the target position. By fitting a beep During the operant trials, participants reported the time of the
participants variance of their signaling, we could quantitatively critical event (either the action or outcome). Baseline trials had either an
predict the influence of signaling costs and target variability on the action from the participant (with no outcome) or the beep occurring in
amount of signaling. In line with our game-theoretic predictions, we isolation. During baseline conditions, the participant is required to report
found that increasing signaling costs and decreasing target variability the time of the critical eventaction or outcome.
leads in most dyads to less signaling. We conclude that the theory of We investigated:
signaling games provides an appropriate framework to study senso-
rimotor interactions in the presence of private information. 1) Subjective time of action perception for verbal commands
2) The SoA for verbal commands
References Firstly, we found that the average perceived time of action cor-
Cho I, Kreps D (1987) Signaling games and stable equilibria. Quart J responded to the beginning of the utterance. This offers an intriguing
Econ 102(2):179222 insight concerning the cognitive processes underlying action per-
Harsanyi J (1968), Games with incomplete information played by ception for speech. One possible explanation for the action being
Bayesian players, IIII. Part II. Bayesian equilibrium points. perceived as occurring at the beginning of the utterance is that once
Manag Sci 14(5):320334 people receive sensory information about their verbal command,
Obhi SS, Sebanz N (2011), Moving together: toward understanding the perception of action arises. Theoretically, this possible explanation is
mechanisms of joint action. Exp Brain Res 211(34):329336 in line with the cue integration theory of agency. Cue integration
Pezzulo G, Donnarumma F, Dindo H (2013) Human sensorimotor holds that both internal motor cues and external situational informa-
communication: a theory of signaling in online social interaction contribute to the SoA (Wegner, Sparrow 2004; Moore et al. 2009;
tions. PloS ONE 8(11):e79876 Moore, Fletcher 2012). It has been suggested that the influence of
Sebanz N, Bekkering H, Knoblich G (2006) Joint action: bodies and these cues upon our SoA depends on their reliability (Moore, Fletcher
minds moving together. Trends Cogn Sci 10(2):7076 2012). According to the cue integration concept, multiple agency cues
Spence M (1973) Job market signaling. Quart J Econ 87(3):355374 are weighted by their relative reliability and then optimally integrated
to reduce the variability of the estimated origins of an action. For
speech, it may be the case that hearing ones voice is a highly reliable
agency cue and enough to label the action as initiated. Of course,
Subjective time perception of verbal action further investigation is required, a larger sample size and other
and the sense of agency measurements for action perception (such as EEG) will be vital in
determining the perception of action perception for verbal commands.
These insights will be valuable, particularly for designers of speech
Hannah Limerick1, David Coyle1, James Moore2 recognition software and voice based interfaces.
1
University of Bristol, UK; 2 Goldsmiths, University of London, UK To address number 2) above, we tested whether binding was
The Sense of Agency (SoA) is the experience of initiating actions to occurring within each input modality. We conducted a 2x2 repeated
influence the external environment. Traditionally SoA has been measures analysis of variance comparing event type (action/outcome)
investigated using experimental paradigms where a limb movement is and context (operant/baseline). The key-press condition resulted in a
required to initiate an action. However, less is known about the SoA significant interaction between context and event. Follow up t-tests
for verbal commands, which are a prevalent mode of controlling our comparing action operant and action baseline have a significant dif-
external environment. Examples of this are interacting with other ference t(13) = - 5.103, p \ .001. This shows that operant actions
123
were perceived later than the baseline. T test comparing the perceived Memory disclosed by motion: predicting visual working
times of the operant tone condition and the baseline tone condition have memory performance from movement patterns
a significant difference t(13) = 2.374, p \ .05 and therefore operant
tones were perceived earlier than the baseline. The same analysis was
Johannes Lohmann, Martin V. Butz
repeated for the speech condition which resulted in a trend towards
Cognitive Modeling, Department of Computer Science, University
significance between context and event (F1,13 = 3,112, p = .101).
Because this was a preliminary investigation, we performed exploratory
analysis and performed follow up paired t-tests comparing action Abstract
operant and action baseline and found a significant difference Embodied cognition proposes a close link between cognitive and
t(12) = - 2.257, p \ .05 indicating that operant actions are perceived motor processes. Empiric support for this notion comes from research
later than the baseline and thus action binding is occurring. A t-test applying hand-tracking in decision making tasks. Here we investigate
comparing outcome operant and outcome baseline gave a non-signifi- if similar systematics can be revealed in case of a visual working
cant difference t(13) = .532, p = .604). Therefore the outcome operant memory (VWM) task. We trained recurrent neural networks (RNNs)
condition was not perceived as significantly earlier than the baseline and to predict memory performance from velocity patterns of mouse
the outcome binding component of intentional binding was not present. trajectories. Compared to previous studies, the responses were not
Although intentional binding was present for limb movements speeded. The results presented here reflect a work in progress and
(consistent with the existing literature), it was absent for verbal more detailed analyzes are pending, especially the low generalization
commands. There are several possible explanations for this. One performance to unknown data requires a more thorough investigation.
possible explanation is that intentional binding is a phenomenon that So far, the results indicate that even small RNNs can predict partic-
does not occur for verbal commands. It is also possible that inten- ipants working memory state from raw mouse tracking data.
tional binding is present at different scales across different Keywords
sensorimotor modalities. Another explanation, in line the cue inte- Mouse Tracking, Recurrent Neural Networks, Visual Working
gration approach for SoA (described above) is that there are Memory
differences in the amount of sensory cues provided to the participant
to confirm that the action has occurred. Key-presses involve propri- Introduction
oceptive, visual, haptic and auditory cues, which are all integrated to With the embodied turn in cognitive science and due to the recon-
influence the SoA for an action. For verbal commands, there are less sideration of cognition in terms of a dynamic system (Spivey and Dale
sensory cuesproprioceptive and auditory. Less agency cues 2006), the dynamic coupling between real-time cognition and motor
involved in verbal commands may result in no intentional binding responses has become a prominent topic in cognitive psychology.
effect. Therefore, further investigation would determine whether Freeman et al. (2011) provided a first review of this body of research,
different factors within the experimental set up has an impact on concluding that movement trajectories convey rich and detailed
intentional binding for verbal commands. Alterations such as longer information about ongoing cognitive processes. Most studies investi-
of shorter timescales, perhaps different forms of outcome (e.g. non gating this coupling, applied speeded responses, where participants
auditory) or additional agency cues may alter intentional binding. were instructed to respond as accurate and as fast as possible. Here we
There may also be experimental factors that lead to no intentional investigate if movement characteristics are also predictive for higher
binding being present for the verbal condition. Typically a speech cognitive functions in case of non-speeded responses. More precisely,
recognizer would need to process the entire utterance and perform we analyze mouse trajectories obtained in a visual working memory
recognition before deeming it as an action. However, as we discussed (VWM) experiment and try to predict recall performance (how well an
above, the user typically considered their utterance as an action item was remembered) from the movement characteristics.
roughly at the beginning of the utterance, thus giving a variable delay Experimental Setup
between action-outcome. Intentional binding studies have found that Mouse trajectories were obtained during a VWM experiment,
the binding phenomenon breaks down beyond 650 ms (Haggard, applying a delayed cued-recall paradigm with continuous stimulus
Clark, Kalogerous 2002). This may also explain the lack of tone spaces (see Zhang and Luck 2009, for a detailed description of the
binding found here. Interestingly, further exploratory analyzes of the paradigm). In each trial, participants had to remember three or six
speech data suggest that the action component of intentional binding stimuli. After a variable interstimulus interval (ISI), they had to report
was present but the outcome component was absent (hence the the identity of one of them. The stimuli consisted of either colored
apparent lack of overall binding). This suggests that an element of squares or Fourier descriptors. Memory performance in terms of
binding is occurring here. precision was quantified as angular distance between the reported and
the target stimulus. At the end of the ISI, one of the previous stimulus
locations was highlighted and the mouse cursor appeared at the center
References of the screen. Around the center either a color or a shape wheel,
Haggard P, Clark S, Kalogeras J (2002) Voluntary action and conscious depending on the initial stimuli, was presented and participants had to
awareness. Nature Neurosci 5(4):382385. doi:10.1038/nn827 click at the location that matched the stimulus at the cued location.
Moore J, Fletcher P (2012) Sense of agency in health and disease: a The responses were not speeded and participants were instructed to
review of cue integration approaches. Conscious Cogn 21(1): take as much time as they wanted for the decision. The trajectory of
5968. doi:10.1016/j.concog.2011.08.010 the mouse cursor was continuously tracked at a rate of 50 Hz. We
Moore J, Obhi S (2012) Intentional binding and the sense of agency: a obtained 4,000 trajectories per participant.
review. Conscious Cogn 21(1):546561. doi:10.1016/j.concog. Network Training
2011.12.002 We used the trajectory data to train Long Short-Term Memory
Moore JW, Wegner DM, Haggard P (2009) Modulating the sense of (LSTM, Gers et al. 2003) networks, to predict the memory perfor-
agency with external cues. Conscious Cogn 18(4):10561064. mance based on the velocity pattern of the first twenty samples of a
doi:10.1016/j.concog.2009.05.004 mouse trajectory. We chose LSTMs instead of other possible classi-
Wegner DM, Sparrow B (2004) Authorship processing. In: Gazzaniga fiers since LSTMs are well suited to precisely identify predictive
M (ed) The cognitive neurosciences III. MIT Press, Cambridge, temporal dependencies in time series, which is difficult for other
MA, pp 12011209 algorithms, like Hidden Markov Models. We used the raw velocity
123
vectors of the trajectories as inputs, without applying any can be predictive for the performance of higher cognitive functions
normalization. like VWM state and retrieval.
We did not require the network to learn a direct mapping between
movement trajectories and reported angular distances (referred to as D References
in the plots). Rather we labeled each trajectory based on the data Freeman JB, Dale R, Farmer TA (2011) Hand in motion reveals mind
obtained in the respective condition as either low distance or high in motion. Front Psychol 2:59
distance and trained the network as a binary classifier. Trajectories Gers FA, Schraudolph NN, Schmidhuber J (2003) Learning precise
that led to an angular distance below the 33 % quantil (Q(33) in timing with LSTM recurrent networks. JMLR 3:115143
Fig. 1), were labeled as low distance, trajectories that led to angular Spivey MJ, Dale R (2006) Continuous dynamics in real-time cogni-
distances above the 66 % quantil (Q(66) in Fig. 1), were labeled as tion. Curr Dir Psychol 15(5):207211
high distance. The intermediate range between the 33 and 66 % Zhang W, Luck SJ (2009) Sudden death and gradual decay in visual
quantil was not used for training. Labels were assigned based on the working memory. Psychol Sci 20(4):423428
response distribution of the respective experimental condition. Hence,
the same angular distance not always led to the same label assign-
ment. Thus, a suitable network had to learn to perform a relative
quality judgment instead of a mere median split. Half of the 4,000
Role and processing of translation in biological motion
trajectories of a participant were used for the training of a single perception
network. From these 2,000 trajectories, the 33 % labeled as low
distance and the 33 % labeled as high distance were used in the Jana Masselink, Markus Lappe
training. We compared the performance of networks consisting of University of Munster, Germany
either 5, 10, or 20 LSTM blocks. For each network size ten networks
Keywords
were trained.
Human walking, Biological motion processing, Translation
Results
The presented results were obtained with the 4,000 trajectories of one Visual perception of human movement is often investigated using
participant. Depending on the network size, classification perfor- point-light figures walking on a treadmill. However, real human
mance increased from 60 % after the first learning epochs up to 80 % walking does not only consist of a change in the position of the limbs
at the end of the training. relative to each other (referred to as articulation), but also of a change
Fig. 1 provides an overview of the results. One-sample t tests in body localization in space over time (referred to as translation). In
revealed that both the proportion of correct- as well as of misclassi- point-light displays this means that the motion vector of each dot is
fications differed significantly from chance level for the two trained composed of both an articulatory and a translatory component. We
categories. Paired t tests indicated that the proportion of correct and have examined the influence and processing mechanisms of this
misclassifications differed significantly within the trained categories. translation component in perception of point-light walkers. In three
Despite the apparent ability of the networks to acquire criteria to experiments each with a two-alternative forced-choice task observers
distinguish trajectories associated with either low or high angular judged the apparent facing orientation and articulation respectively
distance, cross-validation results were rather poor, yet still signifi- in terms of walking direction or forward/backward discriminationof
cantly above chance level. a point-light walker viewed from the side. Translation could be either
Discussion consistent or inconsistent with facing/articulation or not existent at all
In this study we investigated if motion patterns are still predictive for (treadmill walking). Additionally, stimuli differed in point lifetime to
cognitive performance in case of non- speeded responses. We trained manipulate the presence of local image motion. Stimuli were pre-
comparatively small recurrent neural networks to predict the precision sented for 200 ms to prevent eye movements to the translating
of memory recall from mouse movement trajectories. Even if the stimulus. Although participants were explicitly instructed to judge
generalization performance obtained so far is rather low, our pre- facing orientation and articulation regardless of translation, results
liminary results show that characteristics of non-speeded movements revealed an effect of translation in terms of response bias in trans-
lation direction in all three tasks. As translation even had an effect on
walkers with absent local motion signal in facing orientation and
walking direction task, we conclude that the global motion of the
center-of-mass of the dot pattern is relevant to processing of trans-
lation. Overall, translation direction seems to influence both
perception of form and motion of a walker. This supports the idea that
translation interacts with both the posture-based analysis of form and
the posture-time-based analysis of articulation in the perception of
human body motion.
How to remember Tubingen? Reference frames

in route and survey knowledge of ones city of residency
Tobias Meilinger1, Julia Frankenstein2, Betty J. Mohler1,

Heinrich H. Buelthoff1
1
Fig. 1 Aggregated evaluation of the 30 networks after 50 learning Max Planck Institute for Biological Cybernetics, Tubingen,
epochs. Black bars indicate correct classification performance for the Germany; 2 Cognitive Science, ETH Zurich, Switzerland
two trained categories. Error bars indicate the standard error of the Knowledge underlying everyday navigation is distinguished into
mean. Significant differences between classifications within one route and survey knowledge (Golledge 1999). Route knowledge
category are marked with an asterisk allows re-combining and navigating familiar routes. Survey
123
knowledge is used for pointing to distant locations or finding novel directed gaze should be more likely to perceive a coherent relation in
shortcuts. We show that within ones city of residency route and the objects that they see being looked at. To test this, the present study
survey knowledge root in separate memories of the same environment used the Remote Associates Test (Mednick 1962) in which subjects
and are represented within different reference frames. decide whether word triads are coherent by means of allowing
Twenty-six Tubingen residents who lived there for seven years in meaningful combinations with a fourth word. Before each decision, a
average faced a photo- realistic virtual model of Tubingen and dot moved across the words and subjects were either told that it
completed a survey task in which they pointed to familiar target represented the eye movements of a human trying to find word
locations from various locations and orientations. Each participants associations, or a computer-generated control. It was hypothesized
performance was most accurate when facing north, and errors that interpreting the dot as someones gaze would increase the fre-
increased as participants deviation from a north-facing orientation quency and reduce the time for intuitive judgments, namely those
increased. This suggests that participants survey knowledge was for which subjects assume a coherent relation but cannot name a
organized within a single, north-oriented reference frame. solution.
One week later, 23 of the same participants conducted route Methods
knowledge tasks comprising of the very same start and goal Sixteen subjects participated in the experiment and their eye move-
locations used in the survey task before. Now participants did not ments were tracked with an SR EyeLink 1000. Within each trial there
point to a goal location, but used arrow keys of a keyboard to enter was a preview video with cursor overlay and a word triad. Videos
route decisions along an imagined route leading to the goal. showed a 5 9 4 grid of rectangles containing 20 words, three of
Deviations from the correct number of left, straight, etc. decisions which had to be rated for coherence later. A purple dot cursor (15 px)
and response latencies were completely uncorrelated to errors and moved across the grid, either resting on the three words that were
latencies in pointing. This suggests that participants employed chosen later, or on three other words. Contrary to what subjects were
different and independent representations for the matched route told, the cursor always was a real eye movement recording. Each
and survey tasks. subject saw 100 triads, one after each video. All triads were composed
Furthermore, participants made fewer route errors when asked to of words from the respective video, but only in half of the trials these
respond from an imagined horizontal walking perspective rather than words had been cued by the cursor.
from an imagined constant aerial perspective which replaced left, Subjects were instructed that the cursor depicted eye movements
straight, right decisions by up, left, right, down as in a map with the (gaze) or a computer-generated control (dot). No strategy of using the
order tasks balanced. This performance advantage suggests that par- cursor was instructed. Each trial started with a video which was
ticipants did not rely on the single, north-up reference used for followed by a triad that remained on the screen until subjects pressed
pointing. Route and survey knowledge were organized along different a key to indicate whether it was coherent or not. If they negated, the
reference frames. response was counted as incoherent. After a positive response, they
We conclude that our participants route knowledge employed were asked to submit the solution word. If they gave no solution or a
multiple local reference frames acquired from navigation whereas wrong solution, this was counted as a yes + unsolved response,
their survey knowledge relied on a single north-oriented reference whereas trials with correct solution words were classified as yes + -
frame learned from maps. Within their everyday environment, people solved. Subjects worked through two blocks of 50 trials, with each
seem to use map or navigation-based knowledge according to which block corresponding to one of the cursor conditions.
best suits the task. Results
The frequency distributions of the three response types (yes + -
Reference solved, yes + unsolved, incoherent) were compared between both
Golledge RG (1999) Wayfinding behavior. The John Hopkins Uni- cursors and there was no difference, v2 (2) = 1.546, p = .462.
versity Press, Baltimore Specifically, the amount of yes + unsolved (intuitive) responses
was similar for gaze and dot (24.6 and 22.5 %), and this also did
not depend on whether the triad had been cued by the cursor
during the video or not, both Fs \ 1, both ps [ .3. Mean response
The effects of observing other peoples gaze: faster times did not differ between gaze and dot overall (9.5 vs. 9.4 s),
intuitive judgments of semantic coherence F \ 1, p [ .8, but cursor interacted with response type,
F(2,28) = 6.052, p = .007, indicating that only the yes + unsolved
responses were faster for gaze than for dot (8.9 vs. 11.3 s),
Romy Muller p = .02. In contrast, there was no difference for yes + solved and
Technische Universitat Dresden, Germany incoherent responses, both ps [ .6. There was no main effect or
Introduction interaction with cueing, both Fs \ 1, both ps [ .6, suggesting that
Our actions are modulated by an observation of others behavior, the speed advantage for yes + unsolved responses in gaze was
especially when we represent others as intentional agents. However, unspecific, i.e. it also occurred for triads that had not been cued by
inferring intentions can even be accomplished on the basis of seeing the gaze cursor (Fig. 1).
someones gaze. Do eye movements also exert a stronger influence on To investigate the impact of the two cursors on subjects visual
an observer when he is ascribing them to a human instead of a attention, subjects eye movements were analyzed in terms of the time
machine? Indeed, reflexive shifts of attention in response to gaze spent on the three cued areas within a grid. This time did not differ
shifts are modulated by subjects beliefs (Wiese et al. 2012): A human between gaze and dot (39.0 vs. 41.9 %), t(15) = 1.36, p = .193.
face elicited stronger gaze cueing effects than a robot face, but this Thus, although there was quite some interindividual variation in
difference disappeared when the instruction stated that both stimuli subjects strategies of using the cursor, most subjects looked at the
were of the same origin (i.e. either produced by a human or a gaze and dot cursor in a similar manner.
machine). This suggests that beliefs about someone elses visual Discussion
attention can exert a direct influence on our own processing of the The present results indicate that observing another persons eye
same stimuli. movements can affect the coherence we assume in the things being
A possible way in which the interpretation of gaze as human can looked at. When subjects believed that they saw a depiction of gaze
affect our processing is that we try to infer the meaning of the things on word triads, their intuitive classifications as coherent were no more
another person is attending to. In this case, observers of object- frequent (perhaps due to a lack of sensitivity) but faster than when
123
philosophical accounts of mental agency. (III) Finally, we will high-

light aspects of mental agency that need to be explained by predictive
processing accounts and, more specifically, suggest possible concep-
tual connections between mental actions and active inference.
(I) What is active inference? Two aspects of agency explanations
have been emphasized in the predictive processing literature:
1. The initiation of action: According to the framework provided
by Karl Fristons free-energy principle, agency emerges from active
inference. In active inference, changes in the external world that can
be brought about by action are predicted. In order to cause these
changes (instead of adjusting the internal model to the sensory input),
the organism has to move. In beings like us, this involves changing
Fig. 1 Percentage of responses (A) and response times (B) depending the states of our muscles. Therefore, changes in proprioceptive sen-
on cursor and response type. The percentage of time spent on the cued sors are predicted. These evoke proprioceptive prediction errors
areas for every single subject (C) was similar for both cursors (PPE). If these errors are just used to adjust proprioceptive predic-
tions, no action occurs. Therefore, PPEs that are sent up the
they interpreted the exact same cursor as non-human. Thus, it appears processing hierarchy have to be attenuated by top-down modulation,
that seeing someone else looking at objects makes people assume that in other words: their expected precision must be lowered (Brown et al.
there must be something in it, especially when they cannot name it. 2013). Overly precise PPEs just lead to a change of the hypothesis,
Interestingly, the effect was not specific to cued triads, suggesting that while imprecise PPEs lead to action. The initiation of action therefore
with gaze transfer the overall readiness for assuming coherence was crucially depends on precision optimization at the lower end of the
higher. In the light of this result, it is possible that gaze increased processing hierarchy (the expected precision of bottom-up sensory
subjects openness for uncertain judgments more than it affected their signals has to be low, relative to the precision of top-down proprio-
actual processing of the objects. This question will have to remain for ceptive predictions).
future research. 2. The choice and conductance of action: Agency (a goal-directed
In contrast to what could be predicted on the basis of previous kind of behavior) has been explained as active inference (e.g., Friston
work (Wiese et al. 2012), subjects visual attention allocation did not et al. 2013; Moutoussis et al. 2014). Agents possess a representation
differ between gaze and dot. First, this rules out the possibility that of a policy which is a sequence of control states (where control states
differences between both cursors only occurred because subjects had are beliefs about future action, cf. Friston et al. 2013, p 3): [A]ction
ignored the presumably irrelevant dot. Moreover, it raises the ques- is selected from posterior beliefs about control states. [] these
tion to what degree and on what level of processing more abstract posterior beliefs depend crucially upon prior beliefs about states that
depictions of intentional behavior (such as cursors) can exert an will be occupied in the future. (Friston et al. 2013, p 4). In this
influence. This has implications for basic research on social attention process, precision is argued to play a dual biasing role: biasing per-
and joint action as well as for applied topics such as the visualization ception toward goal states and enhancing confidence in action choices
of eye movements or computer-mediated cooperation with real and (cf. 2013, p 11). The latter fact may influence the phenomenology of
virtual agents. the agent (cf. Mathys et al. 2011, p 17).
From the point of view of predictive processing, two aspects are
References central to the explanation of agency: precision and the fact that
Mednick SA (1962) The associative basis of creativity. Psychol Rev possible, attainable counterfactual states are represented. Determining
69(3):220232 which counterfactual states minimize conditional uncertainty about
Wiese E, Wykowska A, Zwickel J, Muller HJ (2012) I see what you hidden states corresponds to action selection (cf. Friston et al. 2012,
mean: how attentional selection is shaped by ascribing intentions p 4). Optimizing precision expectations enables action, which is
to others. PLoS ONE 7(9):e45391 ultimately realized by attenuating proprioceptive prediction error
through classical reflex arcs (Brown et al. 2013, p 415).
Anil Seth (2014) also emphasizes the importance of counterfac-
tually-rich generative models: models that [] encode not only the
Towards a predictive processing account of mental likely causes of current sensory inputs, but also the likely causes of
agency those sensory inputs predicted to occur given a large repertoire of
possible (but not necessarily executed) actions []. (p 2). Seth
(2014) argues that counterfactually-rich generative models lead to the
Iuliia Pliushch, Wanja Wiese experience of perceptual presence (subjective veridicality). This
Johannes Gutenberg University Mainz, Mainz, Germany
suggests that counterfactual richness could possibly also play a role in
The aim of this paper is to sketch conceptual foundations for a predictive explaining the phenomenal sense of mental agency.
processing account of mental agency. Predictive processing accounts (II) What is a mental action? Here, we briefly review accounts
define agency as active inference (as opposed to perceptual inference, proposed by Joelle Proust (2013) and Wayne Wu (2013), respectively.
cf. Hohwy 2013, Friston 2009, Friston et al. 2012, Friston et al. 2013). According to Proust, mental actions depend on two factors: an
Roughly speaking, perceptual inference is about modeling the causal informational need and a specific epistemic norm (cf. 2013, p 161).
structure of the world internally; active inference is about making the As an example of an informational need, Proust gives remembering
world more similar to the internal model. Existing accounts, however, the name of a play. Crucially, the agent should not be satisfied with
so far mainly deal with bodily movements, but not with mental actions any possible name that may pop into her mind. Rather, the agent must
(cf. Proust 2013, Wu 2013; the only conceptual connection between be motivated by epistemic norms like accuracy or coherence. Agents
active inference and mental action we know of is made in Hohwy 2013, who are motivated by epistemic norms have epistemic feelings
pp 197199). Mental actions are important because they do not just reflecting the extent to which fulfilling the informational need is a
determine what we do, they determine who we are. feasible task: These feelings predict the probability for a presently
The paper is structured as follows. (I) First, we will briefly explain activated disposition to fulfill the constraints associated with a given
the notion of active inference. (II) After that, we will review purely norm []. (2013, p 162). Wu (2013) defines mental action as
123
selecting a path in behavioral space with multiple inputs (memory (cf. Feldman & Friston 2010; Hohwy 2012). Precision estimates,
contents) and outputs (possible kinds of behavior). The space of and therefore attention, play a crucial role both in active inference
possible paths is constrained by intentions (cf. 2013, p 257). This is and in mental agency. However, some attentional processes, like
why it constitutes a kind of agency, according to Wu. volitional attention, have also been described as a kind of mental
(III) In what follows, we provide a list of possible conceptual action (cf. Metzinger 2013, p 2; Hohwy 2013, pp 197199). It is
connections between active inference and mental agency, as well as thus an open challenge to show how attentional processes that are a
targets for future research. constitutive aspects of mental action differ from those that are a
kind of mental action themselves.
1. Mental and bodily actions have similar causal enabling conditions.
The initiation of mental as well as bodily action depends on the
right kind of precision expectations. This renders mental and bodily Acknowledgment
actions structurally similar. Mental actions can be initiated at every The authors are funded by the Barbara Wengeler foundation.
level of the processing hierarchy. At each level, the magnitude of
expected precisions may vary. As in bodily active inference, the
precisions of prior beliefs about desired events must be high References
enough; otherwise, the informational need will simply be ignored. Brown H, Adams RA, Parees I, Edwards M, Friston K (2013) Active
Furthermore, allocating attention may be a factor contributing to inference, sensory attenuation and illusions. Cogn Process
the success of mental actions (e.g., attending away from ones 14(4):411427
surroundings if one wants to remember something). Feldman H, Friston KJ (2010) Attention, uncertainty, and free-energy.
2. The contents of a mental action cannot typically be determined at Front Human Neurosci 4:215
will prior to performing the action (cf. Proust 2013, p 151). Friston K (2009) The free-energy principle: a rough guide to the
Example: I cannot try to remember the name John Wayne. (But I brain? Trends Cogn Sci 13(7):293301
can try to remember the name of the famous American western Friston K, Adams RA, Perrinet L, Breakspear M (2012) Perceptions
movie actor.) Similarly, in active inference, actions themselves as hypotheses: saccades as experiments. Front Psychol 3:151
need not be represented, only hidden states that are affected by Friston K, Schwartenbeck P, FitzGerald T, Moutoussis M, Behrens T,
action (cf. Friston et al. 2012, p 4). The system possesses Dolan RJ (2013) The anatomy of choice: active inference and
counterfactual representations whose content is [] what we agency. Front Human Neurosci 7
would infer about the world, if we sample it in particular way. Frith C (2012) Explaining delusions of control: the comparator model
(Friston et al. 2012, p 2) In the case of perception, it could be the 20 years on. Conscious Cogn 21(1):5254
[] visual consequences of looking at a bird. (p 4) In the case Hohwy J (2012) Attention and conscious perception in the hypothesis
of remembering, it could be the consequences that the remem- testing brain. Front Psychol 3:96
bered content would produce in the generative model. A central Hohwy J (2013) The predictive mind. Oxford University Press,
question that remains to be answered here is to what extent this Oxford
would call for an extension of the predictive processing Mathys C, Daunizeau J, Friston KJ, Stephan KE (2011) A Bayesian
framework, in the sense that counterfactuals about internal foundation for individual learning under uncertainty. Front
consequences would also have to be modeled. Interestingly, Human Neurosci 5
conducting mental actions may often be facilitated by refraining Metzinger T (2013) The myth of cognitive agency: subpersonal
from certain bodily actions. Imagining, for instance, may be thinking as a cyclically recurring loss of mental autonomy. Front
easier with ones eyes closed. In terms of predictive processing, Psychol 4:931
this means that visual input is predicted to be absent, and bodily Moutoussis M, Fearon P, El-Deredy W, Dolan RJ, Friston KJ (2014)
action ensues in order to make the world conform to this Bayesian inferences about the self (and others): a review.
prediction (i.e., one closes ones eyes). Conscious Cogn 25:6776
3. Both bodily and mental actions can be accompanied by a Proust J (2013) Philosophy of metacognition: mental agency and self-
phenomenal sense of agency. For the sense of bodily agency, awareness. Oxford University Press, Oxford
comparator models have been proposed (cf. Frith 2012; Seth Seth AK (2013) Interoceptive inference, emotion, and the embodied
2013). For the sense of mental agency (cf. Metzinger 2013), at self. Trends Cogn Sci 17(11):565573
least the following questions need to be answered: (1) Is it Seth AK (2014) A predictive processing theory of sensorimotor
possible to explain the sense of mental agency with reference to a contingencies: explaining the puzzle of perceptual presence and
comparison process? (2) If yes, what kinds of content are its absence in synesthesia. Cogn Neurosci 122
compared in this process? A possible mechanism could compare Wu W (2013) Mental action and the threat of automaticity. In Clark
the predicted internal consequences with the actual changes in the A, Kiverstein J, Vierkant T (eds) Decomposing the will. Oxford
generative model after the mental action has been performed. University Press, Oxford, pp 244261
4. Proust (2013) argues that mental agency is preceded and followed
by epistemic feelings. The latter reflect the uncertainty that the
right criteria for the conductance of a mental action have been
chosen and that it has been performed in accordance with the The N400 ERP component reflects implicit prediction
chosen criteria. We speculate that the phenomenal certainty that a error in the semantic system: further support
mental action will be successful depends both on the prior from a connectionist model of word meaning
probability of future states, and on the conditional probabilities of
those states given (internal) control states (thereby, it indirectly
Milena Rabovsky1, Daniel Schad2, Ken McRae3
depends on counterfactual richness: the more possibilities to 1
Department of Psychology, Humboldt University at Berlin,
realize a future state, the higher the probability that the state will
Germany; 2 Charite, Universitatsmedizin Berlin, Germany;
be obtained). 3
University of Western Ontario, London, Ontario, Canada
5. A possible problem for predictive processing accounts of mental
agency arises from the role of attention. Predictive processing Even though the N400 component of the event-related brain
accounts define attention as the optimization of precision estimates potential (ERP) is widely used to investigate language and semantic
123
processing, the specific mechanisms underlying this component are Green et al. 2007; Moeller et al. 2011a; Moeller et al. 2011b).
still under active debate (Kutas, Federmeier 2011). To address this However, so far, there are only few studies using this methodology to
issue, Rabovsky and McRae (2014) recently used a feature-based better understand the processes involved in mental arithmetic with a
connectionist attractor model of word meaning to simulate seven specific focus on addition (Green et al. 2007; Moeller et al. 2011a;
N400 effects. We observed a close correspondence between N400 Moeller et al. 2011b). In this context, Moeller and colleagues (2011b)
amplitudes and semantic network error, that is, the difference suggested that successful application of the carry-over procedure in
between the activation pattern produced by the model over time and addition (e.g., 23 + 41 = 64 vs. 28 + 36 = 64) involves at least three
the activation pattern that would have been correct. Here, we present underlying processes. First, the sum of the unit digits is computed
additional simulations further corroborating this relationship, using already in first pass encoding (i.e., 3 + 1 = 4 vs. 8 + 6 = 14 in above
the same network as in our previous work, with 30 input units examples). Second, based on this unit sum the need for a carry-over
representing word form that directly map onto 2,526 semantic fea- procedure is evaluated (with the need for a carry-over indicated by a
ture units representing word meaning, according to empirically unit sum of C 10). Third, the carry-over procedure has to be executed
derived semantic feature production norms (McRae et al. 2005). The by finally adding the decade digit of the unit sum to the sum of the
present simulations focus on influences of orthographic neighbors, tens digits of the summands to derive the correct result (i.e.,
which are words that can be derived from a target by exchanging a 2 + 4 + 0 = 6 vs. 2 + 3 + 1 = 6). Interestingly, the authors found
single letter, preserving letter positions. Specifically, empirical ERP that the first two processes were specifically associated with the
research has shown that words with many orthographic neighbors processing of the unit digits of the summands reflecting increased
elicit larger N400 amplitudes. We found that a model analogue of processing demands when the sum of the unit digits becomes B 10
this measure (i.e., the number of word form representations differing and it is recognized that a carry is needed. In particular, it was found
in a single input unit from the target) increases network error. that already during the initial encoding of the problem first fixation
Furthermore, the frequency of a words orthographic neighbors has durations (FFD) on the second summand increased continuously with
been shown to play an important role, with orthographic neighbors the sum of the unit digits indicating that unit sum indeed provides the
that occur more frequently in language producing larger N400 basis for the decision whether a carry is needed or not. Additionally,
amplitudes than orthographic neighbors that occur less frequently. after the need for a carry procedure was detected carry addition
Again, our simulations showed a similar influence on network error. problems were associated with particular processing of the unit digits
In psychological terms, network error has been conceptualized as of both summands as indicated by an increase in refixations.
implicit prediction error, and we interpret our results as yielding In the current study, we aimed at valuation of how far these results
further support for the notion that N400 amplitudes reflect implicit of the specific processing of unit digits associated with the carry-over
prediction error in the semantic system (McClelland 1994; Rabov- procedure in addition can be generalized to the borrowing procedure
sky, McRae 2014). in subtraction. Similar to the case of the carry-over procedure, the
necessity of a borrowing procedure can also be evaluated when
References processing the unit digit of the subtrahend during first pass encoding
Kutas M, Federmeier KD (2011) Thirty years and counting: finding (i.e., by checking whether the difference of the unit digits of minuend
meaning in the N400 component of the event-related brain and subtrahend is B 0) (Geary et al. 1993; Imbo et al. 2007). Fur-
potential (ERP). Ann Rev Psychol 62:621647 thermore, after the need for a borrowing procedure was detected, later
McClelland JL (1994) The interaction of nature and nurture in processing stages may well involve particular processing of the unit
development: a parallel distributed processing perspective. In digits of minuend and subtrahend. Therefore, we expected the influ-
Bertelson PEP, dYdewalle G (ed) International perspectives on ence of the necessity of a borrowing procedure in subtraction
psychological science, vol 1. Erlbaum, UK problems on participants eye fixation behavior to mirror the influence
McRae K, Cree GS, Seidenberg MS, McNorgan C (2005) Semantic of the carry-over procedure in addition.
feature production norms for a large set of living and nonliving Forty-five students [9 males, mean age: 23.9 years;
things. Behav Res Methods 37(4):547559 SD = 7.2 years] solved both 48 addition and 48 subtraction problems
Rabovsky M, McRae K (2014) Simulating the N400 ERP component in a choice reaction time paradigm. Their fixation behavior was
as semantic network error: insights from a feature-based con- recorded using an EyeLink 1000 eye-tracking device (SR-Research,
nectionist attractor model of word meaning. Cognition 132:6889 Kanata, Canada) providing a spatial resolution of less than 0.5
degrees of visual angle at a sampling rate of 500 Hz. In a 2 x 2 design
arithmetic procedure (addition vs. subtraction) and the necessity of a
carry-over or borrowing procedure was manipulated orthogonally
Similar and differing processes underlying carry with problem size being matched. Problems were displayed in white
and borrowing effects in addition and subtraction: against a black background in non-proportional font Courier New
(style bold, size 50). Each problem was presented together with two
evidence from eye-tracking solution probes of which participants had to indicate the correct one
by pressing a corresponding button. The order, in which participants
Patricia Angela Radler1, Korbinian Moeller2,3, Stefan Huber2, Silvia completed the addition and subtraction task, was counterbalanced
Pixner1 across participants. For the analysis of the eye-tracking data, areas of
1
Institute for Psychology, UMITHealth and Life Sciences interest were centered around each digit (height: 200 pixels, width: 59
University, Hall, Tyrol, Austria; 2 Knowledge Media Research pixel). All fixations falling within a respective area of interest were
Center, Tubingen, Germany; 3 Department of Psychology, considered fixations upon the corresponding digit.
Eberhard-Karls University, Tubingen, Germany Generally, additions were solved faster than subtractions
(3,766 ms vs. 4,581 ms) and carry/borrow problems were associated
with longer reaction times (4,783 ms vs. 3,564 ms). Importantly,
Keywords
however, effects of carry-over and borrowing were also observed in
Eye fixation behavior, Addition, Subtraction, Carry-over, Borrowing
participants eye fixation behavior. Replicating previous results the
Recent research indicated that investigating participants eye fix- necessity of a carry-over led to a specific increase of FFD on the unit
ation behavior (Rayner 1998; Rakoczi 2012) can be informative to digit of the second summand (323 ms vs. 265 ms) during first pass
evaluate processes underlying numerical cognition (Geary et al. 1993; encoding. Interestingly, this was also observed for a required
123
borrowing procedure. FFD were specifically elevated on the unit research dedicated to this issue. However, comparatively little
digits of the subtrahend (415 ms vs. 268 ms). However, in contrast to research has focused on the implicit learning of vocabulary and, to
our hypothesis we did not observe such a congruity between the our knowledge, no study has examined whether syntax and vocabu-
influences of carry in addition and borrowing in subtraction on later lary can be acquired simultaneously. This is an important question,
processing stages. While the need for a carry procedure led to a given that in language acquisition outside of the experimental lab,
specific increase of the processing of the unit digits of both summands subjects are exposed to (and learn) many linguistic features at the
(as indicated by an increase of fixations on these digits, 2.04 vs. 1.55 same time. This paper reports the results of an experiment that
fixations), this specificity was not found for borrowing subtraction investigated the implicit learning of second language (L2) syntax and
problems, for which the number of fixations increased evenly on tens vocabulary by adult learners. The linguistic focus was on verb
(2.15 vs. 1.74 fixations) and units (2.33 vs. 1.84 fixations) due to the placement in simple and complex sentences (Rebuschat, Williams
need for a borrowing procedure. 2009, 2012; Tagarelli, Borges, Rebuschat 2011, in press). The novel
Taken together, these partly consistent but also differing results for the vocabulary items were ten pseudowords, taken from Hamrick and
carry-over procedure in addition and the borrowing procedure in sub- Rebuschat (2012, 2013).
traction indicate that evaluating the need for both is associated with Sixty native speakers of English were exposed to an artificial
specific processing of the unit digit of the second operand (i.e., the second language consisting of German syntax and English words, including
summand or the subtrahend). This is plausible, as in both addition and ten pseudowords that followed English phonotactics. Subjects in the
subtraction the sum or the difference between the unit digits is indicative incidental group (n = 30) did not know they were going to be tested,
of whether a carry-over or borrowing procedure is necessary. Impor- nor that they were supposed to learn the grammar or vocabulary of a
tantly, both the sum of the unit digits as well as their difference can only novel language. The exposure task required subjects to judge the
be evaluated after having considered the unit digit of the second operand. semantic plausibility of 120 different sentences, e.g. Chris placed
However, later processes underlying the carry-over and borrowing pro- today the boxes on the dobez (plausible) and Sarah covered usually
cedure seem to differ. While the need for a carry procedure is associated the fields with dobez (implausible). The task thus required subjects
with specific reprocessing of the unit digits of both summands this was to process the sentences for meaning. Subjects were provided with a
not the case for a required borrowing procedure. Thereby, these data picture that matched the meaning of the pseudowords, in the exam-
provide first direct evidence, suggesting that similar cognitive processes ples above with a black-and-white drawing of a table underneath the
underlie the recognition whether a carry or borrowing procedure is sentence. Subjects in the intentional group (n = 30) read the same
needed to solve the problem at hand. On the other hand, further pro- 120 sentences but were asked to discover the word-order rules and to
cessing steps may differ between addition and subtraction. Future studies memorize the meaning of the pseudowords. In the testing phase, all
are needed to investigate the processes underlying the execution of the subjects completed two tests, a grammaticality judgment task to
borrowing procedure in subtraction more closely. assess whether they had learned the novel syntax and a forced-choice
task to assess their knowledge of the pseudowords. In both tasks,
subjects were also asked to report how confident they were and to
References indicate what the basis of their judgment was. Confidence ratings and
Geary DC, Frensch PA, Wiley JG (1993) Simple and complex mental source attributions were employed to determine whether exposure had
subtraction: strategy choice and speed-of-processing differences resulted in implicit or explicit knowledge (see Rebuschat 2013, for a
in younger and older adults. Psychol Aging 8(2):242256 review).
Green HJ, Lemaire P, Dufau S (2007) Eye movements correlates of Data collection has recently concluded but data have not yet been
younger and older adults strategies for complex addition. Acta fully analyzed. Given our previous research (e.g. Tagarelli et al. 2011,
Psychol 125:257278. doi:10.1016/j.actpsy.2006.08.001 in press; Grey, Williams, Rebuschat 2014; Rogers, Revesz, Rebus-
Imbo I, Vandierendonck A, Vergauwe E (2007) The role of working chat, in press; Rebuschat, Hamrick, Sachs, Riestenberg, Ziegler
memory in carrying and borrowing. Psychol Res 71:467483. 2013), we predict that subjects will be able to acquire both the syntax
doi:10.1007/s00426-006-0044-8 and the vocabulary of the artificial language simultaneously and that
Moeller K, Klein E, Nuerk HC (2011a) (No) Small adults: childrens the amount of implicit and explicit knowledge will vary depending on
processing of carry addition problems. Dev Neuropsychol the learning context, with subjects in the incidental group acquiring
36(6):702720 primarily implicit knowledge and also some explicit knowledge, and
Moeller K, Klein E, Nuerk HC (2011b) Three processes underlying vice versa in the intentional group The paper concludes with impli-
the carry effect in additionevidence from eye tracking. Br J cations for future research.
Psychol 102:623645. doi:10.1111/j.2044-8295.2011.02034.x
Rakoczi G (2012) Eye Tracking in Forschung und Lehre. References
Moglichkeiten und Grenzen eines vielversprechenden Er- Grey S, Williams JN, Rebuschat P (2014) Incidental exposure and
kenntnismittels. In Gottfried C, Reichl F, Steiner A (Hrsg.), L3 learning of morphosyntax. Stud Second Lang Acquis
Digitale Medien: Werkzeuge fur exzellente Forschung und Le- 36:134
hre (S. 8798). Munster: Waxmann Hamrick P, Rebuschat P (2012) How implicit is statistical learning?
Rayner K (1998) Eye movements in reading and information pro- In Rebuschat P, Williams JN (eds) Statistical learning and
cessing: 20 years of research. Psychol Bull 124:372422 language acquisition. Mouton de Gruyter, Berlin, pp 365382
Hamrick P, Rebuschat P (2013) Frequency effects, learning conditions,
and the development of implicit and explicit lexical knowledge.
In Connor-Linton J, Amoroso L (eds) Measured language:
Simultaneous acquisition of words and syntax: quantitative approaches to acquisition, assessment, processing
contrasting implicit and explicit learning and variation. Georgetown University Press, Washington
Rebuschat P, Williams JN (2009) Implicit learning of word order. In
Taatgen NA, van Rijn H (eds) Proceedings of the 31st annual
Patrick Rebuschat, Simon Ruiz
conference of the cognitive science society. Cognitive Science
Lancaster University, UK
Society, Austin, pp 425430
The topic of implicit learning plays a central role in cognitive psy- Rebuschat P (2013) Measuring implicit and explicit knowledge in
chology, and recent years have witnessed an increasing amount of second language research. Lang Learn 63(3):595626
123
Rebuschat P, Hamrick P, Sachs R, Riestenberg K, Ziegler N (2013) Shared-space interaction study

Implicit and explicit knowledge of form-meaning connections: Identifying corresponding human communication strategies requires
evidence from subjective measures of awareness. In Bergsleithner J, studying humans in free interaction. Therefore, we investigate face-
Frota S, Yoshioka JK (eds) Noticing: L2 studies and essays in honor to-face, goal-oriented interactions in a natural setting which com-
of Dick Schmidt. University of Hawaii Press, Honolulu, pp 255275 prises spatial references with gaze and pointing gestures. In a route
Rogers J, Revesz A, Rebuschat P (in press) Implicit and explicit knowledge planning scenario, participants are to plan paths to rooms on three
of L2 inflectional morphology: an incidental learning study floors of a university building. The three corresponding floor plans are
Tagarelli KM, Borges Mota M, Rebuschat P (in press 2014) Working located on a table between them. The scale of the 32x32 cm plans is
memory, learning context, and the acquisition of L2 syntax. In approximately 1:180, each floor having about 60 rooms. The floor
Zhisheng W, Borges Mota M, McNeill A (eds) Working memory plans are printed on a DIN A0 format poster. This way, each par-
in second language acquisition and processing: theory, research ticipant has one floor plan directly in front of him or her, one is shared
and commentary. Multilingual Matters, Bristol with the interaction partner, and one plan is not reachable. The dif-
Tagarelli K, Borges Mota M, Rebuschat P (2011) The role of working ficulty of the task is increased by introducing blocked areas in the
memory in the implicit and explicit learning of languages. In hallways: Detours have to be planned (forcing participants to
Carlson L, Holscher C, Shipley T (eds) Proceedings of the 33rd repeatedly change the floor), which lead to more complex interactions
annual conference of the cognitive science society. Cognitive ensuring a lively interaction and not a rigid experimental design with
Science Society, Austin, pp 20612066 artificial stimuli.
Recorded data
In the experiments, multimodal data were recorded: Two video
cameras observed the participants during the interactions. One par-
Towards a model for anticipating human gestures ticipant was equipped with mobile eye-tracking glasses. Pointing
in human-robot interactions in shared space directions and head positions of both participants were recorded by
an external tracking system. As analyzing eye-tracking data usually
requires time-consuming manual annotation, an automatic approach
Patrick Renner1, Thies Pfeiffer2, Sven Wachsmuth2 was developed combining fiducial marker tracking and 3D-modeling
1
Artificial Intelligence Group, Bielefeld University, Germany; of stimuli in virtual reality as proxies for intersection testing
2
CITEC, Bielefeld University, Germany between the calculated line of sight and the real objects Pfeiffer and
Abstract Renner (2014). The occurrence of pointing gestures to rooms, stairs,
Human-robot interaction in shared spaces might benefit from human elevators and markers for blocked areas were annotated semi-
skills of anticipating movements. We observed human-human inter- automatically.
actions in a route planning scenario to identify relevant Results
communication strategies with a focus on hand-eye coordination. The results of our experiments show that at the time of a pointing
gestures onset, it is indeed possible to predict its target area when
Keywords taking into consideration fixations which occurred in the last 200 ms
Shared-space interaction, Hand-eye coordination, 3D eye tracking before the onset. When allowing a maximum deviation of 20 cm the
target area was predicted for 75 % of the cases and with a maximum
Introduction deviation of 10 cm for 50 % of the cases. Figure 1 shows an example
A current challenge in human-robot interaction is to advance from of a fixation on the target of a pointing gesture, preceding the hand
using robots as tools to solving tasks cooperatively with them in close movement. In the same study, we also analyzed body movements, in
interaction. When humans and robots interact in shared space, by the
overlap of the peripersonal spaces of the interaction partners, an
interaction space is formed (Nguyen and Wachsmuth 2011). Here, the
actions of both partners have to be coordinated carefully in order to
ensure a save cooperation as well as a flawless, successful task
completion. This requires capabilities beyond collision avoidance,
because the robot needs to signal a mutual understanding of situations
where both interaction partners interfere. With a dynamic represen-
tation of its peripersonal space (Holthaus and Wachsmuth 2012), a
robot can be aware of its immediate surrounding and this way, e.g.,
avoid collisions before they are actually perceived as a potentially
harmful situation. However, shared-space interactions of humans and
robots are still far from being as efficient as those between humans.
Modeling human skills for anticipating movements could help
robots to increase robustness and smoothness of shared-space inter-
actions. Our eyes often rest on objects we want to use or to refer to. In
a specific pointing task, Prablanc et al. (1979) found that the first
saccade to the target occurs around 100 ms before the hand move-
ment is initiated. If the robot were able to follow human eye gaze and
to predict upcoming human gestures, several levels of interaction
could be improved: First, anticipated gesture trajectories could be
considered during action planning to avoid potentially occupied areas.
Second, action executions could be stopped if the robot estimates a
human movement conflicting with its current target. Third, the robot
could turn its sensors towards the estimated target to facilitate com- Fig. 1 An example for a fixation (highlighted by the ring) anticipat-
munication robustness and increase the humans confidence in the ing the pointing target. The three floor plans can be seen. The black
grounding of the current target (Breazeal et al. 2005). tokens serve for marking blocked areas
123
particular leaning forward: Participants almost exclusively used On the other hand, expert knowledge could also trigger top-down
leaning forward to point to targets more distant than 65 cm (from the mechanisms supporting object recognition despite of impaired basic
edge of the table). functions of object processing. Finally, a more efficient stimulus
Conclusion processing for expert objects might simply not require complete
Altogether, our findings provide quantitative data to develop a prediction resources of an intact ventral stream.
model considering both eye-hand coordination and leaning forward. This
could enable the robot to have a detailed concept of an upcoming human
pointing movement. For example, based on current gaze information of
its interlocutor, a robot could predict that a starting pointing gesture Visual salience in human landmark selection
would end within a 20 cm radius around the currently fixated point (with
a 75 % chance). This will allow the robot to decide whether the predi- Florian Roser, Kai Hamburger
cated target space is in conflict with its own planned actions and it might University of Giessen, Germany
react accordingly, e.g. by avoiding the area or pausing. Abstract
Visual aspects of landmarks are a main component in almost every
Acknowledgments
theory about landmark preference, selection and definition theory. But
This work has been partly funded by the DFG in the SFB 673
could this aspect be moderated by some other factors, for example the
Alignment in Communication.
objects position?
Keywords
References Spatial cognition, Landmarks, Visual salience
Breazeal C, Kidd CD, Thomaz AL, Hoffman G, Berlin M (2005)
Effects of nonverbal communication on efficiency and robustness Introduction
in human-robot teamwork. In: Intelligent Robots and Systems Visual aspects of objects play an elementary role in the landmark
2005. (IROS 2005). 2005 IEEE/RSJ International Conference on, selection process during a wayfinding task (Sorrows, Hirtle 1999; for
IEEE, pp 708713 an overview see Caduff, Timpf 2008). Thereby the contrast to the
Holthaus P, Wachsmuth S (2012) Active Peripersonal Space for More surrounding of the object is elementary, namely the contrast to other
Intuitive HRI. In: International Conference on Humanoid Robots, objects. Our assumption is that the visual aspect of an object in a
IEEE RAS, Osaka, Japan, pp 508513 wayfinding context is only necessary to recognize this object. The
Nguyen N, Wachsmuth I (2011) From body space to interaction decision in favor of or against an object will be based on other, more
space-modeling spatial cooperation for virtual humans. In: 10th cognitive aspects, for example an objects position.
International Conference on Autonomous Agents and Multi- Two contrary assumptions exist. On the one side preliminary
agent Systems, International Foundation for Autonomous experiments showed that the ideal position at an intersection (allo-
Agents and Multiagent Systems, Taipei, Taiwan, pp 10471054 centric perspective) is the position in front of the intersection in the
Pfeiffer T, Renner P (2014) EyeSee3D: A Low-cost Approach for Ana- direction of turn (Roser, Hamburger, Krumnack, Knauff 2012). On
lyzing Mobile 3D Eye Tracking Data Using Computer Vision and the other side it has been shown that in an arrangement of different
Augmented Reality Technology. In: Proceedings of the Symposium objects the pop-out object (single one or singleton) will be
on Eye Tracking Research and Applications, ACM, pp 195202 preferred (Roser, Krumnack, Hamburger 2013). Here we want to
Prablanc C, Echallier J, Komilis E, Jeannerod M (1979) Optimal discuss in how far these two contrasting assumptions go together and
response of eye and hand motor systems in pointing at a visual which influence different tasks or instructions can have.
target. Biological cybernetics 124 pp 113124 Experiment
Method
A total of 32 students (21 $; age: 27 years; range: 1956) partici-
pated. All participants provided informed written consent. All had
Preserved expert object recognition in a case normal or corrected-to-normal visual acuity and color vision (tested
of unilateral visual agnosia with Velhagen and Broschmann 2003). They received course credits
or money for participation.
Materials and Procedure
Johannes Rennig, Hans-Otto Karnath, Marc Himmelbach
The material existed of four grey (apartment) blocks with one white
Center of Neurology, Hertie-Institute for Clinical Brain Research,
square each at the medially oriented corner. These should represent
University of Tubingen, Tubingen, Germany
the facades of the building (Roser et al. 2012) at an intersection.
We examined a stroke patient (HWS) with a unilateral lesion of the Within these blocks the different objects are placed; we call them
right medial ventral visual stream. A high resolution MR scan showed landmarks. This is due to the fact that they could help to orientate in
a severe involvement of the fusiform and parahippocampal gyri such an environment.
sparing big parts of the lingual gyrus. In a number of object recog- All landmark objects consist of a cross and five thin lines in dif-
nition tests with lateralized presentations of target stimuli, HWS ferent arrangements so that they are generally distinct (Fig. 1). Three
showed remarkable deficits for contralesional presentations only. His of these landmarks had the same color, one was different (singleton).
performance on the ipsilesional side was unaffected. We further The color differences range from 0 to 180. The color gradient is
explored his residual capabilities in object recognition confronting visible in Fig. 1 (left top and bottom). The single one was presented
him with objects he was an expert for. These were items he knew once at each position at the intersection and once at each position for
from his job as a trained car mechanic that were occupationally and a left and right turn (2nd and 3rd experimental condition). This results
personally relevant for him. Surprisingly, HWS was able to identify in 64 different pictures/intersections which were presented in a ran-
these complex and specific objects on the contralesional side while he domized order.
failed in recognizing even highly familiar everyday objects. This Each participant was assigned to one of three experimental con-
observation of preserved expert object recognition in visual agnosia ditions. In the first condition (intersection) the intersections were
gives room for several explanations. At first, these results may be presented without a route direction arrow and the task was to choose
caused by enhanced information processing of the ventral system in the object that pops out most. In the second one (intersection and
the intact hemisphere that is exclusively available for expert objects. arrow) an arrow indicated the direction in which a change of route
123
N, Wachsmuth I (eds) Proceedings of the 35th annual conference

of the cognitive science society cognitive science society, Aus-
tin, TX, pp 33153320
Sorrows ME, Hirtle SC (1999) The nature of landmarks for real and
electronic spaces. In: Freksa C, Mark DM (eds) Spatial infor-
mation theory: cognitive and computational foundations of
geographic information science, international conference COSIT.
Springer, Stade, pp 3750
Velhagen K, Broschmann D (2003) Tafeln zur Prufung des Farbsinns.
33., unveranderte Auflage. Georg Thieme Verlag, Stuttgart
Left to right or back to front? The spatial flexibility

of time
Fig. 1 Left (top and bottom) the used colors of the objects and the
color gradient. Top (middle and right) examples of the intersections
Susana Ruiz Fernandez1, Juan Jose Rahona2, Martin Lachmair1
with and without an arrow. Bottom (right) results. The x-axis 1
Leibniz Knowledge Media Research Center (KMRC), Tubingen,
represents the single experimental variation (low difference on the left
Germany; 2 Complutense University, Madrid, Spain
and high on the right). The y-axis represents the participants relative
object selection of the single object (percentage number) How is time represented in space? Strong evidence was found for a
spatial representation of time that goes from left-to-right with past
represented on the left and future on the right side (Santiago et al.
was about to happen (Fig. 1); the task still remained the same as in the 2007). There is also evidence for a back-to-front timeline with past
first condition. In the third condition the intersections looked the same represented behind and future ahead (Ulrich et al. 2012). Based on the
as in the second one, but now the task was to choose the object, which notion of a flexible representation of time onto space (Torralbo et al.
the participant would use in order to give a route description. All 2006) the present study compared both time representations directly.
experiments were run on the same computer (19 inches). Embodied theories suggest that internal representations of abstract
Results concepts include multimodal perceptual and motor experiences.
Figure 1 (bottom right) depicts the frequency for choosing the single Assuming richer back-to-front spatial experiences through our senses,
object. In the condition intersection it can be seen that the single we expect faster responses for the back-to-front than for the left-to-
object is clearly identifiable by a color difference of 11 on right response mapping.
(*100 %). 0 and 3 are on chance level and 6 something between. Method
A similar result is observable for the condition intersection and After the presentation of a future or past related word (e.g., yesterday,
arrow. The remaining condition, in which the participants had to tomorrow), forty-four participants (all right handed) had to classify
decide which object they would prefer to give a route description, the time word moving the slider of a response device along one of two
shows a different curve. First of all, it increases slower and secondly, axes (left-to-right axis or back-to-front axis) according to the tem-
it reaches its top at around 60 %. On the other hand participants chose poral content of the words. In the congruent condition, participants
the ideal position in 60 % of the cases This differs significantly from had to perform a right or forward movement in response to a future-
chance level (t(9) = 4.576, p = .001). related word and a left or backward movement if a past-related word
Discussion was presented. In the incongruent condition, a backward or left
Participants are capable to identify the single object, if the color movement was performed in response to a future-related word and a
difference exceeds 6. The instruction which one would you choose forward or right movement in response to a past-related word. For the
to give a route description led to different landmark selections. Here performance of the movement, a response device was used that
the position seems to play a major role. Thus, we may conclude that recorded continuous movements of the manual response in the left-to-
the perception of the color distribution at the intersection is moderated right and the back-to-front plane (see Ulrich et al. 2012). Touch-
by the task at hand. sensitive devices registered the onset of the response and the time
One interpretation could be that the contrast to the surrounding of when the slider of the device reached one of the two endpoints.
landmarks at an intersection is strongly moderated by the participants Reaction time (RT) required from the onset of the presentation of the
task. This will be examined in more detail in further experiments. word to the onset of the response (leaving the start position of the
slider) was measured. Additionally, the movement time (MT)
Acknowledgments required from response onset to one of the two endpoints was mea-
We thank Anna Bosch and Sarah Jane Abbott for help within data sured. Depending on the required response axis response device was
recording. rotated by 90 or 180.
The experiment consisted of four experimental blocks. The
References experiment combined the factors response axis (back-to-front vs. left-
Caduff D, Timpf S (2008) On the assessment of landmark salience for to-right) and response congruency (congruent: forward or right
human navigation Cog Pro 9:249267 movement to future-related words and backward or left movement to
Klippel A, Winter S (2005) Structural salience of landmarks for route past-related words vs. incongruent: forward or right movement to past
directions. In: AG Cohn, DM Mark (Eds) Spatial information theory. related words and backward or left movement to future-related
International Conference COSIT, Springer, Berlin pp 346362 words). Each combination resulted in one block that included 120
Roser F, Hamburger K, Krumnack A, Knauff M (2012) The structural trials (including 20 practice trials). Separate repeated measures ana-
salience of landmarks: Results from an online study and a virtual lyzes of variance (ANOVA) were conducted on mean RT and mean
environment experiment. J of Spatial Science 5, 3750 MT taking participants (F1) as well as items (F2) as random factors.
Roser F, Krumnack A, Hamburger K (2013) The influence of per- When necessary, p-values were adjusted for violations of the sphe-
ceptual and structural salience. In: Knauff M, Pauen M, Sebanz ricity assumption using the Greenhouse-Geisser correction.
123
Torralbo A, Santiago J, Lupianez J (2006) Flexible conceptual pro-

jection of time onto spatial frames of reference. Cogn Sci
30:745757
Ulrich U, Eikmeier V, de la Vega I, Ruiz Fernandez S, Alex-Ruf S,
Maienborn C (2012) With the past behind and the future ahead:
Back-to-front representation of past and future sentences. Mem
Cognit 40:483495
Smart goals, slow habits? Individual differences

in processing speed and working memory capacity
moderate the balance between habitual and goal-
directed choice behavior
Daniel Schad1, Elisabeth Junger2, Miriam Sebold1,

Maria Garbusow2, Nadine Bernhart2, Amir Homayoun Javadi3,
Fig. 1 Mean RT depending on response congruency and response Ulrich S. Zimmermann2, Michael Smolka2, Andreas Heinz1,
axis Michael A. Rapp4, Quentin Huys5
1
Charite, Universitatsmedizin Berlin, Germany; 2 Technische
Universitat Dresden, Germany; 3 University College London (UCL),
Results UK; 4 Universitat Potsdam, Germany; 5 TNU, ETH und Universitat
RT results are showed in Fig. 1, which depicts mean RT as a function Zurich, Switzerland
of response congruency and response axis. An ANOVA on RT
showed shorter RT for the congruent (768.84 ms) compared to the Choice behavior is shaped by cognitively demanding goal-directed
incongruent condition (812.15 ms), F1 (1, 43) = 29.42, p \ .001; F2 and by more automatic habitual processes. External cognitive load
(1, 19) = 104.37, p \ .001. Participants were faster initiating a right manipulations alter the balance of these systems. However, it is
or forward movement for future-related words and a left or backward unclear how individual differences in specific cognitive abilities
movement for past-related words than initiating a left or backward contribute to the arbitration between habitual and goal-directed
movement for future-related words and a right or forward movement decision-making.
for past-related words. RT were marginally shorter for the left-to-right 29 adults performed a two-step decision task explicitly designed to
axis (783.07 ms) compared to the back-to-front axis (797.92 ms), F1 capture the two systems computational characteristics. We also
(1, 43) = 3.73, p = .060; F2 (1, 19) = 14.06, p = .001. The inter- collected measure of fluid and crystalline intelligence.
action between response congruency and response axis failed There was an inverted U-shape relationship between processing
significance, F1 and F2 \ 1. speed and habitual choice together with a linear relationship between
An ANOVA on MT did not reveal significant effects for response processing speed and goal-directed behavior. Working memory
congruency [F1 (1, 43) = 0.52, p = .476; F2 (1, 19) = 2.59, capacity impacted on this balance only amongst those subjects with
p = .124], response axis [F1 (1, 43) = 1.58, p = .216], and interac- high processing speed.
tion [F1 and F2 \ 1]. Only the F2-analysis revealed an effect of Different aspects of intelligence have specific contributions to
response axis, F2 (1, 19) = 23.04, p \ .001. Accordingly, response complex human decision-making. Individual differences in such
congruency and response axis affected movement initiation but not cognitive abilities moderate the balance between habitual and goal-
movement execution. directed choice behavior.
Discussion
Results support a flexible projection of time into space. Unexpectedly,
a trend to faster responses for the left-to-right mapping was found,
Tracing the time course of n 2 2 repetition costs
suggesting an influence of reading direction on response axis. A
possible explanation is that reading temporal words could activate the
left-to-right response axis. This activation needs to be inhibited when Juliane Scheil, Thomas Kleinsorge
a front-to-back response is performed. This explanation is supported Leibniz Research Centre for Working Environment and Human
by recent experiments that show a higher activation of the timespace Factors, Dortmund, Germany
congruency when visual (instead of auditory) stimuli were used Introduction
(Rolke et al. 2013). In order to flexibly adapt to a permanently changing environment, it is
necessary to inhibit previously activated but now irrelevant process-
Acknowledgments ing pathways. Empirically, this inhibition manifests itself only
We thank R. Bahlinger, V. Engel, N. Feldmann, P Huber, S. Kaiser, J. indirectly in terms of a cost of reengaging a previously inhibited
Kinzel, H. Kriening, S. Riedel, K. Wessolowski, E. Wiedemann and pathway, that is, the so-called n - 2 repetition costs: When switching
K. Zeeb for their assistance. among three tasks A, B, and C, higher reaction times and error rates
occur when the task in the current trial equals the task in trial n - 2
References (i.e., sequences of type ABA) compared to two consecutive switches
Rolke B, Ruiz Fernandez S, Schmid M, Walker M, Lachmair M, to another task (sequences CBA). Although n - 2 repetition costs
Rahona Lopez JJ, Hervas G, Vazquez C (2013) Priming the have been reported in many studies, it remains an open question when
mental time-line: Effects of modality and processing mode. Cogn and how inhibition is triggered and how it develops over time.
Process 14:231244 A possibility to capture the time course of inhibition lies in the
Santiago J, Lupianez J, Perez E, Funes MJ (2007) Time (also) flies variation of different time intervals in the cued task switching para-
from left to right. Psychon B Rev 14:512516 digm. The cue-stimulus interval (CSI) allows participants to prepare
123
for the next task. On the other hand, no specific preparation for the investigations focusing on how and when inhibitory processes
next task is possible during the response-cue interval (RCI), in which decline, the present study is the first trying to identify the time
usually a fixation mark is presented that contains no information needed for inhibition to build up On the other hand, our results
about the next task. Therefore, effects of the RCI cover passive suggest that n - 2 repetition costs, after reaching their maximum
processes, like decaying inhibition or activation. at about 300 ms, start to decay. Therefore, the results are in line
The present study aimed at investigating the time course of inhi- with the assumption of inhibition that, once exerted, decays during
bition in a fine-grained manner. For this purpose, the length of the the RCI.
RCI (the time between the response of trial n - 1 and the cue of trial
n) was manipulated in five steps separated by 125 ms each. This
allowed us to capture also non-linear trends of the size of n - 2
repetition costs that could be overlooked in designs using only two Language cues in the formation of hierarchical
distinct RCIs. representation of space
Method
In two experiments, subjects (Exp I: 10 men, 21 women, mean age
Wiebke Schick, Marc Halfmann, Gregor Hardiess,
23.8 years; Exp II: 6 men, 15 women, mean age 22.7 years) switched
Hanspeter A. Mallot
between three tasks in an explicitly cued task switching experiment.
Cognitive Neuroscience, Dept of Biology, University of Tubingen,
In Exp I, participants had to indicate via keypress whether the digit
Germany
serving as imperative stimulus is smaller or larger than five, odd or
even, or regarding its position along the number line relative to the Keywords
digit five (central or peripheral). In Exp II, participants had to judge
Region effect, Linguistic categories, Whole-part-relations, Interaction
shapes regarding their size (big or small), color (yellow or blue), or
language, spatial knowledge
shape (x or +). Stimuli were presented centrally on a 17 monitor on
light-grey background. Viewing distance approximated 60 cm. The The formation of a hierarchical representation of space can be
experimental design resulted from a factorial combination of the induced by the spatial adjacency of landmark objects belonging to the
within-subjects factors RCI, varied in five steps (50, 175, 300, 425, same semantic category, as was demonstrated in a route planning
and 550 ms), and Task Sequence (ABA vs. CBA). experiment (Wiener, Mallot 2003). Using the same paradigm, we
Results tested the efficiency of linguistic cues with various hierarchical cat-
Both experiments revealed significant n - 2 repetition costs that were egorization principles in regional structuring. In different conditions,
modulated by the RCI. Costs were highest for RCIs of 300 ms and the experimental environment was parceled (i) with landmarks of
differed significantly from those of RCIs of length 50 and 175 ms different semantic categories, (ii) with superordinate fictive proper
(Experiment I and II), 425 ms (Experiment I), and 550 ms (Experi- names, (iii) with superordinate prototypical names, (iv) with names
ment II, cf. Fig. 1). from different linguistic semantic categories, and (v) with holonym-
Discussion meronym relations (semantic whole-parts relation). A region effect
In both experiments, the size of n - 2 repetition costs was modu- comparable to the landmark-object condition was found only for the
lated by the length of the RCI. Highest n - 2 repetition costs could holonym-meronym condition which combined spatial proximity with
be observed for the RCI of 300 ms, while they were smaller for a shared context.
shorter RCIs (50 ms or 175 ms). Furthermore, the size of n - 2 Wiener, Mallot (2003) investigated the influence of regions on
repetition costs declined again when the RCI exceeded 300 ms, that human route planning behavior in a hexagonal, iterated y-maze in a
is, with a RCI of 425 and 550 ms. This pattern can be interpreted in virtual environment. All of the 12 decision places were marked by a
terms of an overlap of two different time courses involved in landmark belonging to one of three different semantic categories
inhibition. (vehicles, animals and paintings), thus defining three regions com-
On the one hand, inhibition seems to need about 200300 ms posed of four adjacent places. When asked to navigate routes which
to reach its full extent, reflecting a process of building up a suf- allowed for two equidistant alternatives, subjects consistently pre-
ficient amount of inhibition in order to cope with interference of ferred the one that crossed fewer regional borders (61.6 % against
recently established task sets. Importantly, while there have been chance level). These routes also passed more places of the target
region.
In the actual investigation, we repeated the experiment and also
Experiment I Experiment II
modified it to test whether such a region-perception can be evoked
* linguistically as well.
*
100 Procedure
*
** The test phase consisted of 18 navigation trials including 12 with
80 ** equidistant but region-sensitive route alternatives, and six distractors.
n - 2 repetition cost [ms]
* The subjects were asked to choose the shortest route passing three
60
places and had access to the place names on a second screen.
Participants
Only the data of those who performed at least 50 % of the test routes
40
correctly were included in the analysis. This applied to 65 subjects
(37 female, 28 male, all 1943 years of age).
20 Variables of interest
Test trials allowed for two equidistant route alternatives to the goal,
0 differing in the amount of region boundaries that had to be crossed.
50 175 300 425 550 50 175 300 425 550 We call the route choices with the smaller number of region-crossings
RCI [ms] region-consistent and count the total number of region-consistent
routes for each subject, expecting a chance level of 50 % if route
Fig. 1 Mean n - 2 repetition cost [ms] as a function of RCI [ms] choice was based solely on distance. Significant preference for one
(*p \ .05; **p \ .01). Error bars represent SEM route type is regarded as evidence for a regionalized representation of
123
the experimental environment. We also measured the navigational the coronal target words in medial place (e.g., ,,Rimme), and 41
errors. pseudowords that diverged from the non-coronal targets in medial
Results place (e.g., ,,Dodde). Spoken prime fragments were preceding the
The results of the landmark-condition confirmed the findings by visual target words and pseudowords, which were presented in cap-
Wiener, Mallot (2003). For the linguistic conditions, higher error rates itals. In Experiment 1, the spoken primes were the onsets of the target
as well as strong differences in the prevalence of region-consistent words and of the pseudowords up to the first nucleus. Those cv-
route choices were found. A significant preference was found only for primes differed only in the place feature co-articulated in the vowel,
the holonym-meronym condition. We therefore suggest that lan- such as ,,ri[n]and ,,ri[m]. In Experiment 2, the spoken primes were
guage-based induction of hierarchies must be in itself of spatial nature the onsets of the target words and of the pseudowords up to the
to induce a regionalized representation of space. consonant following the first nucleus. Those cvc-primes differed in a
complete phoneme, such as ,,rinand ,,rim. In a Match condition, the
primes were followed by their carrier words (e.g., ,,rin-RINNE) or
Reference
Wiener JM, Mallot HA (2003) Fine-to-Coarse route planning and carrier pseudowords (e.g., ,,rim-*RIMME), in a Variation condition,
navigation in regionalized environments. Spatial Cogn Comput the primes were followed by their respective pseudoword pair
3(4):331358 member (e.g., ,,rin-*RIMME) or their respective word pair member
(e.g., ,,rim-RINNE). Unrelated prime-target pairs were taken as
controls (,,dog-RINNE). Taking together, we manipulated Condition
(Match vs. Variation vs. Control), Lexicality (words vs. pseudowords)
Processing of co-articulated place information in lexical and word medial place of the target (coronal vs. non-coronal) as
access within-subject factors; and Prime Length (cv-primes in Experiment 1
vs. cvc-prime in Experiment 2) as between-subject factor. Parallel to
classical psycholinguistic research, we analyzed only the first pre-
Ulrike Schild1, Claudia Teickner2, Claudia K. Friedrich1 sentation of the target. Presentation order was counterbalanced across
1
University of Tubingen, Germany; 2 University of Hamburg, participants.
Germany With respect to the role of features in lexical access, we tested
Listeners do not have any trouble identifying assimilated word forms whether word recognition cascades from features to the lexicon. If so,
such as the spoken string ,,gardem benchas an instance of we should not find different results for cv-primes vs. cvc-primes. With
,,garden bench. Assimilation of place of articulation, such as the respect to a pre-lexical level of representation, we tested whether
coronal place of articulation of the final speech sound of ,,gardento subphonemic variation is maintained up to the lexical level. If so, the
the dorsal place of articulation of the initial speech sound of Match condition and the Variation condition should differ for words,
,,benchis common in continuous speech. It is a matter of debate how but not for pseudowords in Experiment 1. With respect to the
the recognition system handles systematic variation resulting from assumptions of the FUL model, we tested whether lexical represen-
assimilation. Here we test the processing of place variation as soon as tations are sparse for coronal place. If so, responses to the Match
appears in the signal. We used co-articulated information in speech condition and to the Variation should only differ for non-coronal
sounds. For example, the/o/in ,,joghas already encoded the dorsal targets, but not for coronal targets.
place of articulation of the following/g/. Results of four-way ANOVA with the factors Prime Length
It is still a matter of debate whether subphonemic information is (Experiment 1 vs. Experiment 2), Lexicality (Word Targets vs.
normalized at a pre-lexical level of representation or is maintained Pseudoword Targets), Place (Targets with Coronal Segment vs. Tar-
and used for lexical access. On the one hand, many traditional gets with Non-coronal Segment) and Condition (Match vs. Variation
models of spoken word recognition such as Cohort (Marslen-Wilson vs. Control) are informative for our hypotheses.
1987) or TRACE (McClelland 1986) favor abstract pre-lexical First, there was no significant interaction with the factor Prime
representations. Here, sub-phonemic variation is resolved at a pre- Length. That is, behavioral results were comparable across both
lexical level. On the other hand, full-listing exemplar approaches experiments. This is support for cascaded activation of lexical rep-
(Goldinger 1998) assume that phonetic detail is fully represented in resentations from features to word forms.
lexical access with no need of pre-lexical representations that nor- Second, there was an interaction of the factors Condition and
malize for variation. Variation in co-articulation information should Lexicality. For word targets and for pseudoword targets, responses
be less disruptive in the former than in the later account. Some- were slowest for the Control condition. For pseudowords, the Match
where in-between both types of models, the featurally underspecified condition and the Variation condition did not differ. However, for
lexicon (FUL) model (Lahiri and Reetz 2002) avoids pre-lexical words, responses for the Match condition were faster than responses
representations by means of sparse abstract lexical representations for the Variation condition (Fig. 1, left panel). This is support for the
storing only those features that do not frequently undergo variation assumption that the lexicon is involved in processing sub-phonemic
in the signal. According to FUL, non-coronal features like the labial variation.
or dorsal place of articulation are stored in the lexicon. If the input Third, there was an interaction of the factors Condition and Place.
contains another place of articulation the respective candidate is not Responses to coronal targets in the Match condition and in the Var-
further considered. For example, ,,foanwould not be able to acti- iation condition did not differ from each other, but both were faster
vate ,,foam. By contrast, coronal place features are not stored in than responses in the control condition. Responses to non-coronal
the lexicon. Thus, utterances containing a coronal feature at a cer- targets were fastest in the Match condition, intermediate in the Var-
tain position should be activated by any input containing a non- iation condition and slowest in the Control condition (Fig. 1, right
coronal feature at that position. For example ,,gardem can activate panel). This is evidence for the assumption of unspecified coronal
,,garden. place. However, it does not appear that this effect is mediated by the
Here we investigate the processing of co-articulatory place lexicon because it was not modulated by the factor Lexicality.
information in cross-modal word onset priming. We presented 41 The results suggest that information of anticipatory co-articulation
German target words with coronal place of the word medial conso- is maintained and used in lexical access. Completely matching
nant (e.g., ,,Rinne, Engl., chute), and 41 German target words with information activates the target words lexical representation more
non-coronal place of the word medial consonant (e.g. ,,Dogge, Engl., effectively than partially mismatching information. Even subtle sub-
mastiff). In addition we presented 41 pseudowords that diverged from phonemic variation reduces lexical activation. Thus, subphonemic
123
850
Lexicality x Condition PLACE x Condition Aside from ignoring, also the spatial position of a stimulus has
proven to influence the evaluation of (e.g., Valenzuela and Raghubir
800 2009, 2010) and the preference for (e.g., Christenfeld 1995; Rodway,
Shepman and Lambert 2012) certain stimuli. Meier and Robinson
750
(2004) showed that upper positions are associated with positive affect
RT [ms]
700 and lower positions with negative affect. In another study, products in
a supermarket context were evaluated as more expensive when pre-
650
sented in upper shelves compared to lower positioned products
600
(Valenzuela and Raghubir 2010). In horizontal arrangements though,
words coronal there is evidence for an advantage of the central stimulus position. In
pseudowords
550 non-coronal several studies, participants preferred the central stimulus to laterally
0
Match Variation Control Match Variation Control presented stimuli (e.g., Christenfeld 1995; Rodway et al. 2012; Val-
enzuela and Raghubir 2009). This effect pattern was called center-
Fig. 1 Shown are mean lexical decision latencies collapsed across stage-effect by Valenzuela and Raghubir (2009). Attali and Bar-Hillel
both experiments. The left side illustrates responses to words (black) (2003) suggested that this pattern is not based on preferences for the
and pseudowords (white), the right side illustrates responses to central position but might rather be explained with an aversion against
coronal targets (black) and non-coronal targets (white) in the Match the edges of this stimulus configuration.
condition, in the Variation condition and in the Control respectively. The present research combines affective stimulus devaluation and
Error bars indicate standard errors the concept of spatial position effects and measures their influence on
later stimulus evaluation. It is assumed that lateral stimuli will be
devaluated either due to a negative code which is applied to them (via
detail appears to be used for lexical access in a similar way as pho- edge aversion; extending the emotional coding account to other
nemic information. Furthermore, our results are evidence for the FUL possible stimulus-connoting factors such as spatial position) or due to
model. a (passive) inhibition applied to lateral positions compared to central
products and comparable baseline stimuli. Moreover, the present
References research is set on disentangling these just mentioned possible expla-
Goldinger SD (1998) Echoes of echoes? An episodic theory of lexical nation accounts of this lateral devaluation effect by using a
access. Psychol Rev 105(2):251279 combination of stimulus evaluation patterns and eye tracking
Lahiri A, Reetz H, Gussenhoven C, Warner N (2002) Underspecified measurements.
recognition. Mouton de Gruyter, Berlin, pp 638675 Experiment 1 (N = 20) was conducted to investigate the affective
Marslen-Wilson WD (1987) Functional parallelism in spoken word- evaluations of centrally and laterally presented products compared to
recognition. Cogn Int J Cogn Sci 25(12):71102 neutral baseline stimuli. In a presentation task, three cosmetics were
McClelland JL, Elman JL (1986) The TRACE model of speech presented simultaneously in a row. The subsequent evaluation task
perception. Cogn Psychol 18(1):186 revealed a devaluation of lateral stimuli compared to central and
more importantcompared to baseline stimuli. This lateral devalu-
ation below baseline level is a new finding which points to a bias of
Disentangling the role of inhibition and emotional the edges and not to a center-stage-effect when comparing central and
lateral stimuli. However, the underlying mechanisms that might have
coding on spatial stimulus devaluation led to this lateral devaluation are not solved yet. A devaluation of
lateral products might either base on affective codinga positively
Christine Scholtes, Kerstin Dittrich, Karl Christoph Klauer connoted center position contrasted to a negatively connoted lateral
Universitat Freiburg, Abteilung Sozialpsychologie und position (see Attali and Bar-Hillel 2003; Dittrich and Klauer 2012;
Methodenlehre, Germany Valenzuela and Raghubir 2009); or the effect might base on an
Keywords attentional focus on the center product (e.g., Tatler 2007) and a
Spatial position, Stimulus devaluation, Emotional coding, possible consequential neglect of the lateral stimuli. In Experiment 2
Edge aversion, Eye tracking (planned N = 80), we are currently trying to disentangle these just
mentioned possible mechanisms. Again, three cosmetics are simul-
In a study investigating the influence of visual selective atten- taneously presented; this time either in a horizontal row (Condition 1;
tion on affective evaluation, Raymond, Fenske, and Tavassoli replicating Experiment 1) or in a vertical column (Condition 2).
(2003) observed a distractor devaluation effect: Previously to-be- Subsequently, one single product either previously presented or not
ignored stimuli were emotionally devaluated compared to to-be- presented has to be emotionally evaluated by the participants. During
selected stimuli and neutral stimuli not previously presented. the experiment, the participants eye gazes are tracked. Of interest is
According to Raymond et al. (2003), this stimulus devaluation can the dwell time in three previously defined areas of interest including
be explained by assuming that cognitive inhibition is applied to the the three cosmetic products. We expect that products in the vertical
to-be-ignored stimulus. This inhibition is assumed to be stored with arrangement will be evaluated more positively the higher they are
the mental representation of this stimulus and applied to the placed in the column (see Meier and Robinson 2004; Valenzuela and
evaluation task where the stimulus is presented again. Another Raghubir 2010); they are also assumed to be evaluated more posi-
explanation account is provided by Dittrich and Klauer (2012). tively than novel products. Products in the horizontal arrangement
According to their account, the act of ignoring leads to a negative will be devaluated when presented laterally compared to central or
emotional code on this to-be-ignored stimulus. This negative code novel baseline products (see results Experiment 1). However, the
is assumed to be stored with the mental representation of the to-be- participants attentional focus is assumed to rest on the central
ignored stimulus leading to devaluation when the stimulus is product in both arrangements (Tatler 2007). A respective result pat-
encountered again. tern would indicate emotional coding as the underlying mechanism,
123
as the attentional focus on the central product would implicate Experiment 1 examined the mutual influence of reduced atten-
following the inhibition accountthat in both conditions the lateral tional resources on the implementation of a new action plan and of
products would be inhibited and thus devaluated. Preliminary ana- movement planning on the transfer of information into visuospatial
lyzes of the eyetracking data of 40 participants revealed the expected WM. To approach these two questions, we used a dual-task design in
gaze pattern: participants in both conditions focused on the central which participants grasped a sphere and planned a placing movement
product. Implications for the two competing explanation accounts as toward a left or right target, according to a directional arrow. (Pre-
well as for the transfer of the lateral devaluation effect to consumer vious research using a single memory task suggested that visuo-
psychology will be discussed. spatial WM is more effected by a grasp-to-place task than verbal
WM; Spiegel et al. 2012.) Subsequently, participants encoded visuo-
spatial information, i.e., a centrally presented memory stimulus
References (4 9 4 symbol matrix). While maintaining the information in WM, a
Attali Y, Bar-Hillel M (2003) Guess where: the position of correct visual stay/change cue (presented on the left, center or right) either
answers in multiple-choice test items as a psychometric variable. confirmed or reversed the direction of the planned movement (indi-
J Educ Meas 40(2):109128 cated by its color). That is, participants had to execute either the
Christenfeld N (1995) Choices from identical options. Psychol Sci prepared or a re-planned movement, before they reported the symbols
6(1):5055 of the matrix without time pressure. The results show that both
Dittrich K, Klauer KC (2012) Does ignoring lead to worse evalua- movement re-planning as well as shifting spatial attention to the
tions? A new explanation of the stimulus devaluation effect. Cogn location of the incongruent stay/change cues constitute processing
Emot 26:193208 bottlenecks, presumably because both actions are based on visuo-
Meier B, Robinson M (2004) Why the sunny side is up. Psychol Sci spatial WM performance. Importantly, the spatial attention shifts and
15:243247 movement re-planning appeared to be independent of each other.
Raymond J E, Fenske M J, Tavassoli N T (2003) Selective attention Further, we found that the initial preparation of the placing movement
determines emotional responses to novel visual stimuli. Psychol influenced the report of the memorized items. Preparing a leftward
Sci 14(6):537542 movement resulted in better memory performance for the left matrix
Rodway P, Schepman A, Lambert J (2012) Preferring the one in the half, while the preparation of a rightward movement resulted in better
middle: Further evidence for the centre-stage effect. Appl Cogn memory performance for the right matrix half. Hence, movement
Psychol 26:215222 planning influenced the transfer of information into WM. Therefore,
Tatler B (2007) The central fixation bias in scene viewing: Selecting experiment 1 suggests that movement planning, spatial attention and
an optimal viewing position independently of motor biases and visuospatial WM are functionally correlated but not linked in a
image feature distributions. J Vis 7(14):117 mandatory fashion.
Valenzuela A, Raghubir P (2009) Position-based beliefs: the center- Experiment 2 examined the role of WM on action plan modification
stage effect. J Consum Psychol 19(2):185196 processes (retrospective motor planning) using a hand path priming
Valenzuela A, Raghubir P (2010) Are Consumers Aware of Top paradigm. Participants performed a sequential manual tapping task
Bottom but not of LeftRight Inferences? Implications for Shelf comprised of nine movements in time with a metronome. In a defined
Space Positions (Working Paper). New York: Baruch College part of the trials, tapping movement had to cross an obstacle between the
City University, Marketing Department two center targets. Participants executed this task alone (motor only
conditions) or while concurrently performing a WM task of varied
difficulty (i.e., counting backwards in steps of one or three; motor-WM -
1 and motor-WM -3 condition, respectively). In addition, participants
The role of working memory in prospective performed the WM tasks without simultaneously executing the motor
and retrospective motor planning task (WM -1 and WM -3 conditions, respectively). As the generation of
a new motor plan from scratch is thought to require more WM resources
compared to recall of a previously generated plan, we expected the
Christian Seegelke1,2, Dirk Koester1,2, Bettina Blaesing1,2,
retrospective effect on motor planning (as measured by means of peak
Marnie Ann Spiegel1,2, Thomas Schack1,2,3
1 movement height after clearing the obstacle) to increase with task dif-
Neurocognition and Action Research Group, Bielefeld University,
ficulty (i.e., motor-WM- 3 [ motor-WM -1 [ motor only).
Germany; 2 Center of Excellence Cognitive Interaction Technology,
Corroborating findings from earlier studies (van der Wel et al. 2007), we
Bielefeld University, Germany; 3 CoR-lab, Bielefeld University,
found that after clearing an obstacle, peak heights of the manual tapping
Germany
movements were only gradually reduced. This hand path priming effect
A large corpus of work demonstrates that humans plan and represent has been interpreted as indicating that participants recalled the previ-
actions in advance, taking into account future task demands (i.e., ously generated motor plan and only slightly modified it for the
prospective planning). Empirical evidence exists that, action plans subsequent movements, thereby saving cognitive processing resources.
are not always generated from scratch for each movement, but Contrary to our expectation, the results showed that the magnitude of
features of previously generated plans are recalled, modified the hand path priming effect was similar regardless of whether partic-
appropriately, and then used for subsequent actions (e.g., van der ipants performed the motor task alone or together with a WM task. This
Wel et al. 2007). This retrospective planning is likely to serve the finding suggests that WM has no moderating influence on retrospective
purpose of reducing the cognitive costs associated with motor motor planning. However, peak heights of the tapping movements were,
planning. In general, these findings support the notion that action on average, higher during the dual-task conditions compared to the
planning is contingent on both future and past events (cf. Rosen- single-task condition, suggesting an influence of WM on movement
baum et al. 2012). In addition, there is considerable evidence to execution in general. In addition, WM performance was not influenced
suggest that motor planning and working memory (WM) share by task condition (i.e., single vs. dual-task). These two experiments
common cognitive resources (e.g., Weigelt et al. 2009; Spiegel et al. point toward a tight functional interaction between action control and
2012; 2013). In two experiments, we further explored the role of (spatial) WM processes and attentional load. However, retrospective
WM in prospective and retrospective motor planning using different and prospective planning may draw differentially on WM and atten-
dual-task paradigms. tional resources.
123
References incorrect and correct response activation we included a constant

Rosenbaum DA, Chapman KM, Weigelt M, Weiss DJ, van der Wel R flanker-to-target delay of 100 ms (see Kopp et al. 1996). A blocked
(2012) Cognition, action, and object manipulation. Psychol Bull foreperiod (FP) paradigm (FPs of 800 and 2,400 ms) served as
138:924946 manipulation of temporal preparation, whereby the short FP leads to
Spiegel MA, Koester D, Schack T (2013) The functional role of good temporal preparation.
working memory in the (re-) planning and execution of grasping The LRP was derived at electrode sites C4/C3 in the digitally
movements. J Exp Psychol Hum Percept Perform 39:13261339 filtered (0.0510 Hz), artifact-free (horizontal EOG \ 30 lV; all
Spiegel MA, Koester D, Weigelt M, Schack T (2012) The costs of other electrodes \ 80 lV) ERP as the average of contra minus
changing an intended action: movement planning, but not execution, ipsilateral activity for left- and right-hand responses. The 100 ms pre-
interferes with verbal working memory. Neurosci Lett 509:8286 flanker interval served as baseline. Jackknife-based onset latency
van der Wel R, Fleckenstein RM, Jax SA, Rosenbaum DA (2007) (50 % relative amplitude criterion) was calculated for positive and
Hand path priming in manual obstacle avoidance: evidence for negative LRPs (time windows: 140240 ms and 150400 ms).
abstract spatiotemporal forms in human motor control. J Exp Results
Psychol Hum Percept Perform 33:11171126 Statistical analysis was performed via repeated-measures analysis of
Weigelt M, Rosenbaum DA, Huelshorst S, Schack T (2009) Moving variance (rmANOVA) and pairwise t-tests for post hoc comparisons
and memorizing: motor planning modulates the recency effect in (with Bonferroni corrected p-values). Mean RT in correct trials, mean
serial and free recall. Acta Psychol 132:6879 percentage error (PE), and negative LRP onsets were submitted to
separate rmANOVAs with factors foreperiod (short, long) and com-
patibility (compatible, neutral, incompatible). Positive LRP onset was
analyzed via an rmANOVA with factor foreperiod (short, long).
Temporal preparation increases response conflict Analysis of mean RT revealed a compatibility main effect,
by advancing direct response activation F(2,22) = 75.8, p \ .001 (RTcompatible \ RTneutral \ RTincompatible;
both ps \ .001; Fig. 1). Furthermore, FP had a main effect on RT,
F(1,11) = 37.1, p \ .001, which was further qualified by a FP x
Verena C. Seibold, Freya Festl, Bettina Rolke
Compatibility interaction, F(2,22) = 4.6, p = .02. RTs were shorter
Evolutionary Cognition, Department of Psychology, University
after the short FP, but only in compatible and neutral trials (both
ps \ .001), not in incompatible trials (p = .16). PE was affected by
Temporal preparation refers to processes of selectively attending and compatibility, F(2,22) = 9.1, p = .01, and FP, F(1,11) = 10.6,
preparing for specific moments in time. Various studies have shown p = .008, as well as the interaction, F(2,22) = 10.4, p = .006. PE
that these preparatory processes allow for faster and more efficient was selectively higher in incompatible trials, t(11) = 3.0, p = .02,
stimulus processing, as reflected in shorter reaction time (RT) and specifically after the short FP, t(11) = 3.4, p = .02.
higher accuracy in a variety of tasks (e.g. Rolke, Ulrich 2010). Negative LRP onset (Fig. 2a) was affected by compatibility,
Recently, however, Correa et al. (2010) showed that temporal prep- FC(2,22) = 28.0, p \ .001, with increasing latency from compatible
aration impairs performance in tasks with conflicting response to neutral to incompatible trials (both ps \ .001). The FP main effect
information. Specifically, these authors observed that temporal was not significant, FC(1,11) = 2.64, p = .13, nor the FP x Com-
preparation magnified compatibility effects in a flanker task. The patibility interaction (FC \ 1). The positive LRP in incompatible
flanker compatibility effect refers to an increase in RT to a target that trials was clearly affected by FP, FC(1,11) = 18.8, p = .001, with
is flanked by response-incompatible stimuli. According to dual-route shorter latency after the short FP (Fig. 2b).
models (e.g. Eimer et al. 1995), this effect arises because stimuli Discussion
activate responses at a cortical level along two parallel routes: a By means of ERPs, we examined how temporal preparation affects
slower controlled route, which activates responses according to task response activation in conflict tasks. Replicating previous studies
instructions, and a fast direct route, which activates responses via (Kopp et al. 1996), we observed clear compatibility effects, as RT and
direct response priming. In case of incompatible flankers the direct (negative) LRP latency increased from compatible to incompatible
route thus activates the incorrect response, leading to conflict. Within trials. Furthermore, temporal preparation increased the size of the
this framework, temporal preparation may increase conflict effects by behavioral response conflict. Most importantly, temporal preparation
giving direct-route processing a head start. We investigated this idea reduced the latency of the positive LRP in incompatible trials
by measuring the stimulus-locked lateralized readiness potential indexing direct response activation, but it did not affect negative LRP
(LRP) of the event-related potential (ERP) in a flanker task. We
picked the LRP because it reflects response hand-specific ERP lat-
eralization in motor areas and thus enabled us to separate controlled
from direct response activation in incompatible flanker trials: whereas
controlled (correct) response hand activation shows up in a negative-
going LRP, direct activation of the incorrect response hand emerges
as an early positive LRP dip Accordingly, if temporal preparation
advances direct-route based response activation we expected to
observe an earlier occurring positive LRP in incompatible trials. In
addition, this latency shift may also affect response activation in the
controlled route, as indexed by the negative LRP.
Method
Twelve participants performed an arrowhead version of the flanker
task. In each trial participants had to indicate the orientation of a
central arrowhead with either a left or right hand response. This target
was flanked by two vertically aligned stimuli being either response-
compatible (arrowheads pointing in the same direction), incompatible
(arrowheads pointing in the opposite direction) or neutral (rectangles). Fig. 1 Mean RT (correct responses) and PE as a function of
To maximize compatibility effects and disentangle the time-course of compatibility and FP
123
various tasks (e.g., Domahs, Moeller, Huber, Willmes, Nuerk

2010; Fischer 2008). Di Luca and Pesenti (2007) demonstrated in
adults that pictures of finger counting postures prime numerical
size in an Arabic number classification task. This suggests that
finger representations become automatically activated during
number processing. The present study reports further interactions
between the execution of finger counting postures and the pro-
cessing of numbers; it provides evidence for an activation of
number representations through finger postures.
In Experiment 1, 25 right-handed adult participants were instruc-
ted to compare two successively presented digits while performing
finger postures. Each trial comprised a reference number ranging from
2 to 4, followed by a target number that was either smaller or larger
by 1 and thus ranging from 1 to 5. Responses were given verbally (i.e.
saying ta for bigger and to for smaller). The postures were
executed behind a visual occluder with the dominant hand with 2 to 4
fingers stretched out in an either canonical (finger counting, starting
from the thumb) or non-canonical way. Crucially, the number of
extended fingers sometimes corresponded with the presented target
number (congruent trials). The current posture was instructed by the
experimenter before each block of 15 trials. Each trial started with a
button press before the finger posture was readopted to refresh par-
ticipants proprioceptive experience. Results showed a significant
comparison time advantage for congruent trials, only when canonical
finger postures were adopted (RT advantage of 13 ms, SD = 27 ms
for congruent trials compared to incongruent trials; t(24) = 2.39,
p \ .03). These data suggest that, although most participants reported
not to be aware that they were occasionally adopting finger counting
postures, these finger movements pre-activated the representation of
Fig. 2 a Negative LRP as a function of compatibility. b Positive LRP specific numbers that led to a facilitated number processing.
in incompatible trials as a function of FP Flanker (F) and target In Experiment 1, almost all participants were right-starters in
(T) onset are marked at the x-axis finger counting. It is possible that congruency effects only emerge for
the hand that is usually used to represent these specific numbers. It
also remains unclear whether the coding of numbers larger than 5
latency indexing controlled response activation. This finding suggests benefits from adopting finger postures. We therefore conducted a
that temporal preparation modulates response activation along the second experiment in which both hands were used and numbers
direct route and thereby increases response conflict. between 2 and 9 served as stimuli.
In Experiment 2, 26 right-handed participants verbally classified
References numbers (2, 3, 4, 7, 8, 9) as odd or even, while again executing
Correa A, Cappucci P, Nobre AC, Lupianez J (2010) The two sides of canonical or non-canonical finger postures with one hand. In contrast
temporal orienting: facilitating perceptual selection, disrupting to Experiment 1, participants were required to perform two blocks, in
response selection. Exp Psychol 57:142148. doi:10.1027/1618- which they adopted finger postures with the left and with the right
3169/a000018 hand. Responses were again given verbally, by saying odd or
Eimer M, Hommel B, Prinz, W (1995) SR compatibility and response even (German: ungerade and gerade, respectively). We sub-
selection. Acta Psychol 90:301313. doi:10.1016/0001-6918(95) tracted from each vocal RT the subjects individual mean RT per
00022-M response. In this design, at least four different congruencies can be
Kopp B, Rist F, Mattler U (1996) N200 in the flanker task as a distinguished. Again, the number of extended fingers could coincide
neurobehavioral tool for investigating executive control. Psy- with the classified number (exact congruency), the numerical size of
chophysiol 33:282294. doi:10.1111/j.1469-8986.1996.tb00425.x the stimulus could correspond to the respective hand in finger
Rolke B, Ulrich R (2010) On the locus of temporal preparation: counting (counting hand congruency), both the number of fingers and
Enhancement of pre-motor processes. In: Nobre AC, Coull JT (eds) the digit were either odd or even (parity congruency), and both the
Attention and time. Oxford University Press, Oxford, pp 228241 finger posture and the digit could be relatively small or large (with a
range of 24 for finger postures and a range of 29 for presented
digits; relative size congruency). While no significant exact and
counting hand congruency effects were found and only a trend for a
The flexibility of finger-based magnitude RT advantage for parity congruent trials (4 ms, SD = 12 ms;
representations t(25) = 1.89, p = .07), there was a significant relative size congru-
ency effect for canonical (but not for non-canonical) postures 2 and 4
(12 ms, SD = 23 ms for congruent trials compared to incongruent
Elena Sixtus, Oliver Lindemann, Martin H. Fischer
trials; t(25) = 2.56, p \ .02): Executing a relatively small counting
Cognitive Science Division, University of Potsdam, Germany
posture led to faster parity decisions for small than large digits, and
Finger counting is a crucial step towards accomplishing counting vice versa for a relatively big counting posture, while a medium
and understanding number. Consistent with the theoretical stance counting posture had no such effect.
of embodied cognition (see e.g., Glenberg, Witt, Metcalfe 2013), Together, these results clarify our understanding of embodied
recent studies reported evidence suggesting that adults show an number processing. First, the presence of the exact congruency effect
influence of finger counting on cognitive number processing in was limited to a situation in which the numbers did not exceed the
123
counting range of one hand, suggesting that finger counting postures 3D point cloud data and the corresponding RBG image, have been
only activate the corresponding mental number representations when analyzed. Scenes were recorded by RGB-D sensors (Kinect), which
embedded in an appropriate task. Second, the absence of a counting provide 3D point cloud data and matched 2D RGB images. Scenes
hand congruency effect shows that using the non-starting hand does were taken from openly available machine vision data bases (Rich-
not necessarily activate the respective mental representation for larger tsfeld et al. 2012, Silberman et al. 2012). We segmented the scenes
numbers. Third, the finding that finger postures and numbers interact into 3D entities using convex-concave transitions in the point cloud
based on their respective relative sizes demonstrates a more flexible by a model-free machine vision algorithm, the details of which are
size activation through finger postures than previously assumed. This described elsewhere (LCCP Algorithm, Stein et al. 2014). This is a
is in line with the idea of a generalized magnitude system, which is purely data-driven segmentation algorithm, which does not use any
assumed to encode information about the magnitudes in the external additional features for segmentation and works reliably for in-door
world that are used in action (Walsh 2003, p 486). Specifically, RGB-D scenes with a depth range of approx. 0.5 to 5 meters using
showing almost all fingers of one hand is associated to large mag- only 2 parameters to set the resolution. Note, due to the limited spatial
nitudes and showing very few fingers to small magnitudes. The resolution of the RGB-D sensors, small objects cannot be consistently
present study shows that only under certain task demands subjects labeled. Thus, segments smaller than 3 % of the image size were
activate a one-to-one correspondence between fingers and numbers. manually blackened out by us as they most often represent sensor
In other situations, magnitudes might not have to be exactly the same, noise. We received a total of 247 segments (i.e. about 2030 per
but rather proportional to be associated. image). Segments are labeled on the 2D RGB image with different
colors to make them distinguishable for the observer. To control for
errors introduced by image acquisition and/or by the computer vision
Acknowledgments algorithm, we use the known distance error function of the Kinect
This research is supported by DFG grant FI 1915/2-1 Manumerical sensor (Smisek et al. 2011) to calculate a reliability score for every
cognition. segment.
We asked 20 subjects to compare the obtained 247 color-labeled
segments with the corresponding original RGB image, asking: How
References
precisely can you name it?; and recorded their utterances obtaining
Di Luca S, Pesenti M (2007) Masked priming effect with canonical
4,940 data points. Subsequently we analyzed the utterances and
finger numeral configurations. Exp Brain Res 185(1): 2739. doi:
divided them into three groups: 1) precise naming of a segment (e.g.
10.1007/s00221-007-1132-8
table leg), where it does not play a role whether or not subjects
Domahs F, Moeller K, Huber S, Willmes K, Nuerk HC (2010)
would use unique names (e.g. table leg, leg, and table support
Embodied numerosity: implicit hand-based representations
are equally valid), 2) definite failure/impossibility to name a segment
influence symbolic number processing across cultures. Cognition
and 3) unclear cases, where subjects stated that they are not sure about
116(2):251266. doi:10.1016/j.cognition.2010.05.007
the identification.
Glenberg AM, Witt JK, Metcalfe J (2013) From the revolution to
One example scene is shown in Fig. 1a. Using color-based seg-
embodiment: 25 years of cognitive psychology. Perspect Psychol
mentation (BenSalah et al. 2011) the resulting image segments rarely
Sci 8(5):573585. doi:10.1177/1745691613498098
correspond to objects in the scene (Fig. 1b) and this is also extremely
Fischer MH (2008) Finger counting habits modulate spatial-numerical
dependent on illumination. Unwanted merging or splitting of objects
associations. Cortex 44(4): 38692. doi:10.1016/j.cortex.2007.
will, regardless of the chosen segmentation parameters, generically
08.004
happen (e.g. throat + face, fridge-fragments, etc. Figure 1b).
Walsh V (2003) A theory of magnitude: common cortical metrics of
Instead of using 2D color information, here point clouds were 3D-
time, space and quantity. Trends Cogn Sci 7(11):483488. doi:
segmented along concave/convex transitions. We observed (Fig. 1b)
10.1016/j.tics.2003.09.002
that subjects many times used different names (e.g. face or head)
to identify a segment, which are equally valid as both describe a valid
conceptional entity (an object). There are however several cases
Object names correspond to convex entities where segments could not be identified. We find that on average 64 %
of the segments could be identified, 30 % not, and there were 6 %
unclear cases. Are these 30 % non-identified segments possibly
Rahel Sutterlutti, Simon Christoph Stein, Minija Tamosiunaite,
(partially) due to machine vision errors? To assess this, we addi-
Florentin Worgotter
tionally considered the reliability of the individual segments. Due to
Faculty of Physics: Biophysics and Bernstein Center
the discretization error of the Kinect (stripy patterns in Fig. 1c), data
for Computational Neuroscience, Gottingen, Germany
at larger distances become quadratically more unreliable (Smisek
Commonly one assumes that object-identification (and recognition) et al. 2011) leading to merging of segments. When considering this
requires complexinnate as well as acquiredcognitive processes error source, we find that subjects could more often identify reliable
(Carey 2011), however, it remains unclear how objects can be indi- segments (Fig. 1e, red) and unrecognized cases dropped accordingly
viduated, segregated into parts, and identified (named) given the high (green). The red lettering in Fig. 1d marks less reliable segments and,
degree of variability of the sensory features which arise even from indeed, identification is lower or more ambivalent for those segments
similar objects (Geisler 2008). Gestalt laws, relying on shape as compared to the more reliable ones.
parameters and their relations; for example edge-relations, compact- The here performed segmentation generically renders identifiable
ness, or others; seem to play a role in this process (Spelke et al. 1993). object parts (e.g. head, arm, handle of fridge, etc.). Clearly,
Specifically, there exist several results from psychophysics (Hoffman no purely data-driven method exists, which would allow detecting
and Richards 1984, Biederman 1987, Bertamini and Wagemans 2013) complex, compound objects (e.g. woman) as this requires addi-
and machine vision (Siddiqi and Kimia 1995, Richtsfeld et al. 2012), tional conceptual knowledge. Furthermore, we note that we are here
which demonstrate that convex-concave surface transitions can be not concerned with higher cognitive aspects, relating to context
used for object partitioning. analysis, hierarchization, categorization, and other complex pro-
Here we are now trying to discern to what degree such a parti- cesses. Our main observation is that the purely geometrical (low-
tioning corresponds to our language-expressible object level) breaking up of a 3D scene, most often leads to entities for
understanding. To this end, a total of 10 real scenes, consisting of which we have an internal object or object-part concept which may
123
Fig. 1 Humans can with high reliability identify image segments that result from splitting images along concave-convex surface transitions.
a One example scene used for analysis. b Color-based segmentation of the scene. c Point cloud image of parts of the scene. d 3D-segmented
scene and segment names used by our subjects to identify objects. Missing percentages are the non-named cases. Red lettering indicates segments
with reliability less than 50. e Fraction of identified (red), not-identified (green) and unclear (blue) segments for the complete data set plotted
against their reliability. Fat dots represent averages across reliability intervals [0,10]; [10,20]; ; [150,160]. The ability to identify a segment
increases with reliability. Grand averages (red 0.64, green 0.30, blue 0.06) for all data are shown, too
reflect the low-level perceptual grounding of the bounded region Carey S (2011) Precis of The origin of concepts (and commentar-
hypothesis formulated by Langacker (1990) as a possible foundation ies), behav Brain Sci 34(3):113167
for grammatical entity construal. Geisler W (2008) Visual perception and the statistical properties of
It is known that color, texture and other such statistical image natural scenes. Ann Rev Psy 59:167192
features vary widely (Geisler 2008). Thus, object individuation cannot Hoffman D, Richards W (1984) Parts of recognition. Cognition
rely on them. By contrast, here we find that convex-concave transi- 18(13):6596
tions between 3D-surfaces might represent the required prior to which Langacker RW (1990) Concept, image, and symbol: the cognitive
a contiguous object concept can be unequivocally bound. These basis of grammar. Mouton de Gruyter, Berlin
transitions render object boundaries and, consequentially leads to the Richtsfeld A, Morwald T, Prankl J, Zillich M, Vincze M (2012)
situation that we can name them. Segmentation of unknown objects in indoor environments. In:
In addition, we note that this bottom-up segmentation can easily Proceedings of IEEE Conference on EEE/RSJ intelligent robots
be combined with other image features (edge, color, etc.) and alsoif and systems (IROS), pp 47914796
desiredwith object models where one now can go beyond object Siddiqi K, Kimia BB (1995) Parts of visual form: computational
individuation towards true object recognition. aspects. IEEE Trans Pattern Anal Mach Intel 17:239251
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmenta-
tion and support inference from RGB-D images. In: Proceedings
References of European conference on computer vision (ECCV), pp 746760
Ben Salah, M, Mitiche, A, Ayed, IB (2011) Multiregion image seg- Smisek J, Jancosek M, Pajdla T (2011) 3D with Kinect. In: Pro-
mentation by parametric kernel graph cuts. IEEE Trans Image ceedings of international conference comp vision (ICCV),
Proc. 20(2):545557 pp 11541160
Bertamini, M, Wagemans, J (2013) Processing convexity and con- Spelke ES, Breinlinger K, Jacobson K, Phillips A (1993) Gestalt
cavity along a 2-D contour: figure-ground, structural shape, and relations and object perception: a developmental study. Percep-
attention. Psychon Bull Rev 20(2):191207 tion 22(12):14831501
Biederman I (1987) Recognition-by-components: a theory of human Stein S, Papon J, Schoeler M, Worgotter F (2014) Object partitioning
image understanding. Psychol Rev 94:115147 using local convexity. In: Proceedings of IEEE conference on
123
computer vision and pattern recognition (CVPR), 2014. http://www. of overall performance to estimated DHF performance without human
cv-foundation.org/openaccess/content_cvpr_2014/papers/Stein_ input. Figure 1b shows that the subjects contribution in both groups
Object_Partitioning_using_2014_CVPR_paper.pdf decreased with increasing DHF up to the 50 % condition. The
contribution of experienced subjects plateaued between the 50 and
100 % DHF levels. Thus, the increase in performance for the 100 %
condition can mainly be attributed to the higher DHF forces alone. In
The role of direct haptic feedback in a compensatory contrast, the inexperienced subjects seemed to completely rely on the
tracking task DHF during the 50 % condition, since the operators contribution
approximated 1. However, this changed for the 100 % DHF level.
Evangelia-Regkina Symeonidou, Mario Olivari, Heinrich H. Bulthoff, Here, the participants started to actively contribute to the task
Lewis L. Chuang (operators contribution [1). This change in behavior resulted in
Max Planck Institute for Biological Cybernetics, Tubingen, Germany performance values similar to those of the experienced group Our
findings suggest that the increase of haptic support with our DHF
Haptic feedback systems can be designed to assist vehicular steering system does not necessarily result in over-reliance and can improve
by sharing manual control with the human operator. For example, performance for both experienced and inexperienced subjects.
direct haptic feedback (DHF) forces, that are applied over the control
device, can guide the operator towards an optimized trajectory, which
References
he can either augment, comply with or resist according to his pref-
Forsyth BAC, MacLean KE (2006) Predictive haptic guidance:
erences. DHF has been shown to improve performance (Olivari et al.
intelligent user assistance for the control of dynamic tasks. IEEE
submitted) and increase safety (Tsoi et al. 2010). Nonetheless, the
Trans Visual Comput Graph 12(1):10313
human operator may not always benefit from the haptic support
Olivari M, Nieuwenhuizen FM, Bulthoff HH, Pollini L (2014) An
system. Depending on the amount of the haptic feedback, the operator
experimental comparison of haptic and automated pilot support
might demonstrate an over- reliance or an opposition to this haptic
systems. In: AIAA modeling and simulation technologies con-
assistance (Forsyth and MacLean 2006). Thus, it is worthwhile to
ference, pp 111
investigate how different levels of haptic assistance influence shared
Olivari M, Nieuwenhuizen F, Bulthoff H, Pollini L (submitted) Pilot
control performance.
adaptation to different classes of haptic aids in tracking tasks.
The current study investigates how different gain levels of DHF
J Guidance Control Dyn
influence performance in a compensatory tracking task. For this
Tsoi KK, Mulder M, Abbink DA (2010) Balancing safety and sup-
purpose, 6 participants were evenly divided into two groups according
port: changing lanes with a haptic lane-keeping support system.
to their previous tracking experience. During the task, they had to
In: 2010 IEEE international conference on systems, man and
compensate for externally induced disturbances that were visualized
cybernetics, pp 12361243
as the difference between a moving line and a horizontal reference
standard. Briefly, participants observed how an unstable aircraft
symbol, located in the middle of the screen, deviated in the roll axis
from a stable artificial horizon. In order to compensate for the roll Comprehending negated action(s): embodiment
angle, participants were instructed to use the control joystick. perspective
Meanwhile, different DHF forces were presented over the control
joystick for gain levels of 0, 12.5, 25, 50 and 100 %. The maximal Nemanja Vaci1, Jelena Radanovic2, Fernando Marmolejo-Ramos3,
DHF level was chosen according to the procedure described in Petar Milin2,4
(Olivari et al. 2014) and represents the best stable performance of 1
Alpen-Adria University Klagenfurt, Austria; 2 University of Novi
skilled human operators. The participants performance was defined Sad, Serbia; 3 University of Adelaide, Australia; 4 Eberhard Karls
as the reciprocal of the median of the root mean square error (RMSE) Universitat Tubingen, Germany
in each condition.
Figure 1a shows that performance improved with increasing Keywords
DHF gain, regardless of experience levels. To evaluate the operators Embodied cognition, Negation, Mental simulation, Sentence
contribution, relative to the DHF contribution, we calculated the ratio comprehension
According to the embodied cognition framework, comprehension of
language involves activation of the same sensorimotor areas of the brain
that are activated when entities and events described by language
structures (e.g., words, sentences) are actually experienced (Barsalou
1999). Previous work on the comprehension of sentences showed
support for this proposal. For example, Glenberg and Kaschak (2002)
observed that judgment about sensibility of a sentence was facilitated
when there was congruence between the direction of an action implied
by the sentence and the direction of a movement required for making a
response, while incongruence led to slower responses. It was also shown
that linguistic markers (e.g., negation) could modulate mental simula-
tion of concepts (Kaup 2001). This finding was explained by the two-
step negation processing: (1) a reader simulates a sentence as if there is
no negation; (2) she negates the simulated content to reach full meaning.
However, when a negated action was announced in preceding text,
Fig. 1 a Performance of the experienced and in experienced negated clause was processed as fast as the affirmative one (Ludtke and
participants as well as the baseline of direct haptic feedback (DHF) Kaup 2006). The mentioned results suggest the mechanism of negation
assistance without human input for increasing haptic gain. b The ratio processing can be altered contextually.
of overall system performance to DHF performance without human In this study, we aimed at further investigating the effects of
input for increasing haptic gain linguistic markers, following the assumptions of embodied
123
framework. To obtain manipulation of a sentence context that would might be processed in one step, as opposed to the two stages pro-
target mental simulations we made use of materials from De Vega cessing in the case of non-competing actions.
et al. (2004). These researchers created sentences by manipulating The present results support the claim about mental simulation as
whether or not two actions described in a sentence were competing influenced by linguistic markers. We showed, however, that such an
for the cognitive resources. They showed that sentences with two influence depends on more general contextual factors. Present results
actions being performed at the same time were easier to process suggest that negation might have regulatory purpose in sentence
when they aimed at different sensorimotor systems (whistling and comprehension. The negated content is comprehended in a two-step
painting a fence), than when described actions involved the same simulation only if actions do not compete for the cognitive resources.
sensorimotor system (chopping a wood and painting a fence). We Contrariwise, when actions within a sentence are in sensorimotor
hypothesized that given two competing actions negation could competition, negation can suppress the second action to facilitate the
provide suppression for one of them and, thus, change the global comprehension.
simulation time course.
Experiment 1 was a modified replication of De Vega et al. References
(2004) study in Serbian. We constructed sentences by manipulating Barsalou LW (1999) Perceptual symbol systems. Behav Brain Sci
whether or not two actions described in a sentence were performed 22:577660
using the same or different sensorimotor systems. We also manip- De Vega M et al. (2004) On doing two things at once: temporal
ulated temporal ratio of the two actions (simultaneous vs. constraints on actions in language comprehension. Mem Cogn
successive). Finally, actions within a sentence could be physically 33:10331043
executed or mentally planned (reading a book vs. imagining reading Glenberg AM, Kaschak MP (2002) Grounding language in action.
a book). This way we included both descriptions of real actions Psychon Bull Rev 9:558565
as well as the descriptions of mental states. Introduction of this Kaup B (2001) Negation and its impact on the accessibility of text
factor aimed at testing whether linguistic marker for mentally information. Mem Cogn 29:960967
planned actions would induce second order simulation, similar to Ludtke J, Kaup B (2006) Context effects when reading negative and
the two-step processing, or suppress the mental simulation, which affirmative sentences. In: Sun R (ed) Proceedings of the 28th
than would match the one-step processing. The participants task in annual conference of the cognitive science society. Lawrence
this experiment was to read the sentences and to press a button Erlbaum Associates, Mahwah, pp 17351740
when they finished. To ensure comprehension, in 25 % randomly
chosen trials participants were instructed to repeat the meaning of a
sentence to the experimenter.
In the following two experiments, we focused on the mechanism Effects of action signaling on interpersonal
of negation, using similar sentences as in Experiment 1. Here, we coordination
manipulated the form of the two parts (affirmative vs. negative). The
task used Experiment 2 and 3 was a modified self-paced reading task, Cordula Vesper1, Lou Safra2, Laura Schmitz1, Natalie Sebanz1,
allowing a by-clause reading rather than a by-word or by-sentence Gunther Knoblich1
reading. This way we obtained response times for each of the two 1
CEU, Budapest, Hungary; 2 Ecole Normale Superieure, Paris,
parts (clauses). We were also interested in measuring the time (and France
accuracy) required for judging sensibility of the whole sentence.
Therefore, we included the equal number of nonsensible filler How do people coordinate actions such as lifting heavy objects
sentences. together, clapping in synchrony or passing a basketball from one
The linear mixed-effect modeling was applied to the response person to another? In many joint action tasks (Knoblich et al. 2011),
times and logistic mixed-effect modeling to the accuracy rates. We talking is not needed or simply too slow to provide useful cues for
controlled for trial order, clause and sentence length and sensibility of coordination. Instead, two people who coordinate their actions
a sentence (the sensibility ratings were obtained in separate study towards a joint goal often adapt the way they perform their own
using different participants). Results from Experiment 1 confirmed actions to facilitate performance for a task partner.
the findings of De Vega et al. (2004): the sentences with two actions One way of supporting a co-actor is by providing relevant infor-
from the same sensorimotor system were comprehended slower mation about ones own action performance. This can be achieved
(t(489.90) = 4.21, p \ .001). In addition, we observed a stronger non-verbally by exaggerating specific movement aspects so that
inhibitory effect from a length in case of sentences with simulta- another person can more easily understand and predict the action.
neously executed actions, which indicates additional comprehension This communicative modulation of ones own actions is often refer-
load for this type of sentences (t(499.40) = - 2.00, p \ .05). Finally, red to as signaling (Pezzulo et al. 2013) and includes common action
processing time was longer when sentences described mentally exaggerations such as making a distinct, disambiguating step towards
planned actions as opposed to real ones (t(489.70) = 3.21, the right to avoid a collision with another person on the street.
p \ .01). The present study investigated signaling in a joint action task in
The analyzes of Experiment 2 and 3 showed consistent results which sixteen pairs of participants moved cursors on a computer screen
between the clause (local) and sentence (global) response times. The towards a common target with the goal of reaching the target syn-
interaction between sensorimotor systems (same vs. different) and a chronously. Short feedback tones at target arrival indicated the
form of a clause (affirmative vs. negative) was significant coordination accuracy of their actions. To investigate whether actors
(t(303.70) = 2.95, p \ .01): different sensorimotor actions/systems modulate their signaling depending on what is perceptually available to
combined with negation lead to slower processing time and lower their partners, we compared several movement parameters between two
accuracy; when the sensorimotor system was the same, affirmative conditions: In the visible condition, co-actors could see each others
and negated markers did not induce significant differences. However, movements towards the target (i.e. both computer screens were visible
in cases when actions addressed the same sensorimotor system, the to both co-actors); in the hidden condition an occluder between the co-
accuracy of the sensibility judgments was higher if the second action actors prevented them from receiving visual feedback about each other.
was negated (z(303.70) = 4.36, p \ .001). Taken together, this pat- Analyzes of participants movements showed that signaling in the form
tern of results suggest that in case of competing actions negation of exaggerating the trajectory towards the target (by increasing the
123
curvature of the movement) was specifically used in the visible con- tactile signals in expected sensory areas such as the primary
dition, whereas a temporal strategy of reducing the variability of target sensory cortex, supramarginal gyri, and Rolandic opercula. In
arrival times (Vesper et al. 2011) was used in the hidden condition. second-level analyses significant 2-way interactions between the
Furthermore, pairs who signaled more were overall better coordinated. belt on/off and pre/post training condition indicates an involvement
Together these findings suggest that signaling is specifically of Rolandic opercula, Insula, MST and PPC. Inspection of the
employed in cases where a task partner is able to use the information activation intensities shows a significant difference belt on [ off
(i.e. can actually see the action modulation) and that this can be only in the first measurement before the training period, but not
beneficial for successful joint action performance. Thus, co-actors after the training period.
take into account what their partners can perceive in their attempts to In summary, in fMRI we observe differential activations in areas
coordinate their actions with them. Moreover, our study demonstrates expected for path integration tasks and tactile stimulation. Additionally,
how, depending on the type and amount of perceptual information we also found activation differences for the belt signals well beyond the
available between co-actors, different mechanisms support interper- somatosensory system, indicating that processing is not limited to sen-
sonal coordination. sory areas but includes also higher level and motor regions as predicted by
the theory of sensorimotor contingencies. It is demonstrated that the
References belts signal is processed differently after the training period. Our fMRI
Knoblich G, Butterfill S, Sebanz N (2011) Psychological research on joint results are also in line with subjective reports indicating a qualitative
action: theory and data. In: Ross B (ed) The psychology of learning change in the perception of the belt signals.
and motivation 54. Academic Press, Burlington, pp 59101
Pezzulo G, Donnarumma F, Dindo H (2013) Human sensorimotor
communication: a theory of signaling in online social interac-
tions. PLoS ONE 8: e79876
Do you believe in Mozart? The influence of beliefs
Vesper C, van der Wel RPRD, Knoblich G, Sebanz N (2011) Making about composition on representing joint action
oneself predictable: Reduced temporal variability facilitates joint outcomes in music
action coordination. Exp Brain Res 211: 517530
Thomas Wolf, Cordula Vesper, Natalie Sebanz, Gunther Knoblich
CEU, Budapest, Hungary
Physiological changes through sensory augmentation Actors in joint action situations represent the outcomes of their joint
actions and use these to guide their actions (Vesper, Butterfill, Knoblich,
in path integration: an fMRI study
Sebanz 2010). However, it is not clear how conceptual and perceptual
information affect the representations of joint action outcomes. In the
Susan Wache1,*, Johannes Keyser1, Sabine U Konig1, present experiment, we investigated whether beliefs about the intended
Frank Schumann1, Thomas Wolbers2,3, Christian Buchel2, nature of joint action outcomes are sufficient to elicit changes in their
Peter Konig1,4 representation. As recent studies provide evidence that participants rep-
1
Institute of Cognitive Science, University Osnabruck; 2 Institute of resent joint action outcomes in musical paradigms (Loehr, Kourtis,
Systems Neuroscience, University Medical Center Hamburg Vesper, Sebanz, Knoblich 2013), we used a piano paradigm to investigate
Eppendorf; 3 German Center for Neurodegenerative Diseases, the hypothesis that beliefs about the composers intentions can influence
Magdeburg; 4 Department of Neurophysiology and Pathophysiology, representations of jointly produced tones.
University Medical Center Hamburg Eppendorf In our paradigm, we used a within-subjects 2 9 2 design with the
The theory of sensorimotor contingencies (SMCs) describes qualita- factors Belief (together, separate) and Key (same, different). Two
tive experience as based on the dependency between sensory input adult piano novices played 24 melody-sets with the help of templates.
and its preceding motor actions. To investigate sensory processing In the Belief condition together, the participants were told that the
and learning of new SMCs we used sensory augmentation in a virtual melodies they were going to play were intended to be played together
path integration task. Specifically, we built a belt that maps direc- as duets. In the condition separate, participants were told that their
tional information of a compass to a set of vibrating elements such as melodies were not intended to be played together. With the Key
that the element pointing north is always activated. The belt changes manipulation, we manipulated the cognitive costs of joint action
its tactile signals only by motor actions of the belt-wearing partici- outcome representations as follows. All 24 melody-sets were gener-
pants, i.e. when turning around. ated by a python script, and followed the same simple chord
Nine subjects wore the belt during all waking hours for seven progression (I-IV-V7-I). They differed only along the Key manipu-
weeks, 5 control subjects actively trained their navigation, but without lation: In 12 melody-sets, the aforementioned chord progression was
a belt (age 1932y, seven female). Before and after the training period implemented in the same musical key. When the two melodies are
we presented in the fMRI scanner a virtual path integration (PI) task realized following the same chord progression in the same key, the
and a corresponding control task with identical visual stimuli. In half cognitive cost of representing the joint action outcome should be
of the trials of both tasks the belt was switched on, coherently lower than in the other 12 melody-sets, where the same chord pro-
vibrating with the virtual movements of the subjects. gression was implemented in different keys. Representing the joint
We used ROI analysis to concentrate on regions relevant for action outcome of two melodies in different keys demands more
spatial navigation and for sensory processing. We used a mixed- resources, even though representing only ones own action outcome is
effects ANOVA to decompose the four factors belt on/off, belt/ equally costly in both key conditions. During the experiment, accu-
control subjects, PI/control task, and before/after training. The racy, tempo and synchrony were measured.
main effect PI [ control task shows large-scale differences in areas Following our hypothesis that beliefs about the composition
that have been found to be active in similar navigational tasks affects the representation of the joint action outcome, we predicted
such as medial superior temporal cortices (MST), posterior parietal that the differences between the same Key and the different Key
cortex (PPC), ventral intraparietal areas, and caudate nucleus. melody-sets would be significantly higher when participants believed
Additionally we found sensorimotor regions such as supplementary the melodies were meant to be played together, attesting that the
motor areas (SMA), insula, primary sensory cortex, and precentral participants beliefs had led to an increase of joint action represen-
gyrus. The main effect belt on [ off reveals processing of the tations. In other words, we predicted that an ANOVA with the
123
independent variables Belief and Key would show a significant

interaction.
References
Vesper C, Butterfill S, Knoblich G, Sebanz N (2010) A minimal
architecture for joint action. Neural Netw 23:9981003
Loehr JD, Kourtis D, Vesper C, Sebanz N, Knoblich G (2013)
Monitoring individual and joint action outcomes in duet music
performance. J Cogn Neurosci 25(7):10491061
Processing sentences describing auditory events: Fig. 1 Mean response times for left/right-responses to sentences
only pianists show evidence for an automatic space implying high/low pitch for pianists (left panel) and non-musicians
pitch association (right panel). The error bars represent the 95% confidence interval
and are conducted according to Masson and Loftus (2003)
Sibylla Wolter, Carolin Dudschig, Irmgard de La Vega,
Barbara Kaup response position (sensible is right vs. left) was varied between
Universitat Tubingen, Germany blocks, starting position was balanced between participants. Each
Embodied language understanding models suggest that language com- sentence was presented only once to each participant. A by-partici-
prehension is grounded in experience. It is assumed that during reading of pant (F1) and a by-item (F2) ANOVA was conducted, one treating
words and sentences these experiences become reactivated and can be participants and one items as random factor. The results are displayed
used as mental simulation (Barsalou 1999; Zwaan, Madden 2005). in Fig. 1. The pianists (Exp 1) showed a significant interaction
Despite a growing body of evidence supporting the importance of sen- between implied pitch and response hand (F1(1,19) = 4.8, p \ .05,
sory-motor representations during language understanding (e.g., F2(1,56) = 6.77, p \ .05) with faster responses to sentences implying
Glenberg, Kaschak 2002) rather little is known regarding the represen- high pitch with a right compared to a left keypress response and faster
tation of sound during language processing. In the current study, we aim responses to sentences implying low pitch with a left compared to a
to close this gap by investigating whether processing sentences right keypress response. Sentence type (explicit vs. implicit) did not
describing auditory events results in similar action-compatibility effects modify this interaction (Fs \ 1). For the non-musicians, no interac-
as have been reported for physical tone perception. tion between implied pitch and response hand was found (Fs \ 1).
With regard to physical tone perception it is known that real tones Additionally, the data showed significant main effects of implied
of different pitch heights trigger specific spatial associations on a pitch and sentence type in the by- participants analysis for both
vertical as well as horizontal axis. The vertical association is typically participant groups (pianists: F1(1,19) = 21.42, p \ .001, F2(1,56) =
activated automatically for all participant groups (Lidji, Kolinsky, 1.4, p = .24; F1(1,19) = 29.87, p \ .001, F2(1,56) = 2.56, p = .12;
Lochy, Morais 2007; Rusconi, Kwan, Giordano, Umilta, Butterworth non-musicians: F1(1,23) = 20.01, p \ .001, F2(1,56) = 1.21,
2006). In contrast, the horizontal axis seems to be mediated by p = .28; F1(1,23) = 27.14, p \ .001, F2(1,56) = 1.17, p = .28).
musical expertise. Specifically, only pianists with a considerable Sentences implying high pitch yielded faster responses compared to
amount of experience with the piano keyboard and other musicians sentences implying low pitch and implicit sentences were responded
show an automatic association between low tones and the left side and to faster than explicit sentences.
high tones and the right side (Lidji et al. 2007; Trimarchi, Luzatti The results show that specific musical experiences can influence a
2011). This suggests that the experiences pianists make when playing linguistically implied space-pitch association. This is in line with the
the piano lead to a space-pitch association automatically elicited when mental simulation view of language comprehension suggesting that
processing high or low auditory sounds. language understanding involves multimodal knowledge representations
The aim of the present study was to investigate whether experi- that are based on experiences acquired during interactions with the world.
ence-specific space-pitch associations in the horizontal dimension can
also be observed during the processing of sentences referring to high References
or low auditory sounds. For pianists, we expected to find faster Barsalou LW (1999) Perceptual symbol systems. Behav Brain Sci
responses on the right compared to the left for sentences implying 22:577660
high pitch and faster responses on the left compared to the right for Glenberg AM, Kaschak MP (2002) Grounding language in action.
sentences implying low pitch. For non-musicians no such interaction Psychon Bull Rev 9(3):558565
was expected. Finding the respective differences between pianists and Lidji P, Kolinsky R, Lochy A, Morais J (2007) Spatial associations
non-musicians would strongly support the idea that during language for musical stimuli: a piano in the head? J Exp Psychol
processing specific experiential associations are being reactivated. 33(5):11891207
20 skilled pianists with an average training period of 14.85 years Masson MEJ, Loftus GR (2003) Using confidence intervals for
(Experiment 1) and 24 non-musicians with none or less than 2 years graphically based data interpretation. Can J Exp Psychol
of musical training that took place at least 10 years ago (Experiment 57(3):203220
2) were presented with sentences expressing high/low auditory Rusconi E, Kwan B, Giordano BL, Umilta C, Butterworth B (2006)
events, such as the bear growls deeply vs. the soprano singer sings a Spatial representation of pitch height: the SMARC effect. Cog-
high Aria. Half of the sentences contained the words high or low nition 99:113129
(explicit condition), the other half only implicitly expressed pitch Trimarchi PD, Luzatti C (2011) Implicit chord processing and motor
height (implicit condition). Nonsensical sentences served as filler representation in pianists. Psychol Res 75:122128
items. Participants judged whether the sentence was semantically Zwaan RA, Madden CJ (eds) (2005) Embodied sentence compre-
correct or incorrect by pressing either a left or right response key. The hension. CUP, Cambridge
123
A free energy approach to template matching in visual control mechanism, the knowledge network (KN) identifies the con-
attention: a connectionist model tent of the FOA in comparison with the template it entails. Moreover,
the location map complements the matching task by imposing another
top-down control that supervises the selection of the input image.
Keyvan Yahya1, Pouyan R. Fard2, Karl J. Friston3
1 In the FR-SAIM, every network is associated with a free energy
University of Birmingham, Edgbaston, Birmingham,UK;
2 function in a hierarchical fashion. Each lower-level network makes a
Graduate school of Neural Information Processing, University of
prediction and sends it up to the level above and in turn, each higher-
Tubingen, Germany; 3 The Wellcome Trust Centre for Neuroimaging, level network calculates the top-down prediction error signal and
Institute of Neurology, University College London, London, UK
returns to the level below.
Abstract The Generative Model: To model sensory information in a hierar-
In this work, we propose a free energy model for visual template chical structure, we define a nonlinear function, say f, to represent our
matching (FR-SAIM) based on the selective visual attention and state-space in terms of the sensory states (input data), in the way the
identification model (SAIM). following equation suggests:
Keywords
Selective Visual Attention, Template Matching, Free Energy si f xi w : w N0; Rm x; m; c 1
Principle
where the causal states m are mediated by hidden states x and thereby
Introduction the hierarchical states link together and bring about a memory for the
Visual search is a perceptual task that has been extensively studied in model and establish the local dynamics xi :xm ; xx are both random
the cognitive processing systems literature. It is widely known that fluctuations produced through observation.
this process rests on matching the input from visual field with a top- Concerning equation (1), the model dynamics can be written in a
down attentional set, namely, a search template. However, the way hierarchical fashion as follows:
this attentional set is formed and how it guides the visual search
is still is not clear. The free energy principle is an emerging neuro- x0 f x1 2
cognitive framework, which tries to account for how interactions
within a self-organizing system, like the brain, lead to represent, x1 f x2 Ui 3
perceive and interpret sensory data by minimizing a free energy
that can be considered as prediction error (Friston 2009). By x2 f x1 bottomup 4
extending the SAIM (Heinke, Humphreys 2003), we demonstrate how
connectionist models can shed a light on how free energy minimi- x2 f x3 topdown prediction 5
zation mediates template matching in a visual attention model.
The Overview of FR-SAIM model where U(i) is the action the networks takes to modify the selection
The architecture of the FR-SAIM model is illustrated in Fig. 1a. In process of sen-sory data and is denoted by Ui maxx2 ; x3 .
brief, visual input sampling is carried out by the content network (CN), The Energy Functions: The energy functions of the neural networks
while controlled by the selection network (SN), and mapped to the in the FR-SAIM are derived by combining the original SAIM network
focus of attention (FOA). When multiple objects appear in the retina, energy functions and the prediction errors computed using free energy
there is a property called inhibition of return to make the model select principle. The details of mathematical derivation of these functions
one and only one object to avoid them being overlapped in the FOA. are discussed in Yahya (2013). These energy functions can be written
At the same time, the content network rectifies the already selected as follows:
objects. Every neuron in the CN (sigma-pi nodes) holds a correspon-
!2
dence between the retina and the FOA. On the other hand, the SN bCN X X CN X
SCN
determines which one of them is instantiated. By using a top-down E xCN SN
ij ; xkl xij ySN
kl xVF SN
kl yki;lj
2 ij kl kl
X 2
ykl
SN 1 6
A B kl
a X 2
KN
EKN yKN CN
m ; xij yKN
l 1 bKN
2 l
C 0 !2 1
X X
@yl xCN KN
wlij A
KN ij yl 7
l ij
Top-down aLM X LM 2 X SN
Modulation
ELM yLM SN
kl ; xkl l ykl 1 bLM yLM LM
kl xkl ykl
Focus of 2 l
Attention
8
D Finally, the gradient descent method, at time step t, will be imposed
Top-down on all of the network energy functions in order to have them
Modulation
minimized:
oExi
xi t 1 xi t 9
oxi
Simulation Results
Simulation results are shown in Fig. 1bd. Here, the model starts
Fig. 1 a Architecture of the FR-SAIM Model, b Visual field input to processing visual input and will put the result into the FOA. These
the model, c Activation patterns of the content network during results illustrate how the target template 2, won the competition
simulation, d Time course of activation of the content network over the distractor template +, by dominating the activation of the
123
content network, as time passes. Furthermore, the time plot of the memory in the selective attention for identification model (SAIM).
content network shows how the obtained network energy functions Psychol Rev 110:2987
are minimized with regards to free energy principle. Yahya K (2013) A computational study of visual template identifi-
cation in the SAIM: a free energy approach, MPhil Thesis,
References University of Birmingham
Friston KJ (2009) The free-energy principle: a rough guide to the
brain? Trend Cogn Sci 13(7):293301
Heinke D, Humphreys GW (2003) Attention, spatial representation,
and visual neglect: simulating emergent attention and spatial
123
Oral Presentations certain number of rules must be executed whose sequential execution
takes more time than it takes humans to solve corresponding (sub-)
tasks in the experiments.
Analyzing psychological theories with F-ACT-R: In this work we propose a new method to investigate the validity
an example F-ACT-R application of a psychological theory with ACT-R models. Based on a formal
semantics (Albrecht 2013; Albrecht and Westphal 2014) of ACT-R,
Rebecca Albrecht, Bernd Westphal we reduce the question whether global parameter settings exist such
Informatik, Universitat Freiburg, Germany that, e.g., a timely execution of a set of ACT-R rules is possible, to a
satisfiability problem, i.e. a formula in first order logic. In order to
Abstract analyze the resulting satisfiability problem we use a satisfiability
The problem to decide whether an ACT-R model predicts experi- modulo theories (SMT) (De Moura and Bjrner 2011) solver to
mental data is, today, solved by simulation. This, of course, needs a analyze it. If the SMT solver proves the given formula unsatisfiable,
complete ACT-R model and fixed global parameter settings. Such an we can conclude that there are no appropriate global parameter set-
ACT-R model may include implementation details, e.g. the use of tings, thus there is an issue with the given implementation of the
control variables as part of declarative knowledge, in order to yield psychological theory. If the SMT solver proves the given formula
expected results in simulation. Some of these implementation details satisfiable, we obtain valuable hints on global parameter settings and
are not part of a psychological theory but, nevertheless, may change can check them for plausibility. As our approach is not based on
the models behavior. On the other hand, the crucial parts of a psy- actual executions of an ACT-R model, it in particular applies to
chological theory modelled in ACT-R may only depend on very few partial ACT-R models, i.e., to small sets of rules essential for the
rules. Based on a formal semantics for the ACT-R architecture we psychological theory. This may save significant modelling effort.
present preliminary results on a method to formally analyze whether a Motivating Example
partial ACT-R model predicts experimental data, without the need for Experimental Setting. A typical task in the domain of relational
simulation. spatial reasoning with mental models is the following. Objects are
Keywords visually presented to participants either on the left or on the right of a
ACT-R, Formal Methods, Model Analysis, SMT, Model Checking computer screen (cf. Fig. 1). The position of objects on two subse-
Introduction quently shown screens implicitly encodes a relation between two
In cognitive modelling, computer models are used to describe human objects. For example, the two leftmost screens in Fig. 1 together
cognitive processes wrt. psychological assumptions. Unified theories encode the relation A is to the left of B.
of cognition and their implementations (called cognitive architec- The psychological experiment consists of n different tasks, where task
tures) provide means for cognitive modelling. A widely used unified i consists of showing six screens at times t0i ,,t5i . The two relations
theory of cognition and cognitive architecture is ACT- R (Anderson encoded by the first four screens are called premises, the relation
1983, 2007). ACT-R is a so-called hybrid architecture which consist encoded by the last two screens shown at t4i and t5i is called conclu-
of a symbolic and a subsymbolic layer. As part of the symbolic layer sion. After the sixth screen of a task has been shown, participants
declarative knowledge (chunks) and procedural knowledge (produc- should state whether the two premises and the conclusion are con-
tion rules) is defined. The interface between the symbolic and the tradictory. In the example shown in Fig. 1, they are not contradictory
subsymbolic layer in ACT-R is given by so- called modules. Modules because objects A, B, and C can be arranged in an order which
are requested by production rules to process declarative information satisfies both premises and the conclusion.
and make them accessible through associated buffers. The subsym- The Theory of Preferred Mental Models
bolic layer is defined by the behavior of modules, i.e. the responses of In the preferred mental model theory (Ragni, Knauff and Nebel
modules for given requests. For some modules, these responses 2005), it is assumed that participants construct a mental spatial array
depend on numerical parameters, e.g. the decay rate for the imple- of dynamic size which integrates information given by the premises.
mentation of base-level learning as part of the declarative module. Whether a conclusion contradicts the given premises is checked by
The process of cognitive modelling in ACT-R can be described as inspecting the spatial array. Furthermore, it is assumed that only one
defining a model which adequately predicts average human data preferred mental model is constructed immediately when the premises
collected in experiments. Today this process is typically performed as are presented. Only if the given conclusion does not hold in the
follows. There is a psychological theory, i.e., a hypothesis on how a preferred mental model an alternative mental model is constructed.
given task is principally solved by humans. In order to validate the For example, a possible model of the premises shown in Fig. 1 is to
psychological theory, an ACT-R model which implements the theory order the objects as A, C, B. This model does not imply the
is constructed and evaluated wrt. experimental data. Practically, fig- conclusion.
ures like average error rates or response times are derived from Modelling the Theory of Preferred Mental Models. When modelling
several executions of the ACT-R model and compared to average the theory of preferred mental models in ACT-R, a crucial aspect is
human data collected in experiments. If the figures obtained from the use of the declarative memory. In the ACT-R theory, the time and
executions of the ACT-R model deviate too far from experimental probability for retrieving a chunk from declarative memory depend on
data, there are two possible ways to adjust the models behavior. On the activation of chunks. Activation in turn depends on different
the one hand, numerical parameters can be adjusted, on the other
hand, a different implementation of the psychological theory can be
provided. If there is no implementation and parameter setting with
which the cognitive architecture yields adequate predictions, the
psychological theory needs to be rejected.
Today, the only available method for ACT-R model validation is
simulation, i.e. repeated model execution. Using this method for the
validation of psychological theories requires an ACT-R model which
is suitable for simulation. Creating such a sufficiently complete ACT- Fig. 1 Example relational reasoning task with id i. Premise 1 is A is
R model may take a significant effort even if issues of a theory may to the left of B, premise 2 is A is to the left of C, and the
depend on only few production rules of a model. For example, a conclusion is B is to the left of C. The time when the j-th stimulus
psychological theory may be invalid because according to the theory a is presented is denoted by tij
123
assumptions on human memory processing, e.g. spreading activation, decay d and threshold s. Two cognitive states c = (s, t) and
where the content of the declarative memory is considered and base c0 = (s0 , t0 ) are in transition relation, denoted by c ? rc0 , if there is a
level learning, where the history is considered. In an ACT-R cognitive rule r = (p, a) such that precondition p is satisfied in s, s0 is obtained
architecture where only base level learning is considered, the acti- by applying a to s, and t0 - t is the time needed to execute action a.
vation is calculated based on two global parameters: the decay rate d Now the ACT-R model validity problem stated in Section 2
which determines how fast the activation of a chunk decays over time basically reduces to checking whether, given a start cognitive state
and the threshold s which defines a lower bound on activation values (c, t)) and a goal state (c0 , t0 ) there exist values for d and s such that
for successful chunk retrieval. there is a sequence of transitions
A fundamental assumption of the theory of preferred mental models is r1 r2 rn
that the preferred mental model for the two premises is constructed co ; t0 !c1 ; t1 ! . . . !cn ; tn 1
before the conclusion is presented. That is, the behavior of the
with c0 = c and cn = c0 .
environment imposes hard deadlines on the timing of the model: any
For an example, consider the phase of the preferred mental model
valid ACT-R model for the theory of preferred mental models must
theory shown in Fig. 2 as discussed in Section 2. More specifically,
complete the processing of all rules needed to construct the preferred
consider a rule r which requests the declarative module for a mental
mental model before the next stimulus is presented.
model representing premise 1 when the first screen of the second
Consider the top row of Fig. 2 for a more formal discussion.
premise is presented at time ti2.
During a task, stimuli are presented to the participants at fixed points
In the following, we consider for simplicity a model where r is the
in time. For example, let E1 denote the third screen (i.e. the onset of
only nondeterministic rule which is ever enabled between ti2 and ti4
the first element of premise 2) and E2 denote the fifth screen (i.e. the
and that the sequence of rules executed before and after rule r is
onset of the first element of the conclusion) shown at times ti2 and ti4,
deterministic. Then the time to execute the model only varies wrt. the
respectively, in the i-th task. This is the interval where the second
time for executing r. The model is definitely not valid if there are no
premise has to be processed. Then, according to the assumption stated
choices for decay d and threshold s such that there is a transition
above, processing of premise 2 has to be completed within tb := t2i
c ? rc0 where c = (s, t) is a cognitive state associated with a realistic
t4i time units. An ACT-R model for this task in particular needs to
history and c0 = (s0 , t0 ) is a cognitive state where the mental model
model successful solutions of the task. That is, in an ACT-R model
chunk representing premise 1 has correctly been recalled.
which is valid given experimental data, the execution of all rules
This can be encoded as a satisfiability problem as follows. A
which are involved in constructing the mental model must complete
cognitive state can be characterized by a formula over variables V
in at most tb time units.
which model buffer contents, i.e. cognitive states. We can assume
In Fig. 2, we illustrate cognitive states by the circular nodes,
formula us0 over variables V0 to encode cognitive state s and us0 . over
arrows indicate the execution of one rule which transforms one
variables V0 to encode cognitive state s0 . The precondition p of rule
cognitive state into another. In addition to rules which request the
r can be seen as a formula over V, the action a relates s and s0 , so it is
declarative module, an ACT-R model of the theory of preferred
a formula over V and V0 .
mental models may comprise rules with deterministic timing and
Furthermore, we use A(c, t) to denote the activation of chunk c at
outcome, e.g., when modifying buffers of the imaginal module. In
time t. We use optimized base-level learning to calculate activation
Fig. 2, we assume that there is only one request to the declarative
values: A(c, t) = ln (2) - ln (1 - d) - d(t - tc) where tc is the first
module by rule r, i.e. a request for the already constructed mental
time chunk c was presented. For our experiments, we consider two
model comprising premise 1, which has two qualitatively different
outcomes: a correct reply, and a wrong reply or no reply at all. Now
given corresponding rules, if it is impossible to choose the decay rate
d and the threshold s such that ti2 - ti4 B tb, then the considered rules
definitely do not constitute a valid (partial) ACT-R model for the
preferred mental model theory.
A Valid ACT-R Model for the Theory of Preferred Mental Models.
The preferred mental model theory has been implemented in ACT-R
(Ragni, Fangmeier and Brussow 2010). In this model, each premise is
represented by a mental model chunk which is constructed in the
imaginal buffer. A mental model chunk specifies a number of posi-
tions pos1, pos2, and assigns objects presented on the computer
screen accordingly. When premise 2 is presented, the mental model
chunk representing the first premise has to be retrieved from declar-
ative memory in order to construct a new mental model chunk which
integrates both premises. In the ACT-R model for the preferred
mental model theory, only base-level learning is considered.
In the following, we use a part of this ACT-R model to illustrate our
approach. As the ACT-R model predicts the experimental data
appropriately for a certain choice of parameters, we expect our
approach to confirm this result.
Formal Analysis of ACT-R Models
Formally, an ACT-R production rule is a pair r = (p, a) which
comprises a precondition p and an action a. An ACT-R model is a set
of production rules. A cognitive state c = (s, t) consists of a mapping
s from buffers to chunks or to the symbol nil, and a time-stamp Fig. 2 Example sequence of cognitive states (circles) in between
t 2 R 0.
environment event E1 and E2 (rectangles). A cognitive state which
The F-ACT-R formal semantics (Albrecht, Westphal 2014) leads to a correct reply is denoted by V, and a state which leads to a
explains how a set of production rules induces a timed transition wrong reply or no reply at all as X. Label r indicates a state where a
system on cognitive states given a set of global parameters, including retrieval request is posed to the declarative module
123
chunks c1, which correctly represents premise 1, and c2, which does Ragni M, Knauff M, Nebel B (2005) A computational model for
not. spatial reasoning with mental models. In: Proceedings of the 27th
The formula to be checked for satisfiability, then, is annual cognition science conference, pp 10641070
9d; s : us ^ p ^ Ac1 ; t [ s ^ Ac2 ; t\Ac1 ; t ^ a ^ u0s

^ t0 t\tb : 2
F-ACT-R: defining the ACT-R architectural space
As a proof-of-concept, we have used the SMT solver SMTInterpol
(Christ, Hoenicke and Nutz 2012) to check an instance of (2). With an Rebecca Albrecht, Bernd Westphal
appropriate start cognitive state, SMTInterpol reports satisfiability of Informatik, Universitat Freiburg, Germany
(2) and provides a satisfying valuation for d and s in about 1 s in total.
If we choose an initial cognitive state where the activation of c1 is too Abstract
low, SMTinterpol proves (2) unsatisfiable as expected. ACT-R is a unified theory of cognition and a cognitive architecture
By adding, e.g., constraints on s and d to (2), we can use the same which is widely used in cognitive modeling. However, the semantics
procedure to check whether the model is valid for particular values of of ACT-R is only given by the ACT-R interpreter. Therefore, an
d and s which lie within a range accepted by the community. application of formal methods from computer science in order to, e.g.,
Note that our approach is not limited to the analysis of single rules. analyze or compare cognitive models wrt. different global parameter
Given an upper bound n on the number of rules possibly executed settings is not possible. We present a formal abstract syntax and
between two points in time, a formula similar to (2) can be semantics for the ACT-R cognitive architecture as a cornerstone for
constructed. applying formal methods to symbolic cognitive modeling.
Conclusion Keywords
We propose a new method to check whether and under which con- ACT-R, Cognitive Architectures, Formal Methods, Abstract Syntax,
ditions a psychological theory implemented in ACT-R predicts Formal Semantics
experimental data. Our method is based on stating the modelling Introduction
problem as a satisfiability problem which can be analyzed by an SMT In Cognitive Science researchers describe human cognitive pro-
solver. cesses in order to explain human behavioral patterns found in
With this approach it is in particular no longer necessary to write a experiments. One approach is to use cognitive architectures which
complete ACT-R model in order to evaluate a psychological theory. It implement a set of basic assumptions about human cognitive pro-
is sufficient to provide those rules which are possibly enabled during cesses and to create cognitive models with respect to these
the time considered for the analysis. assumptions. ACT-R (Anderson 1983, 2007) is one such cognitive
For example, in Albrecht and Ragni (2014) we propose a cognitive architecture, which provides a programming language to create a
model for the Tower of London task, where an upper bound on the cognitive model and an interpreter to execute the model. The ACT-
time to complete a retrieval request for the target position of a disk is R architecture is a hybrid architecture which includes symbolic and
defined as the time it takes the visual module to encode the start subsymbolic mechanisms. Symbolic mechanisms consist of three
position. We expect the evaluation of such mechanisms to become concepts, namely assuming a modular structure of the human brain,
much more efficient using our approach as compared to simulation- using chunks as basic declarative information units and production
based approaches. rules to describe processing steps. Subsymbolic processes are
In general, we believe that by using our approach the overall associated with the modules behavior. The modules behavior is
process of cognitive modelling can be brought to a much more effi- controlled by so-called global parameters, which enable the execu-
cient level by analyzing crucial aspects of psychological theories tion of a cognitive model with respect to different assumptions
before entering the often tedious phase of complete ACT-R about human cognition.
modelling. In this work we introduce a formal abstract syntax and semantics
for the ACTR cognitive architecture. An ACT-R architecture is
References defined as a structure which interprets syntactic components of an
Albrecht R (2013) Towards a formal description of the ACT-R unified ACT-R model with respect to psychological assumptions, e.g. global
theory of cognition. Unpublished masters thesis, Albert-Lud- parameter settings. As a result, we construct a complete transition
wigs-Universitat Freiburg system which describes all possible computations of an ACT-R model
Albrecht R, Ragni M (2014) Spatial planning: an ACT-R model for with respect to one ACT-R architecture. The architectural space of
the tower of London task. In: Proceedings of spatial cognition ACT-R is defined as the set of all possible ACT-R architectures.
conference 2014, to appear State of the Art
Albrecht R, Westphal B (2014) F-ACT-R: dening the architectural ACT-R. The functionality of ACT-R is based on three concepts.
space. In: Proceedings of KogWis 2014, to appear Firstly, there are basic information units (chunks) which describe
Anderson JR (1983) The architecture of cognition, vol 5. Psychology objects and their relationship to each other. A chunk consists of a
Press chunk type and a set of slots which reference other chunks.
Anderson JR (2007) How can the human mind occur in the physical Secondly, the human brain is organized in a modular structure,
universe? Oxford U Press that is, information processing is localized differently wrt. how
Christ J, Hoenicke J, Nutz A (2012) SMTInterpol: an interpolating information is processed. There are different modules for perception,
SMT solver. In Donaldson AF, Parker D (eds) SPIN, vol 7385. interaction with an environment, and internal mental processes. When
Springer, pp 248254 requested, each module can process one chunk at a time and the
De Moura L, Bjrner N (2011) Satisfiability modulo theories: intro- processing of chunks needs time to be completed. The processed
duction and applications. Commun ACM 54 (9): 6977. doi: chunk is made accessible by the module through an associated buffer.
10.1145/1995376.1995394 A state in cognitive processing (cognitive state) is the set of chunks
Ragni M, Fangmeier T, Brussow S (2010) Deductive spatial reason- made accessible by modules through associated buffers.
ing: from neurological evidence to a cognitive model. In: Thirdly, there are cognitive processing steps, i.e. changing a
Proceedings of the 10th international conference on cognitive cognitive state by altering or deleting chunks which are made
modeling, pp 193198 accessibly by modules, or requesting new chunks from modules. This
123
is accomplished by the execution of production rules. A production Architecture. In this section we describe the formal interpretation of
rule consists of a precondition, which characterizes cognitive states an abstract ACT-R model with respect to psychological assumptions.
where the production rule is executable, and an action which We propose to denote by architecture a structure which provides all
describes changes to cognitive states when the production rule is necessary means to interpret an abstract ACT-R model. To this end,
executed. Basically, actions request modules to process certain an architecture defines chunks as the building blocks of declarative
chunks. Which chunk is processed and how long this processing takes knowledge, i.e. instantiations of chunk types, an interpretation func-
depends on the implementation of psychological assumptions within tion for each action symbol of a module, and a production rule
the modules and may be controlled by global parameters. selection mechanism.
Formalization of Symbolic Cognitive Architectures. To the best of our In order to describe all computations of an ACT-R model as a transition
knowledge, there is no formal abstract description of ACT-R. Other system, we introduce the notion of a cognitive state and finite sequences
works try to make cognitive modelling more accessible by utilizing thereof which are induced by the execution of production rules. The
other modelling formalisms, like GOMS (Card, Moran and Newell introduction of finite sequences of cognitive states is necessary as the
1983) as high level languages for ACT-R (Salvucci, Lee 2003; St. interpretation of actions depends on the history of a model. As the most
Amant, Freed and Ritter 2005; St. Amant, McBride and Ritter 2006). prominent example consider base-level learning in ACT-R. A request to
In other approaches the authors propose high-level languages which the declarative module yields one chunk as a result. In general, it is
can be compiled into different cognitive architectures, e.g. ACT-R possible that more than one chunk fits the request. Which chunk is
and SOAR (Laird, Newell and Rosenbloom 1987). This includes returned by the declarative module depends on how often and when all
HERBAL (Cohen, Ritter and Haynes 2005; Morgan, Haynes, Ritter fitting chunks were processed by the model before.
and Cohen 2005) and HLSR (Jones, Crossman, Lebiere and Best We use D to denote the set of chunks, where a chunk c 2 D is a
2006). All these approaches do not report a formal description for unique entity which has a chunk type and maps each slot (as defined
ACT-R but only describe the high-level language and compilation by the chunk type) to another chunk.
principles. A cognitive state c is a function which maps each buffer b 2 B to a
In Schultheis (2009), the author introduces a formal description for chunk c 2 D and a delay d 2 R 0 . The delay corresponds to the timing
ACT-R in order to prove Turing completeness. However, this formal behavior of modules. By mapping buffer b to a delay d [ 0 we
description is too abstract to be used as a complete formal semantics indicate that there is an action pending and that it will be completed in
for ACT-R. In Stewart and West (2007) the authors analyze the d time units. If there is no action pending, d is 0. This is a slightly
architectural space of ACT-R. In general, this idea is similar to the different view as common in ACT-R, where a chunk is accessible in a
idea presented in this work. However, the result of their analysis is a cognitive state only after it has been processed by the module, i.e. if
new implementation of ACT-R in the Python programming language. the modules delay d is 0. Intuitively, in our representation, an
Therefore, it is not abstract and, e.g., not suitable for applying formal interpreter is able to look ahead when scheduling actions. In the
methods from software engineering. following, we use C to denote the set of all cognitive states and Cpart
A Formal Definition of ACT-R to denote the set of all partial cognitive states, i.e., functions which
In this section, we describe the basic building blocks of our formal- not necessarily assign all buffers. A finite trace p is simply a finite
ization of ACTR. The formalization complies to the ACT-R theory as sequence c0 ; c1 ; . . .; cn 2 C of cognitive states. In the following, we
defined by the ACT-R interpreter. Firstly, we provide an abstract use P to denote the set of all finite traces.
syntax for ACT-R models which includes chunk instantiations, abstract Given an action symbol a 2 A of a module M, an interpretation of a
D
modules, and production rules. In our sense, an ACT-R model is simply is a function Ihai : P ! 2Cpart K2 which assigns to each finite trace p
a syntactic representation of a cognitive process. Secondly, we formally a set of possible effects of the action. An effect is a triple
introduce the notion of architecture as an interpretation of syntactic (cpart, k, C) consisting of a partial cognitive state cpart, a valuation
entities of an ACT-R model with respect to psychological assumptions, k 2 K of module queries, and a set C 2 2D of chunks. The partial
i.e. subsymbolic mechanisms. This yields a representation of cognitive cognitive state cpart defines an update of the buffer contents, k provides
states and finite sequences thereof. Thirdly, for a given model we new values for module queries, and C comprises the chunks which are
introduce an infinite state, timed transition system over cognitive states removed from buffers and which have to be considered for an update of
which is induced by an architecture. the declarative memory. Similarly, the production rule selection
Abstract Syntax. We consider a set of abstract modules as a gener- mechanism is formally a function S : P ! 2R which yields a set of
alization of the particular modules provided by the ACT-R tool. A production rules. The production selection mechanism decides whether
module M consists of a finite set of buffers B, a finite set of module a precondition is satisfied in a cognitive state given an interpretation of
queries Q, and a finite set of action symbols A. Buffers are repre- relation symbols and an assignment of ACT-R variables to chunks.
sented as variables which can be assigned to chunks. Module queries Note that our notion of architecture provides a clear interface
are represented as Boolean variables. As action symbols we consider between the symbolic layer of ACT-R, i.e. the world of rules and
the standard ACT-R action symbols + , = , and -. chunks, and the sub-symbolic layer, i.e. formal principles corre-
In order to describe the ACT-R syntax we define the signature of sponding to human cognitive processing captured by the
ACT-R models which is basically a set of syntactic elements of a interpretation functions of actions symbols associated with modules.
model. The signature of an ACT-R model consists of a set of mod- Furthermore, each choice of global parameters e.g. the decay rate in
ules, a set of chunk types, a set of ACT-R variables, and a set of base-level learning corresponds to exactly one architecture as defined
relation symbols. A chunk type consists of a type name and a finite set above. Architectures differ in the definitions of the interpretation
of slot names (or slots for short). functions I, i.e. which effects are obtained for a given finite trace, and
An abstract production rule consists of a precondition and an in the production rule selection function S.
action. A precondition is basically a set of expressions over a models Behavior of ACT-R Models. In this section we introduce the com-
signature, i.e. over the content of buffers of a module parameterized putational space of an ACT-R model given an ACT-R architecture.
by ACT-R variables and module queries. An action is also an This is done by introducing a labelled, timed transition system as
expression over the models signature which uses action symbols of induced by a model and an architecture. To this end, we define the
modules. following transition relation. Two time-stamped cognitive states
An abstract ACT-R model consists of a finite set of production (c, t) and (c0 , t0 ) are in transition relation wrt. a production rule r 2 R,
rules R, a finite set of initial buffer actions A0 in order to define the an execution delay s 2 R 0 for production rule r, a set of chunks
initial state, and a finite set of chunk type instantiations C0. x 2 D, and a finite trace p 2 P, i.e.
123
r;s;x
References
c; tp! c0 ; t0 ;
Albrecht R, Giewein M, Westphal B (2014) Towards formally
if and only if production rule r is executable in cognitive state wrt. the founded ACT-R simulation and analysis. In: Proceedings of
finite trace p, i.e. if r 2 S(p, c), if the effect of the actions in KogWis 2014, to appear
r according to the interpretation functions of the action symbols yields Albrecht R, Westphal B (2014) Analyzing psychological theories
c0 , and if time-stamp t0 is t + s. with F-ACT-R. In: Proceedings of KogWis 2014, to appear
The introduced transition relation corresponds to a cognitive Anderson JR (1983) The architecture of cognition, vol 5. Psychology
processing step in ACT-R, i.e. the execution of one production rule. Press.
The transition relation ? induces an (infinite state) timed transition Anderson JR (2007) How can the human mind occur in the physical
system with the initial state defined by the cognitive state given by universe? Oxford U Press
initial buffer actions A0. Given an ACT-R model, there is a one-to-one Card SK, Moran TP, Newell A (1983) The psychology of human
correspondence between the set of simulation runs obtainable from computer interaction. CRC.
the ACT-R tool (for a given set of parameters) and computation paths Cohen MA, Ritter FE, Haynes SR (2005) Herbal: a high-level lan-
in the timed transition system induced by the architecture corre- guage and development environment for developing cognitive
sponding to the chosen parameters. We validated the formal models in soar. In: Proceedings of 14th conference on behavior
description by comparing a prototype implementation to the ACT-R representation in modeling and simulation, pp 177182
interpreter for several models described in the ACT-R tutorial. Jones RM, Crossman JA, Lebiere C, Best BJ (2006) An abstract lan-
Conclusion guage for cognitive modeling. In: Proceedings of 7th international
In this work, we presented the first comprehensive, high-level formal conference on cognitive modeling. Lawrence Erlbaum, Mahwah
semantics for the ACT-R programming language as defined by the Laird JE, Newell A, Rosenbloom PS (1987) Soar: an architecture for
ACT-R interpreter. By our notion of ACT-R architectures, we have general intelligence. Artif Intell 33(1):164
precisely captured the architectural space of ACT-R. Ludtke A, Cavallo A, Christophe L, Cifaldi M, Fabbri M, Javaux D
Our formalization lays the ground to approach a number of known (2006) Human error analysis based on cognitive architecture. In
issues with the ACT-R modelling language. Firstly, our notion of HCI-Aero, pp 4047
architecture can be used to explicitly state all assumptions under which Morgan GP, Haynes SR, Ritter FE, Cohen MA (2005) Increasing
cognitive models are created and evaluated. Then the architectures used Efficiency of the Development of User ModelsIn SIEDS, pp 8289
for different cognitive models can be compared precisely due to the Salvucci DD, Lee FJ (2003) Simple cognitive modeling in a complex
formal nature of our definition. We expect such comparisons to provide cognitive architecture. In: CHI pp 265272
deeper insights into human cognition as such. Today, mechanisms and Schultheis H (2009) Computational and explanatory power of cog-
parameter settings employed for modelling and evaluation are often nitive architectures: the Case of ACT-R. In: Proceedings of 9th
neither reported nor discussed, mainly due to the intransparent inte- international conference cognitive modeling, pp 384389
gration of these principles in the ACT-R interpreter. St. Amant R, Freed AR, Ritter FE (2005) Specifying ACT-R models of
Secondly, our formal semantics allow to compare different ACT-R user interaction with a GOMS language. Cogn Syst Res 6(1):7188
models. Whether two models with (syntactically) different rule sets St. Amant R, McBride SP, Ritter FE (2006) An AI planning per-
describe the same behavior now amounts to proving that the induced spective on abstraction in ACT-R Modeling: toward an HLBR
timed transition systems are equivalent. language manifesto. In: Proceedings of ACT-R Workshop
Thirdly, our formal view on ACT-R models allows to go beyond Stewart TC, West RL (2007) Deconstructing and reconstructing ACT-
todays quantitative evaluation of ACT-R models with the ACT-R R: exploring the architectural space. Cogn Syst Res 8(3):227236
interpreter towards a qualitative evaluation. Today, the ACT-R inter-
preter is typically used to compute abstract quantitative figures like the
average time needed by the model to solve a certain task. Our for-
Defining distance in language production: extraposition
malization provides a stepping stone to, e.g., formal analysis
techniques. With these techniques we can, for instance, analyze whe- of relative clauses in German
ther and under what conditions certain aspects of psychological
theories (Albrecht, Westphal 2014) can possibly predict empirical data, Markus Bader
or check whether and under what conditions a certain cognitive state Goethe-Universitat Frankfurt, Institut fur Linguistik, Frankfurt am
which is crucial for a modelled psychological theory is reachable. Main, Germany
Last but not least, formal techniques can be applied to improve the
Abstract
software engineering aspect of ACT-R modelling which is often
This paper presents results from a corpus study and two language pro-
perceived to be rather inefficient and error prone (Morgan et al. 2005;
duction experiments that have investigated the position of relative clauses
Jones et al. 2006) in the literature.
in German. A relative clause can appear either adjacent to its head noun
Furthermore, the scope of our work is not limited to ACT-R but
or extraposed behind the clause final verb. The corpus data show that the
has a clear potential to affect the whole domain of rule-based cog-
major factor deciding whether to extrapose or not is the distance that has
nitive architectures. Firstly, efforts to provide alternative ACT-R
to be crossed by extraposition. Relative clause length has an effect too,
interpreters like (Stewart, West 2007) can refer to a common refer-
but a much smaller one. The experimental results show that distance is
ence semantics. Secondly, we are able to formally establish
not defined as number of words but as new discourse referents in the
connections between different cognitive architectures ranging from
sense of the Dependency Locality Theory of Gibson (2000).
general purpose architectures like SOAR to special purpose archi-
Keywords
tectures like CasCas (Ludtke et al. 2006).
Sentence production, Extraposition, Dependency length, Dependency
In future work, our formalization needs to be extended to cover
Locality Theory (DLT)
probabilistic aspects. Furthermore, we plan to extend the prototype
Introduction
implementation of our formal description (Albrecht, Giewein and
A large body of research into word order variation has shown that
Westphal 2014) to support more ACTR models before we investigate
constituent weight is a major factor determining the choice between
options for improved high level model description languages that are
competing syntactic alternatives (e.g., Hawkins 1994; Wasow 2002).
explicitly suitable for the ACT-R theory.
More recently, it has become common to define weight in terms of the
123
length of syntactic dependencies, like the dependencies between verbs same way as proposed by the DLT for language comprehension,
and their arguments (e.g., Hawkins 2004; Gildea, Temperley 2010). namely in terms of new discourse referents, not in terms of words.
This raises the question of how dependency length is to be measured. Corpus Analysis
The syntactic alternation considered in this paper concerns the About 2000 sentences containing a relative clause in either adjacent
position of relative clauses in German. As shown in (1), a relative or extraposed position were drawn from the deWaC corpus (Baroni,
clause in German can appear either adjacent to its head noun (1-a) or Bernardini, Ferraresi and Zanchetta 2009) and analyzed. Preliminary
extraposed behind the clause final verb (1-b). results of the ongoing analysis are shown in Figs. 1 and 2. Figure 1
shows the effect of relative clause length. Figure 2 shows the effect of
the post head- noun region, which is the region between head noun/
relative clause and clause-final verb (Geschenke in (3)). The verb is
not included because the verb has always to be crossed when extra-
posing and additional analyses show that the length of the verbal
complex has only very small effects on the rate of extraposition.
When only the total extraposition distance is considered, as in the
older corpus literature, one misses the point that it is the length of the
post head-noun region which is crucially involved in determining
extraposition.
In accordance with earlier corpus studies of relative clause
placement in German, the rate of extraposition increases with
increasing length of the relative clause and decreases with increasing
length of the post head-noun region. Furthermore, the length of the
post head-noun region is a much more important predictor of relative
When deciding whether to keep the relative clause adjacent to its

head noun or to extrapose it behind the clause-final verb, two
dependencies have to be considered. One is the dependency between
head noun and relative clause and the second one is the dependency
between head noun and clause-final verb. As shown in (2) and (3),
these two dependencies stand in a trade-off relation.
When the relative clause is adjacent to the head noun, the head
nounrelative clause dependency (solid arrow) is optimal whereas
Fig. 1 Proportion of extraposition depending on the length of the
the head nounverb dependency (dashed arrow) is not because the
relative clause (in words)
relative clause intervenes between head noun and verb. Extraposition
of the relative clause shortens the head nounverb dependency but
lengthens the head-noun relative clause dependency, that is, while
the former dependency improves the latter one becomes worse.
Corpus studies (Hawkins 1994; Uszkoreit et al. 1998) show that
the decision to extrapose depends on both dependencies. First, the rate
of extraposition increases with increasing length of the relative clause.
Second, the rate of extraposition declines with increasing extraposi-
tion distance, that is, with an increasing amount of material that
intervenes between head noun and relative clause in the extraposed
variant. In (3), for example, extraposition has to cross two words
(Geschenke geben).
In studies of language production (Stallings, MacDonald 2011)
and corpus research (Gildea, Temperley 2010), distance is measured
as number of words. The same is true for the efficiency theory pro-
posed in (Hawkins 2004) which is neutral with regard to language
production or language comprehension. This contrasts with the
Dependency Locality Theory (DLT) of (Gibson 2000), which is a
theory of processing load during language comprehension. According
to the DLT, dependency length is not measured in number of words
but in number of new discourse referents.
The aim of the present work is to test the hypothesis that depen- Fig. 2 Proportion of extraposition depending on the length of the
dency length for purposes of language production is defined in the pre-verbal material (in words)
123
clause placement than the length of the relative clause. When the post Both the indefinite pronoun and a bare noun introduce a new
head-noun region is empty, extraposition is almost obligatory, but discourse referent and should thus block extraposition in the same
already a post head-noun region of four words drives the extraposition way. However, because the indefinite pronoun lacks lexical content, it
rate down to less than 10 %. causes less semantic processing costs. Since the cost of semantic
In the following experiments, the post head-noun region will range processing is the underlying reason of why distance is measured in
from 0 to 2 words. As shown in Fig. 2, this relatively small increase terms of new discourse referents in the DLT, it could be expected that
has strong effects on the decision to extrapose when averaged across it is easier to extrapose across an indefinite pronoun than across a bare
all different kinds of intervening material. In this case, the extrapo- noun.
sition rate goes down from ca. 90 % for 0 words to 60 % for one word 27 students participated in Experiment 2. The results, which are
and to 35 % for two words. The question addressed by the following also shown in Table 1, reveal a 14 % drop in extraposition rate in the
two experiments is whether more fine-grained distinctions show up presence of a one-word object and a further 9 % drop when going
when looking at particular types of intervening material. from one- to two-word objects. The results were analyzed as descri-
Experiment 1 bed for Experiment 1. The results of the logistic mixed effect
In order to decide between defining dependency length in terms of regression are shown in Table 3. The difference between 0 words and
number of words or number of new discourse referents 32 students 1 word was significant but the difference between 1 word and 2 words
participated in an oral production experiment which was a variant of failed to reach significance.
the production-from-memory task (Bock, Warren 1985). Participants Discussion
first read a main clause as in (4). After a visual prompt like Max said The experimental results presented in this paper show that the deci-
that, the initial main clause had to be repeated orally from memory in sion between keeping a relative clause adjacent to its head noun and
the form of an embedded clause. While the initial main clause fixed extraposing the relative clause behind the clause-final verb is strongly
the lexical content of the to-be-produced embedded clause, partici- affected by the amount of material that intervenes between head noun
pants were completely free with regard to the position of the relative (including the relative clause) and the clause final verb. When a new
clause. discourse referent intervenes, the rate of extraposition is substantially
reduced. Whether the new discourse referent was introduced by a one-
word NP or a two-word NP had no significant effect, although
numerically there were some differences in the expected direction.
The results thus suggest that dependency length is defined in the
same way for language production and language comprehension,
namely in terms of new discourse referents. This in turn argues that
the DLT has a broader coverage than just processing load during
The experiment varied the amount of material that had to be language comprehension.
crossed by extraposition in addition to the verb: nothing (4-a), a bare An alternative to defining weight in terms of new discourse
NP object (4-b), or an NP object containing a determiner (4-c). The referents is the prosodic theory proposed by (Anttila, Adams and
latter two conditions differ in number of words but are identical in Speriosu 2010) in their analysis of the English dative alternation. In
number of new discourse referents. As shown by the corpus analysis, a nutshell, (Anttila et al. 2010) propose that dependency length
a difference of one word has a strong effect on the rate of extrapo- should be measured as number of intervening phonological phrases,
sition in the length range under consideration. where a phonological phrase consists of an accented lexical word
The percentages of sentences with extraposed relative clauses are possibly preceded by unaccented function words. According to this
presented in Table 1. Table 1 shows that the rate of extraposition
decreases substantially in the presence of an object but the difference
between one- and two-word objects is quite small. The results were Table 1 Percentages of extraposition in Experiments 1 and 2
analyzed by means of mixed-effect logistic regression using the
R-package LME4 (Bates, Maechler 2010). The experimental factors Structure % Extraposed in Exp 1 % Extraposed in Exp 2
were coded in such a way that all contrasts test whether differences
38 54
between means are significant (so-called contrast coding). Table 2
shows the results of the statistical analysis. The difference between 0 N 15 40
words and 1 word was significant but the difference between 1 word Det + N 11 31
and 2 words was not. In sum, the results of Experiment 1 suggest that
distance is defined as number of new discourse referents, as in the
DLT, and not as number of words.
Experiment 2 Table 2 Results of mixed effect model for Experiment 1
To corroborate the results of Experiment 1, Experiment 2 uses the
same experimental procedure for testing material that differs only in Contrast Estimate Std. Error z value Pr([|z|)
one respect from the material investigated in the first experiment. As
shown in (5), the condition with one additional word before the verb vs. N 2.3233 0.6387 3.638 0.0002
now contains the indefinite pronoun etwas (something) instead of a N vs. Det + N 0.5776 0.8060 0.717 0.4735
bare noun.
Table 3 Results of mixed effect model for Experiment 2

Contrast Estimate Std. Error z value Pr([|z|)
vs. N 1.0587 0.3855 2.747 0.0060

N vs. Det + N 0.5054 0.3313 1.526 0.1271
123
definition, an NP consisting of a bare noun like Gedichte (poems) based on activation spreading within dynamically shaped multimodal
and an NP consisting of a determiner and a noun like einige Ge- memories, in which coordination arises from the interplay of visuo-
dichte (some poems) both constitute a single phonological phrase. spatial and linguistically shaped representations under given cognitive
This would be compatible with the finding of Experiment 1 that the resources. A sketch of this model is presented together with simula-
rate of extraposition did not differ significantly between these two tion results.
types of NPs. Keywords
In contrast to a bare noun like Gedichte, an indefinite pronoun like Speech, Gesture, Conceptualization, Semantic coordination, Cogni-
etwas something does not form a phonological phrase because etwas tive modeling
is an unaccented function word. This predicts that the intervening
Introduction
indefinite pronoun etwas should be invisible with regard to extrapo-
Gestures are an integral part of human communication and they are
sition. However, as shown by the results for Experiment 2, the rate of
inseparably intertwined with speech (McNeill, Duncan 2000). The
extraposition decreased significantly when etwas was present. The
detailed nature of this connection, however, is still a matter of con-
rate of extraposition decreased even further when an NP consisting of
siderable debate. The data that underlie this debate have for the most
a determiner and a noun intervened, but this further decrease was not
part come from studies on the coordination of overt speech and
significant. The results of Experiment 2 do thus not support the
gestures showing that the two modalities are coordinated in their
prosodic definition of distance proposed by (Anttila et al. 2010).
temporal arrangement and in meaning, but with considerable varia-
In sum, the results of the two experiments reported in this
tions. When occurring in temporal proximity, the two modalities
paper favor a definition of dependency length in terms of inter-
express the same underlying idea, however, not necessarily identical
vening new discourse referents. The two alternatives that were
aspects of it: Iconic gestures can be found to be redundant with the
considereddistance measured as number of words or number of
information encoded verbally (e.g., round cake + gesture depicting
phonological phrasescould not account for the complete pattern
a round shape), to supplement it (e.g., cake + gesture depicting a
of results.
round shape), or even to complement it (e.g., looks like
this + gesture depicting a round shape). These variations in meaning
References coordinationtogether with temporal synchronyled to different
Anttila A, Adams M, Speriosu M (2010) The role of prosody in the hypotheses about how the two modalities encode aspects of meaning
English dative alternation. Lang Cogn Process 25(79):946981 and what mutual influences between the two modalities could
Baroni M, Bernardini S, Ferraresi A, Zanchetta E (2009) The WaCky underlie this. However, a concrete picture of this and in particular of
wide web: a collection of very large linguistically processed the underlying cognitive processes is still missing.
web-crawled corpora. Lang Resour Eval J 23(3):209226. doi: A couple of studies have investigated how the frequency and
10.1007/s10579-009-9081-4 nature of gesturing, including its coordination with speech is influ-
Bates DM, Maechler M (2010) lme4: linear mixed-effects models enced by cognitive factors. There is evidence that speakers indeed
using S4 classes produce more gestures at moments of relatively high load on the
Bock JK, Warren RK (1985) Conceptual accessability and syntactic conceptualization process for speaking (Kita, Davies 2009; Melinger,
structure in sentence formulation. Cognition 21:4767 Kita 2007). Moreover, supplementary gestures are more likely in
Gibson E (2000) The dependency locality theory: a distance-based cases of problems of speech production (e.g. disfluencies) or when the
theory of linguistic complexity. In Marantz A, Miyashita Y, information conveyed is introduced into the dialogue (and thus con-
ONeil W (eds) Image, language, brain. Papers from the first ceptualized for the first time) (Bergmann, Kopp 2006). Likewise,
mind articulation project symposium. MIT Press, Cambridge, speakers are more likely to produce non-redundant gestures in face-
pp 95126 tip-face dialogue as opposed to addressees who are not visible
Gildea D, Temperley D (2010) Do grammars minimize dependency (Bavelas, Kenwood, Johnson and Philips 2002).
length? Cogn Sci 34:286310 Chu et al. (Chu, Meyer, Foulkes and Kita 2013) provided data
Hawkins JA (1994) A performance theory of order and constituency. from an analysis of individual differences in gesture use demon-
Cambridge University Press, Cambridge strating that poorer visual/spatial working memory is correlated with
Hawkins JA (2004) Efficiency and complexity in grammars. Oxford a higher frequency of representational gestures. However, despite this
University Press, Oxford evidence, Hostetter and Alibali (Hostetter, Alibali 2007) report find-
Stallings LM, MacDonald MC (2011) Its not just the heavy NP: ings suggesting that speakers who have stronger visual-spatial skills
relative phrase length modulates the production of heavy-NP than verbal skills produce higher rates of gestures than other speakers.
shift. J Psycholing Res 40(3):177187 A follow-up study demonstrated that speakers with high spatial skills
Uszkoreit H, Brants T, Duchier D, Krenn B, Konieczny L, Oepen S, also produced a higher proportion of non-redundant gestures than
Skut W (1998) Studien zur performanzorientierten Linguistik: other speakers, whereas verbal-dominant speakers tended to produce
Aspekte der Relativsatzextraposition im Deutschen. Kognitions- such gestures more in case of speech disfluencies (Hostetter, Alibali
wissenschaft 7:129133 2011). Taken together this suggests that non-redundant gesture-
Wasow T (2002) Postverbal behavior. CSLI Publications, Stanford speech combinations are the result of speakers having both strong
spatial knowledge and weak verbal knowledge simultaneously, and
avoiding the effort of transforming the one into the other.
How is information distributed across speech In the literature, different models of speech and gesture production
and gesture? A cognitive modeling approach have been proposed. One major distinguishing feature is the point
where in the production process cross-modal coordination can take
place. The Growth Point Theory (McNeill, Duncan 2000) assumes
Kirsten Bergmann, Sebastian Kahl, Stefan Kopp
that gestures arise from idea units combining imagery and categorical
Bielefeld University, Germany
content. Assuming that gestures are generated pre-linguisti-
Abstract cally,Krauss, Chen and Gottesman (2000) hold that the readily
In naturally occurring speech and gesture, meaning occurs organized planned and executed gesture facilitates lexical retrieval through
and distributed across the modalities in different ways. The under- crossmodal priming. De Ruiter (2000) proposed that speech-gesture
lying cognitive processes are largely unexplored. We propose a model coordination arises from a multimodal conceptualization process that
123
selects the information to be expressed in each modality and assigns a to finally execute the behaviors. Motor control, articulation, and
perspective for the expression. Kita, O zyurek (2003) agree that ges- formulation have been subject of earlier work (Bergmann, Kopp
ture and speech are two separate systems interacting during the 2009). In the following we provide a sketch of the model, details can
conceptualization stage. Based on crosslinguistic evidence, their be found in (Kopp, Bergmann and Kahl 2013; Bergmann, Kahl and
account holds that language shapes iconic gestures such that the Kopp 2013).
content of a gesture is determined by bidirectional interactions Multimodal Memory
between speech and gesture production processes at the level of The central component in our model is a multimodal memory which
conceptualization, i.e. the organization of meaning. Finally, Hostetter, is accessible by modules of all processing stages. We assume that
Alibali (2008) proposed the Gestures as Simulated Action framework language production requires a preverbal message to be formulated in
that emphasizes how gestures may arise from an interplay of mental a symbolic-propositional representation that is linguistically shaped
imagery, embodied simulations, and language production. According (Levelt 1989) (SPR, henceforth). During conceptualization the SPR,
to this view, language production evokes enactive mental represen- e.g., a function-argument structure denoting a spatial property of an
tations which give rise to motor activation. object, needs to be extracted from visuo-spatial representations
Inspite of a consistent theoretical picture starting to emerge, many (VSR), i.e., the mental image of this object. We assume this process
questions about the detailed mechanisms remain open. A promising to involve the invocation and instantiation of memorized supramodal
approach to explicate and test hypotheses are cognitive models that concepts (SMC, henceforth), e.g. the concept round which links the
allow for computational simulation. However, such modeling attempts corresponding visuo-spatial properties to a corresponding proposi-
for the production of speech and gestures are almost inexistent. Only tional denotation. Figure 1 illustrates the overall relation of these
Breslow, Harrison and Trafton (2010) proposed an integrated pro- tripartite multimodal memory structures.
duction model based on the cognitive architecture ACT-R To realize the VSR and part of the SMC, we employ a model of
(Anderson, Bothell, Byrne, Lebiere and Qin 2004). This model, visuo-spatial imagery called Imagistic Description Trees (IDT)
however, has difficulties to explain gestures that clearly complement (Sowa, Kopp 2003). The IDT model unifies models from (Marr,
or supplement verbally encoded meaning. Nishihara 1978; Biederman 1987; Lang 1989) and was designed,
A Cognitive Model of Semantic Coordination based on empirical data, to cover the meaningful visuo-spatial fea-
In recent and ongoing work we develop a model for multimodal tures in shape-depicting iconic gestures. Each node in an IDT contains
conceptualization that accounts for the range of semantic coordination an imagistic description which holds a schema representing the shape
we see in real-life speech-gesture combinations. This account is of an object or object part. Important aspects include (1) a tree
embedded into a larger production model that comprises three stages: structure for shape decomposition, with abstracted object schemas as
(1) conceptualization, where a message generator and an image nodes, (2) extents in different dimensions as an approximation of
generator work together to select and organize information to be shape, and (3) the possibility of dimensional information to be un-
encoded in speech and gesture, respectively; (2) formulation, where a derspecified. The latter occurs, e.g., when the axes of an object
speech formulator and a gesture formulator determine appropriate schema cover less than the three dimensions of space or when an
verbal and gestural forms for this; (3) motor control and articulation exact dimensional extent is left open but only a coarse relation
Fig. 1 Overall production architecture
123
between axes like dominates is given. This allows to represent the this. To this end, we advance and refine the feedback signals provided
visuo-spatial properties of SMCs such as round, left-of or long- by the behavior generators to allow for the fine-grained coordination as
ish. Applying SMC to VSR is realized through graph unification and it is necessary for the production of this kind of utterances. With this
similarity matching between object schemas, yielding similarity val- extension the model will allow to further investigate predictions as
ues that assess how well a certain SMC applies to a particular visuo- postulated in the lexical retrieval hypothesis (Krauss, Chen and Chawla
spatially represented entity (cf. Fig. 1). SPR are implemented straight 1996; Rauscher, Krauss and Chen 1996; Krauss et al. 2000). Although
forward as predicate-argument sentences. that model was set up on the basis of empirical data, it was subject to
Overall production process much criticism based on psycholinguistic experiments and data. Data
Figure 1 shows an outline of the overall production architecture. from detailed simulation experiments based on our cognitive model can
Conceptualization consists of cognitive processes that operate upon provide further arguments in this debate.
the abovementioned memory structures to create a, more or less
coherent, multimodal message. These processes are constrained by
principles of memory retrieval, which we assume can be modeled by References
principles of activation spreading (Collins, Loftus 1975). As in cog- Anderson J, Bothell D, Byrne M, Lebiere C, Qin Y (2004) An inte-
nitive architectures like ACT-R (Anderson et al. 2004), activations grated theory of the mind. Psychol Rev 111(4):10361060
oat dynamically, spread across linked entities (in particular via Bavelas J, Kenwood C, Johnson T, Philips B (2002) An experimental
SMCs), and decay over time. Activation of more complex SMCs are study of when and how speakers use gestures to communicate.
assumed to decay more slowly than activation in lower VSR or SPR. Gesture 2(1):117
Production starts with the message generator and image generator Bergmann K, Kahl S, Kopp S (2013) Modeling the semantic coor-
inducing local activations of modal entries, evoked by a communi- dination of speech and gesture under cognitive and linguistic
cative goal. VSRs that are sufficiently activated invoke matching constraints. In Aylett R, Krenn B, Pelachaud C, Shimodaira H
SMCs, leading to an instantiation of SPRs representing the corre- (eds) Proceedings of the 13th international conference on intel-
sponding visuo-spatial knowledge in linguistically shaped ways. The ligent virtual agents. Springer, Berlin, pp 203216
generators independently select modal entries and pass them on to the Bergmann K, Kopp S (2006) Verbal or visual: how information is
formulators. As in ACT-R, highly activated features or concepts are distributed across speech and gesture in spatial dialog. In: Pro-
more likely to be retrieved and thus to be encoded. Note that, as ceedings of SemDial2006, pp 9097
activation is dynamic, feature selection depends on the time of Bergmann K, Kopp S (2009) GNetIc Using Bayesian decision net-
retrieval and thus available resources. The message generator has to works for iconic gesture generation. In: Proceedings of IVA
map activated concepts in SPR onto grammatically determined cat- 2009. Springer, Berlin, pp 7689
egorical structures, anticipating what the speech formulator is able to Biederman I (1987) Recognition-by-components: a theory of human
process (cf. Levelt 1989). Importantly, interaction between generators image understanding. Psychol Rev 94:115147
and formulators in each modality can run top-down and bottom-up Breslow L, Harrison A, Trafton J (2010) Linguistic spatial gestures.
For example, a proposition being encoded by the speech formulator In: Proceedings of cognitive modeling 2010, pp 1318
results in reinforced activation of the concept in SPR, and thus Chu M, Meyer AS, Foulkes L, Kita S (2013) Individual differences in
increased activation of associated concepts in VSR. frequency and saliency of speech-accompanying gestures: the
In result, semantic coordination emerges from the local choices role of cognitive abilities and empathy. J Exp Psychol Gen
generators and formulators take, based on the activation dynamics in 143(2):694709
multimodally linked memory representations. Redundant speech and Collins AM, Loftus EF (1975) A spreading-activation theory of
gesture result from focused activation of supramodally linked mental semantic processing. Psychol Rev 82(6):407428
representations, whereas non-redundant speech and gesture arise de Ruiter J (2000) The production of gesture and speech. In McNeill
when activations scatter over entries not connected via SMCs. D (ed) Language and gesture. Cambridge University Press,
Results and outlook Cambridge, pp 284311
To quantify our modeling results we ran simulation experiments in Hostetter A, Alibali M (2007) Raise your hand if youre spatial
which we manipulated the available time (in terms of memory update relations between verbal and spatial skills and gesture produc-
cycles) before the model had to come up with a sentence and a gesture tion. Gesture 7:7395
(Kopp et al. 2013; Bergmann et al. 2013). We analyzed the resulting Hostetter A, Alibali M (2008) Visible embodiment: gestures as sim-
multimodal utterances with respect to semantic coordination: Sup- ulated action. Psychon Bull Rev 15/3:495514
plementary (i.e., non-redundant) gestures were dominant in those runs Hostetter A, Alibali M (2011) Cognitive skills and gesture-speech
with stricter temporal limitations, while redundant ones become more redundancy. Gesture 11(1):4060
likely when time available is increased. The model, thus, offers a Kita S, Davies TS (2009) Competing conceptual representations
natural account for the empirical finding that non-redundant gestures trigger cospeech representational gestures. Lang Cogn Process
are more likely when conceptualization load is high, based on the 24(5):761775
assumption that memory-based cross-modal coordination consumes Kita S, O zyurek A (2003) What does cross-linguistic variation in
resources (memory, time), and is reduced or compromised when such semantic coordination of speech and gesture reveal? Evidence for
resources are limited. an interface representation of spatial thinking and speaking.
To enable a direct evaluation of our simulation results in com- J Memory Lang 48:1632
parison with empirical data, we currently conduct experiments to set Kopp S, Bergmann K, Kahl S (2013) A spreading-activation model of
up a reference data corpus. In this study, participants are engaged in a the semantic coordination of speech and gesture. In: Proceedings
dyadic description task and we manipulate the preparation time of the 35th annual conference of the cognitive science society
available for utterance planning. The verbal output will subsequently (cogsci 2013). Cognitive Science Society, Austin, pp 823828
be analyzed with respect to semantic coordination of speech and Krauss R, Chen Y, Chawla P (1996) Nonverbal behavior and non-
gestures based on a semantic feature coding approach as already verbal communication: what do conversational hand gestures tell
applied in (Bergmann, Kopp 2006). us? Adv Exp Soc Psychol 28:389450
In ongoing work we extend the model to also account for com- Krauss R, Chen Y, Gottesman R (2000) Lexical gestures and lexical
plementary speech-gesture ensembles in which deictic expressions in access: a process model. In McNeill D (ed) Language and gesture.
speech refer to their cospeech gesture as in the window looks like Cambridge University Press, Cambridge, pp 261283
123
Lang E (1989) The semantics of dimensional designation of spatial equivalent) one, or the problem instance stays untouched butinstead of
objects. In Bierwisch M, Lang E (eds) Dimensional adjectives: being perfectly (i.e., precisely) solvedis dealt with in a good enough
grammatical structure and conceptual interpretation. Springer, (i.e., approximate) way. Against this background, two crucial question
Berlin, pp 263417 arise: Which problems can actually be solved by applying heuristics?
Levelt WJM (1989) Speaking: from intention to articulation. MIT And how can the notion of heuristics be theoretically modeled on a
Press sufficiently high level so as to allow for a general description?
Marr D, Nishihara H (1978) Representation and recognition of the In what follows we want to provide a sketch of work towards an
spatial organization of three-dimensional shapes. In: Proceedings approach to answering these questions using techniques originating
of the royal society of London, vol 200, pp 269294 from complexity theory and hardness of approximation analysis. This
McNeill D, Duncan S (2000) Growth points in thinking-for-speaking. choice of formal methods is justified by the observation that, although
In: Language and gesture. Cambridge University Press, Cam- computational in nature, systems as developed in cognitive AI and
bridge, pp 141161 cognitive systems research can be considered as physical systems
Melinger A, Kita S (2007) Conceptualisation load triggers gesture which need to perform their tasks in limited time and with a limited
production. Lang Cogn Process 22 (4):473500 amount of space at their disposal and thus formal computational
Rauscher F, Krauss R, Chen Y (1996) Gesture, speech, and lexical properties (and restrictions on these) are relevant parameters.
access: the role of lexical movements in speech production. Two and a Half Formal Perspectives on Heuristics in Cognitive
Psychol Sci 7:226231 Systems
Sowa T, Kopp S (2003) A cognitive model for the representation and Returning to the two different types of heuristics identified above and
processing of shape-related gestures. In: Procedings of European having a look at recent work in complexity and approximation theory,
cognitive science conference we find a natural correspondence between the outlined conceptual
approaches and well-known concepts from the respective fields.
The Reduction Perspective:
Towards formally well-founded heuristics in cognitive Over the last years, complexity theory has turned its attention more
AI systems and more towards examples of problems which have algorithms that
have worst-case exponential behavior, but tend to work quite well in
practice if certain parameters of the problem are restricted. This has
Tarek R. Besold led to the introduction of the class of fixed-parameter tractable
Institute of Cognitive Science, University of Osnabruck, Germany problems FPT (see, e.g., (Downey, Fellows 1999)):
Abstract Definition 1 (FPT) A problem P is in FPT if P admits an O(f(j)nc)
We report on work towards the development of a framework for algorithm, where n is the input size, j is a parameter of the input
the application of formal methods of analysis to cognitive systems constrained to be small, c is an independent constant, and f is some
and computational models (putting special emphasis on aspects computable function.
concerning the notion of heuristics in cognitive AI) and explain
A non-trivial corollary can be derived from FPT-membership: Any
why this requires the development of novel theoretical methods
instance of a problem in FPT can be reduced to a problem kernel.
and tools.
Keywords Definition 2 (Kernelization) Let P be a parameterized problem. A
Cognitive Systems, Heuristics, Complexity Theory, Approximation kernelization of P is an algorithm which takes an instance x of P with
Theory parameter j and maps it in polynomial time to an instance y such that
x 2 P, if and only if y 2 P, and the size of y is bounded by f(j) (f a
Heuristics in Cognitive Systems
computable function).
An ever-growing number of researchers in cognitive science and
cognitive psychology, starting in the 1970s with (Kahneman, Slovic Theorem 1 (Kernelizability (Downey, Fellows 1999)) A problem P
and Tversky 1982)s heuristics and biases program and today is in FPT if and only if it is kernelizable.
prominently heralded, for instance, by (Gigerenzer, Hertwig and Pa- This essentially entails that, if a positive FPT result can be obtained,
chur 2011), argues that humans in their common sense reasoning do then (and only then) there is a downward reduction for the underlying
not apply any full-edged form of logical or probabilistic reasoning to problem to some sort of smaller or less-complex instance of the same
possibly highly complex problems, but instead rely on heuristics as problem, which can then be solved. Returning to the initial quest for
(mostly automatic and unconscious) mechanisms that allow them to finding a formal characterization of reduction-based heuristics, we
circumvent the impending complexity explosion and nonetheless notice that, by categorizing problems according to kernelizability we
reach acceptable solutions to the original problems. All of these can establish a distinction between problem classes which are solvable
mechanisms are commonly subsumed under the general term heu- by the presented type of reduction and those which are notand can
ristics and, following the paradigmatic example given by (Newell, thus already a priori decide whether a system implementing a mecha-
Simon 1976)s notion of heuristic search, under this label are also nism based on a kernelization account generally is (un)able to solve a
often (re)implemented in cognitive AI.4 certain class. What remains to be shown is the connection between
Still, on theoretical grounds, from a computational point of view at kernelization and the notion of reduction-based heuristics (or rather the
least two quite different general types of approach can be imagined: suitability of kernelization as conceptual characterization of the notion
Either the complexity of solving a problem can be reduced by reducing of reduction in the examined type of heuristics).
the problem instance under consideration to a simpler (but solution The connection is explicated by the correspondence between FPT-
membership and kernelizability of a problem: If heuristics are to be as
4
Whilst this type of work clearly has lost some of its popularity over fast and frugal as commonly claimed, considering them anything but (at
the years, and has been replaced with efforts invested in finding worst) polynomial-time bounded processes seems questionable. But
answers to questions where an optimal solution can provably be now, if the reduced problem shall be solvable under resource-critical
achieved (although under possibly unrealistic or impractical time and/ conditions, using the line of argument introduced above, we can just
or space requirements), the study of heuristics-based approaches and hope for it to be in FPT. Finally, combining the FPT-membership of the
techniques still are a lively field of active research, see, for example, reduced problem with the polynomial-time complexity of the reduction
((Bridewell & Langley, 2011; MacLellan, 2011)). process (i.e., the presumed heuristics), already the original problem had
123
to be fixed-parameter tractable. This should not come as a surprise, as the At first sight this seems to be either a trivial issue, or not an issue
contrary (i.e., a heuristics reducing the overall complexity of solving a at all, depending on whether it is assumed that value similarity and
superpolynomial problem to polynomial-time computation by means of structural similarity coincide, or it is decided that structure is not of
a reduction of the original problem within the same class) would con- interest. Still, we believe that dismissing the issue this easily would be
tradict the class membership of the original problem and thus break the ill-advised: Especially in the context of cognitive systems and high-
class hierarchy (assuming P = NP). Still, kernelization-based heuristics level AI in many cases the structure of a problems solution can be of
are not trivialized by these considerations: Although original and reduced great relevance. As an example consider a cognitive system built for
problem are in FPT, the respective size of the parameters may still differ maintaining a network of maximally coherent beliefs about complex
between instances (making an important difference in application sce- domains as, e.g., presented by (Thagard 2000). Whilst, for instance,
narios for implemented systems). (Millgram 2000) has shown that the required form of maximal
The Approximation Perspective: coherence over this type of network in its full form is NP-hard,
The second perspective on heuristics uses approximation algorithms: (Thagard, Verbeurght 1998) proposed several (valuebased) approxi-
Instead of precisely solving a kernel as proposed by reduction-based mation algorithms. Still, a mere value-based approximation scheme
heuristics, we try to compute an approximate solution to the original does not yield the desired results: As also demonstrated by (Millgram
problem (i.e., the solution to a relaxed problem). The idea is not any more 2000), two belief assignments can be arbitrarily close in coherence
to perfectly solve the problem (or an equivalent instance of the same value and at the same time still arbitrarily far from each other in terms
class), but to instead solve the problem to some satisfactory degree. of which beliefs are accepted and which are rejected.
A possible analog to FPT in the Tractable AGI thesis is APX, the Unfortunately, whilst our knowledge and command of value-based
class of problems allowing polynomial-time approximation approximation has greatly developed over the last decades, structure-
algorithms: based approximation has rarely been studied. (Hamilton, Muller, van
Definition 3 (APX) An optimization problem P is in APX if P admits Rooij and Wareham 2007) present initial ideas and define basic
a constant factor approximation algorithm, i.e., there is a constant notions possibly forming the foundations of a formal framework for
factor e [ 0 and an algorithm which takes an instance of P of size structure-based approximation. And although these are still only very
n and, in time polynomial in n, produces a solution that is within a first steps towards a complete and well-studied theory, the presented
factor 1 e of being optimal (or 1 e for maximization problems). concepts already allow for several important observations. The most
This notion in practice crucially depends on the bounding constant for relevant for the introduced cognitive systems setting is the following:
the approximation ratio: If the former is meaningfully chosen with Value approximation and structural approximation are distinct in
respect to the problem, constant-factor approximation allows for generaland whilst very careful use of the tools of value-based
quantifying the good enough aspect of the solution and, thus, might approximation might partially mitigate this divergence (the most
even offer a way of modeling the notion of satisficing introduced naive ad-hoc remedy being the use of problem-specific and highly
by (Simon 1956) (which in turn is central to many heuristics con- non-generalizable optimization functions which also take into account
sidered in cognitive science and psychology, providing additional some basic form of structural similarity and not only outcome values
empirical grounding for the computational systems in cognitive AI). of solutions) it cannot be assumed in general that both notions
Joining Perspectives: coincide in a meaningful way.
What if the system architect, instead of deciding whether to solve a Future Work
certain type of task applying one of the two types of heuristic and then In the long run we therefore want to develop the presented roots into
conducting the respective analysis, just wants to directly check whether an overall framework addressing empirically-inspired aspects of
the problem at hand might be solvable by any of the two paradigms? cognitive system in general. Also, in parallel to the corresponding
Luckily, FPT and APX can be integrated via the concept of fixed- theoretical work, we want to put emphasis on showing the usefulness
parameter approximability and the corresponding problem class FPA: and applicability of the proposed methods in different prototypical
Definition 4 (FPA) The fixed-parameter version P of a minimization examples from relevant fields (such as, for example, models of epi-
problem is in FPA iffor a recursive function f, a constant k, and stemic reasoning and interaction, cognitive systems in general
some fixed recursive function gthere exists an algorithm such that problem-solving, or models for particular cognitive capacities),
for any given problem instance I with parameter k, and question allowing for a mutually informed development process between
OPT(I) B k, the algorithm which runs in O(f(k)nc) (where n = |I|) foundational theoretical work and application studies.
either outputs no or produces a solution of cost at most g(k).
As shown by (Cai, Huang 2006), both polynomial-time approxima- Acknowledgments
bility and fixed-parameter tractability with witness (Cai, Chen 1997) I owe an ever-growing debt of gratitude to Robert Robere (University
independently imply the more general fixed-parameter approxima- of Toronto) for introducing me to the fields of parameterized com-
bility. And also on interpretation level FPA artlessly combines both plexity theory and approximation theory, reliably providing me with
views of heuristics, at a time in its approximability character theoretical/technical backup and serving as a willing partner for
accommodating for the notion of satisficing and in its fixed-parameter feedback and discussion.
character accounting for the possibility of complexity reduction by
kernelizing whilst keeping key parameters of the problem fixed. References
The Wrong Type of Approximation? Bridewell W, Langley P (2011) A computational account of everyday
Approximation-based heuristics have been introduced as solution abductive inference. In: Proceedings ofthe 33rd annual meeting
procedures for a problem producing solutions which are not optimal of the cognitive science society, pp 22892294
but (at least when using a standard like the proposed APX) fall within Cai L, Chen J (1997) On fixed-parameter tractability and approxi-
a certain defined neighborhood of the optimal one. Here, the degree of mability of fNPg optimization problems. J Comput Syst Sci
optimality of a solution is measured in terms of proximity of the 54(3):465474
solutions value to the optimal value for the optimization problem at Cai L, Huang X (2006) Fixed-parameter approximation: conceptual
hand. But this is not the only possible way of conceptualizing framework and approximability results. In Bodlaender H, Langston
approximation: What if emphasis would be put on finding a solution M (eds) Parameterized and exact computation. Springer, pp 96108
which is structurally as similar as possible to the original oneso Downey RG, Fellows MR (1999) Parameterized complexity. Springer
what if the quality of approximation would be measured in similarity Gigerenzer G, Hertwig R, Pachur T (eds) (2011) Heuristics: the
of structure instead of proximity of values? foundation of adaptive behavior. Oxford University Press
123
Hamilton M, Muller M, van Rooij I, Wareham T (2007) Approxi- musical sequence (i.e. the musical goal). In line with models of
mating solution structure. In: Dagstuhl seminar proceedings Nr. incremental planning of serial actions (Palmer, Pfordresher 2003),
07281. IBFI, Schloss Dagstuhl these findings suggest that the notion of syntax translates to a
Kahneman D, Slovic P, Tversky A (1982) Judgment under uncer- grammar of musical action in expert pianists.
tainty: Heuristics and Biases. Cambridge University Press According to the notion of goal priority over the means in action
MacLellan C (2011) An elaboration account of insight. In: AAAI fall hierarchy (Bekkering et al. 2000; Grafton 2009; Wohlschlager et al.
symposium: advances in cognitive systems 2003), in musical motor acts the musical goal determined by the
Millgram E (2000) Coherence: the price of the ticket. J Philos context (Syntax) should take priority over the specific movement
97:8293 selection adopted for the execution (Manner), especially at advanced
Newell A, Simon HA (1976) Computer science as empirical inquiry: skill levels (Novembre, Keller 2011; Palmer, Meyer 2000). However,
symbols and search. Commun ACM 19(3):113126 through intensive musical training, frequently occurring musical
Simon HA (1956) Rational choice and the structure of the environ- patterns (i.e., scales, chord progressions) may have codified for some
ment. Psychol Rev 63:129138 fixed matching fingering configuration (Gellrich, Parncutt 2008;
Thagard P (2000) Coherence in thought and action. The MIT Press Sloboda et al. 1998). Thus, from this perspective, it may also be that
Thagard P, Verbeurght K (1998) Coherence as constraint satisfaction. motor pattern familiarity has a role in motor predictions during the
Cogn Sci 22:124 execution of common chord progressions. To what extent motor
predictive mechanisms operate at the level of musical syntax or arise
due to motor pattern familiarity will be addressed here. Whether a
progressively more syntax-based motor control independent of the
Action planning is based on musical syntax in expert manner correlates with expertise will be also discussed.
pianists. ERP evidence To this end, we asked pianists to watch and simultaneously exe-
cute on a mute piano chord progressions played by a performing
pianists hand presented in a series of pictures on a screen. To negate
Roberta Bianco1, Giacomo Novembre2, Peter Keller2,
exogenously driven auditory predictive processes, no sound was used.
Angela Friederici1, Arno Villringer1, Daniela Sammler1
1 To explore the effect of expertise on syntax-based predictions, pia-
Max Planck Institute for Human Cognitive and Brain Sciences,
nists ranging from 12 to 27 years of experience were tested
Leipzig, Germany; 2 MARCS Institute, University of Western Sydney,
behaviorally and with electroencephalography (EEG). To induce
Australia
different strengths of predictions, we used 5-chord or 2-chord
Action planning of temporally ordered elements within a coherent sequences (long/short Context) presenting the target chord in the last
structure is a key element in communication. The specifically human position. In a 2 x 2 factorial design, we manipulated the target chord
ability of the brain to variably combine discrete meaningful units into of the sequences in terms of keys (Syntax congruent/incongruent), to
rule-based hierarchical structures is what is referred to as syntactic violate the syntactic structure of the sequence, and in terms of fin-
processing and has been defined as core aspect of language and gering (Manner correct/incorrect), to violate the motor familiarity.
communication (Friederici 2011; Hauser et al. 2002; Lashley 1952). Crucially, the manipulation of the manner, while keeping the syntax
While similarities in the syntactic organization of language and congruent, allowed us to dissociate behavioral and neural patterns
Western tonal music have been increasingly consolidated (Katz, Pe- elicited by the execution of either the violation of the syntactic
setsky 2011; Patel 2003; Rohrmeier, Koelsch 2012), analogies with structure of the sequence (Syntax) or of a general violation of familiar
the domain of action, in terms of hierarchical and combinatorial movements (Manner). Additionally, the 2 x 2 factorial design per-
organization (Fitch, Martins 2014; Pastra, Aloimonos 2012; Pulver- mitted us to investigate syntax-related mechanisms on top of the
muller 2014), remain conceptually controversial.(Moro 2014). To concurrent manner violation in order to test whether in motor pro-
investigate the syntax of actions, piano performance based on the gramming high levels of syntactic operations are prioritized over
tonal music is an ideal substrate. First, playing chord progressions is mechanisms of movement parameter specification.
the direct motoric translation of musical syntax, a theoretically We hypothesized that, if motor predictions, during execution of
established hierarchical system of rules governing music structure musical chords sequences, are driven by musical syntax rather than
(Rohrmeier 2011). Second, it gives the possibility to investigate motor pattern familiarity, then the violation of the Syntax should
action planning at different levels of action hierarchy, from lower evoke specific behavioral and electrophysiological patterns, different
immediate levels of movement selection to higher levels of distal from those related to the Manner. Also, we expected to observe
goals (Grafton, Hamilton 2007; Haggard 2008; Uithol et al. 2012). syntax-based prediction effects irrespectively of the fingering used to
Finally, it offers the perspective to investigate the influence of play, thus even in presence of the concurrent manner violation.
expertise on the relative weighting of different action features (i.e., Finally, if at advanced skill levels the more abstract musical motor
goal and manner) in motor programming (Palmer, Meyer 2000; goals increase weighting in motor programming, we expected to
Wohlschlager et al. 2003). observe a positive dependency between the strength of syntax-based
Novembre, Keller (2011) and Sammler et al. (2013) have shown prediction and expertise levels.
that expert pianistsduring intense practicemight motorically We found that the production of syntactically incongruent
learn syntactic regularities governing musical sequences and therefore compared to the congruent chords showed a response delay that was
generate motor predictions based on their acquired long-term syn- larger in the long compared to the short context and that was
tactic knowledge. In a priming paradigm, pianists were asked to accompanied by the presence of a central posterior negativity
imitate on a mute piano silent videos of a hand playing chord (520800 ms) in the long and not in the short context. Conversely,
sequences. The last chord was either syntactically congruent or the execution of the unconventional manner was not delayed as a
incongruent with the preceding musical context. Despite the absence function of Context, and elicited an opposite electrophysiological
of sounds, the authors found slower imitation times of syntactically pattern (a posterior positivity between 520 and 800 ms). Hence,
incongruent chords as well as motor facilitation (i.e. faster responses) while the effects associated to the Syntax might reflect a signal of
of the syntactically congruent chords. In the ERPs (Sammler et al. movement reprogramming of a prepotent response in face of the
2013), the imitation of the incongruent chord elicited a late posterior incongruity to be executed (Leuthold, Jentzsch 2002; Sammler et al.
negativity, index of reprogramming of an anticipated motor act 2013), the effects associated with the Manner were stimulus- rather
(Leuthold, Jentzsch 2002) primed by the syntactic structure of the than response-related and might reflect the perceptual surprise
123
(Polich 2007) of the salient fingering manipulation, recognized by Polich J (2007) Updating P300: an integrative theory of P3a and P3b.
the pianists as obvious target manipulation. Finally, syntax-related Clin Neurophysiol: Off J Int Feder Clin Neurophysiol
effects held when only considering the manner incorrect trials, and 118(10):21282148. doi:10.1016/j.clinph.2007.04.019
their context dependency was sharper with increasing expertise level Pulvermuller F (2014) The syntax of action. Trend Cogn Sci
(computed as cumulated training hours across all years of piano 18(5):219220. doi:10.1016/j.tics.2014.01.001
playing). This suggests that syntactic mechanisms take priority over Rohrmeier M (2011) Towards a generative syntax of tonal harmony.
movements specifications, especially in more expert pianists being J Math Music 5(1):3553. doi:10.1080/17459737.2011.573676
more affected by the priming effect of the contextual syntactic Rohrmeier M, Koelsch S (2012) Predictive information processing in
structure. music cognition. A critical review. Int J Psychophysiol Off J Int
Taken together, these findings indicate that, given a contextual Organ Psychophysiol 83(2):164175. doi:10.1016/j.ijpsycho.
musical structure, motor plans for distal musical goal are generated 2011.12.010
coherently with the context and forehead those ones underlying Sammler D, Novembre G, Koelsch S, Keller PE (2013) Syntax in a
specific, immediate movement selection. Moreover, the increase of pianists hand: ERP signatures of embodied syntax processing
syntax-based motor control with expertise might hint at the action in music. Cortex J Devoted Study o Nervous Syst Behav
planning based on musical syntax as a slowly acquired skill built on 49(5):13251339. doi:10.1016/j.cortex.2012.06.007
top of the acquisition of motor flexibility. More generally, this finding Sloboda JA, Clarke EF, Parncutt R, Raekallio M (1998) Determinants of
indicates that, similarly to music perception, music production too finger choice in piano sight-reading. J Exp Psychol Human Percept
relies on generative syntactic rules. Performance 24(1):185203. doi:10.1037//0096-1523.24.1.185
Uithol S, van Rooij I, Bekkering H, Haselager P (2012) Hierarchies in
action and motor control. J Cogn Neurosci 24(5):10771086. doi:
References 10.1162/jocn_a_00204
Bekkering H, Wohlschlager A, Gattis M (2000) Imitation of gestures Wohlschlager A, Gattis M, Bekkering H (2003) Action generation
in children is goal-directed. Quart J Exp Psychol Human Exp and action perception in imitation: an instance of the ideomotor
Psychol 53(1):15364. doi:10.1080/713755872 principle. Philos Trans R Soc Lond Ser B Biol Sci 358(1431):
Fitch WT, Martins MD (2014) Hierarchical processing in music, 501515. doi:10.1098/rstb.2002.1257
language, and action: Lashley revisited. Ann N Y Acad Sci 118.
doi:10.1111/nyas.12406
Friederici AD (2011) The brain basis of language processing: from
structure to function. Physiol Rev 91(4):13571392. doi:10.1152/ Motor learning in dance using different modalities:
physrev.00006.2011
Gellrich M, Parncutt R (2008) Piano technique and fingering in the
visual vs. verbal models
eighteenth and nineteenth centuries: bringing a forgotten method
back to life. Br J Music Educ 15(01):523. doi:10.1017/S0265051 Bettina Blasing1, Jenny Coogan2, Jose Biondi2, Liane Simmel3,
700003739 Thomas Schack1
1
Grafton ST, Hamilton AFDC (2007) Evidence for a distributed Neurocognition and Action Research Group & Center of Excellence
hierarchy of action representation in the brain. Human Movement Cognitive Interaction Technology (CITEC), Bielefeld University,
Sci 26(4):590616. doi:10.1016/j.humov.2007.05.009 Germany; 2 Palucca Hochschule fur Tanz Dresden, Germany; 3
Haggard P (2008) Human volition: towards a neuroscience of will. tamed Tanzmedizin Deutschland e.V., Fit for Dance Praxis und
Nature Rev Neurosci 9(12): 93446. doi:10.1038/nrn2497 Institut fur Tanzmedizin, Munchen, Germany
Hauser MD, Chomsky N, Fitch WT (2002) The faculty of language: Keywords
what is it, who has it, and how did it evolve? Science (New York, Motor learning, Observation, Visual model, Verbal instruction, Dance
N.Y.) 298(5598):15691579. doi:10.1126/science.298.5598.1569
Katz J, Jean I, Paris N, Pesetsky D (2011) The Identity Thesis for Introduction
Language and Music (January) Observational learning is viewed as the major mode of motor learning
Lashley K (1952) The problem of serial order in behavior. In: Jeffress (Hodges et al. 2007). Empirical evidence shows that observational
LA (ed) Cerebral mechanisms in behavior. Wiley, New York, learning primarily takes place in an implicit way, by activating shared
pp 112131 neural correlates of movement execution, observation and simulation
Leuthold H, Jentzsch I (2002) Spatiotemporal source localisation (Jeannerod 2004; Cross et al. 2006, 2009). It has been shown that the
reveals involvement of medial premotor areas in movement use of language (in terms of verbal cues) can facilitate or enhance
reprogramming. Exp Brain Res. Experimentelle Hirnforschung. motor learning by guiding attention towards relevant features of the
Experimentation Cerebrale 144(2):17888. doi:10.1007/s00221- movement and making these aspects explicit (see Wulf and Prinz
002-1043-7 2001). In dance training (and other movement disciplines), observa-
Moro A (2014) On the similarity between syntax and actions. Trend tional learning from a visual model is most commonly applied, and is
Cogn Sci 18(3):10910. doi:10.1016/j.tics.2013.11.006 often supported by verbal cue-giving. Evidence from practice sug-
Novembre G, Keller PE (2011) A grammar of action generates pre- gests that explicit verbal instructions and movement descriptions play
dictions in skilled musicians. Conscious Cogn 20(4):12321243. a major role in movement learning by supporting the understanding,
doi:10.1016/j.concog.2011.03.009 internalizing and simulating of movement phrases. In modern and
Palmer C, Meyer RK (2000) Conceptual and motor learning in music contemporary dance, however, choreographers often do not expect the
performance. Psychol Sci 11(1):6368. Retrieved from http://www. dancers to simply reproduce movement phrases in adequate form, but
ncbi.nlm.nih.gov/pubmed/11228845 to develop movement material on their own, in accordance with a
Palmer C, Pfordresher PQ (2003) Incremental planning in sequence given idea, description or instruction, aiming at a more personal
production. Psychol Rev 110(4):683712. doi:10.1037/0033- expression and higher artistic quality of the developed movement
295X.110.4.683 material.
Pastra K, Aloimonos Y (2012) The minimalist grammar of action. In this study, we investigate dancers learning of movement
Philos Trans R Soc Lond Ser B Biol Sci 367(1585):103117. doi: phrases based on the exclusive and complementary use of visual
10.1098/rstb.2011.0123 model observation and verbal instruction (movement description).
123
Dance students learned comparable movement material via two dif- the remaining learning mode (verbal or visual) in the Step 1, com-
ferent modes: via observation of a model and via listening to a verbal plemented by the other mode in the Step 2. The order of the dance
movement description (as example, a part of a model sequence is phrases (Phrase 1, Phrase 2) and of the initial leaning modes (visual,
displayed in Fig. 1). In a second step, the complementary mode was verbal) was balanced between the participants (the experimental
added. After both learning steps, the students performance of the design of the study is illustrated in Table 1). The experimental pro-
learned movement phrases was recorded and rated by independent cedure took place in a biomechanics laboratory and lasted
experts. A retention test was applied to evaluate long-term effects of approximately one hour for each participant. Additional to the eval-
the learning processes. We expected the dance students to learn uation of the recorded performances, questionnaires and psychometric
successfully from the visual model, their most commonly practiced tests were applied to investigate the students learning success and
mode of movement learning. From the verbal instruction, we their personal impressions of the different learning processes.
expected that performed movement phrases would vary more Expert ratings of the reproduced material: Two independent experts
strongly, but could possibly be performed with more artistic quality. rated the recorded performance trials from the recorded and cut
We also expected performance after the second learning step to be video clips, one of each demonstration condition (visual,
improved compared to the first learning step in both conditions. visual + verbal, verbal, verbal + visual). The experts rated each of
Method the recorded performances by filling out a questionnaire consisting
Learning task: Eighteen students (age: 18.4 1.0 years, 11 female) of six-point Likert-scale type questions assigned to two categories,
from the BA Dance study program at the Palucca Hochschule fur accordance with the model (AM; 10 questions) and artistic per-
Tanz Dresden learned two dance phrases of similar length (approx. formance quality (PQ, 5 questions). For each category of questions,
30 s) and complexity, one via visual observation of a demonstration ratings of the questions were averaged to achieve general measures
video, the other one via a recorded verbal description (see Fig. 1). In a for the main criteria AM and PQ. Each expert independently wat-
first learning step (Step 1), one of the dance phrases was presented ched the recordings from the students performances and marked
five times either visually (video) or verbally (audio), and the partic- one answer for each question, without knowing about the learning
ipant was instructed to learn it by watching or listening, and by condition of the recorded performance. Non-parametric tests (Wil-
marking movements as required. After a short practice, the participant coxon signed-rank, MannWhitney U) were used to compare the
performed the learned dance phrase while being recorded on video. In averaged ratings of the two experts for the different conditions
a second learning step (Step 2), the participant was twice presented (visual, visual + verbal, verbal, verbal + visual) within each crite-
the same dance phrase in the complementary presentation mode (i.e., rion (AM, PQ) and for the two criteria within each demonstration
video for the verbally learned phrase and vice versa), and the per- condition.
formance was recorded again. The other dance phrase was then Retention test: Thirteen of the dance students (8 female) participated
learned and performed using the same procedure, but was presented in in a retention test that was carried out 1013 days after the experi-
mental learning task. The retention test included the video-recorded
performance of the remembered movement material, psychometric
tests and questionnaires. In the performance part of the test, each
student was asked to perform both dance phrases as completely as
possible. Students were allowed to practice for several minutes before
being recorded, but were not given any assistance in reproducing the
phrases. Each student was recorded individually and on his/her own in
a separate dance studio. The video recordings of the students per-
Fig. 1 Images illustrating approximately two-thirds of Phrase 1, formance in the retention test were annotated for the completeness of
choreographed by Jenny Coogan and performed by Robin Jung. The the phrases by two annotators. Each phrase was segmented into ele-
phrase was presented as video of 26 s and as audio recording of a ven partial phrase, or elements, of similar content (note that the
verbal description (speaker: Alex Simkins). Phrase 2, choreographed phrases had been choreographed to resemble each other in com-
by Jose Biondi, was of similar length and complexity and contained plexity, duration and structure). The annotators independently
similar movement elements as Phrase 1, and was performed and watched the recordings and marked the completeness of each of the
spoken by the same dancer and speaker in the video and audio eleven elements as value between 0 and 1 (0: the element was not
recording, respectively. The verbal description of the dance sequence danced at all, or was not recognizable; 1: the element was clearly
shown in the pictures reads as follows: Stand facing the front left recognizable and was performed without error); ratings of the two
diagonal of the room in first position. At the same time extend your annotators were then averaged. Each student thereby received for
left leg forward and your two arms sideways to the horizontal. Allow each of the two phrases a value between 0 (no partial phrase was
your right hand to continue moving until it arrives to a high diagonal. reproduced at all) and 11 (all partial phrases were reproduced per-
Gradually let the shape melt back into its beginning position as you fectly). Non-parametric tests (Wilcoxon signed-rank, MannWhitney
shift your weight into the right hip, bending both knees, sinking your U) were used to compare averaged completeness scores between
head to the left to make a big C-curve. Continue into falling, then dance phrases (Phrase 1, Phrase 2) and learning modes (visual first,
catch the weight with a step of the left leg crossing to the right. verbal first).
Follow with two steps sideward, in the same direction while throwing Results
both arms in front of your shoulders. Keeping your arms close to you, Expert ratings: Ratings of the two experts were positively correlated for
spiral to the right diagonal, then, kick your right leg, left arm and head both criteria, AM (r = 0.528; p \ .001) and PQ (r = 0.513; p \ .001).
forward as you throw your right arm behind you. Bring the energy After Step 1, ratings of PQ were significantly better than ratings for AM
back into you quickly bending both elbows and the right knee close to (visual: 3.82, 3.33; Z = -2.987, p = .003; verbal: 3.73, 2.69; Z =
the body, spine vertical. Drop your arms and take a step back onto -3.529, p \ .001), whereas ratings did not differ after Step 2. AM ratings
your right leg turning fully around while dragging your left leg behind after learning only from verbal description was lower (2.69) than after all
you. Finish with the weight low, left leg behind, spine rounded other conditions (verbal + visual: 3.48, Z = -3.724, p \ .001; visual:
forward, arms wrapped around the body, right arm front, left arm 3.33, Z = -3.624, p \ .001; visual + verbal: 3.65, Z = -3.682,
back. Stretch your legs and gradually lengthen your spine horizon- p \ .001), and AM ratings after visual + verbal learning were higher
tally. Allow your arms to follow the succession of your spine, right than after visual learning (Z = -2.573, p = .01). PQ ratings did not
front, left back differ for any of the learning conditions.
123
Table 1 Experimental design of the learning task

Learning task Group 1a N = 4 Group 2a N = 4 Group 2b N = 5 Group 1b N = 5
Pre-test questionnaires
Step 1 Phrase 1 Verbal (5x) Visual (5x) Phrase 2 Verbal (5x) Visual (5x)
Step 2 +Visual (2x) +Verbal (2x) +Visual (2x) +Verbal (2x)
Performance Record 13x Record 13x Record 13x Record 13x
Step 1 Phrase 2 Visual (5x) Verbal (5x) Phrase 1 Visual (5x) Verbal (5x)
Step 2 +Verbal (2x) +Visual (2x) +Verbal (2x) +Visual (2x)
Performance Record 13x Record 13x Record 13x Record 13x
Post-test questionnaire, psychometric tests, interview
Retention N=3 N=4 N=4 N=2
Performance Phrases 1, 2 Record 1x Record 1x Phrases 1, 2 Record 1x Record 1x
Retention questionnaire, psychometric tests
Step 1, 2: successive learning steps; Phrase 1, 2: movement material; visual, verbal: demonstration mode; Performance: video-recorded
performance of the learned dance phrase
Retention test: Completeness scores given by the two annotators were an interdisciplinary network (Dance engaging Science; The
highly correlated for both sequences (Phrase 1: r = 0.942, p \ .001; Forsythe Company | Motion Bank), motivated by scientific, artistic
Phrase 2: r = 0.930, p \ .001). No differences were found between and (dance-) pedagogical questions. We compared expert ratings
the groups (Group 1: Phrase 1 verbal first, N = 5; Group 2: Phrase 1 for the recorded performance of two different movement phrases
visual first, N = 8) in general, and no differences were found between in 18 dance students who had learned one phrase initially via
the two sequences (Phrase 1: 7.64; Phrase 2: 6.90). Scores were better verbal description and the other one via observation of a video
for the first visually learned phrase (8.32) than for the phrase first model. After dancing the phrase and being recorded, students
learned from verbal description (6.23) (Z = -1.992, p = .046). received the complementary modality to learn from, and were
When the sequences were regarded separately, groups differed for recorded performing again. Ratings for performance quality were
Phrase 2 (Group 1: 9.17; Group 2: 5.48), but not for Phrase 1 (Group better than rating for model reproduction after the first learning
1: 7.42; Group 2: 7.78), with Group 1 performing better than Group 2 step (one modality), but not after the second learning step (two
(Z = -2.196, p = .028) (see Fig. 2). When comparing ratings for the modalities). After learning from only one modality, ratings for
individual elements (1 to 11), primacy effects were found for both accordance with the model were better if the first learning
dance phrases, in terms of higher scores for the first 3 and 2 parts in modality was visual than verbal, whereas ratings for performance
Phrase 1 and Phrase 2, respectively (Phrase 1: element 1 differed from quality did not differ for visual vs. verbal learning. When the
6, 7, 8, 9 and 11; 2 differed from 5, 6, 7, 8; 3 differed from 4, 5, 6, 7, students had to reproduce the learned movement material in a
8, 9 and 10; Phrase 2: 1 differed from 3, 4, 5, 6, 7, 8, 9, 10 and 11; 2 retention test, the (initially) visually learned material was repro-
differed from 4, 7, 9 and 10; all p \ =.05). duced more completely than the verbally learned material,
Discussion however, when the dance phrases were regarded separately, this
Interdisciplinary projects linking dance and neurocognitive result was only significant for one of the phrases. The results
research have recently come to increasing awareness in artistic and corroborate findings regarding observational learning of move-
scientific communities (see Blasing et al. 2012; Sevdalis, Keller ments in dance and other disciplines or tasks, but also suggest
2011). The presented project on observational (implicit) and verbal dissociation between the exact execution of a model phrase and
(explicit) movement learning in dance has been developed within the artistic quality of dance, even in the learning phase. As
expected, accordance with the model phrases was stronger after
visual learning and after two compared to one modalities (which
might as well have been influenced by the additional practice, as
this was always the second learning step.) Regarding artistic
quality of performance, the students danced the newly learned
material after learning from verbal description as well as after
learning from visual observation, but not better, as we had
expected. Questionnaires and psychometric tests are currently
being analyzed to complement the reported findings of this study.
We expect the outcomes to contribute to our understanding of
explicit and implicit motor learning on the basis of different
Fig. 2 Left Mean expert ratings of students performance for modalities, and also to yield potential implications for teaching
accordance with the model (AM; dark grey columns) and perfor- and training in dance-related disciplines. While explicit learning
mance quality (PQ; light grey columns) after learning from one (via verbal instruction) and implicit learning (via observation and
(visual, verbal) and two (visual + verbal, verbal + visual) modalities practice) have been found to work synergistically in skilled motor
(ratings for both dance phrases are pooled); right completeness scores action (Taylor and Ivry 2013), the situation might be different for
for students performance in the retention test for Phrases 1 and 2; dance and potentially for dance-like movement in general (see
dark grey columns Group 1 (Phrase 1 verbal, verbal + visual; Phrase Schachner and Carey 2013), in which skilful movement execution
2 visual, visual + verbal); light grey columns Group 2 (Phrase 1 largely depends on kinesthetic awareness; further research is
visual, visual + verbal; Phrase 2 verbal, verbal + visual) needed at this point. Further implications could be derived for
123
learning in general, specifically regarding the potential benefit of

combining different modes (or modalities) for conveying infor-
mation in order to shape and optimize learning success.
References
Blasing B, Calvo-Merino B, Cross ES, Jola C, Honisch J, Stevens CJ
(2012) Neurocognitive control in dance perception and perfor-
mance. Acta Psychol 139:300308
Cross ES, Hamilton AF, Grafton ST (2006) Building a motor simu-
lation de novo:observation of dance by dancers. NeuroImage Fig. 1 a This is an example of the stimuli used in the social condition
31:12571267 (i.e. RJA and IJA). b This is an example of the stimuli used in the
Cross ES, Kraemer DJ, Hamilton AF, Kelley WM, Grafton ST (2009) control conditions (i.e. RJAc and IJAc) Note that for a and b, the eye-
Sensitivity of the action observation network to physical and shaped symbol represents the subjects eye movement resulting in
observational learning. Cereb Cortex 19:315326 joint attention. This was not part of the stimulus visible to subjects
Hodges NJ, Williams AM, Hayes SJ, Breslin G (2007) What is
modelled during observational learning? J Sport Sci 25:531545 houses was counterbalanced across acquisition runs. Subjects were
Jeannerod M (2004) Actions from within. Int J Sport Exercise Psy- instructed that whoever found the burglar on each trial had to guide
chol 2:376402 their partner to that location by first establishing mutual gaze and then
Schachner A, Carey S (2013) Reasoning about irrational actions:- looking at the appropriate house.
when intentional movements cannot be explained, the movements On RJA trials, subjects searched their designated houses, each of
themselves are seen as the goal. Cognition 129:309327 which would be empty. The avatar would then complete his search and
Sevdalis V, Keller PE (2011) Captured by motion:dance, action guide the subject to the burglars location. Once the subject responded
understanding, and social cognition. Brain Cogn 77:231236 and joint attention was achieved, positive feedback was provided with
Wulf G, Prinz W (2001) Directing attention to movement effects the burglar appearing behind bars to symbolize that he had been
enhances learning:a review. Psychon B Rev 8:648660 successfully captured. On IJA trials, the subject would find the burglar
Taylor JA, Ivry RB (2013) Implicit and explicit processes in motor inside one of their designated houses. Once the avatar had completed
learning. Action Sci:6387 his search and mutual gaze was established, the subject was then
required to initiate joint attention by saccading towards the correct
location. The avatar responded by gazing at the location fixated by the
subject, regardless of whether it was correct or not. Again, positive
A frontotemporoparietal network common to initiating feedback was provided when joint attention was achieved at the bur-
and responding to joint attention bids glars location. Negative feedback was also provided if the subject
failed to make a responsive eye movement within three seconds, or if
they responded or initiated by fixating an incorrect location.
Nathan Caruana, Jon Brock, Alexandra Woolgar
During the search phase, the avatars gaze behavior was controlled
ARC Centre of Excellence in Cognition and its Disorders,
so that he only completed his search after the subject completed their
Department of Cognitive Science, Macquarie University, Sydney,
search and fixated back on the avatar. This meant that subjects were
Australia
required to monitor the avatars attention during their interaction,
Joint attention is the ability to interactively coordinate attention with before responding to, or initiating a joint attention bid. In this para-
another person to objects of mutual interest, and is a fundamental digmas in ecological interactionsestablishing mutual gaze was
component of daily interpersonal relationships and communication. therefore essential in determining whether the avatar was ready to
According to the Parallel Distributed Processing model (PDPM; guide the subject, or respond to the subjects initiation of joint
Mundy, Newell 2007), responding to joint attention bids (RJA) is attention. The onset latencies of the avatars gaze behavior (i.e.
supported by posterior-parietal cortical regions, while initiating joint alternating between search houses, establishing mutual gaze, and
attention (IJA) involves frontal regions. Although the model executing responding or initiating saccades) were also jittered with a
emphasizes their functional and developmental divergence, it also uniform distribution between 500 and 1,000 ms. This served to
suggests that the integration of frontal and posterior-parietal net- enhance the avatars ecological appearance.
works is crucial for the emergence of complex joint attention The subjects social role as a responder or initiator only
behavior, allowing individuals to represent their own attentional became apparent throughout the course of each trial. Our paradigm
perspective as well as the attentional focus of their social partner in thereby created a social context that (1) elicited intentional, goal-
parallel. However, little is known about the neural basis of these driven joint attention (2) naturally informed subjects of their social
parallel joint attention processes, due to a lack of ecologically valid role without overt instruction, and (3) required subjects to engage in
paradigms. social attention monitoring.
In the present study, we used functional magnetic resonance In order to account for the effect of non-social task features, the
imaging to directly test the claims of the PDPM. Thirteen subjects (9 neural correlates of RJA and IJA were investigated relative to non-
male, Mage = 24.85, SD = 5.65) were scanned as they engaged with social control conditions that were matched on attentional demands,
an avatar whom they believed was operated by another person outside number of eye movements elicited and task complexity. During these
the scanner, but was in fact controlled by a gaze-contingent computer trials, the avatar remained on the screen with his eyes closed, and
algorithm. The task involved catching a burglar who was hiding subjects were told that both partners were completing the task inde-
inside one of six houses displayed on the screen. Each trial began with pendently. In the IJA control condition (IJAc), subjects found the
a search phase, during which there was a division of labor between burglar, looked back to a central fixation point and, when this turned
the subject and their virtual partner. Subjects were required to search green, saccaded towards the burglar location. In the RJA control
a row of three houses located at either the top or bottom of the screen, condition (RJAc), the fixation point became an arrow directing them
whilst the avatar searched the other row. When the subject fixated one to the burglar location (see Fig. 1b).
of their designated houses, the door opened to reveal an empty house A synchronization pulse was used at the beginning of each
or the burglar (see Fig. 1a). The location of the subjects designated acquisition run to allow for the BOLD and eye tracking data to be
123
subjects determine the intentionality of anothers behavior (Morris,

Pelphrey and McCarthy 2008).
Together with previous findings, the frontotemporoparietal net-
work identified in our study is consistent with the PDPMs claim that
the neural mechanisms of RJA and IJA have a shared neural basis in
adulthood. This may support the ability to simultaneously represent
the attentional state of the self and others during interactions (Mundy,
Newell 2007). These self-other representations are essential for the
achievement of joint attention in ecological contexts, as one must
represent the attentional focus of their partner to determine when they
can respond to or initiate joint attention. One also must represent their
own attentional focus so as to plan initiations of joint attention, and to
shift their attentional focus when guided.
Furthermore, a portion of the frontoparietal network common to
RJA and IJAincluding IFG, TPJ and precuneusrevealed additional
activation during IJA trials, compared to RJA trials (see Fig. 2c). This is
again consistent with the role of this network in simultaneously repre-
senting self- and other- oriented attention perspectives, as IJA trials
required subjects to represent an additional shift in their partners
Fig. 2 Threshold maps are displayed for a Responding to joint attentional focus (avatar searches, then waits for guidance, then
attention (RJA - RJAc), b Initiating joint attention (IJA - IJAc), responds), relative to RJA trials (avatar searches, then guides).
c Initiating over and above Responding [(IJA - IJAc) - (RJA - Our data contributes to ongoing debates in the social neuroscience
RJAc)], and d Activation common to Responding and Initiating, literature concerning the social specificity of many of the regions
t [ 3.70, equivalent to p \ 0.05 FDR correction in a, with extent included in this network, such as TPJ (Kincade, Abrams, Astafiev,
threshold 10 voxels. The threshold for p \ 0.05 FDR correction Shulman and Corbetta 2005). Due to the implementation of closely
would have been 2.87, 3.18 and 3.10 in b, c and d respectively. No matched non-social conditions, the present study provides further
voxels survived FDR correction for Responding over and above evidence that these substrates may be particularly sensitive to social
Initiating contrast [(RJA - RJAc) - (IJA - IJAc)] engagement.
This is the first imaging study to directly investigate the neural
temporally aligned. Our analyses of BOLD data focused on the correlates common to RJA and IJA engagement and thus support the
joint attention phase of each trial. Accordingly, event onset times PDPMs claim that a broad integrated network supports the parallel
were defined as the time at which the participant opened the last aspects of both initiating and responding to joint attention. These data
empty house (RJA and RJAc) or found the burglar (IJA and IJAc). inform a neural model of joint attention in adults, and may guide
Events were modelled as box cars lasting until the time at which future clinical applications of our paradigm to investigate whether the
joint attention was achieved and the burglar captured. This assisted developmental delay of joint attention in autism is associated with a
in accounting for variation in reaction times between trials. All differential organization of this integrated network.
second level t-images were thresholded at t [ 3.70, equivalent to
p \ 0.05, with a false discovery rate (FDR) correction for multiple
comparisons in the comparison of RJA and RJAc (see Fig. 2a). References
This threshold was more conservative than p \ 0.05 with FDR Halko M-L, Hlushchuk Y, Hari R, Schurmann M (2009) Competing with
correction in any other contrast tested. The use of a single peers: mentalizing-related brain activity reflects what is at stake.
threshold for visualization allowed the results to be more easily NeuroImage 46:542548. doi:10.1016/j.neuroimage.2009.01.063
compared. Kincade JM, Abrams RA, Astafiev SV, Shulman GL, Corbetta M (2005)
Relative to their corresponding control conditions, both RJA An event-related functional magnetic resonance imaging study of
(Fig. 2a) and IJA (Fig. 2b) activated a broad frontotemporoparietal voluntary and stimulus-driven orienting of attention. J Neurosci
network, largely consistent with previous findings (Redcay et al. 25:45934604. doi: 10.1523/JNEUROSCI.0236-05.2005
2010; Schilbach et al. 2010). Additionally, IJA resulted in more Morris JP, Pelphrey KA, McCarthy G (2008) Perceived causality
distributed activation across this network, relative to RJA, after influences brain activity evoked by biological motion. Soc
controlling for non-social attention (Fig. 2c). Neurosci 3:1625. doi:10.1080/17470910701476686
A conjunction analysis identified a right-lateralized subset of this Mundy P, Newell L (2007) Attention, joint attention and social
network that was common to both RJA and IJA, over and above cognition. Curr Dir Psychol Sci 16:269274
activation associated with the non-social control conditions (Fig. 2d). Redcay E, Dodell-Feder D, Pearrow MJ, Mavros PL, Kleiner M,
Regions included the dorsal portion of the middle frontal gyrus Gabrieli JDE, Saxe R (2010) Live face-to-face interaction during
(MFG), inferior frontal gyrus (IFG), middle temporal gyrus (MTG), fMRI: a new tool for social cognitive neuroscience. NeuroImage
precentral gyrus, posterior superior temporal sulcus (pSTS), tempo- 50:16391647. doi:10.1016/j.neuroimage.2010.01.052
roparietal junction (TPJ) and precuneus. The existing literature Samson D, Apperly IA, Chiavarino C, Humphreys GW (2004) Left
associates many of these regions with tasks involving perspective temporoparietal junction is necessary for representing someone
taking processes. Specifically, TPJ has been implicated in tasks where elses belief. Nat Neurosci 7:499500. doi:10.1038/nn1223
subjects form representations of others mental states (Samson, Ap- Schilbach L, Wilms M, Eickhoff SB, Romanzetti S, Tepest R, Bente
perly, Chiavarino and Humphreys 2004). The precuneus has been G, Vogeley K (2010) Minds made for sharing: initiating joint
recruited in tasks that involve representing first person (self) and third attention recruits reward-related neurocircuitry. J Cogn Neurosci
person (other) visual perspectives (Vogeley et al. 2004). Involvement 22:27022715. doi:10.1162/jocn.2009.21401
of IFG has been reported in dyadic tasks where subjects make com- Vogeley K, May M, Ritzl A, Falkai P, Zilles K, Fink GR (2004)
petitive profit-oriented decisions which intrinsically involve self-other Neural correlates of first-person perspective as one constituent of
comparisons (Halko, Hlushchuk, Hari and Schurmann 2009). Finally, human self-consciousness. J Cogn Neurosci 16:817827. doi:
modulation of pSTS activation has been reported during tasks where 10.1162/089892904970799
123
Action recognition and the semantic meaning stimulus (e.g. if the adaptor was the same action as in one of the test
of actions: how does the brain categorize different social stimuli) a significant shift of the point of subjective equality (PSE) was
consistently observed in the psychometric curve judging the difference
actions? between two different actions (de la Rosa et al. 2014). This shift of PSE
is representing a specific perceptual bias for each recognized action
Dong-Seon Chang1, Heinrich H. Bulthoff1, Stephan de la Rosa1 because it is assumed that this shift (adaptation aftereffect) would not be
Max Planck Institute for Biological Cybernetics, Dept. of Human found if there would be no specific adaptation of the underlying neu-
Perception, Cognition and Action, Tubingen, Germany ronal populations recognizing each action (Clifford et al. 2007; Webster
Introduction 2011). Using this paradigm we showed for the first time that perceived
The visual recognition of actions occurs at different levels (Jellema differences between distinct social actions might be rather encoded in
and Perrett 2006; Blake and Shiffrar 2007; Prinz 2013). At a kinematic terms of their semantic meaning than kinematic motion in the brain.
level, an action can be described as the physical movement of a body Future studies should confirm the neuroanatomical correlates to this
part in space and time, whereas at a semantic level, an action can carry action adaptation aftereffect. The current experimental paradigm also
various social meanings such as about the goals or intentions of an serves as a useful method for further mapping the relationship between
action. In the past decades, a substantial amount of neuroscientific different social actions in the human brain.
research work has been devoted to various aspects of action recogni-
tion (Casile and Giese 2005; Blake and Shiffrar 2007; Prinz 2013). References
Still, the question at which level the representations for different social Blake R, Shiffrar M (2007) Perception of human motion. Ann Rev
actions might be encoded and categorically ordered in the brain is Psychol 58:4773. doi:10.1146/annurev.psych.57.102904.190152
largely left unanswered. Does the brain categorize different actions Casile A, Giese MA (2005) Critical features for the recognition of
according to their kinematic similarities, or in terms of their semantic biological motion. 348360. doi:10.1167/5.4.6
meanings? In the present study, we wanted to find out whether dif- Clifford CWG, Webster M a, Stanley GB, et al. (2007) Visual
ferent actions were ordered according to their semantic meaning or adaptation: neural, psychological and computational aspects.
kinematic motion by employing a visual action adaptation aftereffect Vision Res 47:31253131. doi:10.1016/j.visres.2007.08.023
paradigm as used in our previous studies (de la Rosa et al. 2014). De la Rosa S, Streuber S, Giese M et al. (2014) Putting actions in
Materials and methods context: visual action adaptation aftereffects are modulated by
We used motion capture technology (MVN Motion Capture Suit from social contexts. PloS ONE 9:e86502. doi:10.1371/journal.pone.
XSense, Netherlands) to record different social actions often observed in 0086502
everyday life. The four social actions chosen as our experimental stimuli Jellema T, Perrett DI (2006) Neural representations of perceived bodily
were handshake, wave, punch, yopunch (fistbump), and each of the actions using a categorical frame of reference. Neuropsychologia
actions were similar or different with the other actions either in terms of 44:15351546. doi:10.1016/j.neuropsychologia.2006.01.020
their semantic meaning (e.g. handshake and wave both meant a greet- Prinz W (2013) Action representation: Crosstalk between semantics
ing, whereas punch meant an attack and yopunch meant a greeting) or and pragmatics. Neuropsychologia 16. doi:10.1016/j.neuropsych
kinematic motion (e.g. the movement of a punch and a yopunch were ologia.2013.08.015
both similar, whereas the movement of a punch and a wave were very Webster MA (2011) Adaptation and visual coding. 11:123. doi:
different). To quantify these similarities and differences between each 10.1167/11.5.3.Introduction
action, a total of 24 participants rated the four different social actions
pairwise in terms of their perceived differences in either semantic Understanding before language5
meaning or kinematic motion on a visual analogue scale ranging from 0
(exactly same) to 10 (completely different). All actions were processed
Anna Ciaunica
into short movie clips (\ 2 s) showing only the joint movements of an
Institute of Philosophy, London, England & Institute of Philosophy,
actor (point-light stimuli) from the side view to the participants. Then,
Porto, Portugal
the specific perceptual bias for each action was determined by mea-
suring the size of the action adaptation aftereffect in each participant. Abstract How can an infant unable to articulate meaning in verbal
Each of the four different social actions were shown as a visual adaptor communication be an epistemic agent capable of attributing false
each block (30 s prolonged exposure in the start, 3 x repetitions each beliefs? Onishi, Baillargeon (2005) demonstrated false belief under-
trial) while participants had to engage in a 2-Alternative-Forced-Choice standing in young children through completely nonverbal measures such
(2AFC) task where they had to judge which action was shown. The test as violation of expectation (VOE)6 looking paradigm and showed that
stimuli in the 2AFC task were action morphs in 7 different steps children younger than 3 years of age, who consistently fail the standard
between two actions which were presented repeatedly (18 repetitions verbal false-belief task (SFBT), can anticipate others actions based on
each block) and randomized. Finally, the previously obtained meaning their attributed false beliefs. This gave rise to the so-called Develop-
and motion ratings were used to predict the measured adaptation mental Paradox (DP): if preverbal human infants have the capacity to
aftereffect for each action using linear regression. respond to others false beliefs from at least 15 months, why should
Results they be unable to verbally express their capacity to recognize false
The perceived differences in the ratings of semantic meaning sig-
nificantly predicted the differences in the action adaptation 5
An extended version of this paper has been recently accepted
aftereffects (p \ 0.001). The rated differences in kinematic motion
for publication in the Review of Philosophy and Psychology,
alone was not able to significantly predict the differences in the action
under the title Under Pressure: Processing Representational Decou-
adaptation aftereffects, although the interaction of meaning and
pling in False-Belief Tasks. Springer Science + Business Media
motion was also able to significantly predict the changes in the action
Dordrecht 2014.
adaptation aftereffect for each action (p \ 0.01). 6
Discussion The VOE task tests whether children look longer when agents act in
Previous results have demonstrated that the action adaptation aftereffect a manner that is inconsistent with their false beliefs and relies on the
paradigm could be a useful paradigm for determining the specific per- basic assumption that when an individuals expectations are violated,
ceptual bias for recognizing an action, since depending on the adaptor she is surprised and thus she looks longer at an unexpected event
rather than at an expected event.
123
beliefs until they are 4-years old, a full 33 months later? The DP teaches Leslie uses the term metarepresentation to mean (e.g., in the case
us that visual perception plays a crucial role in processing the implicit of understanding pretence-in-others) an internal representation of an
false-belief condition as opposed to the explicit/verbal-report condition. epistemic relation (PRETEND) between a person, a real situation and
But why is perception, in some cases, smarter than explicit and an imaginary situation (represented opaquely)(Leslie 1991:73)
verbalized thinking? In this paper I briefly sketch the solution proposed Call this metarepresentation2.This definition does not sound at all like
by De Bruin, Kastner (2012), the Dynamic Embodied Cognition and I the definition of metarepresentation1 as second-order representation
raise an objection regarding their use of the term metarepresentation pursued above. There is nothing metarepresentational in the sense of
in explaining the puzzle. higher order representation in Leslies formulation of the semantics
Recently, evidence has been mounting to suggest that infants have of psychological predicates. Building on this distinction, S. Scott
much more sophisticated social-cognitive skills than previously sus- insightfully argues that a representation can contain other represen-
pected. The issue at stake is crucial since as Sommerville, Woodward tations without being a metarepresentation1.
(2010:84) pointed out, assessing infants understanding of others Consider (P):
behavior provides not only a snapshot of the developing mind of the (P) The child BELIEVES that Sally BELIEVES that the marble is
child, but also a panorama of the very nature of cognition itself. in the basket.
Consider this challenge: In what follows, I shall argue that although (P) is a straight-up
P1. Empirical evidence strongly suggests that basic cognition is second-order belief, this does not necessarily involve second-order
smart (since 15-month olds understand false-beliefs). representation, or metarepresentation1 in Dennetts sense. Against De
P2. Smart cognition necessarily involves computations and rep- Bruin and Kastner (2012), I hold that there are no additional second-
resentations (of false beliefs). order metarepresentational skills involved in SFBT as compared
P3. Hence, basic cognition necessarily involves computations and with VOE trials. Much of what I have to say in this section parallels
representations (of false beliefs). arguments from Scott (2001) with which I am in close agreement.
De Bruin, Kastner (2012) recently proposed a reconciliatory Scott convincingly argued that second-order beliefs do not necessarily
middle-ground solution between representationalist and enactivist require metarepresentations1. It is only necessary to have the ability to
accounts, i.e. Dynamic Embodied Cognition (DEC). They claim that represent first order beliefs in order to have second-order beliefs
the Developmental Puzzle is best addressed in terms of the relation (Scott 2001: 940).
between coupled (online) and decoupled (offline) processes for basic Take the following example of a first order belief:
and advanced forms of (social) cognition as opposed to merely rep- (1) Melissa BELIEVES that her dog is dead.
resenting/not representing false beliefs. They argue that rephrasing the The crucial point here is that to simply hold a belief, Melissa need
issue in terms of online/offline processing provides us with an expla- not be aware of her belief or to hold an explicit representation of it. In
nation of the Developmental Puzzle. How exactly does this work? First, other words, she need not think to herself: I believe my dog is dead
the authors take for granted the premise that infants are equipped with or It is I who believes that my dog is dead. At this level of inter-
implicit abilities that start out as grounded in basic online processes, pretation, we can speak of animals having this kind of online implicit
albeit partly decoupled. It is crucial for their project that these basic beliefs, although we may find uncomfortable the idea of dogs having
implicit abilities already involve decoupling. This is in line with the implicit beliefs. Now, consider the following example of a second
cognitivist distinction between (a) sub-doxastic mental states that do order belief:
not possess truth-evaluable propositional content and (b) robust mental (2) Anne BELIEVES that Melissa BELIEVES that her dog is
states (Spaulding 2010:123). In a second step, they hold that infants dead.
implicit abilities develop gradually into more sophisticated explicit As Scott rightly points out, in order to get (2) Anne needs the
abilities that rely on offline processes to a much larger extent. The representation of Melissas dog, the predicate DEAD, and so on.
coupling and decoupling relations between agent and environment What she does not need is a representation of Melissas representa-
advocated by DEC are dynamic in the sense that they are a matter of tion of her dog, the predicate DEAD, and so on. That is, she does not
degree and never an end in itself. () The dynamic interplay of need a second-order representation of any of these things. She can get
decoupled and coupled processes may be used for optimization of by with her own first-order representations. Given that neither Melissa
cognitive processing. (De Bruin and Kastner 2012: 552 emphasis nor Anne has any particular need of belief representation in order to
added). There is definitely much more to be said about DEC, but this be a believer, Annes representation of Melissas belief need not be
gives us the basic flavor. Clearly, DEC borrows from the weak-strategy second order. In addition, it would seem that what Anne also needs in
theorists such as Apperly and Butterfill (2009) the idea that early order to get (2) is a representation of Melissas BELIEF. That is to
mechanisms are cheap and efficient, while the late-emerging say, she needs a representation of Melissas mental state of believing
mechanisms are costly but flexible. But they also borrow from in a way that Melissa does not (Scott 2001:939, emphasis added).
rich theorists (Baillargeon et al. 2010) the idea that preverbal human The question is: is there any metarepresentation1 involved here?
infants are already capable of decoupling, i.e. taking their own reality- Indeed, one might object that Melissas belief state involves already
congruent perspective offline, albeit in a very limited way. implicit or sub-personal representational processing. Now, the dis-
An important concern regards the use of the term metarepre- tinction between explicit versus implicit or sub-personal mental
sentation. As S. Scott (2001) pointed out, there is danger of representations is a complicated issue and need not concern us here.
confusionwith serious consequences for the debate about the nature For present purposes, it is sufficient to insist on the idea that Annes
of higher-level cognitionbetween two distinct notions of metarepresentation of Melissas first-order belief (regardless of the fact
representation, as defined by philosophers (Dennett 1998) and by that the latter involves or not subpersonal representational processing
psychologists dealing with the question of autistic disorders (Leslie in Melissas mind) does not amount to a second-order metarepre-
1991). According to Dennett (1998), representations are themselves sentation1 (in Annes mind). But let us suppose for the sake of the
objects in the world, and therefore potential objects of (second-order argument that Anne holds a representation of Melissas first-order
or meta-) representations. Call this metarepresentation1. For example, implicit belief (B) which in turn involves a certain sub-personal
drawing a cat on a piece of paper is a type of non-mental represen- representational processing (S) in Melissas brain. Now, if (S) is an
tation, which is represented in the mind of the person viewing it. The implicit, sub-personal representation (in Melissas mind), then one
mental representation is of the drawing, but since the drawing is itself consequence would be that in metarepresenting Melissas belief
a representation, the viewer has a (mental) metarepresentation of (B) [which involves (S)], Anne is only half-aware of what she is
whatever it is that the drawing represents, namely a cat. By contrast, metarepresenting. Indeed, given that one member of the double
123
representational layer, namely (S), remains opaque to her, Anne is References

aware only of what she is representing, namely (B). Note that this is Apperly I, Butterfill S (2009) Do humans have two systems to track
not a problem per se. One could label this half-blind metarepresent- beliefs and belief-like states? Psychol Rev 116:953970
ing, metarepresentation3, say. If this is so, then it is difficult for me to Baillargeon R, Scott RM, Zijing H (2010) False-belief understanding
see why metarepresenting3 in this sense is supposed to be cognitively in infants. Trend Cogn Sci 14(3):110118
more demanding (for Anne) than mere representing. In contrast, recall Ciaunica A (2014) (in press) Under pressure- processing representa-
that in Dennetts drawing example, the viewer is fully aware of the tional decoupling in false-belief tasks. Rev Philos Psychol. doi:
double representational layer: he forms a mental representation of a 10.1007/s13164-014-0195-2
drawing of a cat and this makes his metarepresenting1 a genuine De Bruin LC, Kastner L (2012) Dynamic embodied cognition. Phe-
second-order cognitive achievement. Hence, it is not clear that nomenol Cogn Sci 11(4):541563
metarepresenting1 is at work in the Sally/Anne scenario and this casts Dennett D (1998) Making tools for thinking. In: Sperber D (ed)
doubt on the idea that the ERTs require that infants not only represent (2000) Metarepresentation. Oxford University Press, New York
but metarepresent. Leslie AM (1991) Precursors to a theory of mind. In: Andrew W (ed),
To sum up, according to De Bruin and Kastner, ERTs involve a Natural theories of mind: evolution, development, and simulation
stronger form of decoupling (precisely because it involves metarep- of everyday mindreading. Blackwell, Oxford, pp 6378
resentational skills and language processing), hence explaining the Moll H, Tomasello M (2007) How 14- and 18-month-olds know what
Developmental Puzzle. Although I agree with De Bruin and Kastner in others have experienced. Dev Psychol 43(2):309317
saying that (a) SFBTs require decoupling, and that (b) the verbal Onishi KH, Baillargeon R (2005) Do 15-month-old infants understand
interaction with the experimenter during SFBT plays a crucial role in false beliefs? Science 308(8):255258
3-year olds failure to report false-belief understanding, there is still Scott S (2001) Metarepresentations in philosophy and psychology. In:
something missing in the picture. Indeed, I fail to see how (a) and Moore J, Stenning K (eds) Proceedings of the twenty-third annual
(b) alone can solve the Developmental Puzzle, since, as the authors conference of the cognitive science society, University of Edin-
themselves have insisted, the decoupling is supposed to lead to an burgh. LEA Publishers, London
optimization of cognitive processing. Everybody agrees that strong Sommerville JA, Woodward A (2010) In: Grammont et al. (eds)
decoupling is an important evolutionary advantage. But the mystery of Naturalizing intention in action. MIT Press, Cambridge
the Developmental Puzzle stems from the opposite situation. In order Wilby M (2012) Embodying the false-belief tasks, phenomenology
to truly solve the DP, they need to answer the following question: why and the cognitive sciences, Special Issue on Debates on
does stronger decoupling impair (at least in some cases) rather than Embodied Mindreading (ed. S. Spaulding) December 2012,
improve the mental gymnastics of representational manipulation? Volume 11, pp 519540
In other words: why do weaker forms of decoupling do a better job in a
complex task such as false-belief understanding?
Unlike De Bruin and Kastner, I reject the idea that basic forms of
mentality are representational and that during VOE scenarios, infants An embodied kinematic model for perspective taking
must rely on internal representations of visual information that is
available to the other agent but not available to them. Rather, infants Stephan Ehrenfeld, Martin V. Butz
understand others intentional attitudes as currently and readily Cognitive Modeling, Department of Computer Science, University
available (i.e. directly observable) in the environment. To support this of Tubingen, Germany
claim, I appeal to empirical findings illustrating that (i) infants ability Abstract Spatial perspective taking (PT) is an important part of
to understand other minds is rooted in their capacity to actively many social capabilities, such as imitation or empathy. It enables an
engage in interactive scenarios. Consistent with a burgeoning litera- observer to experience the world from the perspective of another
ture suggesting a common basis for both the production and actor. Research results from several disciplines suggest that the
perception of action, evidence has been mounting to illustrate that capability for PT is partially grounded in the postural structure of the
infants understanding of others is more robust within interactive own body. We investigate an option for enabling PT by employing a
contexts. In other words, the more engaged the interactions infants/ potentially learned, own, kinematic body model. In particular, we
agents are the more robust the infants understanding of others investigate if the modular modality frame model (MMF), which is a
becomes. Children first learn to discern or establish reference in sit- computational model of the brains postural representation of its own
uations that are not defined by differences in how self and other body, can be used for PT. Our results confirm that MMF is indeed
perceive agents and objects visually but by differences in their shared capable of PT. In particular, we show that MMF can be used to infer
experiential backgrounds, i.e. in what they did, witnessed or heard. a probabilistic estimate by recruiting the own, embodied kinematic
For example, Moll and Tomasello (2007) tested the childs abilities to knowledge for inferring the necessary spatial transformation for PT
recall an adults knowledge of what she has experienced in three as well as for deducing object positions and orientations from the
conditions: (1) the child and adult together interacted with a toy; (2) actors egocentric perspective.
the infant handled the toy with another experimenter, while the adult Keywords
watched (and the infant was alerted to this several times); (3) the adult Perspective Taking, Embodiment, Frame of Reference
handled a toy alone, while the infant watched. As Wilby (2012)
pointed out, one might describe the difference in evidence that is Introduction
available to the infant as follows: Perspective taking (PT) may be defined as the ability to put oneself
(1) X is aware that [I am aware that [X is aware that [p]]]. into another persons spatial, bodily, social, emotional, or even logical
(2) X is aware that [I am aware that [p]]. reasoning perspective. Taking on an actors perspective in one or
(3) X is aware that p. several of these respects, seems mandatory to be able to interact with
Now, if we apply De Bruin and Kastners degrees of decoupling the observed actor socially, to imitate the actor, to cooperate with the
explanatory strategy in this specific case, then one could expect that actor, to imagine situations, events, or episodes the actor has been in
infants would find the first condition (1) the hardest, since it involves or may experience in the future, and to show and experience empathy
several embedded layers of decoupling. Yet, the evidence suggests (Buckner and Carroll, 2007). PT has often been associated with the
the complete opposite. Hence, it is not clear that crediting infants with mirror neuron system (Rizzolatti and Craighero, 2004). However, to-
an implicit representational decoupling ability is the best strategy here. date it is still under debate where mirror neurons come fromwith
123
suggestions ranging from purely associative learning mechanisms, perspective change proximal limb axis distal
over adaptations for action understanding, to a consequence of epi-
eye or observer actor actor actor actor
genetic, evo-devo interactions (Heyes, 2010; Ferrari et al. 2013) shoulder elbow wrist fingertips
Although PT has been addressed in various disciplines, few shoulder eye
forward
relative
explicit computational models exist. For example, PT is often position
simply attributed to the concept of mirror neurons, without speci-
observer
fying how exactly mirror neurons accomplish PT, and which head or head
modality & FoR axis

mechanisms are made use of in this process. Fleischer et al. (2012) torso
simulated the development of mirror neurons based on relative relative
orientation
object interaction encodings, suggesting and partially confirming actor torso upper arm forearm hand
that many mirror neurons should be view-point dependent. How-
ever, their neural model was mainly hard-coded and it did not limb
include any information processing interaction mechanisms. Purely relative
orientation
feedforward processing allowed the inference of particular types of
object-interactions.
inverse
We believe that PT is an essential ingredient for learning an limb
interactive mirror-neuron system during cognitive development. relative
angles actor actor actor
Essentially, for establishing view- independent mirror neuron activi-
shoulder elbow wrist
ties, two requirements need to be met: First, in order to understand
complex actions, the observer must encode an embodied representa- Fig. 1 The body state is distributed over modules (depicted as
tion of the actor and its surrounding environment. Second, seeing that circles, filled circles, rectangles and crossed-out rectangles). Along
the sensory information comes from the observers egocentric per- the horizontal axis, different body limbs are shown, and along the
spective, a change in the frame of reference (FoR) becomes necessary vertical axis, different modalities (positions, orientations and angles),
for transferring the perspective into the actors egocentric perspec- which also use different FoRs: relative to a base (first two rows) or
tive. Biologically and computationally, both requirements may be met relative to the next proximal limb (third and fourth row). Arrows
by employing attributes of a model of the own body during cognitive show the kinematic mappings, which connect the modules (dash-
development. Doing so, the own body model can be used to contin- dotted yellow are the forward kinematics, dotted gray the inverse
ually filter and combine multiple (visual) cues into a rich estimate of kinematics, and solid red are distal-to-proximal mappings)
an actors position and orientation. Moreover, the own body model
can be used to compute the necessary spatial translation and rotations,
both of which are inevitably a part of the kinematics of the own body be projected back along the forward kinematics (yellow dash-dotted
model. Therefore, we propose that a PT mechanism may recruit a arrows). When the actors shoulder position and torso orientation (filled
modular body model of its own inherent, bodily kinematic mappings. circles) are substituted with a base frame of reference during this process
To the best of our knowledge, no explicit computational models (e.g. position (0,0,0) and orientation (1, 0, 0), (0, 1, 0), (0, 0, 1)), the result
have been applied to model an embodied PT mechanism. A non- represents the actors limbs in the actor-relative FoR.
embodied approach to PT may be found in Cabido-Lopes and Santos- Second, any observer-relative visual information arriving in an
Victor (2003). To fill this gap, we show here that the modular observer-relative FoR (first two rows of MMF) may also be directly
modality frame model (MMF) (Ehrenfeld and Butz, 2013; Ehrenfeld transformed into actor- relative FoRs. As before, the model can
et al. 2013), which constitutes a biologically inspired model of body accomplish such a transformation by projecting the sensory infor-
state estimations, can exhibit embodied PT. mation along the inverse kinematics (gray dotted) into limb- relative
The modular modality frame model (MMF) orientation FoRs, only in this case the next proximal input is substi-
At its core, MMF is a Bayesian model of modular body state esti- tuted with the position and orientation of the actors shoulder and
mation. MMF distributes the body state over a set M of local modules torso, respectively. Due to the substitutions and because no normal-
mi, such that each module encodes a part pxjmi of the whole ization to the limb length is done, the result in the relative orientation
probabilistic body state pxjM . Bayesian filtering reduces noise over FoR is actually equal to the actors limbs in the actor-relative FoR.
time, and kinematic mappings connect all mi , leading to continuous Equally, the second method can be used to transform objects in the
information flow between the modules. The modules and their con- environment from an observers egocentric perspective to an actors
nections are visualized in Fig. 1. egocentric perspective.
Multiple frames of reference (FoRs) are shown in Fig. 1. The As both methods rely exclusively on interactions that are built
head-centered FoR is used to encode joint positions (first row), and initially for establishing a distributed representation of the observers
limb orientations (second row), where limb orientations consist of own body, the observer can simply recruit its own body model to infer
three orthogonal vectorsone parallel to the limb, the other vectors the actors perspective.
specifying its intrinsic rotation. An additional FoR is centered on each When the actors shoulder position and orientation are not visible
body limb and is used to encode the relative orientation of the next to the observer, also this base perspective can be inferred by MMF
distal limb (third row). Finally, Tait-Bryan angles between adjacent given at least one shoulder and torso-relative position and orientation
limbs are encoded (fourth row). An in-depth description can be found signal. By transforming multiple cues particularly along the distal-to-
elsewhere (Ehrenfeld and Butz, 2013; Ehrenfeld et al. 2013). In proximal kinematic mappings and fusing them, MMF can build a
summary, MMF executes transitions between FoRs and implements robust estimate of the actors shoulder and torso. In the following, we
complex information fusion. detail and evaluate these three capabilities of MMF in further detail.
MMFs characteristics are ideal to model PT in an embodied way. Simulations
When observing the body of an actor, MMF features two ways of An observer might not always be able to perceive an actors torso,
accomplishing the necessary FoR transformation for PT. while still being able to perceive other parts of an actors body. The
First, any visual information arriving in the first two rows, i.e. position torso might be occluded, or additional cues might be available for
or orientation information relative to the observers body, can be pro- other body parts (e.g. the observer could touch the actors hand,
jected along the inverse kinematics (gray dotted arrows) to the third row providing additional cues, the actors hand could be placed on a well-
(rectangles), thus inferring limb-relative representations. The result can established landmark, such as a door handle, or attention could be
123
focused on the hand). In the following, we show how MMF can use 0.5
cues from the actors hand state and relative relations between adja- position
cent limbs to build a probabilistic estimate of the actors torso orientation
orientation and shoulder position. 0.4
In the following simulations, we assume that the actors torso is
estimation error
static and its hand moves along an unknown trajectory. To this end, 0.3
we model the hands movement as Gaussian noise with mean zero
and a standard deviation of 0.2 rad per angle. The arm has nine
degrees of freedom (three on each joint). In each time step, noisy 0.2
sensory input arrives in all modules depicted in Fig. 1 with a crossed-
out rectangle (standard deviation of 0.02 per dimension in units of
limb length) or a non-crossed-out rectangle (standard deviation of 0.1
0.2). Thus, while the fingertip-position and hand-orientation are per-
ceived rather accurately in the observers egocentric perspective, 0
relations between adjacent limbs are perceived rather inaccurately. In 0 10 20
each time step, MMF projects the sensory information along the solid time step
red arrows to the torsos position and orientation (filled circles),
where Bayesian filtering reduces the sensor noise. The Euclidean Fig. 3 The estimate of the actors torso in the observers FoR is used
distance of the resulting torso estimate from the real torso state is to project an object from the observers FoR to the actors FoR. As
shown in Fig. 2. The results show that despite the high sensory noise the torso estimate improves (cf. Fig. 2), the object projection
in the relative FoRs and the high movement noise, the orientation of improves as well. The vertical axis is in units of limb lengths, error
the actors torso can be inferred with a lower estimation error than the bars are SEM
one inherent in most of the FoRs perceived. Results are averaged over
100 individual runs, where each run samples a different shoulder
estimate inferred from the relative relations. The error of this fused
position, torso orientation, and arm trajectory.
estimate is shown in Fig. 4, green. The improvement of the green
To infer the actor-relative orientation of objects, the second pro-
performance over the red performance is only possible, because the
jection method is evaluated. For this purpose, in each run a new object
torso estimate is filtered over time. The results show how continuous
with random positions and orientations is created in a sphere of one
filtering and information fusion can improve the body state estimate.
limb length around the actors torso. The error of the objects pro-
Conclusion
jection into the actors egocentric FoR is shown in Fig. 3. It depends
Recently, we applied MMF to multimodal sensor fusion, Bayesian
on both the shoulder position and torso orientation estimates (cf.
filtering, and sensor error detection and isolation. As shown in Butz
Fig. 2). In accordance to the improvement of those estimates, the
et al. (2014), MMF is also well-suited to model the Rubber Hand
objects projection into the actors FoR improves.
Illusion. Two important characteristics of MMF are its modularly
Last, we evaluate the effects of multimodal sensor fusion and
distributed state representation and its rigorous multimodal Bayesian
Bayesian filtering on the representation of the actors fingertip posi-
fusion, making it highly suitable to model PT in an embodied way.
tion in the actors egocentric perspective. At first glance, sensory
Our results show that MMF is able to infer position and orientation
input of the relative relations between adjacent limbs (non-crossed-
estimates of an actors body and objects in the environment from the
out rectangles in Fig. 1) are sufficient to infer the fingertip position.
actors egocentric perspective. We showed that this is even possible
The resulting estimation error is shown in Fig. 4, red. It is however
when the actors head and torso are occluded. Moreover, we showed
advantageous to also include the eye-relative measurements (crossed-
that Bayesian filtering is able to improve the process. All results are
out rectangles in Fig. 1). They are projected into the actors ego-
obtained by exclusively using the observers own body model, i.e. no
centric perspective in the same way environmental objects are
new abilities are required. Thus, the proposed PT approach is fully
projected. However this time the result is fused with the fingertip
embodied.
The resulting PT capability sets the stage for many skills that at
least partially rely on PT. As an estimate of the actors whole body
0.4
position
orientation
0.3 without global measurement

2 with global measurement
estimation error
position estimation error
0.2 1.5
0.1 1
0.5
0
0 10 20
time step
0
0 10 20
Fig. 2 Error of the estimation of an actors shoulder position and time step
torso orientation in an observers egocentric FoR. The shoulder and
torso themselves are occluded and are inferred via the body model. Fig. 4 The vertical axis is in units of limb lengths, and error bars are
The vertical axis is in units of limb lengths, and error bars are SEM SEM
123
state is maintained over time, angular estimates and changes in these One of the challenges brought about by these developments is how
angular estimates, which allow the inference of current motor harmony relates to quantitative linguistic evidence. With regard to corpus
activities, are readily available. Because MMF represents these frequencies, a large body of research has shown that this relationship is
estimates body-relative, the inferred motor activities are the same no non-linear (e.g., Goldwater and Johnson, 2003). With regard to gradient
matter if the observer or another actor acts. As a consequence, linguistic judgments, the most transparent relationship between con-
motor primitives can be activated and movements may be classified straint weight and perceived grammaticality is postulated by Linear
according to these activities. The observer could for example rec- Optimality Theory (LOT) (Keller, 2006) which explicitly aims at pro-
ognize a motion as a biological motion or even infer the desired viding a model of gradient grammaticality judgments, in particular as
effect an actor is trying to achieve. Even more so, the close inte- obtained by the method of magnitude estimation (ME). This method
gration of PT in the body model should allow for easy online allows participants to judge sentences on an open-ended continuous
imitation learning. Overall, MMF can be considered to either pre- numerical scale relative to a predefined reference sentence. Following
cede the mirror neuron system and provide it with input or to be Bard et al. (1996), magnitude estimation has become a kind of gold
part of the mirror neuron system for simulating and understanding standard for assessing grammaticality, although more recently its validity
observed bodily motions of an observed actor. has been questioned (e.g., Sprouse, 2011). LOT claims that the weight of
a constraint reflects the decrease in acceptability that results from vio-
References lating the constraint. A further claim of LOT is that multiple constraints
Buckner RL, Carroll DC (2007) Self-projection and the brain. Trends combine additively. Thus, if a sentence contains two constraint viola-
Cogn Sci 11:4957 tions, its decrease in acceptability should be the sum of the acceptability
Butz MV, Kutter EF, Lorenz C (2014) Rubber hand illusion affects decrease of violating each constraint in isolation.
joint angle perception. PLoS ONE 9(3):e92854 Some evidence against this assumption was found by Hofmeister
Cabido-Lopes M, Santos-Victor J (2003) Visual transformations in et al. (2014). The combined effect of two syntactic constraint viola-
gesture imitation: What you see is what you do. In: IEEE in- tions was under-additive, that is, less than the sum of the separate
ternatonal conference robot, vol 2, pp 23752381 effects of the two constraints. However, Hofmeister et al. (2014) used
Ehrenfeld S, Butz MV (2013) The modular modality frame model: a non-standard judgment procedure (the thermometer judgment
continuous body state estimation and plausibility-weighted methodology of Featherston, 2007), and the interaction between the
information fusion. Biol Cybern 107(1):6182 two constraints was only marginally significant.
Ehrenfeld S, Herbort O, Butz MV (2013) Modular neuron-based body We ran two experiments using a standard magnitude estimation
estimation: maintaining consistency over different limbs, procedure in order to investigate the effect of multiple constraint
modalities, and frames of reference. Front Comput Neurosci 7 violations. Experiment 1 investigates the effect of two severe (hard)
Ferrari PF, Tramacere A, Simpson EA, Iriki A (2013) Mirror neurons constraint violations; Experiment 2 investigates the effect of a severe
through the lens of epigenetics. Trends Cogn Sci 17(9):450457 constraint violation coinciding with a mild (soft) constraint violation.
Fleischer F, Christensen A, Caggiano V, Thier P, Giese MA (2012) In both cases, we find evidence for under-additivity. In the last part,
Neural theory for the perception of causal actions. Psychol Res we therefore develop the idea of mapping harmony values to
76(4):476493 acceptability judgments using a sigmoid linking function to preserve
Heyes C (2010) Where do mirror neurons come from? Neurosci linear cumulativity in harmony, while allowing for under-additive
Biobehav R 34(4):575583 effects in acceptability judgments.
Rizzolatti G, Craighero L (2004) The mirror-neuron system. Annu Experiment 1
Rev Neurosci 27:169192 Experiment 1 tested German sentences that contained either no vio-
lation at all (2-a), a violation of the position of the finite auxiliary
(2-b), an agreement violation (2-c), or both violations at once (2-d).
The under-additive effect of multiple constraint While sentences (2b-d) are all ungrammatical in a binary system,
LOT predicts that sentence (2-d) is even less acceptable than (2b-c).
violations The corresponding constraints AuxFirst and Agree are both consid-
ered hard constraints in the sense of Sorace and Keller (2005), that is,
Emilia Ellsiepen, Markus Bader both violations should cause severe decreases in acceptability.
Goethe-Universitat, Institut fur Linguistik, Frankfurt am Main,
Germany (2) Ich finde, dass die Eltern im Winter an die See
I think, that the parents in winter at the sea
Keywords
Gradient grammaticality, Harmonic grammar, Quantitative linguistics a. hatten reisen sollen.
Introduction have travel should
The quest for quantitative evidence in syntactic and semantic research
has led to the development of experimental methods for investigating b. *reisen sollen hatten.
linguistic intuitions with experimental rigor (see overview in Schutze c. *hatte reisen sollen.
and Sprouse, 2014). This in turn has inspired a renewed interest in d. *reisen sollen hatte.
grammar formalisms built on constraints with numerical weights
Method
(Smolensky and Legendre, 2006; see overview in Pater, 2009). These
The ME procedure closely follows the description of the ME method
formalisms assign each sentence a numerical harmony value as
in Bard et al. (1996) and consisted of a customization phase, where
defined in (1). The harmony H of a sentence S is the negative
participants were acquainted with the method by judging the length of
weighted sum of all grammatical constraints Ci that are violated by S
X lines and the acceptability of ten training sentences, and the experi-
Harmony of sentence S HS wCi vS; Ci 1 mental phase.
i In each phase, participants first saw the reference stimulus (either a
line or a sentence) and assigned it a numerical value. Afterwards, the
As discussed in detail by Pater (2009), such formalisms have a experimental stimuli were displayed one by one, and participants
great potential in bringing generative linguistics and cognitive science judged each stimulus relative to the reference stimulus, which
in close contact again. remained visible throughout the experiment. The reference sentence
123
(3), almost literally taken from Keller (2000, sentence (B.18)/page Fig. 1, the difference between conditions (4-b) and (4-d) is even
377), is a sentence with non-canonical word order. smaller than in Experiment 1, but this difference was still significant
(t(35) = 3.32,p \ .01). We can conclude that both hard and soft
(3) Ich glaube, dass den Bericht der Chef in
constraints affect acceptability in a cumulative fashion in that addi-
I believe that the.ACC report the.NOM boss in
tional violations systematically lead to lower acceptability. The
seinem Buro gelesen hat.
effects of the two constraint violations in isolation, however, do not
his office read has
add up in the case of sentences containing both constraint violations.
32 sentences were created, all appearing in four versions according Instead, they combine in an under-additive way.
to the two violation types introduced above (2). The experimental Modelling under-additivity: a sigmoid function to link harmony
sentences were distributed onto four lists according to a Latin square with acceptability
design and combined with 278 filler sentences. 36 students, all native While the results above suggest a cumulative effect of constraint viola-
speakers of German, took part in the study. tions in that additional violations always lower acceptability, this
Results cumulativity does not result in a linear decrease. There are at least two
The acceptability judgments as obtained by the ME procedure were first explanations for this under-additivity effect: Either harmony is not a
normalized with the judgment of the reference sentence and then log linear combination of constraint weights multiplied by the number of
transformed as is standard practice with ME data. A repeated measures violations, or acceptability is not proportional to harmony. In this case, it
ANOVA revealed significant main effects for both factors (AuxFirst and is not unlikely that acceptability can still be derived from harmony, but by
Agree), as well as a significant interaction between the two (F(35,1) = a linking function that accounts for the apparent floor effect we found in
14.7, p \ .001). As illustrated on the left hand side of Fig. 1, the effects the experiments. In this section we will explore the possibility of using a
were not additive (the predicted mean for full additive effects is indi- sigmoid function to link harmony values to acceptability judgments.
cated by the x). The difference between conditions (2-b) and (2-d), If we assume that the constraint weights and thus the harmony values
however, was still significant, as indicated by a paired t-test are given, the appropriate linking function should intuitively preserve the
(t(35) = 3.55, p \ .01). relative distance between harmony values in the upper range, while
Experiment 2 progressively reducing the difference between harmony values in the
Experiment 2 has a similar design as Experiment 1, the difference being lower range, possibly leveling off at a horizontal asymptote which would
that instead of an agreement violation, a violation of subject-object correspond to the lower bound of acceptability. A sigmoid function with
order (S [ O) is investigated, again in combination to a violation of its inflection point at 0 and an asymptote which corresponds to the
AuxFirst. As shown in (4), the normal order between subject and object maximal difference in acceptability could serve this requirement as the
is SO. When the order constraint on subject and object is violated, 0-point (a structure which does not violate any constraint) is mapped to
sentence acceptability decreases. However, in contrast to Agree, this is zero while increasingly lower values are mapped to values that are closer
not a hard but a soft constraint in the sense of Sorace and Keller (2005), together. If we want to estimate the weights from the acceptability data
resulting in a comparatively mild decrease in acceptability. itself, however, it gets more complicated. If we were to use the differ-
(4) Ich glaube, dass I think, that ences between acceptability judgments as the weights, we would
subsequently predict higher acceptability than observed for structures
a. der Doktor dem Patienten hatte helfen konnen. which exhibit only one violation. This problem, however, can be avoided
the doctor the patient have helped could by first applying the inverse of the sigmoid linking function to the weights
as derived by acceptability judgments. As for choosing the correct
b. *der Doktor dem Patienten helfen konnen hatte. asymptote, this seems to be an empirical question. As an example, we
c. ?dem Patienten der Doktor hatte helfen konnen. chose the hyperbolic tangent function with a reduced asymptote of
d. *dem Patienten der Doktor helfen konnen hatte. -0.75 instead of -1 and its corresponding inverse (Fig. 2):
Method a. linking (xh) = tanh (xh) * 0.75
The procedure was the same as in Experiment 1. 32 sentences were b. linkinginv (xh) = artanh (xh / 0.75 )
created, all appearing in four versions according to the two violation
types introduced above (4). The experimental sentences were distrib- To estimate the weights to be used, we first run a linear model on the
uted onto four lists according to a Latin square design and combined acceptability judgment data of Experiment 1 and extract the coefficients
with 62 filler sentences. 36 students took part in the study. of the two simple effects of AuxFirst and Agree, disregarding the
Results interaction term. We apply the inverse of the linking function to these
Similar to Experiment 1, a repeated measures ANOVA yielded two coefficients to obtain the weights to be used in our model which allow us
main effects of AuxFirst and S [ O as well as an interaction to calculate the harmony values for our four candidates:
(F(35,1) = 20.32, p [ .001). As can be seen on the right hand side of
Acceptability (log ratios)
Agree SO
0.2
0.2
noAgree OS
0.5
(adjusted harmony)
predicted predicted
tanh(x)*0.75
noAgree OS
0.0
-0.2
-0.2
-0.5
-0.6
-0.6
Aux first Aux last Aux first Aux last -2 -1 0 1 2

Verb cluster order Verb cluster order x (harmony)
Fig. 1 Results of Experiments 1 and 2 Fig. 2 A linking function between harmony and adjusted harmony
123
presupposes a fixed lower bound of acceptability relative to the fully

Constraint Coefficient Weight grammatical candidate. To test whether this assumption is reasonable,
we apply the same linking function as above to the results of
AuxFirst 0.46 0.72
Experiment 2 and plot all values in Fig. 4.
Agree 0.40 0.59 The transformed harmony value for the two-violation case here
diverges slightly more from the measured mean, but is still much
closer than the one predicted by a linear combination of weights.
H(a) = - (0 + 0*0.72 + 0*0.59) = 0 Discussion
H(b) = - (0 + 1*0.72 + 0*0.59) = - 0.72 (5) The current study makes two independent contributions to the area of
H(c) = - (0 + 0*0.72 + 1*0.59) = - 0.59 gradient grammaticality research: Firstly, it provides strong evidence
H(d) = - (0 + 1*0.72 + 1*0.59) = - 1.31 that multiple constraint violations combine in an under-additive
fashion in acceptability judgments measured by magnitude estima-
To compare the harmony values to the acceptability judgments, tion. This holds both for concurrent violations of two hard constraints,
we now apply the linking function to the harmony values and plot the that is constraints that are context independent and cause severe
values in Fig. 3 after aligning the 0-point of the transformed harmony unacceptability, and for soft constraints which can depend on context
scale (Axis on the right) to the zero violation case. and only cause mild degradation. Secondly, we demonstrated that
Because of rescaling the weights with the inverse, the predicted using an appropriate linking function that maps harmony to ME
values for the two single violation cases match exactly. For the acceptability judgments we are able to model the under-additive
condition with two violations, the predicted value comes close to the effect in judgments while preserving full additive cumulativity in
measured value, and it is much closer than the value predicted under harmony values. It remains to be tested whether this linking function,
the assumption of a direct proportionality between harmony and or an alternative function based on more data, can account for the
acceptability, i.e., linear decrease in acceptability. whole set of ME judgment data.
While it is possible to determine a function that leads to an exact The under-additivity in ME judgments that we observed suggests
match by choosing a different multiplier in (5), we leave this step to the existence of a lower bound of perceived grammaticality. If such a
further research as ideally this function should be fitted to a variety of lower bound exists, this questions the appropriateness of ME in two
experiments and not only a single one. The existence of a single ways: If there is a cognitive lower bound, the motivation for using an
linking function for all acceptability judgment experiments open-ended scale rather than a bounded scale, like a Likert Scale of
suitable size, seems to disappear. Alternatively, it is possible that the
method itself is not well-suited to capture differences below a certain
threshold as the perception of linear differences might not be linear in
0
Acceptability
general, as is the case with loudness.
(transformed) harmony
0.2
Agree
noAgree
-0.25
References
Predicted harmony Bard EG, Robertson D, Sorace A (1996) Magnitude estimation of
noAgree sigmoid linguistic acceptability. Language 72(1):3268
-0.2
-0.5
noAgree linear Featherston S (2007) Data in generative grammar: the stick and the
carrot. Theor Ling 33(3):269318
Goldwater S, Johnson M (2003) Learning OT constraint ranking using
-0.75
a maximum entropy model. In: Spenader J, Eriksson A, Dahl S

(eds) Proceedings of the Stockholm workshop on variation
-0.6
within optimality theory. University of Stockholm, pp 111120

Aux first Aux last Hofmeister P, Casasanto LS, Sag IA (2014) Processing effects in
Verb cluster order linguistic judgment data:(super-) additivity and reading span
scores. Lang Cogn 6(01):111145
Fig. 3 Measured vs predicted values in Experiment 1 Keller F (2000) Gradience in grammar: Experimental and computa-
tional aspects of degrees of grammaticality. PhD thesis,
University of Edingburgh
Acceptability Keller F (2006) Linear optimality theory as a model of gradience in
0
grammar. In: Fanselow G, Fery C, Vogel R, Schlesewsky M (eds)

0.2
(transformed) harmony
SO
OS Gradience in grammar: generative perspectives. Oxford Univer-

-0.25
Predicted harmony sity Press, New York, pp 270287

OS sigmoid Pater J (2009) Weigthed constraints in generative linguistics. Cogn
OS linear Sci 33:9991035
-0.2
Schutze CCT, Sprouse J (2014) Judgment data. In Podesva RJ,

-0.5
Sharma D (eds) Research methods in linguistics. Cambridge

University Press, Cambridge, pp 2750
Smolensky P, Legendre G (2006) The harmonic mind: from neural
-0.75
computation to optimality-theoretic grammar (2 Volumes). MIT

-0.6
Press, Cambridge
Sorace A, Keller F (2005) Gradience in linguistic data. Lingua
Aux first Aux last 115(11):14971524
Verb cluster order Sprouse J (2011) A test of the cognitive assumptions of magnitude
estimation: Commutativity does not hold for acceptability judg-
Fig. 4 Measured vs predicted values in Experiment 2 ments. Language 87(2):274288
123
Strong spatial cognition activities involved in cognitive processing [Wintermute and Laird,
2008]. Alternative ways would be (1) to maintain some of the spatial
Christian Freksa relations in their original form or (2) to use only mild abstraction for
University of Bremen, Germany their representation. Maintaining relations in their original form
corresponds to what Norman [1980] named knowledge in the world.
Motivation Use of knowledge in the world requires perception of the world to
The ability to solve spatial tasks is crucial for everyday life and thus solve a problem. The best-known example of mild abstraction is
of great importance for cognitive agents. A common approach to geographic paper maps; here certain spatial relations can be repre-
modeling this ability in artificial intelligence has been to represent sented by identical spatial relations (e.g. orientation relations); others
spatial configurations and spatial tasks in form of knowledge about could be transformed (e.g. absolute distances could be scaled). As a
space and time. Augmented by appropriate algorithms such repre- result, physical operations such as perception, route-following with a
sentations allow the computation of knowledge-based solutions to finger, and manipulation may remain enabled similarly as in the
spatial problems. In comparison, natural embodied and situated original domain. Again, perception is required to use these mildly
cognitive agents often solve spatial tasks without detailed knowledge abstracted representationsbut the perception task can be easier than
about underlying geometric and mechanical laws and relationships; the same task under real-world conditions, for example due to the
they can directly relate actions and their effects due to spatio-tem- modified scale.
poral affordances inherent in their bodies and their environments. A main research hypothesis for studying physical operations and
Against this background, we argue that spatial and temporal structures processes in spatial and temporal form in comparison to formal or
in the body and the environment can substantially support (or even computational structures is that spatial and temporal structures in the
replace) reasoning effort in computational processes. While the body and the environment can substantially support reasoning effort
principle underlying this approach is well knownfor example, it is in computational processes. One major observation we can make
applied in descriptive geometry for geometric problem solvingit when comparing the use of such different forms of representation
has not been investigated as a paradigm of cognitive processing. The (formal, mild abstraction, original) is that the processing structures of
relevance of this principle may not only be to overcome the need for problem solving processes differ [Marr 1982]. Different processing
detailed knowledge that is required for a knowledge-based approach; structures facilitate different ease of processing [Sloman 1985].
it is also in understanding the efficiency of natural problem solving Our hypothesis can be plainly formulated as:
approaches. manipulation + perception simplify computation
Architecture of cognitive systems While the principle underlying this hypothesis is well knownfor
Cognitive agents such as humans, animals, and autonomous robots example, it is applied in descriptive geometry for geometric problem
comprise brains (resp computers) connected to sensors and actuators. solvingit has not been investigated as a principle of cognitive
These are arranged in their (species-specific) bodies to interact with processing.
their (species-typical) environments. All of these components need to Reasoning about the world can be considered the most advanced
be well tuned to one another to function in a fully effective manner. level of cognitive ability; this ability requires a comprehensive
For this reason, it is appropriate to view the entire aggregate (cog- understanding of the mechanisms responsible for the behavior of
nitive agent including body and environment) as a full cognitive bodies and environments. But many natural cognitive agents
system (Fig. 1). (including adults, children, and animals) lack a detailed understanding
Our work aims at investigating the distribution, coordination, and of their environments and still are able to interact with them rather
execution of tasks among the system components of embodied and intelligently. For example, they may be able to open and close doors
situated spatial cognitive agents. From a classical information pro- in a goal-directed fashion without understanding the mechanisms of
cessing/AI point of view, the relevant components outside the brain or the doors or locks on a functional level. This suggests that knowledge-
computer would be formalized in some knowledge representation based reasoning may not be the only way to implementing problem
language or associated pattern in order to allow the computer to solving in cognitive systems.
perform formal reasoning or other computational processing on this In fact, alternative models of perceiving and moving goal-oriented
representation. In effect, physical, topological, and geometric rela- autonomous systems have been proposed in biocybernetics and AI
tions are transformed into abstract information about these relations research to model aspects of cognitive agents [e.g. Braitenberg 1984;
and the tasks are then performed entirely on the information pro- Brooks 1991; Pfeifer and Scheier, 2001]. These models physically
cessing level, where true physical, topological, and geometric implement perceptual and cognitive mechanisms rather than
relations no longer persist. describing them formally and coding them in software. Such systems
This classical information-processing oriented division between are capable of intelligently dealing with their environments without
brain/computer on one hand and perception, action, body, and envi- encoding knowledge about the mechanisms behind the actions.
ronment on the other hand is only one way of distributing the The background of the present work has been discussed in detail in
[Freksa 2013; Freksa and Schultheis, in press].
Approach
With our present work, we go an important step beyond previous
embodied cognition approaches to spatial problem solving. We
introduce a paradigm shift which not only aims at preserving spatial
structure, but also will make use of identity preservation; in other
words, we will represent spatial objects and configurations by them-
selves or by physical spatial models of themselves, rather than by
abstract representations. This has a number of advantages: we can
avoid loss of information due to early representational commitments:
we do not have to decide prematurely which aspects of the world to
represent and which aspects to abstract from. This can be decided
partly during the problem solving procedure. At this stage, additional
contextual information may become available that can guide the
Fig. 1 Structure of a full cognitive system choice of the specific representation to be used.
123
Perhaps more importantly, objects and configurations frequently computational, and the locomotive parts of cognitive interaction than
are aggregated in a natural and meaningful way; for example, a the one we currently pursue in AI systems: rather than putting all the
chair may consist of a seat, several legs, and a back; if I move one intelligence of the system into the computer, the proposed approach
component of a chair, I automatically (and simultaneously!) move aims at putting more intelligence into the interactions between
the other components and the entire chair, and vice versa. This components and structures of the full cognitive system. More spe-
property is not intrinsically given in abstract representations of cifically, we aim at exploiting intrinsic structures of space and time to
physical objects; but it may be a very useful property from a cog- simplify the tasks to be solved.
nitive point of view, as no computational processing cycles are We hypothesize that this flexible assignment of physical and
required for simulating the physical effects or for reasoning about computational resources for cognitive problem solving may be closer
them. Thus, manipulability of physical structures may become an to natural cognitive systems than the almost exclusively computa-
important feature of cognitive processing, and not merely a property tional approach; for example, when we as cognitive agents search for
of physical objects. certain objects in our environment, we have at least two different
Similarly, we aim at dealing with perception dynamically, for strategies at our disposal: we can represent the object in our mind and
example allowing for on-the-fly creation of suitable spatial refer- try to imagine and mentally reconstruct where it could or should be
ence frames: by making direct use of spatial configurations, we can this would correspond to the classical AI approach; or we can visually
avoid deciding a priori for a specific spatial reference system in which search for the object in our physical environment. Which approach is
to perceive a configuration. As we know from problem solving in better (or more promising) depends on a variety of factors including
geometry and from spatial cognition, certain reference frames may memory and physical effort; frequently a clever combination of both
allow a spatial problem to collapse in dimensionality and difficulty. approaches may be best.
For example, determining the shortest route between two points on a Although the general principle outlined may apply to a variety of
map boils down to a 1-dimensional problem [Dewdney 1988]. domains, we will constrain our work in the proposed project to the
However, it may be difficult or impossible to algorithmically deter- spatio-temporal domain. This is the domain we understand best in
mine a reference frame that reduces the task given on a 2- or terms of computational structures; it has the advantage that we have
3-dimensional map to a 1-dimensional problem. A spatial reconfig- well-established and universally accepted reference systems to
uration approach that makes use of the physical affordance shortcut, describe and compute spatial and temporal relations.
easily reduces the problem from 3D or 2D to 1D. In other cases, it Our research aims at identifying a bag of cognitive principles and
may be easier to identify suitable spatial perspectives empirically in ways of combining them to obtain cognitive performance in spatio-
the field than analytically by computation. Therefore we may be better temporal domains. We bring together three different perspectives, in
off by allowing certain operations to be carried out situation-based in this project: (1) the cognitive systems perspective which addresses
the physical spatial configuration as part of the overall problem cognitive architecture and trade-offs between explicit and implicit
solving process. representations; (2) the formal perspective which characterizes and
In other words, our project investigates an alternative architecture analyzes the resulting structures and operations; and (3) the imple-
of artificial cognitive systems that may be more closely based on role mentation perspective which constructs and explores varieties of
models of natural cognitive systems than our purely knowledge- cognitive system configurations. In the long-term, we see potential
based AI approaches to cognitive processing. We focus on solving technical applications of physically supported cognitive configura-
spatial and spatio-temporal tasks, i.e. tasks having physical aspects tions for example in the development of future intelligent materials
that are directly accessible by perception and can be manipulated by (e.g. smart skin where distributed spatio-temporal computation is
physical action. This will permit outsourcing some of the intelli- required but needs to be minimized with respect to computation
gence for problem solving into spatial configurations. cycles and energy consumption).
Our approach is to first isolate and simplify the specific spatial Naturally, the proposed approach will not be as broadly applicable
problem to be solved, for example by identifying an appropriate task- as some of the approaches we pursue in classical AI. But it might
specific spatial reference system, by removing task-irrelevant entities discover broadly applicable cognitive engineering principles, which
from the spatial configuration, or by reconstructing the essence of the will help the design of tomorrows intelligent agents. Our philosophy
spatial configuration by minimal abstraction. In general, it may be is to understand and exploit pertinent features of space and time as
difficult to prescribe the precise steps to preprocess the task; for the modality-specific properties of cognitive systems that enable powerful
special case of spatial tasks it will be possible to provide rules or specialized approaches in the specific domain of space and time.
heuristics for useful preprocessing steps; these can serve as meta- However, space and time are most basic for perception and action and
knowledge necessary to control actions on the physical level. After ubiquitous in cognitive processing; therefore we believe that under-
successful preprocessing, it may be possible in some cases to read standing and use of their specific structures may be particularly
an answer to the problem through perception directly off the resulting beneficial.
configuration; in other cases the resulting spatial configuration may be In analogy to the notion of strong AI (implementing intelligence
a more suitable starting point for a knowledge-based approach to rather than simulating it [Searle 1980]) we call this approach strong
solving the problem. spatial cognition, as we employ real space rather than simulating its
Discussion structure.
The main hypothesis of our approach is that the intelligence of
cognitive systems is located not only in specific abstract problem- Acknowledgments
solving approaches, but alsoand perhaps more importantlyin the I acknowledge discussions with Holger Schultheis, Ana-Maria Olte-
capability of recognizing characteristic problem structures and of teanu, and the R1-[ImageSpace] project team of the SFB/TR 8 Spatial
selecting particularly suitable problem-solving approaches for given Cognition. This work was generously supported by the German
tasks. Formal representations may not facilitate the recognition of Research Foundation (DFG).
such structures, due to a bias inherent in the abstraction. This is,
where mild abstraction can help: mild abstraction may abstract only References
from few aspects while preserving important structural properties. Braitenberg V (1984) Vehicles: experiments in synthetic psychology.
The insight that spatial relations and physical operations are MIT Press, Cambridge
strongly connected to cognitive processing may lead to a different Brooks RA (1991) Intelligence without representation, Artif Intell
division of labor between the perceptual, the representational, the 47:139159
123
Dewdney AK (1988) The armchair universe. W.H. Freeman & basic information to infer shape from texture that need to be integrated
Company, San Francisco along characteristic intrinsic surface lines (Li and Zaidi, 2000). Pre-
Freksa C (2013) Spatial computinghow spatial structures replace vious computational models try to estimate surface orientation from
computational effort. In: Raubal M, Mark D, Frank A (eds) distortions of the apparent optical texture in the image. The approaches
Cognitive and linguistic aspects of geographic space. Springer, can be subdivided according to their task specificity and the compu-
Heidelberg tational strategies involved. Geometric approaches are suggested to
Freksa C, Schultheis H (in press) Three ways of using space. In: reconstruct the structure of the metric surface geometry (e.g., Aloi-
Montello DR, Grossner KE, Janelle DG (eds) Space in mind: monos and Swain (1985); Bajcsy and Lieberman (1976); Super and
concepts for spatial education. MIT Press, Cambridge Bovik (1995)). Neural models, on the other hand, infer the relative or
Marr D (1982) Vision. MIT Press, Cambridge even ordinal structure from initial spatial frequency selective filtering,
Norman DA (1980) The psychology of everyday things. Basic Books, subsequent grouping of the resulting output responses and a depth
Inc, New York mapping step (Grossberg et al. 2007; Sakai and Finkel, 1997). The
Pfeifer R, Scheier C (2001) Understanding intelligence. MIT Press, LIGHTSHAFT model of Grossberg et al. (2007) utilizes scale-selec-
Cambridge tive initial orientation filtering and subsequent long-range grouping.
Searle J (1980) Minds, brains and programs. Behav Brain Sci 3(3): Relative depth in this model is inferred by depth-to-scale mapping
417457 associating coarse-to-fine filter scales to depth using orientation sen-
Sloman A (1985) Why we need many knowledge representation for- sitive grouping cells which define scale- sensitive spatial
malisms. In Bramer M (ed) Research and development in expert compartments to fill-in qualitative depth. Grouping mechanisms can
systems. Cambridge University Press, New York, pp 163183 be utilized to generate a raw surface sketch to establish lines of min-
Wintermute S, Laird JE (2008) Bimodal spatial reasoning with con- imal surface curvature as a ridge-based qualitative geometry
tinuous motion. In: Proceedings of AAAI, pp 13311337 representation (Weidenbacher et al. 2006). Texture gradients can be
integrated to derive local maps of relative surface orientation (as
suggested in Li and Zaidi (2000); Sakai and Finkel (1997)). Such
responses may be integrated to generate globally consistent relative
Inferring 3D shape from texture: a biologically inspired depth maps from such local gradient responses (Liu et al. 2004).
model architecture The above mentioned models are limited to simple objects most
dealing only with regular textures and do not give an explanation as to
how the visual system mechanistically produces a multiple depth
Olman Gomez, Heiko Neumann
order representation of complex objects.
Inst. of Neural Information Processing, Ulm University, Germany
Model description
Abstract Our model architecture consists of a multi-stage network of inter-
A biologically inspired model architecture for inferring 3D shape acting areas that are coupled bidirectionally (extension of
from textures is proposed. The model is hierarchically organized into (Weidenbacher et al. 2006); Fig. 1). The architecture is composed of
modules roughly corresponding to visual cortical areas in the ventral four functional building blocks or modules, each one consists of three
stream. Initial orientation selective filtering decomposes the input into stages corresponding to the compartment structure of cortical areas:
low-level orientation and spatial frequency representations. Grouping feedforward input is initially filtered by a mechanism specific to the
of spatially anisotropic orientation responses builds sketch-like rep- model area, then resulting activity is modulated by multiplicative
resentations of surface shape. Gradients in orientation fields and feedback signals to enhance their gain, and finally a normalization via
subsequent integration infers local surface geometry and globally surround competition utilizes a pool of cells in the space-feature
consistent 3D depth. domain.
Keywords The different stages can be formally denoted by the following
3D Shape, Texture, Gradient, Neural Surface Representation steady-state equations (with the filter output modulated by feedback
and inhibition by activities from a pool of cells (Eq. 1) and the
Introduction
inhibitory pool integration (Eq. 2)):
The representation of depth structure can be computed from various
visual cues such as binocular disparity, kinetic motion and texture I;FB
b f F r 0 1 neti;feat I;in
n qi;feat g
gradients. Based on findings from experimental investigations (Liu I
ri;feat 1
I;FB
et al. (2004); Tsutsui et al. (2002)) we suggest that depth of textured a c f F r 0 1 neti;feat qI;in
i;feat
surfaces is inferred from monocular images by a series of processing !
X X
stages along the ventral stream in visual cortex. Each of these stages I;in I I pool
qi;feat d ri;feat e max ri;feat Kij 2
is related to individual cortical areas or a strongly clustered group of feat j
feat
areas (Markov et al. 2013). Based on previous works that develop
generic computational mechanisms of visual cortical network pro- I,FB
where the P feedbackII signal I isII defined by neti,feat = [kFB -
cessing (Thielscher and Neumann (2003); Weidenbacher et al. II
ri,feat] + z2{feat,loc}rz . Here r , r denote output activation of the
(2006)) we propose a model that transforms initial texture gradient generic modules (I, II: two subsequent modules in the hierarchy). The
patterns into representations of intrinsic structure of curved surfaces different three-stage modules roughly correspond to different cortical
(lines of minimal curvature, local self- occlusions) and 3D depth (Li areas with different feature dimensions represented neurally (compare
and Zaidi (2000); Todd (2004)). Fig. 1): Cortical area V1 computes orientation selective responses
Previous work using a spatial frequency decomposition of the input; area V2
Visual texture can assume different component structure which suffers accomplishes orientation sensitive grouping of initial items into
from compression along the direction of surface slant when the object boundaries in different frequency channels to generate representations
appearance curves away from the viewers sight. Texture gradients of surface curvature properties. Different sub-populations of cells in
provide a potent cue to local relative depth (Gibson, 1950). Several V4/IT are proposed to detect different surface features from distrib-
studies have investigated how size, orientation or density of texture uted responses: One is used to extract discontinuities in the
elements convey texture gradient information (Todd and Akerstrom, orientation fields (indicative for self-occlusions), another extracts and
1987). Evidence suggests that patterns of changing energy convey the analyzes anisotropies in the orientation fields of grouping responses to
123
Fig. 1 General overview of models schematics. Texture inputs are decomposed into a space-orientation-frequency domain representation. The
cascaded processing utilizes computational stages with cascades of filtering, top-down modulation via feedback, and competition with activity
normalization
determine slanted surface regions, and one that integrates patches of detector using coarse-grained oriented filters with on/off-subfields (like
anisotropic orientation field representations in order to infer local 3D area V4). In addition, model area IT functions as directed integrator of
depth. The approach suggests that the generation of 2D sketch rep- gradient responses using pairs of anisotropic Gaussian long-range
resentation of surface invariants seeks to enhance surface border lines, grouping mechanisms truncated by a sigmoid function. These integrate
while integrating regions with high response anisotropies in the ori- the gradient cell responses to generate an activation that is related to the
entation domain (over spatial frequencies) allows the inference of surface depth profile.
qualitative depth from texture gradients. The proposed network Results
architecture is composed of four blocks, or modules, each of which We show few results in order to demonstrate the functionality of the new
defines a cascade of processing stages as depicted in Fig. 1. Module I proposed model architecture. In Fig. 2 the result of computing surface
employs 2D Gabor filters resembling simple cells in area V1. In representations from initial orientation sensitive filtering and sub-
module II output responses from the previous module are grouped to sequent grouping to create a sketch-like shape representation are
form extended contour arrangements. Activations are integrated by shown. Then a map of strong anisotropy in the texture energy is shown.
pairs of 2D anisotropic Gaussian filters separated along the major axis These anisotropies refer to locations of local slant in the surface ori-
along the target orientation axis of each orientation band (like in area entation relative to the observer view point and operate independent of
V2). Grouping is computed separately in each frequency band. This is the particular texture pattern that appears on the surface.
all similar to the LIGHTSHAFT model (Grossberg et al. (2007)) to In Fig. 3 the results of orientation sensitive integration of texture
compute initial spatial frequency- selective responses and subse- gradient responses is shown that leads to a viewer-centric surface
quently group them into internal boundaries. Unlike LIGHTSHAFT we depth representation.
employ frequency-related response normalization such that relative These results are compared against the ground truth surface height
frequency energy in different channels provide direct input for gradient map in order to demonstrate the invariance of the inferred shape
estimation. The sum of the responses here give a measure of texture independent of the texture pattern in the input.
compression. In module III the grouping responses are in turn filtered by Discussion and conclusion
mechanisms that employ oriented dark-lightdark anisotropic Gaussian A neural model is proposed that extracts 3D relative depth shape
spatial weightings with subsequent normalization (like in Thielscher representations of complex textured objects. The architecture utilizes
and Neumann (2003)). The output is fed back to module II to selectively a hierarchical computational scheme of different stages referring to
enhance occlusion boundaries and edges of the apparent object surface cortical areas V1, V2, V4 and IT along the ventral pathway to gen-
shape. This recurrence helps to extract a sketch-like representation of erate representations of shape and the recognition of objects. The
the surface structure similar to (Weidenbacher et al. 2006). Module IV model also generates a 2D surface sketch from texture images. Such a
combines the output of the previous modules and serves as a gradient sketch contains depth cues such as T-junctions or occlusion
123
selectively integrated for different orientations to generate qualitative

surface depth.
Acknowledgments
O.G. is supported by a scholarship of the German DAAD, ref.no.
A/10/90029.
Fig. 2 Result of grouping initial filter responses in space-orientation

References
domain (separately for individual frequency channels) for the input
Aloimonos J, Swain MJ (1985) Shape from texture. In: Proceedings
image (upper left). Texture gradient information is calculated over the
of the 9th IJCAI, Los Angeles, CA, pp 926931
normalized responses of cells in different frequency channels (upper
Bajcsy R, Lieberman L (1976) Texture gradient as a depth cue.
right). Stronger response anisotropies are mapped to white. The short
Comput Graph Image Process 5(1):5267
axis of the anisotropies (strongest compression) coheres with the slant
Gibson JJ (1950) The perception of the visual world. Houghton,
direction (surface tilt). The maximum responses over frequency and
Mifflin
orientation (white) create a sketch-like representation of the ridges of
Grossberg S, Kuhlmann L, Mingolla E (2007) A neural model of 3d
a surface corresponding with the orientation of local minimal
shape-from-texture: multiple-scale filtering, boundary grouping,
curvature (bottom left). Also local junctions occur due to self-
and surface filling-in. Vision Res 47(5):634672
occlusions generated by concave surface geometry. The result of
Li A, Zaidi Q (2000) Perception of three-dimensional shape from
orientation contrast detection (bottom right) is fed back to enhance the
texture is based on patterns of oriented energy. Vision Res
sketch edges
40(2):217242
Liu Y, Vogels R, Orban GA (2004) Convergence of depth from
texture and depth from disparity in macaque inferior temporal
cortex. J Neurosci 24(15):37953800
Markov NT, Ercsey-Ravasz M, Van Essen DC, Knoblauch K, To-
roczkai Z, Kennedy H (2013) Cortical high-density
counterstream architectures. Science 342(6158):1238406
Sakai K, Finkel LH (1997) Spatial-frequency analysis in the per-
ception of perspective depth. Netw Comput Neural Syst
8(3):335352
Super BJ, Bovik AC (1995) Shape from texture using local spectral
moments. IEEE Trans PAMI 17(4):333343
Thielscher A, Neumann H (2003) Neural mechanisms of cortico
cortical interaction in texture boundary detection: a modeling
approach. Neuroscience 122(4):921939
Todd JT (2004) The visual perception of 3d shape. Trend Cogn Sci
8(3):115121
Todd JT, Akerstrom RA (1987) Perception of three-dimensional form
from patterns of optical texture. J Exp Psychol Human Percept
Performance 13(2):242
Tsutsui KI, Sakata H, Naganuma T, Taira M (2002) Neural correlates
for perception of 3d surface orientation from texture gradient.
Science 298(5592):409412
Weidenbacher U, Bayerl P, Neumann H, Fleming R (2006) Sketching
shiny surfaces: 3d shape extraction and depiction of specular
surfaces. ACM Trans Appl Percept 3(3):262285
An activation-based model of execution delays

of specific task steps
Fig. 3 3D depth structure computed for different input textures for
Marc Halbrugge, Klaus-Peter Engelbrecht
the same surface geometry (left). Results of inferred depth structure
Quality and Usability Lab, Telekom Innovation Laboratories,
are shown (right) for given ground truth pattern (bottom). Relative
Technische Universitat Berlin, Germany
error RE measures are calculated to determine the deviation of depth
estimation from the true shape Abstract
When humans use devices like ticket vending machines, their actions
boundaries as well as ridge-like structures depicting lines of minimum can be categorized into task-oriented (e.g. selecting a ticket) and
surface curvature. Unlike previous approaches the model goes beyond device-oriented (e.g. removing the bank card after having paid).
a simple detection of local energies of oriented filtering to explain Device-oriented steps contribute only indirectly to the users goal;
how such localized responses are integrated into a coherent depth they take longer than their task-oriented counterparts and are more
representation. Also it does not rely on a heuristic scale-to-depth likely to be forgotten. A promising explanation is provided by the
mapping, like LIGHTSHAFT, to assign relative depth to texture activation-based memory for goals model (Altmann and Trafton
gradients and also does not require diffusive filling- in of depth 2002). The objectives of this paper are, first, to replicate the step
(steered by a boundary web representation). Instead, responses dis- prolongation effect of device-orientation in a kitchen assistance
tributed anisotropically in the orientation feature domain are context, and secondly, to investigate whether the activation construct
123
can explain this effect using cognitive modeling. Finally, a necessity 4. Selecting the target recipe (e.g. Lamb chops) in the search
and sensitivity analysis provides more insights into the relationship results list.
between goal activation and device-orientation effects. 5. Answering a simple question about the recipe (e.g. What is the
Keywords preparation time?) as displayed by the kitchen assistant after
Cognitive Modeling, HumanComputer Interaction, ACT-R, having selected the recipe.
Memory, Human Error
We did not analyze the first and last phase as they do not create
Introduction and related work observable clicks on the touch screen. Of the remaining three phases,
While the research on task completion times in humancomputer entering search criteria and recipe selection are task-oriented, while
interaction (HCI) has brought many results of both theoretical and the intermediate Search-click is device-oriented.
practical nature during the last decades (see John and Kieras 1996, for Results
an overview), the relationship between interface design and user error is We recorded a total of 18 user errors. Four were intrusions, nine were
still unclear in many parts. Notable exceptions are post-completion omissions, five were selections of wrong recipes. The application
errors, when users fail to perform an additional step in a procedure after logic of the kitchen assistant inhibits overt errors during the device-
they have already reached their main goal (Byrne and Davis 2006). This oriented step We therefore focused on completion time as dependent
concept can be extended to any step that does not directly support the variable and discarded all erroneous trials.
users goals, independently of the position in the action sequence, and As our focus is on memory effects, we concentrated on steps that task
has been termed device-orientation in this context (Ament et al. 2009). only the memory and motor system. We removed all subtasks that need
The opposite (i.e. steps that do contribute to the goal) is analogously visual search and encoding (phase 4: searching for the target recipe in the
called task-orientation. Device-oriented steps take longer and are more results list and clicking on it), and steps that incorporated substantial
prone to omission than task-oriented ones (Ament 2011). computer system response times (i.e. moving to another UI screen).
A promising theoretical explanation for the effects of device-ori- 817 clicks remained for further analysis; 361 (44 %) of these were
entation is provided by the memory for goals model (MFG; Altmann device-oriented. The average time to perform a click was 764 ms
and Trafton 2002). The main assumption of the MFG is that goals (SD = 381) for task-oriented and 977 ms (SD = 377) for device-
underlie effects that are usually connected to memory traces, namely oriented steps.
time- dependent activation and associative priming. Within the the- As the kitchen assistant has been created for research in an area
oretical framework of the MFG, post-completion errors and increased different from HCI, it introduces interfering variables that need to be
execution times for post-completion steps are caused by lack of controlled. The motor time needed to perform a click on a target
activation of the respective sub-goal. A computational implementa- element (i.e. button) depends strongly on the size and distance of the
tion of the MFG that can be used to predict sequence errors has been target as formalized in Fitts law (Fitts 1954). Fitts index of difficulty
created by Trafton et al. (2009). (ID) cannot be held constant for the different types of clicks, we
This paper aims at investigating the concept of device- orientation therefore introduced it into the analysis. As the click speed (i.e. Fitts
on the background of the MFG using cognitive modeling with ACT-R law parameters) differs between subjects, we used linear mixed
(Anderson et al. 2004). The basic research question is whether human models (NLME; Pinheiro et al. 2013) with subject as grouping factor
memory constructs as formalized within ACT-R can explain the and Fitts law intercept and slope within subject. We also observed a
completion time differences between task- and device-oriented steps small, but consistent speed-up during the course of the experiment
found in empirical data. that led us to the introduction of the trial block as additional inter-
Experiment fering variable. The analysis of variance was conducted using R (R
As the empirical basis for our investigation, we decided not to rely on Core Team 2014). All three factors yielded significant results, we
synthetic laboratory tasks like the Tower of Hanoi game, but instead obtained a prolongation effect for device-oriented steps of 104 ms.
use an application that could be used by everyone in an everyday The results are summarized in Table 1.
environment. Our choice fell on a HTML-based kitchen assistant that Discussion
had been created for research on ambient assisted living. Besides The first objective of this paper is met, we could identify a significant
other things, the kitchen assistant allows to search for recipes execution time delay for device-oriented steps. How does this effect
depending on regional cuisine (French, Italien, German, Chinese) and relate to the existing literature? Ament et al. (2009) report an insig-
type of dish (main dish, appetizer, dessert, pastry). Our experiment nificant difference of 181.5 ms between task-oriented and device-
was built around this search feature. oriented steps. This fits well with the empirical averages reported at
12 subjects (17 % female, Mage = 28.8, SDage = 2.4) were invited the beginning of the results section, although the experimental pro-
into the lab kitchen and performed 34 individual search tasks of varying cedure used there (flight simulation game) led to longer steps with
difficulty in five blocks. The user interface (UI) of the kitchen assistant completion times well above two seconds.
was presented on a personal computer with integrated touch screen. What remains open is whether the proposed cognitive mechanism
Task instructions were given verbally and all user clicks were recorded behind the time difference, namely lack of activation, can account for
by the computer system.7 Individual trials consisted of five phases: this time difference. The next section addresses this question.
1. Listening to and memorizing the instructions for the given trial.
2. Entering the search criteria (e.g. German and Main dish) by
clicking on respective buttons on the screen. This could also Table 1 Regression coefficients (coef.) with confidence intervals
contain deselecting criteria from previous trials. (CI) and analysis of variance results for the experiment
3. Initiating the search using a dedicated Search button. This also
initiated switching to a new screen containing the search results Factore Name Coef. 95 % CI of coef. F1,802 p
list if this list was not present, yet.
Fitts ID 165 ms 126 to 204 ms 111.1 \.001
trial block -55 ms -71 to -39 ms 45.9 \.001
7
The experiment as described here was embedded in a larger Device-orient. 104 ms 53 to 154 ms 16.4 \.001
usability study. See Quade et al. (2014) for more details. The
instructions are available for download at http://www.tu-berlin. Individual slopes for Fitts difficulty (ID) ranged from 121 to
de/?id=135088. 210 ms/bit
123
Table 2 Average click time (Mtime), average memory retrieval time (Mmem), determination coefficient (R2), root mean squared error (RMSE),
maximum likely scaled difference (MLSD), and maximum relative difference (%diff) for different amounts of activation spreading (mas)
mas Mtime Mmem R2 RMSE MLSD %diff
2 1785 ms 591 ms .759 982 ms 16.5 66 %

4 1509 ms 315 ms .738 687 ms 12.1 58 %
6 1291 ms 99 ms .881 477 ms 8.5 50 %
8 1231 ms 37 ms .912 422 ms 7.9 48 %
10 1210 ms 15 ms .893 406 ms 7.8 48 %
The MFG model Results

We implemented the memory for goals theory based on the mecha- We evaluated the overall fit of the model by dividing the clicks into
nism provided by the cognitive architecture ACT-R (Anderson et al. eight groups by the screen areas of the origin and target click position
2004), as the MFG is originally based on the ACT-R theory (Altmann (e.g. from type of dish to search; from search to recipe selection) and
and Trafton 2002). Within ACT-R, memory decay is implemented compared the average click times per group between our human
based on a numerical activation property belonging to every chunk sample and the model. Besides the traditional goodness of fit mea-
(i.e. piece of knowledge) in declarative memory. Associative priming sures R2 and root mean squared error (RMSE), we applied the
is added by a mechanism called spreading activation. maximum likely scaled difference (MLSD; Stewart and West 2010)
This led to the translation of the tasks used in our experiment into which also takes the uncertainty in the human data into account. The
chains of goal chunks. Every goal chunk represents one step towards relative difference between the empirical means and the model pre-
the target state of the current task. One element of the goal chunk dictions is given in percent (%diff). The results for five different
(slot in ACT-R speak) acts as a pointer to the next action to be amounts of activation spreading are given in Table 2.
taken. After completion of the current step, this pointer is used to The model is overall slower than the human participants, resulting in
retrieve the following goal chunk from declarative memory. The time moderately high values for RMSE, MLSD, and relative difference. The
required for this retrieval depends on the activation of the chunk to be explained variance (R2) on the other hand is very promising and hints at
retrieved. If the activation is too low, the retrieval may fail com- the model capturing the differences between different clicks quite well.
pletely, resulting in an overt error. Sensitivity and necessity analysis
The cognitive model receives the task instructions through the In order to test whether our model also displays the device- orienta-
auditive system, just like the human participants did. For reasons of tion effect, we conducted a statistical analysis identical to the one
simplicity, we reduced the information as much as possible. The user used on the human data and compared the resulting regression
instruction Search for German dishes and select lamb chops for coefficients. While an acceptable fit of the model is necessary to
example translates to the model instruction German on; search push; support the activation spreading hypothesis, it is not sufficient to
lamb-chops on. The model uses this information to create the nec- prove it. By manipulating the amount of activation spreading, we can
essary goal chunks in declarative memory. No structural information perform a sensitivity and necessity analysis that provides additional
about the kitchen assistant is hard coded into the model, only the insight about the consequences of our theoretical assumptions (Gluck
distinction that some buttons need to be toggled on, while others need et al. 2010). Average coefficients from a total of 400 model runs are
to be pushed. displayed in Fig. 1. It shows an inverted U-shaped relationship
While the model should in principle be able to complete the recipe between spreading activation and the device-orientation effect. For
search tasks of our experiment with the procedural knowledge descri- intermediate spreading activation values, the time delay predicted by
bed above, it actually breaks down due to lack of activation. Using the model falls within the confidence interval of the empirical coef-
unaltered ACT-R memory parameters, the activation of the goal chunks ficient, meaning perfect fit given the uncertainty in the data.
is too low to be able to reach the target state (i.e. recipe) of a given task.
We therefore need to strengthen our goals and spreading activation is
the ACT-R mechanism that helps us doing so. How we apply spreading 150
Device orientation effect size [ms]
activation in our context is inspired by close observation of one of our

subjects who used self-vocalization for memorizing the current task
information. The self-vocalization contained only the most relevant
parts of the task, which happen to be identical to the task- oriented steps 100
of the procedure. We analogously theorize that the goal states repre-
senting task-oriented steps receive more spreading activation than their
device-oriented counterparts. This assumption is also in line with the
discussion of post-completion errors on the basis of the memory for 50
goals model in Altmann and Trafton (2002).
For the evaluation of the model, we used ACT-CV (Halbrugge
2013) to connect it directly to the HTML-based user interface of the
0
kitchen assistant. In order to be able to study the effect of spreading
activation in isolation, we disabled activation noise and manipulated
the value of the ACT-R parameter that controls the maximum amount 2 4 6 8 10
of spreading activation (mas). The higher this parameter, the more ACTR activation spreading parameter (mas)
additional activation is possible.8
Fig. 1 Device orientation effect size depending on spreading activa-
8
The ACT-R code of the model is available for download at tion amount. The shaded area between the dotted lines demarks the
http://www.tu-berlin.de/?id=135088. 95 % confidence interval of the effect in the human sample
123
Discussion Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team (2013) nlme:

The MFG model is able to replicate the effects that we found in our Linear and Nonlinear Mixed Effects Models. R package version
initial experiment. The model being overall slower than the human 3.1-113
participants could be caused by the rather low Fitts law parameter Quade M, Halbrugge M, Engelbrecht KP, Albayrak S, Moller S
used within ACT-R (100 ms/bit) compared to the 165 ms/bit that we (2014) Predicting task execution times by deriving enhanced
observed. cognitive models from user interface development models. In:
Spreading activation is not only necessary for the model to be able Proceedings of the 2014 ACM SIGCHI symposium on engi-
to complete the tasks, but also to display the device-orientation effect neering interactive computing systems, ACM, New York,
(Fig. 1). We can infer that the activation assumption is a sound pp 139148
explanation of the disadvantage of device-oriented steps. Too much R Core Team (2014) R: A language and environment for statistical
spreading activation reduces the effect again, though. This can be computing. R Foundation for Statistical Computing, Vienna,
explained by a ceiling effect: The average retrieval time gets close to Austria, http://www.R-project.org. Accessed 7 May 2014
zero for high values of mas (Mmem in Table 2), thereby diminishing Stewart TC, West RL (2010) Testing for equivalence: a methodology
the possibility for timing differences. for computational cognitive modelling. J Artif Gen Intell
How relevant is a 100 ms difference in real life? Probably not too 2(2):6987
much by itself. What makes it important is its connection to user Trafton JG, Altmann EM, Ratwani RM (2009) A memory for goals
errors. Errors itself are hard to provoke in the lab without adding model of sequence errors. In: Howes A, Peebles D, Cooper RP
secondary tasks that interrupt the user or create strong working (eds) Proceedings of the 9th International Conference of Cogni-
memory strain, thereby substantially lowering external validity. tive Modeling, Manchester, UK
Conclusions
The concept of device-orientation versus task-orientation is an
important aspect of humancomputer interaction. We could replicate How action effects influence dual-task performance
that the device-oriented parts of simple goal- directed action
sequences take approximately 100 ms longer than their task-oriented
Markus Janczyk, Wilfried Kunde
counterparts. With the help of cognitive modeling, associative prim-
Department of Psychology III, University of Wurzburg, Germany
ing could be identified as a possible explanation for this effect.
Doing multiple tasks at once typically involves performance costs in
at least one of these tasks. This unspecific dual-task interference
Acknowledgments occurs regardless of the exact nature of the tasks. On top of that,
The authors gratefully acknowledges financial support from the several task characteristics determine how well tasks fit with each
German Research Foundation (DFG) for the project Automatische other. For example, if two tasks require key press responses with the
Usability-Evaluierung modellbasierter Interaktionssysteme fur left and right hand, performanceeven in the first performed Task
Ambient Assisted Living (AL-561/13-1). 1is better if both responses entailed the same spatial characteristic
(i.e., if two left or two right responses are required compared with
when one left and one right response is required), the so-called
References backward-crosstalk effect (BCE; Hommel, 1998). Similarly, a
Altmann EM, Trafton JG (2002) Memory for goals: an activation- mental rotation is faster when it is preceded by or performed simul-
based model. Cogn Sci 26(1):3983 taneously with a manual rotation into the same direction compared to
Ament MG, Blandford A, Cox AL (2009) Different cognitive when both rotations go into opposite directions (Wexler, Kosslyn,
mechanisms account for different types of procedural steps. In: Berthoz, 1998; Wohlschlager, Wohlschlager, 1998). These examples
Taatgen NA, van Rijn H (eds) Proceedings of the 31nd annual are cases of specific dual-task interference.
conference of the cognitive science society, Amsterdam, NL, Given that the aforementioned tasks require some form of motor
pp 21702175 output, one may ask: how is this motor output selected? A simple
Ament MGA (2011) The role of goal relevance in the occurrence of solution to this question has been offered already by philosophers of
systematic slip errors in routine procedural tasks. Dissertation, the 19th century (e.g., Harle, 1861) and has experienced a revival in
University College London psychology in recent decades (e.g., Hommel, Musseler, Aschersleben,
Anderson JR, Bothell D, Byrne MD, Douglass S, Lebiere C, Qin Y Prinz, 2001): the ideomotor theory (IT). The basic idea of IT is that,
(2004) An integrated theory of the mind. Psychol Rev first, bidirectional associations between motor output and its conse-
111(4):10361060 quences (= action effects) are learned. Later on, this bidirectionality is
Byrne MD, Davis EM (2006) Task structure and postcompletion error in exploited for action selection: motor output is accessed by mentally
the execution of a routine procedure. Hum Factors 48(4):627 638 anticipating the action effects. Conceptually, action effects can be
Fitts PM (1954) The information capacity of the human motor system distinguished as being environment-related (e.g., a light that is swit-
in controlling the amplitude of movement. J Exp Psychol ched on by pressing a key) or body-related (e.g., the proprioceptive
47(6):381391 feedback from bending the finger).
Gluck KA, Stanley CT, Moore LR, Reitter D, Halbrugge M (2010) Against this background, consider again the case of mental and
Exploration for understanding in cognitive modeling. J Artif Gen manual rotations. Turning a steering wheel clockwise gives rise to
Intell 2(2):88107 body-related proprioceptive feedback resembling a clockwise turn
Halbrugge M (2013) ACT-CV: Bridging the gap between cognitive and even to obvious environment-related action effects, because one
models and the outer world. In: Brandenburg E, Doria L, Gross sees his/her hand and the wheel turning clockwise. According to IT,
A, Gunzlera T, Smieszek H (eds) Grundlagen und Anwendungen exactly these effects are anticipated to select the motor output.
der Mensch-Maschine-Interaktion10. Berliner Werkstatt However, the rotation directions of (anticipated) effects and of the
Mensch- Maschine-Systeme, Universitatsverlag der TU Berlin, actual motor output are confounded then. Consequently, one may
Berlin, pp 205210 wonder whether the manual rotation or rather the (anticipated) effect
John BE, Kieras DE (1996) Using GOMS for user interface design rotation is what determines the specific interference with a mental
and evaluation: which technique? ACM Trans Comput Hum rotation. The same argument applies to the BCE: Pressing a left of
Interact (TOCHI) 3(4):287319 two response keys requires anticipation of, for example, a left body-
123
related action effect, which is thus similarly confounded with the

spatial component of the actual motor output. Given the importance
IT attributes to action effects for action selection, we hypothesized
that action effects determine the size and direction of specific inter-
ference in such cases. We here present results from two studies that
aimed to disentangle the contributions of motor output and the
respective action effects. Conceivably, it is rather difficult to
manipulate body-related action effects, the approach was thus to
couple the motor output with environment-related action effects.
In a first study we have investigated the interplay of manual and
mental rotations (Janczyk, Pfister, Crognale, Kunde, 2012). To dis-
entangle the directions of manual and effect rotations, we resorted to Fig. 2 Illustration of the tasks used by Janczyk et al. (2014) and the
an instrument from aviation known as the attitude indicator or arti- results of their Experiment 1. Error bars are within-subject standard
ficial horizon. This instrument provides the pilot with information errors (Pfister, Janczyk 2013), computed separately for each R2-E2
about deviations from level-flight (perfect horizontal flying). Notably, relation group (see also Janczyk et al. 2014)
two versions of this instrument are available (Previc, Ercoline, 1999;
see also Fig. 1). In a plane-moving display, the horizon remains fixed
and turns of a steering wheel are visualized by the corresponding
turns of the plane. Consequently, turning a steering wheel counter- was that pressing a response key with the right hand briefly flashed a
clockwise results in an action effect rotating into the same direction. left or right light (i.e., an environment-related action effect), and this
Obviously, manual and effect rotation are confounded but provide a was the participants goal. One group of participants (the R2-E2
benchmark against which the critical condition using a horizon- compatible group; see Fig. 2, left part) flashed the left light with a left
moving display can be compared. In this display, the plane remains key press (of the right hand) and the right light with a right key press
fixed but the horizon rotates. Consequently, turning the steering wheel (of the right hand). This group produced a BCE, that is, better Task 1
counter-clockwise gives rise to an action effect turning clockwise. In performance was observed with compatible R1-R2 relations (see
our experiments, a mental rotation task (Shepard, Metzler, 1971) was Fig. 2, middle part). Again though, relative locations of motor output
followed by a manual rotation task that required turning a steering and action effects were confounded. Therefore, another group of
wheel. The planes curve due to this steering wheel turn was either participants (the R2-E2 incompatible group; see Fig. 2, right part)
visualized with the plane-moving or the horizon-moving display. flashed the left light with a right key press (of the right hand) and the
First, with the plane-moving display the manual rotation was initiated right light with a left key press (of the right hand). Now Task 1
faster when the preceding mental rotation went into the same direc- performance was better with incompatible R1-R2 relations (see
tion (essentially replicating the Wohlschlager, Wohlschlager, 1998, Fig. 2, middle part). This, however, means that the relative locations
and Wexler et al. 1998, results but with the reversed task order). of (body-related) action effects of the Task 1 response and the
Second, with the horizon-moving display, the manual rotation was environment-related action effects of Task 2 were compatible. This
initiated faster when the preceding mental rotation was into the basic outcome was replicated with continuous movements and action
opposite direction (Exp 3). Here, however, the mental and the effect effects (Exp 2) and also when both tasks resulted in environment-
rotation were into the same direction. Thus, these results suggest that related action effects (Exp 3).
important for the specific interference between mental and manual The generative role of anticipated action effects for action
rotations is not so much the motor output itself but rather what fol- selection, a pillar of IT, has been investigated in single task settings
lows from this motor output as a consequence. in numerous studies. The studies summarized in this paper extend
In a second study we used a similar strategy to investigate the this basic idea to dual-task situations and tested our assertion that
origin of the BCE (Janczyk, Pfister, Hommel, Kunde, 2014). In mainly the (anticipated) action effects determine the size and
Experiment 1 of this study, participants were presented with colored direction of specific interference phenomena. In sum, the results
letters as stimuli (a green or red H or S). Task 1 required a response to presented here provide evidence for this (see also Janczyk, Skirde,
the color with a key press of the left hand (index or middle finger) and Weigelt, Kunde, 2009, for converging evidence). In broader terms,
Task 2 required a subsequent key press depending on the letter action effects can be construed as action goals. Thus, it is not so
identity with the right hand (index or middle finger). The BCE in this much the compatibility of motor outputs and effectors but rather the
case would be reflected in better performance in Task 1 if both tasks compatibility/similarity of action goals that induces performance
required a left or a right response (compatible R1-R2 relations) costs or facilitation. Such an interpretation also bears potential for
compared to when one task required a left response and the other a improving dual-task performance and ergonomic aspects in, for
right response (incompatible R1-R2 relations). The critical addition example, working environments.
Acknowledgments
This research was supported by the German Research Foundation
(Deutsche Forschungsgemeinschaft, DFG; projects KU 1964/2-1, 2).
References
Harle E (1861) Der Apparat des Willens. Z Philos philos Kri 38:
5073
Hommel B (1998) Automatic stimulusresponse translation in dual-
task performance. J Exp Psychol Human 24: 13681384
Hommel B, Musseler J, Aschersleben G, Prinz W (2001) The theory
of event coding: a framework for perception and action. Behav
Brain Sci 24: 849878
Janczyk M, Pfister R, Crognale MA, Kunde W (2012) Effective
rotations: action effects determine the interplay of mental and
Fig. 1 Illustration of the tasks used by Janczyk et al. (2012) manual rotations. J Exp Psychol Gen 141: 489501
123
Janczyk M, Pfister R, Hommel B, Kunde W (2014) Who is talking in images. Besides this widely found linearity, this claim is supported by
backward crosstalk? Disentangling response- from goal-conflict psychophysiological findings that are summarized in Kosslyn (1996).
in dual-task performance. Cognition 132: 3043 The second assumption stems from findings about the influence of
Janczyk M, Skirde S, Weigelt M, Kunde, W (2009) Visual and tactile object complexity (e.g. Bethell-Fox and Shepard, 1988; Yuille and
action effects determine bimanual coordination performance. Steiger, 1982). It is assumed that objects can be rotated holistically if
Hum Movement Sci 28: 437449 they are sufficiently familiar. If an object is not, it will be broken down
Pfister R, Janczyk, M (2013) Confidence intervals for two sample into its components until these components are simple (i.e., familiar)
means: Calculation, interpretation, and a few simple rules. Adv enough to rotate. Then these components will be rotated subsequently.
Cogn Psychol 9: 7480 Third, the mental images that are maintained and transformed
Previc FH, Ercoline WR (1999) The outside-in attitude display throughout the rotation task are assumed to be subject to activation
concept revisited. Int J Aviat Psychol 9: 377401 processes. This means that they have to be reactivated during the
Shepard RN, Metzler J (1971) Mental rotations of three-dimensional process. This assumption is suggested by Kosslyn (1996) and fits
objects. Science 171: 701703 Cowans (1999) activation of working memory contents. It is fur-
Wexler M, Kosslyn SM, Berthoz A (1998) Motor processes in mental thermore supported by Just and Carpenters (1976) results. Analyzing
rotation. Cognition 68: 7794 eye movement during a mental rotation task, the authors found fre-
Wohlschlager A, Wohlschlager A (1998) Mental and manual rota- quent fixation changes between both object images.
tions. J Exp Psychol Human 24: 397412 Cognitive Model
A full description of the model is beyond the scope of this paper. This
section gives a short overview of the process steps that were derived from
the abovementioned assumptions. The described process applies to
Introduction of an ACT-R based modeling approach mental rotation tasks in which both stimuli are presented simultaneously.
to mental rotation 1. Stimulus encoding
The first image is encoded and a three-dimensional object repre-
Fabian Joeres, Nele Russwinkel
sentation (mental image) is created.
Technische Universitat Berlin, Department of cognitive modeling
in dynamic human machine systems, Berlin, Germany 2. Memory retrieval
Introduction Based on the three-dimensional representation, long term memory
The cognitive processes of mental rotation as postulated by Shepard is called to check if the encoded object is familiar enough to process
and Metzler (1971) have been extensively studied throughout the last its representation. If so, the representation is stored in working
decades. With the introduction of numerous humanmachine inter- memory and the second image is encoded. The created representation
face concepts that are integrated the humans spatial environment is used as reference in the following process steps (reference image).
(e.g. augmented-reality interfaces such as Google Glass or virtual- If the object is not familiar enough, the same retrieval is conducted for
reality interfaces such as Oculus Rift), human spatial competence and an object component and the information about the remaining com-
its understanding have become more and more important. Mental ponent(s) is stored in working memory.
rotation is seen as one of three main components of human spatial
competence (Linn and Petersen, 1985). A computational model of 3. Initial Search
mental rotation was developed to help understand the involved cog-
Several small transformations (i.e., rotations around different axes
nitive processes. This model integrates a wide variety of empirical
by only a few degrees) are applied to the mental image that was first
findings on mental rotation. It was validated in an experimental study
created (target image). After each small rotation, the model evaluates if
and can be seen as a promising approach for further modeling of more
the rotation reduced the angular disparity between both mental images.
complex, application-oriented tasks that include spatial cognitive
The most promising rotation axis is chosen. The decision (as well as the
processes.
monitoring process in the following step) is based on previously iden-
Mental Rotation
tified corresponding elements of the object representations.
In an experiment on object recognition, Shepard and Metzler (1971)
displayed two abstract three-dimensional objects from different per- 4. Transform and Compare
spectives to their participants. Both images showed either the same
object (same-trials) or mirrored versions of the same object (different- After defining the rotation axis in step 3, the target image is rotated
trials). The objects were rotated around either the vertical axis or by this axis. During this process, the target representations orientation
within the picture plane. Subjects were asked to determine if both is constantly monitored and compared to the reference representation.
images showed the same object. The authors found that the reaction The rotation is stopped when both representations are aligned.
time needed to match two objects forms a linear function of the 5. Confirmation
angular disparity between those objects. The slope of that linear
function is called rotation rate. Following Shepard and Metzlers If the object was processed piecemeal, the previously defined
interpretation of some analogue rotation process, this means that a rotation is applied to the remaining object components. After that
high rotation rate represents slow rotation, whereas fast rotation is propositional descriptions of all object parts is created for both mental
expressed by a low rotation rate. images. A comparison of these delivers a decision for same object
Since Shepard and Metzlers (1971) experiment, numerous studies or different objects.
have been conducted on the influences that affect the rotation rate of
6. Reaction
mental rotation. Based on these findings and on process concepts
suggested by various authors, a cognitive model has been developed. Based on the decision a motor response is triggered.
The following section summarizes the three main assumptions that Steps 3, 4, and 5 are inspired by the equally named process steps
the model is based on. suggested by Just and Carpenter (1976). However, although their
First, it is assumed that the linear dependence of angular displacement purpose is similar to that of Just and Carpenters steps, the details of
and reaction time is based on an analogue transformation of mental these sub-processes are different.
123
Furthermore, due to the abovementioned activation processes, steps numerous blocks. Also, the last two experimental blocks were
3 to 5 can be interrupted if the activation of one or both mental images excluded from data analysis because fatigue interfered with the
falls below a threshold. In that case, a reactivation sub-process is trig- training effects. Generally, the expected training and object famil-
gered that includes re-encoding of the corresponding stimulus. iarity effects occurred, as reported in Joeres and Russwinkel
The model was implemented within the cognitive architecture (accepted).
ACT-R (Anderson et al. 2004). Since ACT-R does not provide The effects that were found in the experiment (Ex) and predicted
structures for modeling spatial cognitive processes, an architecture by the model (M) are displayed in Fig. 1. It can be seen that the
extension based on Gunzelmann and Lyons (2007) concept was predicted rotation rates are considerably lower than the experimen-
developed. Adding the structures for spatial processes to the archi- tally found. A possible explanation for this disparity can be found in
tecture will enable ACT-R modelers to address a broad range of the abovementioned reactivation process that includes re-encoding of
applied tasks that rely on spatial competence. the stimuli. The model, however, does not claim to address stimulus
Experiment encoding validly. Therefore, duration differences in this process can
Empirical validation is an integral part of the model development cause the data deviation. Nevertheless, the trends, i.e. the shape of the
process. Therefore, an experimental study was conducted to test the learning curves, are validly predicted by the model. This is the case
assumptions about training effects and to predict these effects with the for the familiar object and for the unfamiliar objects, respectively.
above mentioned model. This first impression is confirmed by the goodness-of-fit measures,
Experimental approach as listed in Table 1. Although no golden standard exists for these
The experimental task was a classic mental rotation task with three- measures, it can be said that the absolute value deviation is rather high
dimensional objects as in Shepard and Metzlers (1971) study. In this with a mean RMSSD = 4.53. The data trends, however, were mat-
task, two images were displayed simultaneously. Reaction times were ched rather well, as indicated by the high r2 values (Fig. 1).
measured for correctly answered same-trials. Different-trials and Discussion
human errors include cognitive processes that are not addressed by The presented study showed that the model can validly replicate
the discussed cognitive model. certain training effects in mental rotation. It can therefore be seen as a
To test the assumption of object-based learning, the objects promising approach for modeling mental rotation and, with further
occurred with different frequencies. The entire stimulus set consisted research, mental imagery.
of nine objects, adopted from the stimulus collection of Peters and As briefly discussed, the model assumptions are partially based on
Battista (2008). One third of all trials included the same object, eye movement data. Therefore, further model validation data should
making this familiar object occur four times as often as each of the be provided in a follow-up study in which eye movement during a
other eight unfamiliar objects. The object used as familiar was bal- mental rotation task is predicted by the model and evaluated
anced over the participants. experimentally.
To capture training effects, the change of rotation rates was
monitored. Following the approach of Tarr and Pinker (1989), the
experiment was divided into eight blocks. In each block one rotation
rate for the familiar object and one rotation rate for all unfamiliar Table 1 Goodness-of-fit measures
objects were calculated, based on the measured reaction times. RMSSD r2
The described model is designed to predict learning-induced
changes in the rotation rate. As Schunn and Wallach (2005) suggest, Condition Familiar object 5.03 .74
two measures for the models goodness of fit were used to evaluate the
Unfamiliar object 4.02 .80
model. As the proportion of data variance that can be accounted for
by the model, r2 is a measure of how well the model explains trends in Mean 4.53 .77
the experimental data. RMSSD (Root Mean Squared Scaled Devia-
tion), however, represents the datas absolute deviation, scaled to the
experimental datas standard error.
Experimental Design
The study has a two-factorial within-subjects-design with repeated
measurements. As first independent variable the experimental block
was varied with eight levels. This variable represents the participants
state of practice. The second independent variable was object famil-
iarity (two levels: familiar object and unfamiliar objects). As
dependent variable, two rotation rates were calculated per block and
subject (one for each class of objects).
Sample
27 subjects (18f, 9 m) participated in the study. The participants age
ranged from 20 to 29 years (m = 26.1). Two persons received course
credit, the others were paid 10 for participation.
Procedure
After receiving instructions, the participants were required to com-
plete eight experimental blocks, each including 48 trials. Of these 48
trials, 16 trials displayed the familiar object. Half the trials were
same- the other half were different-trials.
Results
The experiment was repeated for two subjects because the number of
correct same trials was too low to calculate valid rotation rates in Fig. 3 Experimental (Ex) and model (M) data
123
If that study is successful, the model can be extended to further adjacent syllables create a stress clash or adjacent unstressed syllables
types of stimuli and to more complex, application-oriented tasks (stress lapse) occur. Experimental studies on speech production, judg-
including mental imagery. ment of stress perception, and event-related potentials (ERPs) (Bohn,
Knaus, Wiese, Domahs 2013) have found differences in production,
References ratings, and ERP components respectively, between well-formed
Anderson JR, Bothell D, Byrne MD, Douglass S, Lebiere C, Qin Y structures and rhythmic deviations. The present study builds up on these
(2004) An integrated theory of the mind. Psychol Rev findings by using functional magnetic resonance imaging (fMRI) in
111(4):10361060. doi:10.1037/0033-295X.111.4.1036 order to localize rhythmic processing (within the concept of Rhythm
Bethell-Fox CE, Shepard RN (1988) Mental rotation: effects of Rule) in the brain. Other fMRI studies on linguistic stress found effects
stimulus complexity and familiarity. J Exp Psychol Human in the supplementary motor area, insula, precuneus, superior temporal
Percept Performance 14(1):1223 gyrus, parahippocampal gyrus, calcarine gyrus and inferior frontal
Cowan N (1999) An embedded-process model of working memory. In gyrus (Domahs, Klein, Huber, Domahs 2013; Geiser, Zaehle, Jancke,
Miyake A, Shah P (eds) Models of working memory. Mecha- Meyer 2008; Rothermich, Kotz 2013). However, what other studies
nisms of active maintenance and executive control. Cambridge have not investigated yet is rhythm processing in natural contexts, thus
University Press, Cambridge, pp 62101 in the course of a story which is not further controlled for a metrically
Gunzelmann G, Lyon DR (2007) Mechanisms for human spatial isochronous speech rhythm. Here we examine the hypotheses that a)
competence. In Barkowsky T, Knauff M, Ligozat G, Montello well-formed structures are processed differently than rhythmic devia-
DR (eds) Spatial cognition V: reasoning, action, interaction, tions in compound words for German, b) this happens in speech
pp 288308 processing of stories in the absence of a phonologically related task
Joeres F, Russwinkel N (accepted). Object-related learning effects in (implicit rhythm processing).
mental rotation. In: Proceedings of the Spatial Cognition 2014, Our compounds consisted of three parts (A(BC)) that build a pre-
Bremen. modifier-noun combination. The modifier was either a monosyllabic
Just MA, Carpenter PA (1976) Eye fixations and cognitive processes. noun (Holz, wood) or a bisyllabic noun (Plastik, plastic) with
Cogn Psychol 8(4):441480. doi:10.1016/0010-0285(76)90015-3 lexical stress on the initial syllable. The premodifier was followed by a
Kosslyn SM (1996) Image and brain: the resolution of the imagery disyllabic noun bearing compound stress on the initial syllable in iso-
debate: the resolution of the imagery debate, 1st edn. A Bradford lation (Spielzeug, toy). When combining these two word structures
book. MIT Press, Cambridge the premodifier bears overall compound stress and the initial stress of
Linn MC, PetersenAC (1985) Emergence and characterization of sex the disyllabic noun should be shifted right- wards to its final syllable, in
differences in spatial ability: a meta-analysis. Child Dev14791498 order to be in accordance to the Rhythm Rule: Holz-spiel-zeug (wooden
Peters M, Battista C (2008) Applications of mental rotation figures of toy(s)). On the other hand if the disyllabic noun is combined with a
the Shepard and Metzler type and description of a mental rotation preceding disyllabic noun bearing initial stress, a shift is unnecessary
stimulus library. Brain nd Cogn 66(3):260264. doi:10.1016/j. allowing for the stress pattern: Pla- stik-spiel-zeug (plastic toy(s)). The
bandc.2007.09.003 first condition we call SHIFT and the second NO SHIFT. In contrast to
Schunn CD, Wallach D (2005) Evaluating goodness-of-fit in com- these well-formed conditions we induce rhythmically ill- formed con-
parison of models to data. In: Psychologie der Kognition: Reden ditions: CLASH for the case that Holz- spiel-zeug keeps the initial stress
und Vortrage anlasslich der Emeritierung von Werner Tack of its compounds and LAPSE when we introduce the unnecessary shift
Shepard RN, Metzler J (1971) Mental rotation of three-dimensional in Pla- stik-spiel-zeug. We constructed 20 word pairs following the
objects. Science 171:701703 same stress patterns as Holz-/Plastikspielzeug and embedded them in
Shepard S, Metzler D (1988) Mental rotation: effects of dimension- 20 two-minute long stories. Our focus when embedding the conditions
ality of objects and type of task. J Exp Psychol Human Percept was the naturalness of the stories. For example, word-pair Holz-
Performance 14(1):311 spielzeug vs. Plastikspielzeug would thus appear in the following
Tarr MJ, Pinker S (1989) Mental rotation and orientation-dependence context:The clown made funny grim- aces, reached into his red cloth
in shape recognition. Cogn Psychol 21(2):233282. doi:10.1016/ bag and threw a small wooden toy to the lady in the front row. vs.The
0010-0285(89)90009-1 toys, garden chairs and pillows remained however outside. The mother
Yuille JC, Steiger JH (1982) Nonholistic processing in mental rotation: wanted to tidy up the plastic toys from the garden after dinner.
some suggestive evidence. Percept Psychophys 31(3):201209 We obtained images (3T) of 20 healthy right-handed German
monolinguals (9 male) employing a 2x2 design: well-formedness
(rhythmically well-formed vs. ill-formed) x rhythm-trigger (mono-
Processing linguistic rhythm in natural stories: syllabic vs. disyllabic premodifier). Subjects were instructed to listen
carefully and were asked two comprehension questions after each
an fMRI study story. On the group level we analyzed the data in the 2x2 design
mentioned above. Our critical events were the whole compound
Katerina Kandylaki1, Karen Bohn1, Arne Nagels1, Tilo Kircher1, words. We report clusters of p \ .005 and volumes of at least 72
Ulrike Domahs2, Richard Wiese1 voxels (Monte Carlo corrected).
1
Philipps Universitat Marburg, Germany; 2 Universitat zu Koln, For the main effect of well-formedness we found effects in the left
Germany cuneus, precuneus and calcarine gyrus. For the main effect of rhythm-
Keywords trigger we found no significant differences at this supra-threshold
Rhythm rule, Speech comprehension, rhythmic irregularities, fMRI level, which was expected since we did not hypothesize an effect of
the length of the premodifier. Our main finding is the interaction of
Abstract well-formedness and rhythmic-trigger in the precentral gyrus bilat-
Language rhythm is assumed to involve an alternation of strong and erally and in the right supplementary motor area (SMA). Since the
weak beats within a certain linguistic domain, although the beats are not interaction was significant we calculated theoretically motivated pair-
necessarily isochronously distributed in natural language. However, in wise contrasts within one rhythmic trigger level. For the monosyllabic
certain contexts, as for example in compound words, rhythmically premodifier CLASH vs. SHIFT revealed no significant clusters, but,
induced stress shifts occur in order to comply with the so-called Rhythm interestingly, the opposite contrast (SHIFT vs. CLASH) showed dif-
Rule (Liberman, Prince 1977). This rule operates when two stressed ferences in the right superior frontal gyrus, right inferior frontal gyrus
123
(rIFG, BA 45), right lingual and calcarine gyrus, bilateral precentral verbs referring to movements in vertical space (e.g., rise vs. fall)
gyrus (BA6,BA4), left precentral gyrus (BA3a). For the bisyllabic facilitate upwards or downwards oriented sensorimotor processes,
premodifier LAPSE vs. NO SHIFT activated significantly the left depending on the meaning of the word that is being processed
inferior temporal gyrus, left parahippocampal gyrus, left insula, (Lachmair, Dudschig, De Filippis, de la Vega and Kaup 2011;
bilateral superior temporal gyrus (STG), right pre- and post- central Dudschig, Lachmair, de la Vega, De Filippis and Kaup 2012). This
gyrus. NOSHIFT vs. LAPSE activated significantly the right lingual finding presumably reflects an association of words with experiential
gyrus and the calcarine gyrus bilaterally. We finally compared the traces in the brain that stem from the readers interactions with the
two rhythmically ill- formed structures LAPSE vs. CLASH and found respective objects and events in the past. When later the words are
significant activation in the right supplementary motor area and pre- being processed in isolation, the respective experiential traces become
motor cortex. reactivated, providing the possibility of interactions between language
Our findings are in line with previous fMRI findings on rhythmic processing and the modal systems (cf. Zwaan and Madden 2005).
processing. Firstly, the superior temporal gyrus is robustly involved in Such interactions are also known from other cognitive domains, such
rhythmic processing irrespective of the task of the study: semantic and as for instance number processing (Fischer, Castel, Dodd and Pratt
metric task (Rothermich, Kotz 2013), speech perception of violated vs. 2003). Here high numbers facilitate sensorimotor processes in upper
correctly stressed words (Domahs, Klein, Huber, Domahs 2013) and in vertical space and low numbers in lower vertical space (Schwarz and
explicit and implicit isochronous speech rhythm tasks (Geiser, Zaehle, Keus 2004). The question arises whether the observed spatial-asso-
Jancke, Meyer 2008). To this we can add with our careful-listening task ciation effects in the two domains are related. A recent study
comparable to the semantic task of (Rothermich, Kotz 2013). Our conducted in our lab investigated this question. The reasoning was as
contribution is that we found activations for the implicit task of careful follows: If number processing activates spatial dimensions that are
listening which have only been found for explicit tasks before: these also relevant for understanding words, then we can expect that pro-
include the left insula, the bilateral precentral gyrus, the precuneus and cessing numbers may influence subsequent lexical access to words.
the parahippocampal gyrus. Lastly, the activation in the supplementary Specifically, if high numbers relate to upper space, then they can be
motor areas completes the picture of rhythm processing regions in the expected to facilitate understanding of an up-word such as bird.
brain. This finding is of special interest since it was strong for the The opposite should hold for low numbers which should facilitate the
comparison within rhythmically ill-formed conditions LAPSE vs. understanding of a down-word such as root. This is exactly what
CLASH. This might be due to the fact that stress lapse structures contain we found in an experiment in which participants saw one of four
two violations, i.e. a deviation from word stress which is not rhythmi- digits (1,2,8,9) prior to the processing of up- and down-nouns in a
cally licensed, while the clash structures contain only a rhythmically lexical decision task (Lachmair, Dudschig, delaVega and Kaup 2014).
deviation but keep the original word stress. In the present study we aimed at extending these findings by inves-
The differences in activations found for well-formedness show that tigated whether priming effects can be observed for the processing of
even in implicit rhythmical processing the language parser is sensitive verbs referring to movements in the vertical dimension (e.g., rise vs.
to subtle deviations in the alternation of strong and weak beats. This is fall).
particularly evident in the STG activation associated with the pro- Method
cessing of linguistic prosody, SMA activation which has been suggested Participants (N = 34) performed a lexical decision task with 40 verbs
to be involved in temporal aspects of the processing of sequences of denoting an up- or downwards oriented movement (e.g., rise vs. fall)
strong and weak syllables, and IFG activation associated with tasks and 40 pseudowords. Verbs were controlled for frequency, length and
requiring more demanding processing of suprasegmental cues. denoted movement direction. The words were preceded by a number,
one of the set {1, 2, 8, 9}. Correctly responding to the verbs required a
References key press on the left in half of the trials and on the right in the other
Bohn K, Knaus J, Wiese R, Domahs U (2013) The influence of rhythmic half. The order of the response mapping was balanced across par-
(ir) regularities on speech processing: evidence from an ERP study ticipants. Each trial started with a centered fixation cross (500 ms),
on German phrases. Neuropsychologia 51(4):760771 followed by a number (300 ms). Afterwards the verb/pseudo-word
Domahs U, Klein E, Huber W, Domahs F (2013) Good, bad and ugly stimulus appeared immediately and stayed until response. Response
word stressfMRI evidence for foot structure driven processing of times (RTs) were measured as the time from stimulus onset to the key
prosodic violations. Brain Lang 125(3):272282 press response. Each stimulus was presented eight times, resulting in
Geiser E, Zaehle T, Jancke L, Meyer M (2008) The neural correlate of a total of 640 experimental trials (320 verb-trials + 320 pseudo word-
speech rhythm as evidenced by metrical speech processing. trials), subdivided into 8 blocks, separated by a self-paced break with
J Cogn Neurosci 20(3):541552 error information. Each experimental half started with a short practice
Liberman M, Prince A (1977) On stress and linguistic rhythm. Lin- block. To ensure the processing of the digits, the participants were
guistic Inquiry 249336 informed beforehand that they should report the numbers they had
Rothermich K, Kotz SA (2013) Predictions in speech comprehension: seen in a short questionnaire at the end of the experiment. The design
fMRI evidence on the metersemantic interface. Neuroimage of the experiment was a 2 (number magnitude: low vs. high) x 2 (verb
70:89100 direction: up vs. down) x 2 (response mapping) design with repeated
measurements in all variables.
Results
The data of six participants were excluded due to a high number of
Numbers affect the processing of verbs denoting errors ([10 %) in all conditions. Responses to pseudo words,
movements in vertical space responses faster than 200 ms, and errors were excluded from further
analyses. We found no main effect of number magnitude (Fs \ 1.8),
no effect of response mapping (Fs \ 1), but a main effect of verb
Martin Lachmair1, Carolin Dudschig 2, Susana Ruiz Fernandez 1,
direction with faster responses for down- compared to up-verbs
Barbara Kaup 2
1 (F1(1,26) = 5.61, p \ .05; F2 \ 1; 654 ms vs. 663 ms). Interestingly,
Leibniz Knowledge Media Research Center (KMRC), 2 Psychology,
we also found a significant interaction of number magnitude and verb
University of Tubingen, Germany
direction, F1(1,26) = 5.23, p \ .05; F2(1,38) = 3.46, p = .07, with
Recent studies have shown that nouns referring to objects that typi- slower responses in congruent compared to incongruent trials [up
cally appear in the upper or lower visual field (e.g., roof vs. root) or verbs: 668 ms vs. 658 ms; down verbs: 654 vs. 654 ms]. To obtain
123
more information with regard to whether this effect depends on how or downwards directed movements could be initiated by the partici-
deeply the number primes were being processed, we conducted post pants after having processed the motion verbs, whereas the current
hoc analyses. Participants were subdivided into two groups, with study investigated the effects of number primes that were presented
Group 1 including all participants who had correctly reported the prior to the processing of the motion verbs. Thus, it seems well
numbers at the end of the experiment (N = 14), and Group 2 including possible that the task in Dudschig et al. tapped into later simulation
the remaining participants. Group 1 again showed an interaction processes than the task in the current study.
between number magnitude and verb direction (F1(1,26) = 9.47, Of course, future studies are needed to directly investigate this
p \ .01; F2(1,38) = 3.38, p = .07), however Group 2 did not post hoc explanation of our results. One possibility would be to
(Fs \ 1). Mean RT of both groups are displayed in Fig. 1. change the temporal aspects of the experimental task in the current
Discussion paradigm such that participants spend more time processing the verbs,
The present findings show that reading verbs denoting an up- or giving later simulation processes a chance to occur. Another possi-
downwards oriented movement is affected by the preceding pro- bility would be to present participants with verbs in the past perfect
cessing of high and low numbers. As such, the presented findings denoting a movement that has already taken part in the past (e.g.,
provide evidence for the view that spatial associations observed in gestiegen [had risen]). Maybe participants then focus more on the end
number and word processing may share a common basis (Barsalou point of the denoted movement, leading to facilitation effects in
2008). Interestingly, in contrast to Lachmair et al. (2014), the results spatially congruent conditions.
show interference instead of facilitation in spatially congruent con- One further aspect of the results obtained in the present study calls
ditions. Possibly, this deviating findings reflect the fact that verbs for discussion. The interaction effect between number and word
referring to movements in vertical space (such as rise or fall) are processing was only observed for those participants who could cor-
rather complex and implicitly refer to two spatial locations, namely rectly recall the number primes at the end of the experiment. One
the starting point and the end point of the movement. Maybe par- possible explanation is that the number primes need to be processed at
ticipants dynamically simulated the described movements beginning a certain level of processing in order for them to affect the subsequent
with the starting point. Considering that verbs are assumed to trigger processing of direction associated words. This would suggest that the
complex comprehension processes (see Vigliocco, Vinson, Druks, interaction effect between number and word processing is not fully
Barber and Cappa 2011), it seems plausible to assume that our automatic. Another possibility is that those participants who did not
experimental task may have tapped into early rather than late simu- recall the number primes correctly at the end of the experiment
lation processes. This in turn may explain why interference rather simply did not adequately follow the instructions and strategically
than facilitation was observed in the present experiments. ignored the number primes because they were of no relevance to the
One could of course argue that this explanation is not very con- experimental task. If so, it would be of no surprise that these par-
vincing considering that the study by Dudschig et al. (2012) also ticipants did not experience any interference from the number primes,
presented participants with verbs referring to upwards or downwards and no further conclusions could be drawn. One interesting manipu-
directed movements (as in the current study) and nevertheless lation for follow-up studies to the current experiment would be to
observed facilitation in spatially congruent conditions, not interfer- present the number primes for a very short duration and/or framed by
ence. However, we think that differences concerning temporal aspects a visual mask (see Dudschig et al. 2014). An interaction effect
of the experimental task may explain the different results. The study between number and word processing under these conditions would
by Dudschig et al. (2012) investigated the speed with which upwards provide strong evidence for the view that spatial associations in
number and word processing indeed share a common basis indepen-
dent of any strategic behavior of inter-relating the two domains.
Acknowledgments
We thank Elena-Alexandra Plaetzer for her assistance in data col-
lection. This work was supported by a grant from the German
Research Foundation (SFB 833/B4 [Kaup/Leuthold]).
References
Barsalou LW (2008) Grounded cognition. Ann Rev Psychol
59:617645
Dudschig C, de la Vega I, De Filippis M, Kaup B (2014) Language
and vertical space: on the automaticity of language action
interconnections. Cortex 58: 151160
Dudschig C, Lachmair M, de la Vega I, De Filippis M, Kaup B (2012)
Do task-irrelevant direction-associated motion verbs affect
action planning? Evidence from a Stroop paradigm. Mem Cogn
40(7):10811094
Fischer MH, Castel AD, Dodd MD, Pratt J (2003) Perceiving numbers
causes spatial shifts of attention. Nat Neurosci 6(6):555556
Lachmair M, Dudschig C, de la Vega I, Kaup B (2014) Relating numeric
cognition and language processing: do numbers and words share a
common representational platform? Acta Psychol 148: 107114
Fig. 1 Mean RT of correct responses as a function of verb direction Lachmair M, Dudschig C, De Filippis M, de la Vega I, Kaup B (2011)
(up vs. down) and number magnitude (high vs. low). Participants in Root versus roof: automatic activation of location information
Group 1 correctly recalled the number primes at the end of the during word processing. Psychon B Rev 18: 11801188
experiment, participants in Group 2 did not. Error bars represent the Masson, MEJ, Loftus GR. (2003) Using confidence intervals for
95 % confidence interval for within-subject designs (Masson and graphically based data interpretation. Can J Exp Psychol 57:
Loftus 2003) 203220
123
Schwarz W, Keus IM (2004) Moving the eyes along the mental It is Bratmans main argument that the described complex of
number line: comparing SNARC effects with saccadic and interrelated intentions and attitudes functions together as one char-
manual responses. Percept Psychophys 66: 651664 acteristic form of shared intention (2009). Due to the constructivist
Vigliocco G, Vinson DP, Druks J, Barber H, Cappa SF (2011) Nouns and functionalist nature of his approach it may yet not be the only
and verbs in the brain: a review of behavioural, electrophysio- kind of shared intention. The author himself admits the possibility
logical, neuropsychological and imaging studies. Neurosci that there may be other kinds and that shared intention may thus be
Biobehav Rev 35(3): 407426 multiply realizable.
Zwaan RA, Madden CJ (2005) Embodied sentence comprehension. In Bratmans conception of shared intention seems to be a convinc-
Pecher D, Zwaan RA (eds) Grounding cognition: the role of ing characterization of how cognitively mature agents act together. In
perception and action in memory, language, and thinking. contrast to this, some researchers doubt whether his approach is suited
Cambridge University Press, Cambridge, pp 224245 to account for joint action in young children. This issue is closely
related to the developmental onset of socio-cognitive abilities. The
common knowledge condition of Bratmans substantial account pre-
Is joint action necessarily based on shared intentions? supposes that the system of intentions in question is in the public
domain. Furthermore, there has to be mutual knowledge of the others
intentions plus knowledge of the others knowledge. The cognitive
Nicolas Lindner1, Gottfried Vosgerau
basis for common knowledge thus rests on a variety of capacities. The
Department of Philosophy, Heinrich-Heine-Universitat Dusseldorf,
agents in joint action ought to have: a) the ability to form beliefs and
Dusseldorf, Germany
higher-order beliefs (beliefs about beliefs), b) the ability to attribute
Abstract: Is joint action necessarily based on shared intentions? mental states to themselves and others, and c) the capacities needed
Regarding joint action, the majority of researchers in the field assumes for recursive mindreading. All in all, they must thus have a robust
that underlying collective or joint intentions are the glue that holds the theory of mind. With respect to this, critics of Bratmans account state
respective actions of the participants together (Searle 1990; Bratman that he characterizes shared intention in a way that is too complex to
1993; Tuomela 1988). A major part of the debate thus focuses on the accommodate for joint action of young children (Tollefsen 2005;
nature of these particular intentions. In this talk, we will describe one Pacherie 2011; Butterfill 2012).
major account and criticize that this account cannot explain joint Tollefsens (2005) critique is based on evidence suggesting that
action as displayed by small children. Based on this critique, we will young children lack a robust theory of mindparticularly, a proper
formulate an alternative view, which suggests that some non- understanding of others beliefs. This evidence comes from different
demanding cases of (seemingly) joint action (including those dis- false-belief tasks (Wellman et al. 2001). Without such a proper
played by small children) are rather effects of the lack of representing understanding of other agents beliefs, so Tollefsen argues, the
ones own intentions as ones own (it is just represented as an intention common knowledge condition in Bratmans conception could not be
that is there). This account has the advantage of offering a way to fulfilled. Hence, children could not take part in shared intentional
specify the pivotal role that joint action is supposed to play in the activities of such a sort. Similarly, Pacherie (2011) claims that
acquisition of socio-cognitive abilities. Bratmans shared intention requires cognitively sophisticated agents
A prominent approach to joint intentions by Michael Bratman who have both concepts of mental states like intentions and attitudes
(1993, 2009) construes shared intention as he calls them, as being and the ability to represent the mental states of others. From her point
derived from singular intentions, a conception of which he developed of view, small children lack such fully developed mentalizing and
in his book Intention, Plans, and Practical Reason from 1987. In a meta-representational capacities. Therefore, shared intention cannot
nutshell, Bratman characterizes intentions in this book as conduct- account for joint action in young children. A problem for Bratmans
controlling pro-attitudes, a term by Davidson (1980) describing an account thus stems from the fact that there is evidence of children
agents mental attitude directed toward an action under a certain engaging in joint activities before they develop the putatively nec-
description. For Bratman, intentions are typically parts of larger plans essary socio-cognitive abilities. Findings from developmental
concerning future actions. He regards these plans as mental states, psychology (Brownell 2011) suggest that children engage in different
which often are only partial and involve a hierarchy of general and forms of joint action together with adults from around 18 months of
more specific intentions. age and, from the end of the 2nd year of life, also with peers.
Bratmans account of shared intention (1993) relies on his concep- We will show that said criticisms rest on rather shaky grounds.
tion of individual intentions, the attitudes of the participants in joint First, they both attack Bratmans substantial account, which only
action, and their interrelations and is thus constructivist in nature. In his presents sufficient conditions for the presence of a shared intention.
account, a shared intention doesnt consist in a particular type of Thus, there might be other constructions in a Bratmanian sense that
intention. He proposes a complex of intentions and attitudes thatif avoid these flaws. Furthermore, both critiques rely on controversial
they have the appropriate content and function properlydo the job of a empirical claims about the onset of childrens mindreading capaci-
shared intention and can be identified with it. This complex is supposed tiesfor example, with respect to the starting point of false belief
to do three interrelated jobs: It should 1) coordinate the agents inten- understanding in children (Bargailleon et al. 2010; De Bruin and
tional actions in such a way that the joint goal can be achieved by acting Newen 2012) and the development of an early understanding of
together, 2) help in coordinating the relevant planning of the partici- common knowledge (Carpenter 2009). Thus, the critiques by Pacherie
pants, and 3) provide a framework that helps to structure relevant and Tollefsen do not present convincing arguments against Bratmans
bargaining. According to Bratman, fulfilling this three-fold function is a account per se. Still, they highlight an important issue by questioning
necessary condition for any kind of shared intention. the cognitive standards imposed on participating agents in joint action.
With regard to a complex that does all of these jobs, Bratman Butterfill (2012) takes a different route in criticizing Bratmans
suggests three sufficient conditions to describe a substantial account approach. His objection focuses on the necessary conditions for
of shared intention. 1) There should be an individual intention of each shared intention: the functional roles that shared intention is sup-
participant of the joint action in the form of I intend that we J. 2) posed to play. Butterfill claims that the coordinating and structuring
These individual intentions should be held in part because of and in of relevant bargaining, which shared intention is supposed to
accordance with the relevant intentions of the other partakers in the ensure, sometimes require monitoring or manipulating of other
joint action. 3) The two aforementioned conditions have to be com- agents intentions. With regard to accounts that stress the impor-
mon knowledge between the participants. tance of joint action for cognitive and socio-cognitive development
123
in infants (Tomasello et al. 2005; Moll and Tomasello 2007), joint Our alternative account is primarily designed to explain the
action would thus presuppose psychological concepts and capacities behavior of small children. However, we point to the possibility that
whose development it should explain in the first place. The con- non-demanding cases of cooperation (e.g. to buy an article in a gro-
tribution of joint activities to the development of our cognitive cery) can be explained by similar mechanisms in adults. In such cases,
capacities is the core argument of Tomasello and colleagues adults would not explicitly represent their own intentions as their own
hypothesis on shared intentionality. As long as one stresses its role intentions, thereby generating actions that are structurally similar to
in cognitive and socio-cognitive development, Butterfill claims that those of small children. Nevertheless, other more complex cases of
early joint action of children can hence not involve shared intention joint action certainly also exist in adults. In the light of our proposal,
in Bratmans sense. we thus also conclude that Bratmans account of shared intention
Bratmans conception can thus not account for childrens joint should not be abandoned altogether. Although a uniform account of
actions. At least, if it is supposed to explain the development of their joint action for both children and mature agents would have the
understanding of minds. Yet, his approach is suited to explain joint benefits of being parsimonious, candidates for such a comprising
action of adults as mature cognitive agents. Especially, this is the case explanation (Tollefsen and Dale 2012; Vesper et al. 2010; Gold and
for those kinds of joint action that involve planning, future-directed Sugden 2007) do not seem to have the resources to explain the
intentions and deliberation. development of qualitatively differing stages of joint action.
We will conclude our talk by offering an alternative account of
childrens ability for joint action, which turns, in a way, the circularity
upside down: If joint action is indeed pivotal for the development of References
socio-cognitive abilities, they cannot be developed in small children. Baillargeon R, Scott RM, He Z (2010) False-belief understanding in
Thus, joint action as displayed by small children has to be grounded infants. Trend Cogn Sci 14(3):110118. doi:10.1016/j.tics.2009.
in other abilities. Our proposal is that it is the lack of the concept of a 12.006
mental state (esp intentions) that produces behavior which looks like Bratman M (1987) Intention, plans, and practical reason. C S L I
joint action (we will not discuss whether the term should be applied to Publications/Center for the Study of Language & Information
these cases or not). If a child has not yet learned that a mental state is Bratman M (1993) Shared intention. Ethics 104(1):97113
something that belongs to single persons, it cannot be said to have Bratman M (2009) Shared agency. In: Philosophy of the social sci-
acquired the concept of a mental state. However, the child might be, ences: philosophical theory and scientific practice. Cambridge
at the same time, able to introspect the content of the own intentions, University Press
such that the childs introspection can be paraphrased as there is the Brownell CA (2011) Early developments in joint action. Rev Philos
intention to J. In other words, the child has not yet learned to make a Psychol, 2(2):193211. doi:10.1007/s13164-011-0056-1
difference between the own intentions and those of others. The effect Butterfill S (2012) Joint action and development. Philos Quart
of this lack of abilities will result in a behavior that looks like joint 62(246):2347. doi:10.1111/j.1467-9213.2011.00005.x
action (at least in cases in which the intention of the adult and the Carpenter M (2009) Just how joint is joint action in infancy? Top
child match). Such behavior might be initiated by different triggers in Cogn Sci 1(2):380392. doi:10.1111/j.1756-8765.2009.01026.x
the surrounding world that establish a common goal in the first place. Davidson D (1980/2001). Essays on actions and events, 2nd ed.
Candidates for this could be pointing gestures, affordances and Oxford University Press, USA
alignment between the agents. De Bruin LC, Newen A (2012) An association account of false belief
This account does not only offer new perspectives for the understanding. Cognition 123(2):240259. doi:10.1016/j.cognition.
explanation of autism (Frith 1989; Vosgerau 2009), it also offers a 2011.12.016
way to specify the thesis that (seemingly) joint action is pivotal to Frith U (1989/2003) Autism: explaining the enigma, 2nd edn.
the acquisition of socio-cognitive abilities: Joint action sets up an Blackwell Publ, Malden
environment in which children are able to gradually learn that Gold N, Sugden R (2007) Collective intentions and team agency.
intentions can differ between individuals. The result of this learning J Philos 104(3):109137
phase will ultimately be the acquisition of the concept of a mental Liebal K, Behne T, Carpenter M, Tomasello M (2009) Infants use
state, which includes that mental states belong to persons and that shared experience to interpret pointing gestures. Dev Sci
thus mental states can differ between individuals (this knowledge 12(2):264271. doi:10.1111/j.1467-7687.2008.00758.x
is then tested in the false-belief-task). In other words, the learning Liszkowski U, Carpenter M, Tomasello M (2008) Twelve-month-olds
of a theory of mind starts with acquiring the concept of a mental communicate helpfully and appropriately for knowledgeable and
state, and this concept can be best acquired in (seemingly) joint ignorant partners. Cognition 108(3):732739. doi:10.1016/
action scenarios, in which children directly experience the effects of j.cognition.2008.06.013
differing mental states (intentions and beliefs). Accordingly, Moll H, Carpenter M, Tomasello M (2007) Fourteen-month-olds
empirical research has already suggested that the acquisition of know what others experience only in joint engagement. Dev Sci
mental state concepts is dependent on the use of mental state terms 10(6):826835. doi:10.1111/j.1467-7687.2007.00615.x
(Rakoczy et al. 2006), which are presumably most often used in Moll H, Tomasello M (2007) Cooperation and human cognition: the
joint action scenarios. Vygotskian intelligence hypothesis. Philos Trans R Soc B Biol
Some empirical results have been interpreted to show that very Sci 362(1480):639648. doi:10.1098/rstb.2006.2000
young children already possess the socio-cognitive abilities needed Pacherie E (2011) Framing Joint Action. Rev Philos Psychol
for cooperative activities and act on a rather sophisticated under- 2(2):173192
standing of the mental states of self and other (Carpenter 2009). Rakoczy H, Tomasello M, Striano T (2006) The role of experience and
Following this line of argument, researchers propose that infants discourse in childrens developing understanding of pretend play
already understand others knowledge and ignorance (Liszkowski actions. Br J Dev Psychol 24(2):305335. doi:10.1348/02615100
et al. 2008), they can act on a shared goal (Warneken et al. 2006; 5X36001
Warneken and Tomasello 2007), and exploit the common ground they Searle J (1990) Collective intentions and actions. In Cohen P, Morgan
shared with an adult (Liebal et al. 2009; Moll et al. 2008). While J, Pollack ME (Hrsg.), Intentions in Communication. Bradford
appreciating the importance of this research as such, we will present Books, MIT Press, Cambridge
alternative interpretations of these findings that are cognitively less Tollefsen D (2005) Lets pretend! children and joint action. Philos
demanding and thus consistent with our proposal. Soc Sci 35(1):7597. doi:10.1177/0048393104271925
123
Tollefsen D, Dale R (2012) Naturalizing joint action: a process-based processes of his model of visual information processing (e.g., the
approach. Philos Psychol 25(3):385407. doi:10.1080/095150 generation of a three-dimensional depth structure) are specified by
89.2011.579418 particular formal algorithms, which are physically implemented in
Tomasello M, Carpenter M, Call J, Behne T, Moll H (2005) Under- the human brain. Therefore it is recognized, that functional pro-
standing and sharing intentions: the origins of cultural cognition. cesses also have a physical reality, but functional models fail to
Behav Brain Sci 28(5):675691 provide a framework for the exact circumstances, conditions,
Tuomela R, Miller K (1988) We-intentions. Philos Stud 53(3): constraints, etc. of such implementation relations. Admittedly, the
367389 connectionist approach has fulfilled this task better by generating
Vesper C, Butterfill S, Knoblich G, Sebanz N (2010) A minimal models of neural networks that are more likely to describe the
architecture for joint action. Neural Netw 23(89):9981003. doi: actual processes in our minds (see Rumelhart, McClelland 1986;
10.1016/j.neunet.2010.06.002 Smolensky 1988), but ultimately does not offer clear multi-level
Vosgerau G (2009), Die Stufentheorie des Selbstbewusstseins und model of the mind either.
ihre Implikationen fur das Verstandnis psychiatrischer Storun- It is important that the way of physical implementation as
gen. J Fur Philos Psychiatrie 2 described by Marr is usually understood in terms of physical reali-
Warneken F, Chen F, Tomasello M (2006) Cooperative activities in zation. Therefore, the causal profile of an abstract functional property
young children and chimpanzees. Child Dev 77(3):640663. doi: (behavioral inputs and outputs) must be determined by a conceptual
10.1111/j.1467-8624.2006.00895.x analysis in order to identify those physical (neural) structures that
Warneken F, Tomasello M (2007) Helping and cooperation at 14 months have exactly that causal profile (cf. Levine 1993; Kim 1998, 2005).
of age. Infancy 11(3):271294. doi:10.1111/j.1532-7078.2007.tb Maybe the realization theory is intended to provide an explanatory
00227.x approach of how abstract, functionally characterized properties as
Wellman HM, Cross D, Watson J (2001) Meta-analysis of theory-of- postulated by cognitive sciences can be a part of the physical world.
mind development: the truth about false belief. Child Dev An abstract, theoretical phenomenon is realized (quasi materialized)
72(3):655684. doi:10.1111/1467-8624.00304 in this sense through concrete physical conditions, while it can be
different physical systems to bring the computational or connectionist
formalism into the world (see Fodor 1974). The ontological status of
an abstract functional description or a second-order property remains
A general model of the multi-level architecture highly questionable.
of mental phenomena. Integrating the functional In contrast, much is gained if the functionalist approach is extended
paradigm and the mechanistic model of explanation and partially adjusted by the mechanistic rendition of mental proper-
ties. A mechanism can be understood as a set of activities organized
such that they exhibit the phenomenon to be explained (Craver 2007,
Mike Ludmann p 5). The mechanistic approach individuates a phenomenon about
University of Duisburg-Essen, Germany which tasks or causal roles it holds for the system concerned. So if the
The central aim of this contribution is to provide a conceptual mechanism behind a phenomenon is explored, one has explained the
foundation of psychology in terms of the formulation of a general phenomenon itself. As Bechtel (2008) says, a mechanism is a structure
model of an architecture of mental phenomena. It will be shown that performing a function in virtue of its component parts, component
the mechanistic model of explanation (Bechtel, Richardson 1993; operations, and their organization (p 13). Figure 1 shows the general
Machamer, Darden und Craver 2000; Bechtel 2007, 2008, 2009; formal structure of mechanistic levels.
Craver 2007) offers an appropriate founding approach to psychology To the explanatory phenomenon at the top of the mechanism
as well as their integration within the framework of cognitive and (S) is w. By contributed suffix -ing and the course of the arrows
brain sciences. Although the computational model of mind provides the process-related nature of mechanisms should be expressed. The
important models of mental properties and abilities, it fails to provide phenomenon w can decomposed into subcomponents. Craver used
an adequate multi-level model of mental properties. The mechanistic X as a term for the functioning as a component of W entities and u
approach, however, can be regarded as a conceptually coherent and as a name for their activity patterns. While functionalism respec-
scientifically plausible extension of the functional paradigm (see tively realization theory focuses on the relationship of abstract
Polger 2004; Eronen 2010). While a functionalist conception of the information processing and certain processes in the brain, the
mind mostly focuses on the mysterious relationship of mental prop-
erties as abstract or second-order properties to their physical realizers
(if such issues are not generally excluded), the mechanistic approach
model allows establishing a multi-level architecture of mental prop-
erties and their unambiguous localization in the overall scientific
system.
The functionalist models of the mind are usually based on the
computer metaphor of man that construes human beings as
information processing systems. They postulate relatively abstract
theoretical models of mental processes that allow generally very
reliable predictions of subsequent behavior of the system under
consideration of known input variables. The models provide a way
to put some cognitive (functionalist) operators into the black box
of behaviorism like thinking, decision making, planning. Taking
into account the current interdisciplinary research on the mind, the
functionalist conception of mind defining these operators as
abstract information processing, which can be described indepen-
dently of neuroscientific constraints, is a problem. If the question
is raised, how the connection between functional models and the
mind is established, Marr (1982) proposes that computational Fig. 1 Formal structure of a mechanism (from Craver 2007, p 189)
123
mechanistic approach extends this concern to a question of Bechtel W (2007) Reducing psychology while maintaining its auton-
embedding a given (mental) phenomenon in a structural hierarchy omy via mechanistic explanation. In M. Schouten, H. Looren de
of natural levels of organization characterized by the part-whole Jong (eds) The matter of the mind: philosophical essays on psy-
relationship. chology, neuroscience and reduction. Blackwell, Oxford,
If we take a cognitive property like spatial orientation or spatial pp 172198
memory, it is not simply the question of which brain structure realized Bechtel W (2008) Mental mechanisms: philosophical perspectives on
this property, rather than it has to be shown which causally relevant cognitive neuroscience. Psychology Press, New York
mechanisms are installed at various levels of a mereologically con- Bechtel W (2009) Looking down, around, and up: Mechanistic
strued mechanistic hierarchy (see Craver 2007). Thus the functional explanation in psychology. Philos Psychol 22, 543564
structure, as described by cognitive science, is undoubtedly an Bechtel W, Richardson RC (1993) Discovering complexity: decom-
explanatorily essential description of this mental property. So we can, position and localization as strategies in scientific research. MIT
for example, explain the behavior of a person in a given situation in Press, Cambridge
terms of the components and predictions of working memory theory Bickle J (1998) Psychoneural reduction: the new wave. MIT Press,
(Baddeley 1986). But the same mental event can described at dif- Cambridge
ferent levels of organization. In this way the mental event has a Bickle J (2003) Philosophy and neuroscience: a ruthlessly reductive
neuronal structure which, among other things, consists of a hippo- account. Kluwer, Dordrecht
campal activity. In addition, the mental property has a molecular Craver CF (2007) Explaining the brain. Mechanisms and the Mosaic
reality which is primarily characterized by the NMDA receptor Unity of Neuroscience. Clarendon Press, Oxford
activation and so on. So a mental phenomenon has a (potentially Eronen MI (2010) Replacing functional reduction with mechanistic
infinite) sequence of microstructures, none of them can be understood explanation. Philosophia Naturalis 47/48:125153
as the actual reality of the target property. Fodor FA (1974) Special sciences (or the Disunity of Science as a
From the fact that the installed part-whole relation implies a Working Hypothesis). Synthese 28:97115
spatio-temporal coextensivity of the different microstructures, so I Kim J (1998) Mind in a physical world. MIT Press, Cambridge
will argue, it can be deduced that we have a mereologically based Kim J (2005) Physicalism, or something near enough. Princeton
form of psychophysical identity. Nevertheless this identity thesis does University Press, Princeton
not have the crude reductionistic implications like the classical Levine J (1993) On leaving out what its like. In: Davies M,
philosophical thesis of psychophysical identity (see Place 1956; Smart Humphreys GW (eds) Consciousness. Psychological and Philo-
1959). Likewise, it can be shown that the dictum that only func- sophical Essays. Blackwell, Oxford, S. 121136
tionalism guarantees the autonomy of psychology (Fodor 1974) Machamer P, Darden L, Craver CF (2000) Thinking about mecha-
and it is jeopardized by every conception of psychophysical identity, nisms. Philos Sci 67:125
is fundamentally wrong. Quite the opposite is true. If we strictly Marr D (1982) Vision. Freeman and Company, New York
follow Fodor, then psychological concepts and theories are to be Place UT (1956) Is consciousness a brain process? Br J Psychol
preferred, which have a small inter-theoretical fit or low degree of 47:4450
correspondence to physical processes. Especially under these condi- Polger, T. W. (2004) Natural Minds. Cambridge: MIT Press.
tions, psychology risks to fall prey to crudely reductionist programs Rumelhart DE, McClelland JL (1986) Parallel distributed processing:
such as the new wave reductionism. This means that an inferior the- explorations in the microstructure of cognition. MIT Press,
ory, which does not have good inter-theoretical fit, should be replaced Cambridge
by lower-level theories (Bickle 1998, 2003). And even worse because Smart JJC (1959) Sensations and brain processes. Philos Rev
of the rejection of any psychophysical identity conception psychol- 68:148156
ogist would have to accept a microphysicalism, entailing micro levels Smolensky P (1988) On the proper treatment of connectionism.
have an ontological and explanatory priority. On the basis of the Behav Brain Sci 11:123
mechanistic approach (and its identity-theoretical interpretation) both Wimsatt WC (1976) Reductionism, levels of organization, and the
the integrity of psychology can be justified as well as the inter-the- mindbody problem. In Globus G, Maxwell G, Savodnik I (eds)
oretical fit of their concepts and theories. Mental properties form a Consciousness and the Brain. Plenum, New York, pp 205267
higher level in the natural organization of a (human) organism, but at Wimsatt WC (1980) Reductionistic research strategies and their biases
the same time they form a mutually inseparable unit with its physical in the units of selection controversy. In Nickles T (ed), Scientific
microstructures. discovery: case studies. D. Reidel, Dordrecht, pp 213259
It is the mental properties that characterize diffuse nexus of neu- Wimsatt WC (2006) Reductionism and its heuristics: making meth-
ronal events in terms of certain functional units in the first place. In odological reductionism honest. Synthese 151:445475
this sense the mind is the structure-forming or shaping principle at all
levels of the natural organization of the brain. Not despite but because
of its coextensivity with diverse natural organizational levels is the A view-based account of spatial working and long-term
mental both a real and a causally potent phenomenon. Despite the fact memories: Model and predictions
that with recourse to these micro levels of e.g. neurobiology some
characteristics of mental phenomena can be well explained, there is
neither an ontological nor an explanatory primacy of the micro levels Hanspeter A. Mallot, Wolfgang G. Rohrich, Gregor Hardiess
or their explanations. The adoption of such primacy is merely the Cognitive Neuroscience, Dept. of Biology, University of Tubingen,
product of a cognitive bias, a misguided interpretation of scientific Germany
explanations and the process of scientific knowledge discovery Abstract
(Wimsatt 1976, 1980, 2006). Space perception provides egocentric, oriented views of the envir-
onment from which working and long-term memories are constructed.
Allocentric (i.e. position-independent) long-term memories may be
organized as graphs of recognized places or views but the interaction
References of such cognitive graphs with egocentric working memories is
Baddeley AD (1986) Working memory. Oxford University Press, unclear. Here, we present a simple coherent model of view-based
Oxford working and long-term memories, and review supporting evidence
123
from behavioral experiments. The model predicts (i) that within a memory. As a consequence, computationally costly transformations
given place, memories for some views may be more salient than between non-egocentric long-term memories and egocentric working
others, (ii) imagery of a target place should depend on the location memories are often assumed.
where the recall takes place and (iii) that ecall avors views of the In this paper, we give a consistently view-based account of spatial
target place which would be obtained when approaching it from the working- and long-term memories and discuss a recent experiment
current recall location. supporting the model.
Keywords View-based spatial memory
Spatial cognition, Working memory, Imagery, View-based Places
representation, Spatial updating By the term view, we denote an image of an environment taken at a
view-point x and oriented in a direction u. Both x and u may be
Introduction
specified with respect to a reference frame external to the observer,
Sixteen years before his famous paper on the cognitive map, Edward
but this is not of great relevance for our argument. Rather, we assume
C. Tolman gave an account of rat spatial learning in terms of what he
that each view is stored in relation to other views taken at the same
called the means- ends-field (Tolman 1932, 1948) of which a key
place x but with various viewing directions u. The views of one place
diagram is reproduced in Fig. 1. The arrows indicate means- ends-
combine to a graph with a simple ring topology where views taken
relations, i.e. expectations that a rat has learned about which objects
with neighboring viewing directions are connected by a graph link
can be reached from which other ones, and how. In modern terms, the
(see Fig. 2a). This model of a place representation differs from the
Means objects (MO in the figure) are intermediate goals or rep-
well-known snapshot-model from insect navigation (Cartwright,
resentational states that the rat is in or expects to get into. This graph-
Collett 1982; for the role of snapshots in human navigation, see
approach to spatial memory has later been elaborated by Kuipers
Gillner et al. 2008) by replacing the equally sampled, panoramic
(1978) and is closely related to the route vs. maps distinction dis-
snapshot by a set of views that may sample different viewing direc-
cussed by OKeefe and Nadel (1978). Behavioral evidence for graph-
tions by different numbers of views. It is thus similar to view-based
like organization of human spatial memory has been reviewed e.g., by
models of object recognition, where views may also be sampled
Wang, Spelke (2002) or Mallot, Basten (2009).
inhomogeneously over die sides or aspects of an object (Bulthoff,
The graph-based approach to cognitive mapping, powerful as it
Edelman 1992). As in object recognition, places may therefore have
may appear, leaves open a number of important questions, two of
canonical views from which they are most easily recognized.
which will be addressed in this paper. First, what is the nature of the
Long-term memory
nodes of the graph? In Tolmans account, the means objects are
The graph approach to spatial long-term memory has been extended
intervening objects, the passage along each object being a means
from place-graphs to graphs of oriented views by Scholkopf, Mallot
to reach the next one or the eventual goal. Kuipers (1978) thinks of
(1995). As compared to the simple rings sufficient to model place
the nodes as places defined by a set of sensory stimuli prevailing at
memory, we now allow also for view-to-view links representing
each place. The resulting idea of a place-graph can be relaxed to a
movements with translatory components such as turn left and move
view-graph in which each node represents an observer pose (position
ahead or walk upstairs. The result is a graph of views with links
plus orientation), again characterized by sensory input, which are now
labeled by egocentric movements. Scholkopf, Mallot (1995) provide a
egocentric, oriented views (Scholkopf, Mallot 1995).
formal proof that this view-graph contains the same information as a
The second question concerns the working memory stage needed,
graph of places with geocentric movement labels.
among other things, as an interface between the cognitive graph and
perception and behavior, particularly in the processes of planning
routes from long-term memory and of encoding new spatial infor-
mation into long-term memory. For such working memory structures,
local, metric maps are generally assumed, representing objects and
landmarks at certain egocentric locations (Byrne et al. 2007, Tatler,
Land 2011, Loomis et al. 2013). While these models offer plausible
explanations for many effects in spatial behavior, they are hard to
reconcile with a view-based rather than object-based organization
long-term memory which will have to interact with the working
Fig. 2 Overview of view-based spatial memory. a Memory for

places is organized as a collection on views obtained from a place,
arranged in a circular graph. View multiplicity models salience of
view orientation. b Spatial long-term memory organized as a graph of
views and movements leading to the transition from one view to
another. c View-based spatial working memory consisting of a
subgraph of the complete view-graph, centered at the current view
Fig. 1 Tolmans notion of spatial long-term memory as a means- and including an outward neighborhood of the current view. For
ends-field (from Tolman 1932). This seems to be the first account of further explanation see text. (Tubingen Holzmarkt icons are sections
the cognitive map as a graph of states (objects) and actions of a panoramic image retrieved by permission from
(means-ends-relations) in which alternative routes can be found by www.kubische-panoramen.de. Map source Stadtgrundkarte der Uni-
graph search versitatsstadt Tubingen, Stand: 17.3.2014.)
123
A sketch of the view-graph for an extended area is given in by some sort of indexing mechanism which then is likely to recall a
Fig. 2b. The place-transitions are shown as directed links whereas the canonical view of the target place.
turns within a place work either way. In principle, the view-graph The mental-travel mechanism is illustrated in Fig. 3. Assume
works without metric information or a global, geocentric reference that the subject is currently located at position A in a familiar
frame, but combinations with such data types are possible. downtown environment. When asked to recall a view of the central
Working memory square appearing in Fig. 3, mental travel will generate a southward
Spatial working memory tasks may or may not involve the interview in spatial working memory which is then recalled. In contrast,
action with long-term memory. Examples for stand-alone when asked at position B, the mental-travel mechanism will yield a
processes in working memory include path integration, perspective westward view, and so on. We therefore predict that recall, or
taking, and spatial updating while spatial planning requires inter- imagery of a distant place will result in oriented views whose ori-
action with spatial long-term memory. Models of spatial working entation depends on the interview location. Preliminary data
memory presented e.g. by Byrne et al. (2007), Tatler, Land (2011), (Rohrich et al. 2013) support this prediction: passers-by who were
or Loomis et al. (2013) assume a local egocentric map in which approached in downtown Tubingen and asked to sketch a map of the
information about landmarks and the environment is inscribed. In Holzmarkt (a landmark square in central Tubingen), produced
contrast, Wiener, Mallot (2003) suggested a working-memory maps whose orientation depended on the interview site. As pre-
structure formed as local graph of places in which more distant dicted, orientations were preferred that coincided with the direction
places are collapsed into regional nodes. In order to reconcile this of approach from the current interview location. This effect was not
approach with the view-graph model for long-term memory, we found for additional interview-locations some 2 km away from
consider a local subgraph of the view-graph containing (i) the downtown, indicating that here a different recall mechanism might
current view, (ii) all views connected to this view by a fixed number operate.
of movement steps, and (iii) some local metric information repre- Oriented recall can also be triggered by explicitly asking subjects
sented either by egocentric position-labeling of the included views to perform a mental travel before sketching a map Basten et al. (2012)
or by some view transformation mechanism similar to the one asked subjects to imaging walking one of two ways in downtown
suggested in object recognition by Ullman, Basri (1991), or both. Tubingen, passing the Holzmarkt square either in westward or east-
This latter component is required to account for spatial updating ward direction. In this phase of the experiment, the Holzmarkt was
which is a basic function of spatial working memory. not mentioned explicitly. When asked afterwards, to draw sketches of
Frames of reference the Holzmarkt, produced view orientations were clearly biased
While perceptions and spatial working memory are largely organized towards the view orientation occurring in the respective direction of
in an egocentric way, long-term memory must be independent of the mental travel carried out by each subject. This indicates that oriented
observers current position and orientation and is therefore often view-like memories are generated during mental travel and affect
called allo- or geo-centric. These terms imply that a frame of subsequent recall and imagery.
reference is used, much as a mathematical coordinate system within Conclusion
which places are represented by their coordinates (e.g., Gallistel We suggest that spatial long-term memory consists of a graph of
1990). Clearly, the assumption of an actual coordinate frame in the views linked together according to the movements effecting each
mental map leads to severe problems, not the least of which are the view transition. Working memory contains local views as well as
representation of (coordinate) numbers by neurons and the choice of those nearby views which are connected to one of the local views.
the global coordinate origin. The view-graph approach avoids these When walking onwards, views of approached places are added from
problems. Long-term memory is independent of egos position and long-term memory, thereby maintaining orientation continuity (spa-
orientation since the views and their connections are carried around tial updating). In recall, views are selected from either working or
like a portfolio, i.e. as abstract knowledge that does not change upon
egos movements. Working memory may rightly be called egocentric
since it collects views as they appear from the local or a close-by
position. In the view-based model, the transform between the pose-
independent long-term memory and the pose-dependent (egocen-
tric) working memory reduces in the view based model to a simple
selection process of the views corresponding to the current pose and
their transfer into working memory.
Predictions and experimental results
The sketched model of spatial memory interplay makes predictions
about the recollections that subjects may make of distant places. The
task of imagining a distant place is a working-memory task where an
image of that place may be built by having an imagined ego move
to the target place. Bisiach, Luzzatti (1978) show that hemilateral
neglect in the recall of landmarks around the Piazza del Duomo in
Milan, Italy, affects the landmarks appearing left when viewed from
an imagined view-point, but not the landmarks on the respective right
side. This result can be expressed by assuming that neglect entails a
loss of the left side of spatial working memory, into which no long-
term memory items can be loaded; the long-term memory items
themselves are unaffected by the neglect condition.
For the imagery of distant places, two mechanisms can be
assumed. In a mental-travel mechanism, an observer might imag- Fig. 3 Mental-travel mechanism of spatial recall. When located at
ine a travel from his or her current position to the requested target a nearby place, but out of sight of the target (places AD), recall by
place, generate a working memory and recall the image from this mental travel towards the target place will result in different views.
working memory. In a recall from index-mechanism, place names Preliminary data suggest that this position-dependence of spatial
might be recalled from long-term memory without mental travel, e.g., recall exists. (For image sources see Fig. 2.)
123
long-term memory. For places more than 2 km away, recall reflects exist in relation to computer vision (CV) algorithms. Here, we argue
the long-term memory contents only. that the implications of the systematicity of vision, in terms of what
behavior is expected from CV algorithms, is important for the
development of such algorithms. In particular, the fact that syste-
Acknowledgment maticity is a strong argument for compositionality should be relevant
WGR was supported by the Deutsche Forschungsgemeinschaft within when designing computer vision algorithms and the representations
the Center for Integrative Neuroscience (CIN) Tubingen. they work with. In this paper, we discuss compositionality and sy-
stematicity in CV applications and present a CV system that is based
References on compositional representations.
Basten K, Meilinger T, Mallot HA (2012) Mental travel primes place Keywords
orientation in spatial recall. Lecture Note Artif Intell 7463:378385 Systematicity, Compositionality, Computer Vision
Bisiach E, Luzzatti C (1978) Unilateral neglect of representational
space. Cortex 14:129133 Systematicity and Compositionality
Bulthoff HH, Edelman S (1992) Psychophysical support for a two In their seminal paper (Fodor and Pylyshyn 1988), Fodor and Pyly-
dimensional view interpolation theory of object recognition. Proc shyn address the question of systematicity of cognition. Systematicity
Natl Acad Sci 89:6064 is the property by which related thoughts or sentences are understood.
Byrne P, Becker S, Burgess N (2007) Remembering the past and Anyone able to understand the sentence John loves the girl should
imagining the future: A neural model of spatial memory and be able to understand the related sentence The girl loves John. This
imagery. Psych Rev 114:340375 can be explained because both sentences are syntactically related. It is
Cartwright BA, Collett TS (1982) How honey bees use landmarks to because there is a structure on the sentences that language, and
guide their return to a food source. Nature 295:560564 thought, exhibit systematic behavior. The compositionality principle
Gallistel CR (1990) The organization of learning. The MIT Press, states that the meaning, or the content, of a sentence is derived from
Cambridge the semantic contribution of its constituents and the relations between
Gillner S, Wei AM, Mallot HA (2008) Visual place recognition and them (Szabo 2013). It is because John, the girl, and loves make the
homing in the absence of feature-based landmark information. same semantic contribution to the sentence John loves the girl, and
Cogn 109:105122 to The girl loves John, that we are able to systematically under-
Kuipers B (1978) Modeling spatial knowledge. Cogn Sci 2:129153 stand both of them. In the case of language, systematicity is achieved
Loomis JM, Klatzky RL, Giudice NA (2013) Representing 3D space by a compositional structure of constituents. In general, systematicity
in working memory: spatial images from vision, hearing, touch, is a strong argument for compositionality (Szabo 2013): we are able
and language. In: Lacey S, Lawson R (eds) Multisensory imagery: to understand an immense number of sentences which we have never
theory and applications. Springer, New York seen before.
Mallot HA, Basten K (2009) Embodied spatial cognition: biological This can be extended to vision: we are able to make sense of
and artificial systems. Image Vision Comput 27:16581670 scenes we have never seen before because they are composed of items
OKeefe J, Nadel L (1978) The hippocampus as a cognitive map, we know. The systematicity of vision is defended by several authors.
chapter 2. Spatial Behaviour. Clarendon, Oxford Already in (Fodor and Pylyshyn 1988), Fodor and Pylyshyn foresee
Rohrich WG, Binder N, Mallot HA (2013) Imagery of familiar places that systematicity is probably a general property of cognition that is
varies with interview location. In Proceedings of 10th Gottingen not limited to verbal capabilities. In the cognitive science literature,
meeting of the German Neuroscience Society, pp T242C. www. there are several arguments that support that vision is systematic
nwg-goettingen. de/2013/upload/file/Proceedings NWG2013.pdf (Aparicio 2012; Tacca 2010): if a subject is capable of visually
Scholkopf B, Mallot HA (1995) View-based cognitive mapping and representing a red ball then he must be capable of representing: i) the
path planning. Adapt Behav 3:311348 very same red ball from a large number of different viewpoints (and
Tatler BW, Land MF (2011) Vision and the representation of the retinal inputs); ii) a number of similar red balls []; and iii) red
surroundings in spatial memory. Phil Trans R Soc Lond B objects and ball- shaped objects in general. (Aparicio 2012).
366:596610 In this paper, we are concerned with the sort of systematic behavior
Tolman EC (1932) Purposive behavior in animals and men, chapter that should be expected when a scene is observed from different points
XI. The Century Co., New York of view: a systematic CV algorithm should be able to determine the
Tolman EC (1948) Cognitive maps in rats and man. Psych Rev visual elements that compose the images and find the correspondences
55:189208 between them over time. Some authors claim that systematicity in
Ullman S, Basri R (1991) Recognition by linear combinations of vision can be achieved without having compositionality (Edelman and
models. IEEE Trans Pattern Recogn Machine Intel 13:9921006 Intrator 2003, 2000). However, the models they provide have not
Wang RF, Spelk ES (2002) Human spatial representation: insights shown to be applicable in real world CV problems. We argue that from
from animals. Trends Cogn Sci 6:376382 a computer scientist point of view, recurring to compositionality is
Wiener JM, Mallot HA (2003) Fine-to-coarse route planning and beneficial when designing CV algorithms.
navigation in regionalized environments. Spatial Cogn Comput Compositionality in Computer Vision Algorithms
3:331358 The systematicity problem is rarely addressed in computational
models of vision. In Edelman and Intrator (2000), the authors
acknowledge that structural descriptions are the preferred theory about
human vision that allows for view- point abstraction and novel shape
Systematicity and Compositionality in Computer Vision recognition. In the structural approaches to vision, the visual infor-
mation is explained in terms of atomic elements and the spatial
German Martn Garca, Simone Frintrop, Armin B. Cremers relations that hold between them (Edelman 1997). One example is the
Institute of Computer Science III, Universitat Bonn, Germany Recognition-by-Components theory of Biederman (1987). In this
theory, object primitives are represented by simple geometric 3D
Abstract components called geons. However, extracting such primitive ele-
The systematicity of vision is a topic that has been discussed thor- ments from images is by no means a trivial task in CV. Approaches
oughly in the cognitive science literature; however, few accounts of it that attempt to extract such primitives to explain the visual phenomena
123
are hard to realize in practice, and according to Andreopoulos, Tsotsos a posteriori probability of the labelling given the measurements
(2013) there is no method that works reliably with natural images F f Pf jX1 ; X2 , and by applying Bayes rule, we get:
(Andreopoulos and Tsotsos 2013). Here, we suggest to generate such
PX1 ; X2 jf Pf
primitive elements by grouping mechanisms realized by segmentation Pf jX1 ; X2 1
methods which are well investigated in CV. In the following section, pX1 ; X2
we propose a computer vision system that bases on such perceptually Hereby, P(X1, X2|f) is the appearance term that denotes the
coherent segments to represent scenes in a compositional way. probability that the nodes of a given match f have certain attributes
A Compositional Approach for Visual Scene Matching X1and X2: we used colour average and dimensions of the minimum
Here, we present a compositional vision system that is able to rep- fitting rectangle as attributes. P(f) is the structural term and is high if a
resent a scene in terms of perceptually coherent components and the matching preserves the structure of the graph; for this term to have a
relations between them with help of a graph representation. A graph high value, if node A is mapped to A0 , then the neighbors of A should
matching algorithm enables to match components between different be mapped to the neighbors of A0 . The algorithm works by iteratively
viewpoints of a scene and, thus, enables a scene representation that is assigning to each node u in G1, the node v in G2 that maximises
temporally consistent. In contrast to geons, our segments are easily Equation 1:
extracted with standard segmentation algorithms; we use the well-
known Mean Shift segmentation algorithm (Comaniciu and Meer f u argmax pxu ; xv ju; vP f : 2
v2V2
2002). Mean Shift produces a segmentation based on the proximity of
pixels in spatial and color spaces. We construct a graph where the We extended the original algorithm so that it is able to deal with
nodes represent segments, and the edges the neighborhood of seg- directed graphs as well as with labeled edges. The labels represent the two
ments. We use labeled edges, where the labels correspond to the different relations: part of and attached to. The directions of the edges
relations between segments. These are of two types, part of and denote the order in which the segments appear in the relation predicates,
attached to, and can be obtained automatically from the image by e.g., in the part of relation, the edge points towards the node that contains
simple procedures. To compute whether two segments that share a the other, and in the attached to the edge points towards the node that is
common border (attached to relation) it is enough to perform two either under or on the right side of the other. The details of the algorithm
morphological operations: first to dilate, and then intersect are out the scope of this paper and can be found in (Garcia 2014).
both segments. The remaining pixels will constitute the shared con- We evaluated the algorithm on a real-world video sequence
tour and will indicate that this relation is present. To find whether recorded at our office by matching pairs of consecutive and non-
segment A is part of segment B is enough to check whether the outer consecutive frames. In the first case, 84 % of the segments were
contour of segment B is the same as the outer contour of the union of correctly matched, and in the second case, 57 %. Some non-consec-
A and B. utive frames are shown in Fig. 1: the matched segments are displayed
Once the graphs are built, we can apply a graph matching algo- with the same color, and those that were missed are displayed in
rithm to establish correspondences between nodes, and thus, between black. It can be seen that some missing matches originate from having
segments. Suppose we have two graphs G1 V1 ; E1 ; X1 and non-repeatable segmentations over frames, i.e., the boundaries of the
G2 V2 ; E2 ; X2 defined by a set of nodes V, edges E, and attributes segments are not always consistent when the viewpoint changes (see,
measured on the nodes X. We want to find a labelling function f that for example, the segmentation of the sponge in frames d) and e) in
assigns nodes from G1 to nodes in G2: f:G1 ? G2. We base our Fig. 1). This is a known problem of image segmentation algorithms
approach for matching on (Wilson and Hancock 1997). The authors (Hedau et al. 2008) that has two effects: a segment in frame 1 is
propose a relaxation algorithm for graph matching that locally segmented as two in frame 2, or the other way round. As a conse-
updates the label of each node based on an energy functional F quence, the graphs that are built on top of these segmentations are
defined on the labelling function f. By defining F f as the maximum structurally different.
Fig. 1 First row: original non-consecutive images. Rows 2 & 3: results of the matching between the corresponding pair of frames. Matches are
displayed with the same colors. Segments for which no match was found are shown in black
123
In future work, we will extend the matching algorithm so that gestures, bodily postures, tone and tempo of voice can provide others with
merging of segments is performed. In the presented system, we show information about her emotions, intentions and other mental states, and
in an exemplar way how the concept of compositionality can be thereby help to sustain interpersonal understanding and support joint
integrated into CV algorithms and, by making use of well-approved actions. And when such information flows back and forth among two or
segmentation and graph-matching methods, a simple visual repre- more mutually responsive participants in an interaction, the ensuing
sentation can be achieved that is coherent over time. alignment can promote social cohesion, enhancing feelings of connect-
edness and rapport (Lakin and Chartrand 2003; Bernieri 1988; Valdesolo
References et al. 2010). Indeed, by enhancing rapport, interactive alignment may also
Andreopoulos A, Tsotsos JK (2013) 50 years of object recognition: increase participants willingness to cooperate with each other (van
directions forward. Comput Vis Image Understand Baaren et al. 2004; Wiltermuth and Heath 2009) andequally impor-
Aparicio VMV (2012) The visual language of thought: Fodor vs. Pyly- tantlytheir mutual expectations of cooperativeness even when interests
shyn. Teorema: Revista Internacional de Filosofa 31(1):5974 are imperfectly aligned, as in scenarios such as the prisoners dilemma
Biederman I (1987) Recognition-by-components: a theory of human (Rusch et al. 2013). Moreover, interactive alignment may even enhance
image understanding. Psychol Rev 94(2):115 interactants ability to understand each others utterances (Pickering and
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward Garrod 2009) and to communicate their level of confidence in their
feature space analysis. Pattern Anal Mach Intell IEEE Trans judgments about situations (Fusaroli et al. 2012), thereby enhancing
24(5):603619 performance on some joint actions. Finally, interactive alignment may
Edelman S (1997) Computational theories of object recognition. also increase interactants ability to coordinate their contributions to joint
Trend Cogn Sci, pp 296304 actions (Valdesolo et al. 2010) because synchronization increases inter-
Edelman S, Intrator N (2000) (coarse coding of shape frag- actants attention to one anothers movements, and because it may be
ments) + (retinotopy) approximately = representation of easier to predict and adapt to the movements of another person moving at
structure. Spatial Vision 13(23):255264 a similar tempo and initiating movements of a similar size, duration, and
Edelman S, Intrator N (2003) Towards structural systematicity in dis- force as oneself.
tributed, statically bound visual representations. Cogn Sci It is no surprise, then, that recent decades have seen a dramatic
27(1):73109 Fodor JA, Pylyshyn ZW (1988) Connectionism and increase in the amount of attention paid to various kinds of interactive
cognitive architecture: a critical analysis. Cognition 28(1):371 alignment in the cognitive sciences. However, although there is a broad
Garcia GM (2014) Towards a Graph-based Method for Image consensus about the importance of interactive alignment processes for
Matching and Point Cloud Alignment. Tech. rep., University of social interaction and social cognition, there are still many open ques-
Bonn, Institute of Computer Science III tions. How do these diverse processes influence each other? Which ones
Hedau V, Arora H, Ahuja N (2008) Matching images under unstable contributeand in what waysto interpersonal understanding, coop-
segmentations. In: IEEE conference on computer vision and erativeness and/or performance in joint actions? Is alignment
pattern recognition (CVPR)., IEEE sometimes counterproductive? To what extent can alignment processes
Szabo ZG (2013) Compositionality. In: Zalta EN (ed) The Stanford be deliberately controlled and flexibly combined, replaced, tweaked or
Encyclopedia of Philosophy, fall 2013 edn enhanced? This latter question may be especially relevant for individ-
Tacca MC (2010) Seeing objects: the structure of visual representa- uals who have impairments in some form of bodily expressiveness, and
tion. Mentis who therefore may benefit by compensating with some other form of
Wilson RC, Hancock ER (1997) Structural matching by discrete expressiveness. In the present study, we investigated social interactions
relaxation. IEEE Trans Pattern Anal Mach Intell 19:634648 involving just such individuals, namely a population of teenagers with
Mobius Syndrome (MS)a form of congenital, bilateral facial paral-
ysis resulting from maldevelopment of the sixth and seventh cranial
nerves (Briegel et al. 2006).
Since people with MS are unable to produce facial expressions, it is
Control and flexibility of interactive alignment: Mobius
unsurprising that they often experience difficulties in their social
syndrome as a case study interactions and in terms of general social well-being. We therefore
implemented a social skills intervention designed to train individuals
John Michael1,2,3, Kathleen Bogart4, Kristian Tylen3,5, Joel Krueger6, with facial paralysis owing to MS to adopt alternative strategies to
Morten Bech3, John Rosendahl stergaard7, Riccardo Fusaroli3,5 compensate for the unavailability of facial expression in social inter-
1
Department of Cognitive Science, Central European University, actions (e.g. expressive gesturing and prosody). In order to evaluate the
Budapest, Hungary; 2 Center for Subjectivity Research, Copenhagen effectiveness of this intervention, each of the 5 participants with MS
University, Copenhagen, Denmark; 3 Interacting Minds Centre, (MS-participants) engaged in interactions before and after the inter-
Aarhus University, Aarhus, Denmark; 4 School of Psychological vention with partners who did not have MS (Non-MS-participants).
Science, Oregon State University, Corvallis, USA; 5 Center for These social interactions consisted of two separate tasks, a casual get-
Semiotics, Aarhus University, Aarhus, Denmark; 6 Department of ting-to-know-you task and a task designed to tap interpersonal
Sociology, Philosophy, and Anthropology University of Exeter Amory, understanding. Participants filled out rapport questionnaires after each
Exter, UK; 7Aarhus University Hospital, Aarhus, Denmark interaction. In addition, the interactions were videotaped and analyzed
by independent coders, and we extracted two kinds of linguistic data
Keywords
relating to the temporal organization of the conversational behavior:
Mobius Syndrome, social interaction, social cognition, alignment
prosody (fundamental frequency) and speech rate. We used this latter
When we interact with others, there are many concurrent layers data to calculate indices of individual behavioral complexity and of
of implicit bodily communication and mutual responsiveness at work alignment using cross-recurrence quantification analysis (CRQA).
from the spontaneous temporal synchronization of movements (Rich- We found several interesting results. First, intervention increased
ardson et al. 2007), to gestural and postural mimicry (Chartrand and observer-coded rapport. Secondly, observer-coded gesture and ex-
Bargh 1999; Bernieri and Rosenthal 1991), and to multiple dimensions of pressivity increased in participants with and without MS after
linguistic coordination (Garrod and Pickering 2009; Clark 1996; Fusaroli intervention. Thirdly, fidgeting and repetitiveness of verbal behavior
and Tylen 2012). These diverse processes may serve various important decreased in both groups after intervention. Fourthly, while we did in
social functions. For example, one individuals facial expressions, general observe alignment (compared to surrogate pairs), overall
123
linguistic alignment actually decreased after intervention, and pitch (eds) Fundamentals of nonverbal behavior. Cambridge Univer-
alignment was negatively correlated with rapport. sity Press, Cambridge, pp 401432
These results suggest that the intervention had an impact on MS Bogart KR, Tickle-Degnen L, Ambady N (2012) Compensatory
interlocutors, which in turn impacted non-MS interlocutors, making expressive behavior for facial paralysis: adaptation to congenital
them less nervous and more engaged. Behavioral dynamics can sta- or acquired disability. Rehabilit Psychol 57(1):4351
tistically predict observer-coded rapport, thus suggesting a direct link Bogart KR, Tickle-Degnen L, Joffe M (2012) Social interaction
between them and experience of the interaction. experiences of adults with Moebius syndrome: a focus group.
This pattern of findings provides initial support for the conjecture J Health Psychol. Advance online publication
that a social skills workshop like the one employed here can not only Bogart KR, Matsumoto D (2010) Facial mimicry is not necessary to
affect the participants with MS but alsoand perhaps even more recognize emotion: Facial expression recognition by people with
importantlyaffect the interaction as a whole as well as the partici- Moebius syndrome. Soc Neurosci 5(2):241251
pants without MS. One reason why this is important is because some of Bogart KR, Matsumoto D (2010) Living with Moebius syndrome:
the difficulties experienced by individuals with MS in social interac- adjustment, social competence, and satisfaction with life. Cleft
tions may arise from other peoples discomfort or uncertainly about Palate-Craniofacial J47(2):134142
how to behave. In other words, individuals without MS who interact Briegel W (2007) Psychopathology and personality aspects of adults
with individuals with MS may interrupt the smooth flow of interaction with Moebius sequence. Clin Genet 71:376377
through their uncertainty about how to interact in what is for them a Chartrand TT, Bargh JA (1999) The chameleon effect: the perception-
new and sensitive situation. Moreover, this may also be true in other behavior link and social interaction. J Person Soc Psychol
instances in which people interact with others who appear different or 76:893910
foreign to them (because of other forms of facial difference, skin color, Clark HH (1996) Using language. Cambridge University Press,
etc.) Thus, this issue points to a possible direction in which further Cambridge
research may be conducted that would extend the findings far beyond Derogatis LR SCL-90-R (1977) Administration, scoring and proce-
the population of individuals with MS. More concretely, one obvious dures manual-I for the Revised version. Johns Hopkins
comparison would be to individuals with expressive impoverishment University School of Medicine, Baltimore
due to Parkinsons disease. Do these individuals also employ some of Fahrenberg J, Hampel R, Selg H. FPI-R. (2001) Das Freiburger
the same kinds of compensatory strategies as individuals with MS? If Personlichkeitsinventar, 7th ed. Gottingen: Hogrefe
so, what effects does that have upon interactive alignment within Garrod S, Pickering MJ (2009) Joint action, interactive alignment,
social interactions? What differences does it make that their condition and dialog. Top Cogn Sci 1(2):292304
is an acquired rather than a congenital one? Helmreich R, Stapp J (1974) Short forms of the Texas Social
Finally, one additional question for further research is whether Behavior Inventory (TSBI), an objective measure of self-
some compensatory strategies are more easily automated than others. esteem. Bull Psychon Soc
For example, it is possible that increasing hand gesturing or eye Kahn JB, Gliklich RE, Boyev KP, Stewart MG, Metson RB,
contact can be quickly learned and routinized, but that modulating McKenna MJ (2001) Validation of a patient-graded instrument
ones prosody cannot. If there are such differences among the degrees for facial nerve paralysis: The FaCE scale. Laryngoscope
to which different processes can be automated, it would be important 111(3):387398
to understand just what underlies them. On a theoretical level, this Lakin J, Chartrand T (2003) Using nonconscious behavioral mimicry
could provide useful input to help us understand the relationship to create affiliation and rapport. Psychol Sci 14:334339
between automatic and controlled processes. On a more practical Mattick RP, Clarke JC (1998) Development and validation of mea-
level, this could be important for three concrete reasons. First of all, it sures of social phobia scrutiny fear and social interaction anxiety.
may be taxing and distracting to employ deliberate strategies for Behav Res Therapy 36(4):455470
expressing oneself in social interactions, and people may therefore Meyerson MD (2001) Resiliency and success in adults with Moebius
find it tiring, and be less likely to continue doing it. Secondly, it may syndrome. Cleft Palate Craniofac J 38:231235
be important that some interactive alignment processes occur without Oberman LM, Winkielman P, Ramachandran VS (2007) Face to face:
peoples awareness. Thus, attempting to bring them about deliberately blocking facial mimicry can selectively impair recognition of
may actually interfere with the implicit processes that otherwise emotional expressions. Social Neurosci 2(3):167178
generate alignment. Indeed, there is evidence that behavioral mimicry Richardson MJ, Marsh KL, Isenhower RW, Goodman JR, Schmidt
actually undermines rapport if people become aware that it is being RC (2007) Rocking together: dynamics of intentional and unin-
enacted deliberately (Bailensen et al. 2008). Thirdly, it would be tentional interpersonal coordination. Hum Mov Sci 26:867891
important for future social skills workshops to examine whether some Robinson E, Rumsey N, Partridge J. (1996) An evaluation of social
compensatory strategies are more effectively taught indirectlye.g. interaction skills training for facially disfigured people. Br J Plast
rather than telling people to use more gestures, it may be advanta- Surg 49:281289
geous to employ some other means which does not require them to Rosenberg M (1965) Rosenberg self-esteem scale (RSE). Acceptance
deliberately attend to their gestures or prosody, for example by using and Commitment Therapy. Measures Package, 61
more gestures and prosody when interacting with children with MS, Tickle-Degnen L, Lyons KD (2004) Practitioners impressions of
by asking them to watch videos in which actors are highly expressive patients with Parkinsons disease: the social ecology of the
in their gestures and prosody, or by engaging them in role-playing expressive mask. Soc Sci Med 58:603614
games in which a high level of gesture and/or prosody is appropriate. Valdesolo P, Ouyang J, DeSteno D (2010) The rhythm of joint action:
synchrony promotes cooperative ability. J Exp Soc Psychol
References 46:693695
Bailenson JN, Yee N, Patel K, Beall AC. (2008) Detecting digital van Baaren RB, Holland RW, Kawakami K, van Knippenberg A
chameleons. Comput Hum Behav 24:6687 (2004) Mimicry and pro-social behavior. Psychol Sci 15:7174
Bernieri FJ, Rosenthal R (1991) Interpersonal coordination: behavior Zigmond AS, Snaith R (1983) The hospital anxiety and depression
matching and interactional synchrony. In: Feldman RS, Rime B scale. Acta Psychiatrica Scand 67(6):361370
123
Efficient analysis of gaze-behavior in 3D environments Application Areas

The presented EyeSee3D approach can be applied as a method to
Thies Pfeiffer1, Patrick Renner, Nadine Pfeiffer-Lessmann accurately annotate fixations in 3D environments as required for
1
Center of Excellence Cognitive Interaction Technology, Bielefeld scientific studies. We have already tested this approach in two studies.
University, Germany; 2 SFB 673: Alignment in Communication, Both studies involve settings with two interacting interlocutors (no
Bielefeld University, Germany confederates) sitting face-to-face at a table.
In the first study, we were interested in gaze-patterns of joint
Abstract attention (Pfeiffer-Lessmann, Pfeiffer, Wachsmuth 2013). We placed
We present an approach coined EyeSee3D to identify the 3D point of 23 figures of a LEGO Duplo set on a table, each of which facing
regard and the fixated object in real-time based on 2D gaze videos either of the interlocutors. The experimenter then describes a certain
without the need for manual annotation. The approach does not figure and the interlocutors have to team up to identify the figure. The
require additional hardware except for the mobile eye tracker. It is task, however, is not as simple as it sounds: the information given
currently applicable for scenarios with static target objects and might only be helpful for one of the interlocutors, as it might refer to
requires fiducial markers to be placed in the target environment. The features of the figure only visible from a certain perspective. Even
system has already been tested in two different studies. Possible more, the interlocutors are instructed to neither speak nor gesture to
applications are visual world paradigms in complex 3D environments, communicate. This way we force the participants to use their gaze to
research on visual attention or humanhuman/human-agent interac- guide their partners attention towards the correct figure. The set-up
tion studies. used in this experiment will be used later in this paper to illustrate the
Keywords EyeSee3D method.
3D eye tracking, Natural environments In the second study, we were interested in creating computa-
tional models for predicting the targets of pointing gestures and
Introduction more generally areas which in the near future will be occupied by
Humans are evolved to live in a 3D spatial world. This affects our a human interlocutor during interaction (Renner, Pfeiffer, Wachs-
perception, our cognition and our action. If human behavior and in muth 2014). This research is motivated by human-robot interaction
particular visual attention is analyzed in scientific studies, however, in which we want to enable robots to anticipate human movements
practical reasons often force us to reduce the three-dimensional in order to be more responsive, i.e., in collision-avoidance
world to two dimensions within a small field of view presented on behavior.
a computer screen. In many situations, such as spatial perspective Besides eye tracking, in this study we also combined the Eye-
taking, situated language production, or understanding of spatial See3D approach with an external motion-tracking system to track the
references, just to name a few, a restriction to 2D experimental hands and the faces of the interlocutors. Using the same principles as
stimuli can render it impossible to transfer findings to our natural presented in the next section, also the targets of pointing gestures as
everyday environments. well as gazes towards the body of the interlocutor can be identified
One of the reasons for this methodological compromise is the computationally without the need for manual annotations.
effort required to analyze gaze data in scenarios where the participant EyeSee3D
is allowed to move around and inspect the environment freely. Cur- The EyeSee3D approach is easy to set-up Fig. 1 on the left shows a
rent mobile eye-tracking systems use a scene camera to record a video snapshot from one of our own studies on joint attention between two
from the perspective of the user. Based on one or two other cameras human interlocutors (Pfeiffer-Lessmann, N., Pfeiffer, T., Wachsmuth,
directed at the participants eyes, the gaze fixation of the participant is I. (2013). In this study we had 12 pairs of interaction partners and a
then mapped on the video of the scene camera. While binocular total of about 160 min of gaze video recordings. It would have taken
systems are already able to compensate for parallax by estimating the about 40 h to manually annotate the gaze videos, excluding any
distance of the fixation from the observer, they have no representation additional second annotations to test for annotation reliability.
of the 3D world but still only work on the 2D projection of the world The process followed by EyeSee3D is presented in Fig. 2. In a
visible in the scene camera video. The most important part then is preparation phase, we covered the environment with so-called fiducial
identifying in the video stream that particular object the participant markers, highly visible printable structures that are easy to detect
has been fixating. This currently requires manual annotations, which using computer-vision methods (see Fig. 1, mid upper half). We
take several times as much as the recorded time. Depending on the verified that these markers did not attract significant attention by the
complexity of the annotation (target object count and density), we had participants. As a second step, we created proxy geometries for the
cases where the annotation of one minute recorded video required relevant stimuli, in this example small toy figures (see Fig. 3). For our
fifteen minutes of annotation or more. setup, a simple approximation using bounding boxes is sufficient,
With our EyeSee3D approach, we provide a software tool that is but any complex approximation of the target may be used. When
able to identify the fixated objects automatically if it can be allowed aiming for maximum precision, it is possible to use 3D scans with
that the environment is covered with some visible markers that do not exact replications of the hull of the target structure. The whole pro-
affect the visual behavior and if the target objects remain static. cess for setting up such a table will take about 30 min. These
Related Work preparations have to be made once, as the created model can be used
There are approaches for semi-automatic gaze annotation based on for all study recordings.
2D computer vision, such as the SemantiCode approach by Pontillo Based on these preparations, we are now able to conduct the study
et al. (2010), which still requires manual annotation, but achieves a and record the eye-tracking data (gaze videos and gaze data). Eye-
speed-up by incrementally learning the labeling of the targets using See3D then automatically annotates the recorded gaze videos. For
machine learning and computer vision techniques. Still, the experi- each frame of the video, the algorithms detect fiducial markers in the
menter has to at least validate every label. Approaches that also use image and estimate the position and orientation of the scene camera in
3D models are Toyama et al. (2012), but they are targeting human 3D space. For this process to succeed at least one fiducial marker has
computer interactions, not scientific studies, and Paletta et al. (2013), to be fully visible in each frame. The camera position and orientation
who use a 3D scan of the target environment to later identify the are then used together with the gaze information provided by the eye
target position. Their approach requires much more effort during tracker itself to cast a gaze ray into the 3D proxy geometries. This
preparation but then does not require an instrumentation of the gaze ray intersects the 3D proxy geometries exactly at the point (see
environments with markers. Fig. 1, right) that is visualized by the gaze cursor in the scene camera
123
the lower body of the figures during sentence processing. After

updating the 3D proxy models, we could use EyeSee3D to re-annotate
all videos and have the more fine-grained annotation ready within
minutes.
In our example study, we were able to cover about 130 min of the
160 min of total recordings using this technique. In the remaining
30 min, participants were either moving their head so quickly that the
scene camera only provided a motion-blurred image or they turned
towards the interaction partner or the experimenter for questions, so
that no marker was visible in the image (but also no target stimuli).
Fig. 1 The left snapshot is taken from a 2D mobile eye-tracking Thus, the remaining 30 min where not relevant for the evaluation of
video taken from the egocentric perspective of the scene camera. the study.
The point of regard is visualized using a green circle and a human More technical details about the EyeSee3D approach have been
annotator would have to manually identify the fixated object, here the presented at ETRA 2014 (Pfeiffer and Renner 2014).
figure of a girl. With EyeSee3D, gaze-rays can be computed and cast Discussion and Future Work
into a 3D abstract model of the environment (simple white boxes The presented initial version of our EyeSee3D approach can already
around the figures), the intersection with the fixation target (box significantly speed-up the annotation of mobile eye-tracking studies.
corresponding to figure of the girl) is computed automatically and in There are no longer economic reasons to keep an eye on short ses-
real-time sions and low number of participants. The accuracy of the system
depends on the one hand on the accuracy of the eye- tracking system.
In this the accuracy of EyeSee3D does not differ from the normal 2D
video-based analysis. On the other hand the accuracy depends on the
quality with that the fiducial markers are detected. The larger the
detected marker and the better the contrast, the higher the accuracy of
the estimated camera position and orientation.
EyeSee3D is not only applicable for small setups, as the selected
example of two interaction partners sitting at a table might suggest at
first glance. The size of the environment is not restricted as long as at
Fig. 2 The EyeSee3D method requires a one-time preparation phase. least one fiducial marker is in the field of view for every relevant
During study recording there are two alternatives, either (a) use the target object. The markers might, for example, be sparsely distributed
standard tools and run EyeSee3D offline to annotate the data or b use in a museum just around the relevant exhibits.
EyeSee3D online during the study We are currently working on further improving the speed and the
accuracy of the system. In addition to that, we are planning to inte-
grate other methods for tracking the scene cameras position and
orientation in 3D space based, e.g., on tracking arbitrary but signifi-
cant images. In certain examples such as a museum or a shelf in a
shopping center, this would allow for an automatic tracking without
any dedicated markers.
In future work, we are planning to compare the results obtained by
human annotators with those calculated by EyeSee3D. In a pilot
evaluation we were able to identify situations of disagreement, i.e.
situations in which EyeSee3D comes to slightly different results as a
human annotator, when two target objects overlap in space (which is
Fig. 3 The 3D proxy geometries that had to be created to determine more likely to happen with a freely moving participant than in tra-
the fixated objects. The different figures are textured with original ditional screen-based experiments) and the fixation is somewhere in
pictures, which is not needed for the process but useful for controlling between.
the orientation of the figures when setting up the experiment Such situations are likewise difficult to annotate consistently
between human annotators, because of their ambiguity. Investigating
the differences between the systematic and repeatable annotations
provided by EyeSee3D and the interpretations of human annotators,
video provided by the standard eye-tracking software (see Fig. 1, which might depend on different aspects, such as personal preferences
left). As each of the proxy geometries is labeled, we can identify the or the history of preceding fixations, could be very informative.
target object automatically. Besides the described speed-up achieved by EyeSee3D, it might also
This annotation process can be either used online during the study, provide more objective and consistent annotations.
so that the annotation results are already available when the study In summary, using EyeSee3D the analysis of mobile gaze-tracking
session is completed. Or, alternatively, EyeSee3D can be used in studies has become as easy as desktop-computer-based studies using
offline-mode to analyze the previously recorded gaze videos and data remote eye-tracking systems.
files. This offline-mode has the advantage that it can be repeatedly
applied to the same data. This is useful in cases where number and
placement of the proxy geometries is not known beforehand and Acknowledgments
incrementally refined during the progress of understanding the This work has been partly funded by the DFG in the SFB 673
problem domain. For example, at the moment we are only interested Alignment in Communication.
in locating the target figure. Later on we might be working together
with psycholinguists on language processing following a visual- References
world paradigm. We might then be also interested in whether the Paletta L, Santner K, Fritz G, Mayer H, Schrammel J (2013) 3D
participants have looked at the headdress, the head, the upper body or attention: measurement of visual saliency using eye tracking
123
glasses, CHI 13 Extended Abstracts on Human Factors in Com- variation phase. The neural activation patterns can also help unravel
puting Systems, 199204, ACM, Paris, France the cognitive processes underlying reading and processing of the
Pfeiffer T, Renner P (2014) EyeSee3D: A Low-Cost Approach for premise information. First experiments utilizing recorded neural
Analysing Mobile 3D Eye Tracking Data Using Augmented activation with PET (and later with fMRI) were conducted by Goel
Reality Technology, Proceedings of the Symposium on Eye et al. in 1998. The initial motivation was to get an answer about which
Tracking Research and Applications, ACM of several then popular psychological theories were correct (cp Goel
Pfeiffer-Lessmann N, Pfeiffer T, Wachsmuth I (2013) A Model of 2001), simply by examining the involvement of respective brain areas
Joint Attention for Humans and Machines. Book of Abstracts of that are connected to specific processing functions. Such an easy
the 17th European Conference on Eye Movements (Bd. 6), 152, answer has not yet been found. However, previous analyses (e.g.,
Lund, Sweden Knauff 2006; Knauff 2009) across multiple studies showed the
Pontillo DF, Kinsman TB, Pelz JB (2010) SemantiCode: using con- involvement of the frontal and the posterior parietal cortex (PPC),
tent similarity and database-driven matching to code wearable especially for relational reasoning. Roughly speaking, the role of the
eyetracker gaze data. ACM ETRA 2010, 267270, ACM PPC is to integrate information across modalities (Fig. 1) and its
Renner P, Pfeiffer T, Wachsmuth I (2014) Spatial references with general involvement has been shown consistently in studies (Knauff
gaze and pointing in shared space of humans and robots. In: 2006).
Proceedings of the Spatial Cognition 2014 In this article we briefly introduce state-of-the-art neural findings
Toyama T, Kieninger T, Shafait F, Dengel A (2012) Gaze guided for relational reasoning. We present an overview of the current studies
object recognition using a head-mounted eye tracker, ACM and report two studies form our lab. Subregions within the PPC, e.g.,
ETRA 2012, 9198, ACM the SPC, are differentiated to allow a more fine-grained description of
their role for the mental model construction and manipulation process.
The Associated Activation in Relational Reasoning
The role of the posterior parietal cortex in relational Main activations during reasoning about relations found in a meta-
study conducted by Prado et al. (2011) identified the role of the PPC
reasoning and the middle frontal gyrus (MFG). Although we know that these
regions are involved in the reasoning process about relations, exact
Marco Ragni, Imke Franzmeier, Flora Wenczel, Simon Maier functions of these areas, the role of the subregions, and problem-
Center for Cognitive Science, Freiburg, Germany specific differences remain unclear. Studies by Fangmeier, Knauff
Abstract and colleagues (Fangmeier et al. 2006; Fangmeier, Knauff 2009;
Inferring information from given relational assertions is at the core of Knauff et al. 2003) additionally compared activations across the
human reasoning ability. Involved cognitive processes include the reasoning process. They analyzed the function of the PPC during the
understanding and integration of relational information into a mental processing and integration of premise information and the subsequent
model and drawing conclusions. In this study we are interested in the model validation phase. The PPC is mainly active in the last phase
identification of the role of associated brain regions. Hence, (i) we model validation.
reanalyzed 23 studies on relational reasoning from Pubmed, Science We included all studies mentioned in Prado et al. (2012) and
Direct, and Google Scholar with healthy patients and focused on Knauff (2006) and additionally searched the three databases Pubmed,
peak-voxel analysis of single subregions of the posterior parietal Google scholar, and Science Direct with the keywords: relational
cortex allowing a more fine-grained analysis than before, and (ii) the reasoning or deductive reasoning in combination with the terms
identified regions are interpreted in light of findings on reasoning neuroimaging or fMRI, and searched for studies that were cited in the
phases from own transcranial magnetic stimulation (TMS) and fMRI respective articles.
studies. The results indicate a relevant role of the parietal cortex, Of these 26 studies we included 23 in our analysis; all those which
especially the lateral superior parietal cortex (SPL) for the construc- (i) report coordinates (either Tailarach or MNI), (ii) had a reasoning
tion and manipulation of mental models. vs. non-reasoning contrast, and (iii) used healthy participants, i.e.,
Keywords excluding patient studies. We transformed all coordinates to the MNI
Relational Reasoning, Brain Regions, Posterior Parietal Cortex coordinate system for the peak voxel analysis.
Only few studies report temporal activation. Mainly activation in
Introduction the middle temporal gyrus was found, possibly related to language
Consider a relational reasoning problem of the following form: processes. Activation in the occipital cortex probably is due to the
The red car is to the left of the blue car.
The yellow car is to the right of the blue car.
What follows?
The assertions formed in reasoning about (binary) relations consist
of two premises connecting three terms (the cars above). Participants
process each piece of information and integrate it in a mental model
(Ragni, Knauff 2013). A mental model (Johnson-Laird 2006) is an
analogue representation of the given information. For the problem
above we could construct a mental model of the following form:
red car blue car yellow car
From this analogical representation (for a complete discussion of
how much information might be represented please refer to Knauff
2013) the missing relational information, namely that the red car is to
the left of the yellow car, can easily be inferred. The premise
description above is determinate, i.e., it elicits only one mental model.
There are, however, indeterminate descriptions, i.e., descriptions with
which multiple models are consistent, and sometimes alternative
models have to be constructed. The associated mental processes in
reasoning are the model construction, mental inspection, and model Fig. 1 The posterior parietal cortex and central subregions
123
Table 1 Key studies and frontal and parietal activation
Anatomical probabilities for the peak coordinates located within the lateral (with the SPL as a subregion) and the medial SPC (with the precuneus
as a subregion) according to the SPM anatomy toolbox (Eickhoff et al. 2007) are reported. Reports of SPC activation in the original publications
which showed an anatomical probability of less than 30 % for the SPC are depicted in brackets. MC = motor cortex, PMC = premotor cortex,
dlPFC = dorsolateral prefrontal cortex, AG = angular gyrus, TPJ = temporoparietal junction, SMG = supramarginal gyrus; left half-cir-
cle = left lateral, right half-circle = right lateral, circle = bilateral
visual presentation of the stimuli. Key activations were found in the reasoning tasks, i.e., participants needed longer if the SPL was
frontal and parietal lobes (Table 1). Across all studies only the lateral stimulated during the model validation phase. A performance mod-
SPL was consistently involved while in the frontal regions the acti- ulation was achieved by unilateral right and by bilateral stimulation.
vation was more heterogeneous. Hence, we focused on the PPC and The modulation direction, i.e., whether the performance was
its subregions. In Table 1 we report anatomical probabilities for the enhanced or disrupted, depended on stimulation timing. Real lesions
peak coordinates located within the lateral and medial (incl. the can shed additional light on this. A recent study by Waechter et al.
precuneus as a subregion) SPL according to the SPM anatomy tool- (2013) compared patients with lesions in the rostrolateral prefrontal
box (Eickhoff et al. 2007). Reports of SPL activation in the original cortex to patients with PPC lesions, and controls on transitive infer-
publications which showed an anatomical probability of less than ence problems. These results further support the role of the lateral
30 % for the region are depicted in brackets. SPL in drawing inferences and its crucial involvement in mental
General Discussion model construction. All studies show the eminent role of lateral SPL
Table 1 shows that in almost all experimental studies of relational (and hence of the PPC) for the reasoning process. Premise informa-
reasoning the PPC is active. However, our goala more detailed tion is integrated and manipulated in a mental model which is at least
analysisshows the bilateral involvement of the lateral SPL and the partially kept in the lateral SPL and to a lesser degree the inferior
inferior parietal lobe in the reasoning process. Additionally, and in parietal cortex.
accordance with findings from Fangmeier et al. (2006), it shows the
importance in the core reasoning phasethe model validation phase. Acknowledgments
To investigate the role of these regions we conducted an fMRI study The work has been partially supported by a grant to MR from the
(Maier et al. 2014) and presented participants with indeterminate DFG within the SFB/TR 8 in project R8-[CSPACE]. The authors are
problems in which they could construct and vary the mental models. grateful to Barbara Kuhnert for drawing the brain picture and
These processes elicited lateral SPL activation. We argue that in this Stephanie Schwenke for proof-reading.
region the mental model of the premise information is constructed and
varied (cp Goel et al. 2004)a result supported by our study. Thus,
the lateral SPL is likely to be involved in the reasoning process. A References
causal connection can be established if a malfunctioning SPL leads to Acuna BD, Eliassen JC, Donoghue JP, Sanes JN (2002) Frontal and
a decrease in reasoning performance. A method to induce virtual parietal lobe activation during transitive inference in humans.
lesions is transcranial magnetic stimulation (TMS, Walsh, Pascual- Cerebral Cortex (New York, N.Y.: 1991), 12(12):13121321
Leone 2003). Hence, in a recent study we investigated the role of the Brzezicka A, Sedek G, Marchewka A, Gola M, Jednorg K, Krlicki L,
SPL on the construction and alteration of mental models (Franzmeier Wrbel A (2011) A role for the right prefrontal and bilateral
et al. 2014). TMS on the SPL modulated the performance in deductive parietal cortex in four-term transitive reasoning: an fMRI study
123
with abstract linear syllogism tasks. Acta Neurobiologiae Ex- Waechter RL, Goel V, Raymont V, Kruger F, Grafman J (2013)
perimentalis, 71(4):479495 Transitive inference reasoning is impaired by focal lesions in
Eickhoff SB, Paus T, Caspers S, Grosbras MH, Evans A, Zilles K, parietal cortex rather than rostrolateral prefrontal cortex. Neu-
Amunts K (2007) Assignment of functional activations to probabi- ropsychologia 51(3):464471
listic cytoarchitectonic areas revisited. NeuroImage 36(3):511521 Walsh P-L (2003) Transcranial magnetic stimulation: A neurochro-
Fangmeier T, Knauff M (2009) Neural correlates of acoustic reasoning. nometrics of mind. MIT Press, Cambridge
Brain Res 1249:181190. doi:10.1016/j.brainres.2008.10.025 Wendelken C, Bunge SA (2010) Transitive inference: distinct con-
Fangmeier T, Knauff M, Ruff CC, Sloutsky V (2006) FMRI evidence tributions of rostrolateral prefrontal cortex and the hippocampus.
for a three-stage model of deductive reasoning. J Cogn Neurosci J Cogn Neurosci 22(5):837847
18(3):320334
Franzmeier I, Maier SJ, Ferstl EC, Ragni M (2014) The role of the
posterior parietal cortex in deductive reasoning: a TMS study. In How to build an inexpensive cognitive robot: Mind-R
OHMB 2014. Human Brain Mapping Conference, Hamburg
Goel V, Gold B, Kapur S, Houle S (1998) Neuroanatomical correlates Enrico Rizzardi1, Stefano Bennati2, Marco Ragni1
of human reasoning. J Cogn Neurosci 10(3):293302 1
University of Freiburg, Germany; 2 ETH Zurich, Switzerland
Goel V, Dolan RJ (2001) Functional neuroanatomy of three-term
relational reasoning. Neuropsychologia 39(9):901909 Abstract
Goel V, Makale M, Grafman J (2004) The hippocampal system Research in Cognitive Robotics is dependent on standard robotic plat-
mediates logical reasoning about familiar spatial environments. forms that are designed to provide high precision as required by
J Cogn Neurosci 16:654664 classical robotics, such platforms are generally expensive. In most cases
Goel V, Stollstorff M, Nakic M, Knutson K, Graf-man J (2009) A role the features provided by the robot are more than are needed to perform
for right ventrolateral prefrontal cortex in reasoning about the task and this complexity is not worth the price. In this article we
indeterminate relations. Neuropsychologia 47(13):27902797 propose a new reference platform for Cognitive Robotics that, thanks to
Hinton EC, Dymond S, von Hecker U, Evans CJ (2010) Neural cor- its low price and full-featured set of capabilities, will make research
relates of relational reasoning and the symbolic distance effect: much more affordable and pave the way for more contributions in the
involvement of parietal cortex. Neuroscience 168(1):138148 field. The article describes the requirements and procedure to start using
Johnson-Laird PN (2006) How we reason. Oxford University Press, the platform and presents some use examples.
New York Keywords
Knauff M (2006) Deduktion und logisches Denken. Denken und Prob- Mind-R, Cognitive Robotics, ACT-R, Mindstorms
lemlosen. Enzyklopadie der Psychologie, 8, Hogrefe, Gottingen
Introduction
Knauff M (2009) A neuro-cognitive theory of deductive relational
Cognitive Robotics aims to bring human level intelligence to robotic
reasoning with mental models and visual images. Spatial Cogn
agents equipping them with cognitive- based control algorithms.
Comput 9(2):109137
This can be accomplished by extending the capabilities of robots
Knauff M, Fangmeier T, Ruff CC, Johnson-Laird PN (2003) Rea-
with concepts from Cognitive Science, e.g. learning, reasoning and
soning, models, and images: Behavioral measures and cortical
planning abilities. The main difference to classical robotics is in the
activity. J Cogn Neurosci 15(4):559573
requirements: cognitive robots must show robust and adaptable
Knauff M, Johnson-Laird PN (2002) Visual imagery can impede
behavior, while precision and efficiency are not mandatory. The
reasoning. Memory Cogn 30(3):363371
standard robotic platforms are designed to comply with the
Knauff M, Mulack T, Kassubek J, Salih HR, Greenlee MW (2002)
demanding requirements of classical robotics; therefore the entry
Spatial imagery in deductive reasoning: a functional MRI study.
price is high enough to become an obstacle for most researchers.
Brain Res Cogn Brain Res 13(2):203212
To address this issue we present a new robotic platform targeted
Knauff M (2013) Space to Reason: A Spatial Theory of Human
to Cognitive Robotics research that we called Mind-R. The advanta-
Thought. MIT Press
ges of Mind-R over other robotic hardware are its low price and
Prado J, Chadha A, Booth JR (2011) The brain network for deductive
customization capabilities. Its greatest disadvantage is that its sensors
reasoning: a quantitative metaanalysis of 28 neuroimaging
and actuators are not even nearly as precise as other commercial
studies. J Cogn Neurosci 23(11):34833497
hardware, but this is not a big issue in Cognitive Robotics which
Prado J, Mutreja R, Booth JR (2013) Fractionating the neural sub-
does not aim at solving tasks efficiently and precisely, instead flexi-
strates of transitive reasoning: task- dependent contributions of
bility and adaptability are the focus.
spatial and verbal representations. Cerebral Cortex (New York,
The article is structured as follows: Section 2 briefly describes the
N.Y.: 1991):23(3):499507
ACT-R theory and how the Mind-R modules fit into its framework,
Prado J, Noveck IA, Van Der Henst J-B (2010a). Overlapping and
Section 3 provides details of the characteristics of the hardware
distinct neural representations of numbers and verbal transitive
platform and Section 4 details a step-by-step guide on how to install
series. Cerebral Cortex (New York, N.Y.: 1991), 20(3):720729
the software and run some examples.
Prado J, Van Der Henst JB, Noveck IA (2010b). Recomposing a
ACT-R
fragmented literature: How conditional and relational arguments
ACT-R (Bothell 2005) is a very well know and widely tested cog-
engage different neural systems for deductive reasoning. Neu-
nitive architecture. It is the implementation of a theory of the mind
roimage 51(3):12131221
developed by (Anderson et al. 2004) and validated by many experi-
Ragni M, Knauff M (2013) A theory and a computational model of
ments over the years. The ACT-R framework has a modular structure
spatial reasoning with preferred mental models. Psychol Rev
that can be easily expanded with new modules that allow researchers
120 (3):561588
to add new features to the architecture, such as controlling a robotic
Ruff CC, Knauff M, Fangmeier T, Spreer J (2003) Reasoning and
platform.
working memory: common and distinct neuronal processes.
ACT-R models the structure and behavior of the human brain.
Neuropsychologia 41(9):12411253
Each module has a specific function (i.e. visual, motor, etc.) that
Shokri-Kojori E, Motes MA, Rypma B, Krawczyk DC (2012) The
reflects the functional organization of the cortex. The modules can
Network Architecture of Cortical Processing in Visuo-spatial
exchange information through their buffers; each module can read all
Reasoning. Scientific Reports, 2. doi:10.1038/srep00411
123
Goal Imaginal Declarative

Module Module Module
Goal Im aginal Retrieval
Buffer Buffer Buffer
Procedural
Module
nxt- nxt- nxt- nxt-

move touched visual distance
nxt- nxt- nxt- nxt-
motor touch vision distance
Environment
Fig. 1 ACT-R structure with Mind-R modules
the buffers, but can write only in its own (i.e. to answer queries).
Communication is coordinated by the procedural module, which is a
serial bottleneck. Extending ACT-R to read from sensors and control
actuators required writing new modules, that give cognitive models
the capability to communicate with the robot as it were the standard
Fig. 2 Mind-R robot
ACT-Rs virtual device (Fig. 1).
Mind-R robot
The robot used to build Mind-R is the LEGO Mind- storms set
(LEGO 2009, Fig. 2). Its core relies on the central brick that includes
the CPU, batteries and communication interfaces. Mind-Rs design is
fully customizable using LEGO Mindstorms bricks. The robot can be
programmed through the USB interface using many different lan-
guages and tools. To keep the interface with ACT-R straightforward,
the chosen programming language is Lisp The Lisp interpreter can
command the robot through the NXT-LSP libraries (Hiraishi 2007),
which are at the core of the ACT-R interface.
The LEGO Mindstorms in the Mind-R configuration is composed
of: one ultrasonic sensor, two bumpers, one color sensor and two
engines. The ultrasonic sensor, shown in Fig. 3a, provides an
approximate distance from the next obstacle inside a solid angle. A
bumper, show in Fig. 3b, provides binary state information, that is it
can distinguish between press and release states. The color sensor,
show in Fig. 3c, can distinguish between basic colors, for example
blue, red and green, over a short distance. The engines are step- per
motors, able to turn in both directions. Two engines together make the
driving system in Fig. 3d. Each engine drives a wheel, shown in the
upper left and upper right-hand corners of Fig. 3d; they can be con-
trolled to navigate in the environment. The third wheel, in the bottom
part of Fig. 3d, has no tire has and only has balance purposes.
Mind-R has already shown its effectiveness in spatial planning
problems (Bennati and Ragni 2012). As a future development the
effect of complex visual perception, coming from image processing
software, on robotic navigation will be investigated.
Setup
This section provides a step-by-step guide on how to install and Fig. 3 The Mindstorms robot and its sensors
configure the needed software to run the Mind- R platform. A more
detailed guide can be found on the Mind-R website9 containing
examples and a troubleshooting section.
The LEGO NXT Fantom driver10 can be obtained from the LEGO recommended interpreter is SBCL x86. The SBCL installer can be
Mindstorms website or from the Mind- storms installation CD and downloaded from its website.11 ACT-R 6.0 can be downloaded from
then installed. A reboot of the computer may be required. The its website and unpacked into a local folder. The Mind-R software,
containing the NXT communication library, the peripherals modules
and a demo model can be downloaded from the Mind-R website. The
9
http://webexperiment.iig.uni-freiburg.de/mind-r/index.html.
10
For Intel-based Macs users: make sure to install the correct driver
11
for your platform. For GNU/Linux no support is provided. SBCL website: http://www.sbcl.org/.
123
source code of the NXT communication library is provided together

with a guide about how to compile it. The content of the Mind-R
archive has to be unpacked into a local folder.
The robot has to be connected to the computer through a USB port
before loading the modules with the interpreter. After the robot is
recognized by the OS, the interpreter can be launched. Make sure to
start the interpreter from the folder where the Mind-R software has
been unpacked. The NXT communication library can be loaded from
LISP interpreter with the command (load nxt.lsp). If the SBCL
current working directory is not the one into which Mind-R has been
unpacked, the NXT communication library loading will fail. This will
prevent the demo or any other Mind-R models from running. If the
loading was successful, SBCL returns T; if anything else is returned Fig. 5 Production rules that control the engines: left moves forward,
as final output, see the troubleshooting section of the Mind-R website. right turns right
The next step is to load ACT-R with the command (load /path/to/
act-r/load-act-r-6.lisp), replacing the path with the appropriate one.
The Mind- R modules can now be loaded. Within the Mind-R archive
four modules are provided: nxt-distance.lsp, nxt-touch.lsp, nxt- already proven effective in spatial navigation (Bennati and Ragni 2012)
motor.lsp and nxt-vision.lsp The first allows ACT-R to communicate tasks where it was able to replicate human navigation results and errors.
with the ultrasonic sensor, the second one with the bumpers, the third The flexibility of the platform allows advanced perception capa-
commands the engines and the last the color sensor. bilities such as computer vision and natural speech to be added. A first
When the modules have been loaded by the interpreter with a load step in this direction is (Rizzardi 2013) in which a communication
command, the robot setup phase can be concluded by executing the module and digital image processing software were used to acquire
function nxt-setup. information from a retail webcam to improve the robotic navigation
Once all this software has been successfully installed and loaded, a with the use of visual landmarks.
demo model consisting of a series of productions that let Mind-R By integrating ACT-R on a LEGO Mindstorms platform it is
explore the environment, can be started. The demo model is called possible to use other ACT-R models from driver models (Haring et al.
demo.lsp and can be loaded with a load command or using the 2012), reasoning (Ragni and Brussow 2011) and planning (Best and
graphical interface. This demo model is very simple and is designed Lebiere 2006) towards a true unified cognition approach. How- ever,
to be a starting point for building more complex models. equipping robots with cognitive knowledge is not only important for
To run the demo use the ACT-R run function. The engines of the learning about human and embodied cognition (Clark 1999), it is
robot might continue running after a model has terminated, this becoming increasingly important for Human-Robot-Interaction,
depends both on the model structure and on its final state. To stop the where a successful interaction depends on an understanding of the
engines the function motor-reset has to be called. The demo model other agents behavior (Trafton et al. 2013).
contains some productions used to let the robot interact with the Our hope is that an affordable robot and the bridging function
environment. Figures 4 and 5 show productions similar to those in the towards ACT-R maybe fruitful for research and education purposes.
model. Figure 4 shows two simple productions that are used to read
the distance measured by ultrasonic sensor and to print the read Acknowledgments
distance into the console by invoking !outptut!. This work has been supported by the SFB/TR 8 Spatial Cognition
Figure 5 shows two productions that send commands to the within project R8-[CSPACE] funded by the DFG. A special thanks to
engines. The left one makes the robot move straight forward. The Tasuku Hiraishi, the developer of the NXT-LSP communication
value next to the duration field indicates how long the engines have to library.
rotate before stopping. The right one makes the robot turn right.
Again, the higher the value assigned to duration, the longer the robot References
will turn in that direction. Anderson J, Bothell D, Byrne M, Douglass S, Lebiere C, Qin Y
Conclusions (2004) An integrated theory of the mind. Psychological Review
The described platform is a low-priced and flexible cognitive robot 111(4):10361060
that give researchers in the field of Cognitive Robotics an affordable Bennati S, Ragni M (2012) Cognitive Robotics: Analysis of Precondi-
alternative to the most common and expensive robotic platforms. tions and Implementation of a Cognitive Robotic System for
The platform is based on the widely accepted architecture ACT-R Navigation Tasks. In: Proceedings of the 11th International Con-
6.0 that controls the robot and interacts with the environment through ference on Cognitive Modeling, Universitaetsverlag der TU Berlin
LEGO Mind- storms sensors and actuators. The Mind-R robot has Best BJ, Lebiere C (2006) Cognitive agents interacting in real and
virtual worlds. Cognition and multi-agent interaction: From
cognitive modeling to social simulation pp 186218
Bothell D (2005) ACT-R. http://act-r.psy.cmu.edu
Clark A (1999) An embodied cognitive science? Trend Cogn Sci
3(9):345351
Haring K, Ragni M, Konieczny L (2012) Cognitive Model of Drivers
Attention. In: Russwinkel N, Drewitz U, van Rijn H (eds) Pro-
ceedings of the 11th International Conference on Cognitive
Modeling, Universitaetsverlag der TU Berlin, pp 275280
Hiraishi T (2007) Nxt controller in lisp http://super.para.media.
kyoto-u.ac.jp/*tasuku/index-e.html
LEGO (2009) LEGO Mindstorms. http://mindstorms.lego.com
Ragni M, Brussow S (2011) Human spatial relational reasoning:
Fig. 4 Productions that read the distance from the ultrasonic sensor Processing demands, representations, and cognitive model. In:
123
Burgard W, Roth D (eds) Proceedings of the 25th AAAI Con- condition allowed us to investigate whether the timespace congru-
ference on Artificial Intelligence, AAAI Press, San Francisco, CA ency is based on egocentric spatial codes or whether it depends on
Rizzardi E (2013) Cognitive robotics: Cognitive and perceptive body-referenced effector sides. Specifically, if the timespace con-
aspects in navigation with landmarks. Masters thesis, Universita gruency effect is based on an egocentric frame of reference, we expect
degli Studi di Brescia, Brescia, Italy faster RT for left (right) key responses following past (future) words
Trafton G, Hiatt L, Harrison A, Tamborello F, Khemlani S, Schultz A regardless of response condition. If, on the other hand, the timespace
(2013) Act-r/e: An embodied cognitive architecture for human- congruency effect depends on effector side, we expect that the pattern
robot interaction. Journal of Human-Robot Interaction 2(1):3055 reverse across conditions, i.e., faster RT should result for left (right)
key responses following past (future) words in the uncrossed-hands
condition, but faster RT for left (right) key responses following future
Crossed hands stay on the time-line (past) words in the crossed-hands condition.
The experiment factorially combined response condition
Bettina Rolke1, Susana Ruiz Fernandez2, Juan Jose Rahona Lopez3, (uncrossed hands, crossed hands), temporal reference (past, future),
Verena C. Seibold1 response key position (left, right), and SOA (300, 600, or 1,200 ms).
1
Evolutionary Cognition, University of Tubingen, Germany; 2 Leibniz Repeated measures analyses of variance (ANOVA) were conducted
Knowledge Media Research Center (KMRC), Tubingen, Germany; on mean RT of correct responses and percent correct (PC) taking
3
Complutense University, Madrid, Spain participants (F1) and items (F2) as random factors. P-values were,
whenever appropriate, adjusted for violations of the sphericity
How are objects and concepts represented in our memories? This assumption using the Greenhouse-Geisser correction.
question has been addressed in the past by two controversial posi- Results
tions. Whereas some theories suppose that representations are coded RT results are summarized in Fig. 1 which depicts mean RT as a function
in amodal concept nodes (e.g., Kintsch 1998), more recent of temporal reference, response key position, SOA, and response hands
embodied theories assume that internal representations include mul- condition. An ANOVA on RT revealed shorter RT for the uncrossed
timodal perceptual and motor experiences. One example for the latter compared to the crossed condition, F1(1,19) = 48.9, p \ .001;
conception constitutes the conceptual metaphor view, which assumes F2(1,11) = 9198.2, p \ .001. SOA exerted an influence on RT,
that abstract concepts like time are grounded in cognitively more F1(2,38) = 28.3, p \ .001; F2(2,22) = 243.4, p \ .001. Shorter RTs
accessible concepts like space (Boroditsky et al. 2010). This view were observed at shorter SOAs (all contrasts between SOAs p \ .05). As
is empirically supported by timespace congruency effects, showing
faster left-hand responses to past-related words and faster right-hand
responses to future-related words compared to responses with
reversed stimulusresponse mapping (e.g., Santiago et al. 2007). This
congruency effect implies that time is mentally represented along a
line, extending horizontally from left to right. Whereas the existence
of this mental time-line has been supported by several empirical
findings (see Bonato et al. 2012), the specific properties of the spatial
reference frame are still unclear. The aim of the present study was to
shed further light into the specific relationship between temporal and
spatial codes. Precisely, we examined whether the frame of reference
for the association between temporal and spatial codes is based on the
structural embodied side of the motor effectors, meaning that left
(right) refers to the left (right) hand, independent of the actual hand
position, or, alternatively, whether the frame of reference is organized
along an egocentric spatial frame which represents things and effec-
tors occurring at the left (right) body side as left (right)-sided. In other
words, according to the embodied frame of reference, the left hand
represents left, irrespective whether it is placed at the left or right
body side, whereas according to an egocentric spatial frame, the left
hand represents the left side, when it is placed at the left body side,
but represents the right side, when it is placed at the right body side.
Method
We employed a spatial discrimination task. Participants (N = 20) had
to respond with their right or left hand depending on the presentation
side of a rectangle. Specifically, when the rectangle was presented at
the left (right) side of fixation, participants had to press a response key
on their left (right) side. In the uncrossed-hands condition, participants
placed their left (right) index finger at the left (right) response key, in
the crossed-hands condition they crossed hands and thus responded
with their right (left) index finger on the left (right) key to left (right)
sided targets. To activate spatial codes by time related words, we
combined the spatial discrimination task with a priming paradigm and
presented future and past related time words (e.g., yesterday, tomor-
row) before the rectangle appeared (see Rolke et al. 2013). To monitor Fig. 1 Mean RT depending on response key position, temporal
the time-course of the timespace congruency effect, we manipulated reference of prime words, and SOA. Solid lines represent data of the
the SOA between the prime word and the rectangle (SOA = 300, 600, uncrossed response condition; dotted lines represent data of the
or 1,200 ms). Whereas the uncrossed condition served as baseline to crossed response condition. For sake of visual clarity, no error bars
establish the timespace congruency effect, the crossed-hands were included in the figure
123
one should expect, response condition interacted with response key Indeed, increasing tracking difficulty either by decreasing the pre-
position, F1(1,19) = 8.3, p = .01; F2(1,11) = 93.5, p \ .001, indicat- dictability of the tracked target or by changing the complexity of the
ing a right hand benefit for right hand responses at the left side in the controller dynamics has been shown to attenuate P3 responses in the
crossed condition and at the right side in the uncrossed condition. The- secondary auditory monitoring task (Wickens et al. 1983; Wickens,
oretically most important, temporal reference and response key position Kramer and Donchin 1984).
interacted, F1(1,19) = 9.7, p = .01; F2(1,11) = 17.6, p = .002. This In contrast, increasing tracking difficultyby introducing more
timespace congruency effect was neither modulated by response con- frequent direction changes of the tracked target (i.e. including higher
dition, F1(1,19) = 1.1, p = .31; F2(1,11) = 1.4, p = .27, nor by SOA, frequencies in the function that describes the motion trajectory of the
F1(2,38) = 1.2, p = .30; F2(2,22) = 1.0, p = .37. All other effects target)has been shown to bear little influence on the secondary
were not significant, all ps [ .31. Participants conducted more errors in tasks P3 response (Wickens, Israel and Donchin 1977; Isreal,
the crossed than in the uncrossed response condition, F1(1,19) = 24.3, Chesney, Wickens and Donchin 1980). Overall, the added require-
p \ .001; F2(1,11) = 264.3, p \ .001. The F2-analysis further revealed ment of a steering task consistently results in a lower P3 amplitude,
an interaction between response key position, SOA, and response con- relative to performing auditory monitoring alone (Wickens et al.
dition F2(2,22) = 3.9, p = .04. There were no other significant effects 1983; Wickens et al. 1977; Isreal et al. 1980).
for PC, all ps [ .07. Using a dual-task paradigm for indexing workload is not ideal.
Discussion First, it requires participants to perform a secondary task. This pre-
By requiring responses on keys placed on the left or right by crossed vents it from being applied in real-world scenarios; users cannot be
and uncrossed hands, we disentangled the egocentric spatial space and expected to perform an unnecessary task that could compromise their
the effector-related embodied space. The presentation of a time critical work performance. Second, it can only be expected to work if
word before a lateralized visual target evoked a spacetime congru- the performance of the secondary task relies on the same mental
ency effect, that is, responses were fastened for spatially left (right) resources as those of the primary task (Wickens, Yeh 1983), requiring
responses when a past (future) word preceded the rectangle. Theo- a deliberate choice of the secondary task. Thus, it is fortunate that
retically most important, this spacetime congruency effect was not more recent studies have demonstrated that P3 amplitudes can be
modulated when hands were crossed. This result indicates that tem- sensitive to MWL, even if the auditory oddball is ignored (Ullsperger,
poral codes activate abstract spatial codes rather than effector-related Freude and Erdmann 2001; Allison, Polich 2008). This effect is said
spatial codes. to induce a momentary and involuntary shift in general attention,
especially if recognizable sounds (e.g. a dog bark, opposed to a pure
References sound) are used (Miller, Rietschel, McDonald and Hatfield 2011).
Bonato M, Zorzi M, Umilta C (2012) When time is space: Evidence The current work, containing two experiments, investigates the
conditions that would allow novelty-P3, the P3 elicited by the
for a mental time line. Neurosci Biobehav Rev 36:22572273.
doi:10.1016/j.neubiorev.2012.08.007 ignored, recognizable oddball, to be an effective index for the MWL
Boroditsky L, Fuhrman O, McCormick K (2010) Do English and of compensatory tracking. Compensatory tracking is a basic steering
Mandarin speakers think about time differently? Cognition task that can be generalized to most implementations of vehicular
control. In both experiments participants were required to use a joy-
118:123129. doi:10.1016/j.cognition.2010.09.010
Kintsch, W. (1998) Comprehension: A paradigm for cognition. stick to counteract disturbances of a horizontal plane. To evaluate the
Cambridge University Press, New York generalizability of this paradigm, we depicted this horizontal plane as
Rolke B, Ruiz Fernandez S, Schmid M, Walker M, Lachmair M, either a line in a simplified visualization or as the horizon in a real-
world environment. In the latter, participants experienced a large
Rahona Lopez JJ, Hervas G, Vazquez C (2013) Priming the
mental time-line: Effects of modality and processing mode. Cogn field-of-view perspective of the outside world from the cockpit of an
Process 14:231244. doi:10.1007/s10339-013-0537-5 aircraft that rotated erratically about its heading axis. The task was the
Santiago J, Lupianez J, Perez E, Funes MJ (2007) Time (also) flies same regardless of the visualization. In both experiments, we
employed a full factorial design for the visualization (instrument,
from left to right. Psychon Bull Rev 14:512516. doi:
10.1007/s10339-013-0537-5 world) and 3 oddball paradigms (in experiment 1) or 4 levels of task
difficulty (in experiment 2) respectively. Two sessions were con-
ducted on separate days for the different visualizations, which were
counter-balanced for order. Three trials were presented per oddball
Is the novelty-P3 suitable for indexing mental workload paradigm (experiment 1) or level of task difficulty (experiment 2) in
in steering tasks? blocks, which were randomized for order. Overall, we found that
steering performance was worse when the visualization was provided
by a realistic world environment in experiments 1 (F (1, 11) = 42.8,
Menja Scheer, Heinrich H. Bulthoff, Lewis L. Chuang p \ 0.01) and 2 (F (1, 13) = 35.0, p \ 0.01). Nonetheless, this
Max Planck Institute for Biological Cybernetics, Tubingen, Germany manipulation of visualization had no consequence on our participants
Difficulties experienced in steering a vehicle can be expected to place MWL as evaluated by a post-experimental questionnaire (i.e., NASA-
a demand on ones mental resources (ODonnell, Eggemeier 1986). TLX) and EEG responses. This suggests that MWL was unaffected by
While the extent of this mental workload (MWL) can be estimated by our choice of visualization.
self-reports (e.g., NASA-TLX; Hart, Staveland 1988), it can also be The first experiment, with 12 participants, was designed to identify
physiologically evaluated in terms of how a primary task taxes a the optimal presentation paradigm of the auditory oddball. For the
common and limited pool of mental resources, to the extent that it EEG analysis, two participants had to be excluded, due to noisy
reduces the electroencephalographic (EEG) responses to a secondary electrophysiological recordings (more than 50 % of rejected epochs).
task (e.g. an auditory oddball task). For example, the participant could Whilst performing the tracking task, participants were presented with
be primarily required to control a cursor to track a target while a sequence of auditory stimuli that they were instructed to ignore.
attending to a series of auditory stimuli, which would infrequently This sequence would, in the 1-stimulus paradigm, only contain the
present target tones that should be responded to with a button-press infrequent oddball stimulus (i.e., the familiar sound of a dogs bark
(e.g., Wickens, Kramer, Vanasse and Donchin 1983). Infrequently (Fabiani, Kazmerski, Cycowicz and Friedmann 1996)). In the
presented targets, termed oddballs, are known to elicit a large positive 2-stimulus paradigm this infrequently presented oddball (0.1) is
potential after approximately 300 ms of their presentation (i.e.,P3). accompanied by a more frequently presented pure tone (0.9) and in
123
the 3-stimulus paradigm the infrequently presented oddball (0.1) is

accompanied by a more frequently presented pure tone (0.8) and an
infrequently presented pure tone (0.1). These three paradigms are
widely used in P3 research (Katayama, Polich 1996). It should be
noted, however, that the target to target interval is 20 s regardless of
the paradigm. To obtain the ERPs the epochs from 100 ms before to
900 ms after the onset of the recognizable oddball stimulus, were
averaged. Mean amplitude measurements were obtained in a 60 ms
window, centered at the group- mean peak latency for the largest
positive maximum component between 250 and 400 ms for the
oddball P3, for each of the three mid-line electrode channels of
interest (i.e., Fz, Cz, Pz). In agreement with previous work, the
novelty-P3 response is smaller when participants had to perform the
tracking task compared to when they were only presented with the
task-irrelevant auditory stimuli, without the tracking task (F (1,
9) = 10.9, p \ 0.01). However, the amplitude of the novelty-P3
differed significantly across the presentation paradigms (F (2,
18) = 5.3, p \ 0.05), whereby the largest response to our task-irrel-
evant stimuli was elicited by the 1- stimulus oddball paradigm. This
suggests that the 1-stimulus oddball paradigm is most likely to elicit
novelty-P3 s that are sensitive to changes in MWL. Finally, the
attenuation of novelty-P3 amplitudes by the tracking task varied
across the three mid-line electrodes (F (2, 18) = 28.0, p \ 0.001).
Pairwise comparison, Bonferroni corrected for multiple comparisons,
revealed P3 amplitude to be largest at Cz, followed by Fz and smallest Fig. 1 a left Grand average ERP data of Experiment 2 averaged over
at Pz (all p \ 0.05). This stands in contrast with previous work that Fz, Cz, Pz; right averaged amplitude of P3 as function of tracking
found control difficulty to attenuate P3 responses in parietal elec- difficulty. b left Averaged power spectral density (PSD) at Pz; right
trodes (cf., Isreal et al. 1980; Wickens et al. 1983). Thus, the current averaged PSD as a function of tracking difficulty
paradigm that uses a recognizable, ignored sound is likely to reflect an
underlying process that is different from previous studies, which
could be more sensitive to the MWL demands of a tracking task. novelty-P3. Although changes in novelty-P3 can identify the control
Given the result of experiment 1, the second experiment with 14 effort required in our compensatory tracking task, it is not sufficiently
participants, investigated whether the 1-stimulus oddball paradigm sensitive to provide a graded response across different levels of dis-
would be sufficiently sensitive in indexing tracking difficulty as turbances. In this regard, it may not be as effective as self-reports and
defined by the bandwidth of frequencies that contributed to the dis- joystick activity in denoting control effort. Nonetheless, further
turbance of the horizontal plane (cf., Isreal et al. 1980). Three research can improve upon the sensitivity of EEG metrics to MWL by
different bandwidth profiles (easy, medium, hard) defined the linear investigating other aspects that better correlate to the specific
increase in the amount of disturbance that had to be compensated for. demands of a steering task.
This manipulation was effective in increasing subjective MWL,
according to the results of a post- experimental NASA-TLX ques- Acknowledgments
tionnaire (F (2, 26) = 14.9, p \ 0.001) and demonstrated the The work in this paper was supported by the myCopter project, funded
expected linear trend (F (1, 13) = 23.2, p \ 0.001). This increase in by the European Commission under the 7th Framework Program.
control effort was also reflected in the amount of joystick activity,
which grew linearly across the difficulty conditions (F (1, 13) = 42.2, References
p \ 0.001). For the EEG analysis two participants had to be excluded Allison BZ, Polich J (2008) Workload assessment of computer
due to noisy electrophysiological recordings (more than 50 % of gaming using a single-stimulus event-related potential paradigm.
rejected epochs). A planned contrast revealed that the novelty- P3 was Biol Psychol 77 (3):277283
significantly lower in the most difficult condition compared to the Fabiani M, Kazmerski V, Cycowicz Y, Friedmann, D. (1996) Naming
baseline viewing condition, where no tracking was done (F (1, norms for brief environmental sounds. Psychol Rev 33:462475
11) = 5.2, p \ 0.05; see Fig. 1a). Nonetheless, novelty-P3 did not Hart SG, Staveland LE (1988) Development of NASA-TLX (Task
differ significantly between the difficulty conditions (F (2, Load Index). Results of empirical and theoretical research
22) = 0.13, p = 0.88), nor did it show the expected linear trend (F (1, Isreal JB, Chesney GL, Wickens CD, Donchin E (1980) P300 and
11) = 0.02, p = 0.91). Like (Isreal et al. 1980), we find that EEG- tracking difficulty: evidence for multiple resources in dual-task
responses do not discriminate for MWL that is associated with con- performance. Psychophysiology 17 (3):259273
trolling increased disturbances. It remains to be investigated, whether Katayama J, Polich J (1996) P300 from one-, two-, and three-stimulus
the novelty-P3 is sensitive for the complexity of controller dynamics, auditory paradigms. Int J Psychophysiol 23, 3340
like it has been shown for the P3. Miller MW, Rietschel JC, McDonald CG, Hatfield BD (2011) A
The power spectral density of the EEG data around 10 Hz (i.e., novel approach to the physiological measurement of mental
alpha) has been suggested by (Smith, Gevins 2005) to index MWL. A workload. Int J Psychophysiol 80 (1):7578
post hoc analysis of our current data, at electrode Pz, revealed that ODonnell RC, Eggemeier TF (1986) Workload assessment meth-
alpha power was significantly lower for the medium and hard con- odology. Handbook of Perception and Human Performance,
ditions, relative to the view-only condition (F (1, 11) = 6.081, 2:149
p \ 0.05; (F (1, 11) = 6.282, p \ 0.05). Nonetheless, the expected Smith ME, Gevins A (2005) Neurophysiologic monitoring of mental
linear trend across tracking difficulty was not significant (Fig. 1b). workload and fatigue during operation of a flight simulator.
To conclude, the current results suggest that a 1-stimulus oddball Defense and Security (International Society for Optics and
task ought to be preferred when measuring general MWL with the Photonics) 116126
123
Ullsperger P, Freude G, Erdmann U (2001). Auditory probe sensi- the basal ganglia, thereby influencing the learning of motor sequences
tivity to mental workload changesan event-related potential in parietal and (pre-)motor cortical areas (Penhune and Steele 2012).
study. Int J Psychophysiol 40 (3):201209 Along these lines, the proposed model learns to predict segments of
Wickens CD, Kramer AF, Vanasse L, Donchin E (1983) Performance motion patterns given embodied, sensorimotor motion signals. Due to
of concurrent tasks: a psychophysiological analysis of the reci- the resulting perspective taking capabilities, the model essentially
procity of information-processing resources. Science 221 offers a mechanism to activate mirror neuron capabilities.
(4615):10801082 Neural Network Model
Wickens CD, Israel J, Donchin E (1977) The event related potential The model consists of three successive stages illustrated in the
as an index of task workload. Proceedings of the Human Factors overview given in Fig. 1. The first stage processes relative positional
Society Annual Meeting 21, 282286 and angular values into mentally rotated, motion-direction sensitive
Wickens CD, Kramer AF, Donchin E (1984). The event- related population codes. The second stage performs a modulatory normali-
potential as an index of the processing demands of a complex zation and pooling of those. Stage III is a self- supervised pattern
target acquisition task. Annals of the New York Academy of segmentation network with sequence forecasting, which enables the
Sciences 425 (955610):295299 back-propagation of forecast errors. We detail the three stages and the
Wickens CD, Yeh Y-Y (1983) The dissociation between subjective involved techniques in the following sections.
workload and performance: A multiple resource approach. In: Stage I: Feature Preprocessing
Proceedings of the human factors and ergonomics society annual The input of the network is driven by a number of (not necessarily all)
meeting, 27(3):244248 relative joint positions and joint angles of a person. Initially, the net-
work can be driven by self-perception to establish an egocentric
perspective on self-motion. In this case, the relative joint positions
Modeling perspective-taking by forecasting 3D may be perceived visually, while the perception of the joint angles may
biological motion sequences be supported by proprioception in addition to vision. When actions of
others are observed, joint angles may be solely identified visually.
Fabian Schrodt, Martin V. Butz In each single interstage Ia in the relative position pathway, a
Cognitive Modeling, Computer Science Department, University single, positional body landmark relation is transformed into a
of Tubingen, Germany directional velocity by time-delayed inhibition, in which way the
model becomes translation-invariant. Interstage Ib implements a
Abstract mental rotation of the resulting directional velocity signals using a
The mirror neuron system (MNS) is believed to be involved in social neural rotation module Rl. It is driven by auto-adaptive mental
abilities like empathy and imitation. While several brain regions have rotation angles (Euler angles in a 3D space), which are implemented
been linked to the MNS, it remains unclear how the mirror neuron by bias neurons. The rotational module and its influence on the
property itself develops. Previously, we have introduced a recurrent directional velocity signals are realized by gain field-like modulations
neural network, which enables mirror-neuron capabilities by learning of neural populations (Andersen et al. 1985). All positional processing
an embodied, scale- and translation-invariant model of biological stages apply the same mental rotation Rl, by which multiple error
motion (BM). The model allows the derivation of the orientation of signals can be merged at the module. This enables orientation-
observed BM by (i) segmenting BM in a common positional invariance on adequate adaptation of the modules biases. In inter-
and angular space and (ii) generating short-term, top-down predic- stage Ic, each (rotated) D-dimensional directional motion feature is
tions of subsequent motion. While our previous model generated convolved into a population of 3D - 1 direction-responsive neurons.
short-term motion predictions, here we introduce a novel forecasting
algorithm, which explicitly predicts sequences of BM segments. We
show that the model scales on a 3D simulation of a humanoid walking
and is robust against variations in body morphology and postural
control.
Keywords
Perspective Taking; Embodiment; Biological Motion;
Self-Supervised Learning; Sequence Forecasting; Mirror-Neurons;
Recurrent Neural Networks
Introduction
This paper investigates how we may be able to recognize BM
sequences and mentally transform them to the egocentric frame of
reference to bootstrap mirror neuron properties. Our adaptive, self-
supervised, recurrent neural network model (Schrodt et al. 2014)
might contribute to the understanding of the MNS and its implied
capabilities. With the previous model, we were able to generate
continuous mental rotations to learned canonical views of observed
2D BMessentially taking on the perspective of an observed person.
This self-supervised perspective taking was accomplished by back-
propagating errors stemming from top-down, short-term predictions
of the BM progression. Fig. 1 Overview of the three-stage neural modeling approach in a 3D
In this work, we introduce an alternative or complementary, time- example with 12 joint positions and 8 joint angles, resulting in n = 20
independent forecasting mechanism of motion segment sequences to features. Boxes numbered with m indicate layers consisting of m
the model. In the brain, prediction and forecasting mechanisms may neurons. Black arrows describe weighted forward connections, while
be realized by the cerebellum, which is involved in the processing of circled arrowheads indicate modulations. Dashed lines denote
BM (Grossman et al. 2000). In addition, it has been suggested that the recurrent connections. Red arrows indicate the flow of the error
cerebellum may also support the segmentation of motion patterns via signals
123
The processing of each one-dimensional angular information is (corresponding to one left and one right walking step). The simulation
done analogously, resulting in 2-dimensional population codes. A provides the 3D positions of all 12 limb endpoints relative to the
rotation mechanism (inter-stage Ib) is not necessary for angles and bodys center x1 . . .x12 as well as 8 angles a1 . . .a8 between limbs
thus not applied. In summary, stage I provides a population of neu- (inner rotations of limbs are not considered). The view of the walker
rons for each feature of sensory processing, which is either sensitive can be rotated arbitrarily before serving as visual input to the model.
to directional changes in a body-relative limb position (26 neurons for Furthermore, the simulation allows the definition of the appearance
each 3D position) or sensitive to directional changes in angles and postural control of the walker. Each of the implied parameters
between limbs (2 neurons for each angle). (body scale, torso height, width of shoulders/hips and length of arms/
Stage II: Normalization and Pooling legs, as well as minimum/maximum amplitude of joint angles on
Stage II first implements individual activity normalizations in the movement) can be varied to log-normally distributed variants of an
direction-sensitive populations. Consequently, the magnitude of average walker, which exhibits either female or male proportions.
activity is generalized over, by which the model becomes scale- and Randomly sampled resulting walkers are shown in Fig. 2.
velocity-invariant. Normalization of a layers activity-vector can be Perspective-Taking on Action Observation with Morphological
achieved by axo-axonic modulations, using a single, layer-specific Variance
normalizing neuron (shown as circles in Fig. 1). Next, all normalized We first trained the model on the egocentric perspective of the
direction-sensitive fields are merged by one-to-one connections to a average male walker for 40 k time steps. The rotation biases were
pooling layer, which serves as the input to stage III. To also normalize kept fixed since no mental rotation has to be applied during self-
thepactivity
of the pooling layer, the connections are weighted by perception. In consequence, a cyclic series of 4 to 11 winner patterns
1= n, where n denotes the number of features being processed. evolved from noise in the pattern layer. Each represents i) a suffi-
Stage III: Correlation Learning ciently linear part of the walking via its instar vector and ii) the next
Stage III realizes a clustering of the normalized and pooled infor- forecasted, sequential part of the movement via its outstar vector.
mation from stage II (indexed by i) over time by instar weights fully After training, we fed the model with an arbitrarily rotated (uniform
connected to a number of pattern-responsive neurons (indexed j). distribution in orientation space) view of a novel walker, which was
Thus, each pattern neuron represents a unique constellation of posi- either female or male with 50 % probability. Each default mor-
tional and angular directional movements. For pattern learning, we phology parameter was varied by a log-normal distribution
use the Hebbian inspired instar learning rule (Grossberg 1976). To LN0; r2 with variance r2 = 0.1, postural control parameters were
avoid a catastrophic forgetting of patterns, we use winner-takes-all not varied. Instar/outstar learning was disabled from then on, but the
competitive learning in the sense that only the weights to the most mental rotation biases were allowed to adapt according to the
active pattern neuron are adapted. We bootstrap the weights from backpropagated forecast error to derive the orientation of the shown
scratch by adding neural noise to the input of each pattern neuron, walker.
which consequently activates Hebbian learning of novel input pat- Figure 3 shows the mismatch of the models derived walker orien-
terns. The relative influence of neural noise decreases while a pattern- tation, which we term orientation difference (OD), over time. We define
sensitive neuron is learned (cf. Schrodt et al. 2014). the OD by the minimal amount of rotation needed to rotate the derived
In contrast to our previous, short-term prediction approach, here orientation into the egocentric orientation about the optimal axis of
we apply a time-independent forecasting algorithm (replacing the rotation. In result, all trials converged to a negligible OD, which means
attentional gain control mechanism). This is realized by feedback that the given view of the walker was internally rotated to the previously
connections wji from the pattern layer to the pooling layer, which are learned, egocentric orientation. The median remaining OD converged
trained to approximate the input neti of the pooling layer neurons: to * 0.15 with quartiles of * 0.03. The time for the median OD to
1 owji t
Dwji t neti t wji t; 1
g ot
where neuron j is the last winner neuron that differed from the current
winner in the pattern layer. In consequence, the outgoing weight
vector of a pattern neuron forecasts the input to the pooling layer
while the next pattern neuron is active. The forecasting error can be
backpropagated through the network to adapt the mental
transformation for error minimization (cf. red arrows in Fig. 1).
Thus, perspective adaptation is driven by the difference between the Fig. 2 Variants of the simulated walker
forecasted and actually perceived motion. The difference di is directly
fed into the pooling layer by the outstar weights:
dit Dwji t; 2
where j again refers to the preceding winner.

Experiments
In this section, we first introduce the 3D simulation we imple-
mented to evaluate our model. We then show that after training on
the simulated movement, the learned angular and positional cor-
relations can be exploited to take on the perspective of another
person that currently executes a similar motion pattern. The
reported results are averaged over 100 independent runs (training
and evaluating the network starting with different random number
generator seeds).
Simulation and Setup Fig. 3 The model aligns its perspective to the orientation of observed
We implemented a 3D simulation of a humanoid walking with 10 walkers with different morphological parameters (starting at
angular DOF. The movement is cyclic with a period of 200 time steps t = 200). Blue quartiles, black median
123
Grossberg S (1976) on the development of feature detectors in the

visual cortex with applications to learning and reactiondiffusion
systems. Biological Cybernetics 21(3):145159
Grossman E, Donnelly M, Price R, Pickens D, Morgan V, Neighbor
G, Blake R (2000) Brain areas involved in perception of bio-
logical motion. Journal of cognitive neuroscience 12(5):711720
Penhune VB, Steele CJ (2012) Parallel contributions of cerebellar,
striatal and m1 mechanisms to motor sequence learning.
Behavioral brain research 226(2):579591
Schrodt F, Layher G, Neumann H, Butz MV (2014) Modeling per-
spective-taking by correlating visual and proprioceptive
dynamics. In: 36th Annual Conference of the Cognitive Science
Society, Conference Proceedings
Fig. 4 The model aligns its perspective to the orientation of observed
walkers with different postural control parameters
Matching quantifiers or building models? Syllogistic

fall short of 1 was 120 time steps. These results show that morpho- reasoning with generalized quantifiers
logical differences between the self-perceived and observed walkers
could be generalized over. This is because the models scale-invariance Eva-Maria Steinlein, Marco Ragni
applies to every positional relation perceived by the model. Center for Cognitive Science, University of Freiburg, Germany
Perspective-Taking on Action Observation with Postural Control
Variance Abstract
In this experiment, we varied the postural control parameters of the Assertions in the thoroughly investigated domain of classical syllo-
simulation on action observation by a log-normal distribution with gistic reasoning are formed using one of the four quantifiers: all,
variance r2 = 0.1, instead of the morphological parameters. Again, some, some not, or none. In everyday communication, meanwhile,
female as well as male walkers were presented. The perspective of all set-based quantifiers like most and frequency-based quantifiers such
shown walkers could be derived reliably, but with a higher remaining as normally are more often used. However, little progress has been
OD of * 0.67 and more distal quartiles of * 0.32. The median made in finding a psychological theory that considers such quantifiers.
OD took longer to fall short of 1, namely 154 time steps. This is This article adapts two theories for reasoning with these quantifiers:
because the directions of joint motion are influenced by angular the Matching-Hypothesis and a variant of the Mental Model Theory.
parameters. Still, variations in postural control could largely be Both theories are evaluated experimentally in a syllogistic reasoning
generalized over (Fig. 4). task. Results indicate a superiority of the model-based approach.
Conclusions and Future Work Semantic differences between the quantifiers most and normally are
The results have shown that the developed model is able to recognize discussed.
novel perspectives on BM independent from morphological and lar- Keywords
gely independent from posture control variations. With the previous Reasoning, Syllogisms, Matching-Hypothesis, Mental Models, Min-
model, motion segments are also recognized if their input sequence is imal Models
reordered, such that additional, implicitly learned attractors may exist Introduction
for the perspective derivation. The introduced, explicit learning of Consider the following example:
pattern sequences forces the model to deduce the correct perspective All trains to Bayreuth are local trains.
by predicting the patterns of the next motion segment rather than the Normally local trains are on time.
current one. It may well be the case, however, that the combination of What follows?
both predictive mechanisms may generate even more robust results. You might infer that, normally, trains to Bayreuth are on timeat
Future work needs to evaluate the current model capabilities and least this is what participants in our experiments tend to do. And, in
limitations as well as possible combinations of the prediction mech- absence of other information, it might be even sensible to do so.
anisms further. Currently, we are investigating how missing or However, if you understand the second assertion as Normally local
incomplete data could be derived by our model during action trains in Germany are on time then the local trains to Bayreuth could
observation. be an exception. So while different conclusions are possible, none of
We believe that the introduced model may help to infer the current them is necessarily true. Hence no valid conclusion (NV) follows, but
goals of an actor during action observation somewhat independent of participants rarely give this logically correct answer.
the current perspective. Experimental psychological and further Problems like the example above consisting of two quantified
cognitive modeling studies may examine the influence of motor premises are called syllogisms. The classical quantifiers all, some,
sequence learning on the recognition of BM and the inference of some not, and none have been criticized for being too strict or
goals. Also, an additional, dynamics-based modulatory module could uninformative, respectively, (Pfeifer 2006) and thus being infre-
be incorporated, which could be used to deduce emotional properties quently used in natural language. Hence so-called
of the derived motionand could thus bootstrap capabilities related generalized quantifiers like most and few have been introduced and
to empathy. These advancements could pave the way for the creation investigated in this field (Chater, Oaksford 1999). In our study we
of a model on the development of a mirror neuron system that sup- additionally included the frequency-based term normally that is
ports learning by imitation and is capable of inferring goals, used in non-monotonic reasoning. Non-monotonic reasoning (Brew-
intentions, and even emotions from observed BM patterns. ka, Niemela and Truszczynski 2007) deals with rules that describe
what is usually the case, but do not necessarily hold without
References exception.
Andersen RA, Essick GK, Siegel RM (1985) Encoding of spatial location The terms of a syllogistic problem can be in one of four possible
by posterior parietal neurons. Science 230(4724):456458 figures (Khemlani, Johnson-Laird 2012). We focus on two: Figure I,
123
the order A-B and BC (the example above is of this type with to what is known as System 1), some reasoners may engage in a more
A = trains to Bayreuth, B = local trains, and C = are on sophisticated reasoning process (System 2) consisting of the con-
time), and Figure IV, the term order B-A BC. While Figure I struction of alternative models in order to falsify the initial
allows for a transitive rule to be applied, Figure IV does not. Addi- conclusion. An example for an alternative model of the syllogism is
tionally, conclusions can be drawn in two directions, relating A to C given above. With such an alternative model in mind, the reasoner
(A-C conclusion) or C to A (C-A conclusion). Several theories of will arrive at the valid response, i.e. in this case NV.
classical syllogistic reasoning have been postulated based on formal Empirical Investigation
rules (e.g. (Rips 1994)), mental models (e.g. Bucciarelli, Johnson- Hypothesis. We tested whether our extension of the Matching-
Laird 1999), or heuristics (e.g. Chater, Oaksford 1999). How- Hypothesis or Minimal Models provide a more accurate account of
ever, none of them provides a satisfying account of native human syllogistic reasoning. The PHM was not included in our analysis,
participants syllogistic reasoning behavior (Khemlani, Johnson-Laird as it does not provide any predictions for reasoning with the quantifier
2012). normally. For our experiment, we assume that Minimal Models make
While most theories only provide predictions for reasoning with better predictions for the response behavior of naive participants,
the classical quantifiers, some theories apply equally to generalized because 1) they allow for effects of figure, i.e. responses may vary
quantifiers. One of the most important approaches in this field is the depending on the order of terms in the premises and 2) they not only
Probability Heuristics Model (PHM) introduced by Chater and predict heuristic System 1 responses, but also System 2 responses which
Oaksford (1999). It states that reasoners solve syllogisms by simple are logically valid and are often not conform to System 1 responses.
heuristics, approximating a probabilistic procedure. Within this Therefore, we hypothesize that Minimal Models show a higher hit rate
framework, generalized quantifiers like most are treated as proba- and more correct rejections than the Matching-Hypothesis. Further-
bilities of certain events or features. Another theory to explain human more, System 2 responses should occur as predicted by Minimal
syllogistic reasoning is the Matching Hypothesis (Wetherick, Gilho- Models. In addition to this comparison of theories, we explored the
oly 1995), which states that the choice of the quantifier for the semantics of the quantifiers most and normally empirically.
conclusion matches the most conservative quantifier contained in the Participants. Fifty-eight native English speakers (21 m, 37 f;
premises. Extending this approach with most and normally could mean(age) = 35.5) participated in this online experiment. They were
result in the order: recruited via Amazon Mechanical Turk and come from a variety of
All \ Normally \ Most \ Some = Some not \ None educational and occupational backgrounds.
Considering the example above from this perspective, normally is Design & Procedure. In this online experiment participants were
preferred over all; hence a reasoner would, incorrectly, respond that asked to generate conclusions to 31 syllogisms in Figures I and IV
normally trains to Bayreuth are on time. reflecting all combinations of the quantifiers all, some, most, and
Do people actually reason when confronted with syllogisms or are normally (the simple syllogism AA in Figure I was omitted, as it was
responses the result of a superficial automatic process, as suggested used as an explanatory example in the instruction). Both premises
by the Matching-Hypothesis? Mental Models are an approach that were presented simultaneously, together with the question What
assumes individuals engage in a reasoning process, thus allowing for follows? Participants could either fill in a quantifier and the terms
more sophisticated responses. Yet individual differences exist. (X, N, V), or write nothing (meaning NV) in the response field.
Therefore, we suggest Minimal Models as a hybrid approach, com- After completing this production task, participants were asked about
bining mental models and heuristics. It is assumed that a deductive their understanding of the four quantifiers. For each quantifier they
process based on an initial model is guided by the most conservative had to complete a statement of the following form: If someone says
quantifier of the premises. Reasoners will try to verify this quantifier [quantifier] it refers to a minimum of out of 100 objects. Note that
in the initial model, which is minimal with respect to the number of we asked for the minimum, as the lower bounds of the quantifiers are
individuals represented, and tend to formulate a conclusion containing of greater importance to understanding the semantics of these specific
this quantifier. For example, for the syllogism Most A are B, quantifiers.
Some B are C, some is more conservative and tested in the fol- Results
lowing initial (minimal) model (left): Overall calculations show only a small, but significant difference
(Wilcoxon test, z = 2.11, p = .018) between Minimal Models
(67.9 % of responses predicted) and the Matching-Hypothesis
(64.5 %). This trend was confirmed by the more fine-grained analysis
of hits and correct rejections following a method introduced by
(Khemlani, Johnson-Laird 2012): Theoretical predictions are com-
pared to significant choices (as shown in Tables 1 and 2) and hits (i.e.
choices that are predicted by the respective theory) and correct
rejections are counted. In this analysis, Minimal Models perform
better in both categories, hits (90.1 vs. 78.1 %; Wilcoxon test,
Some holds in this model, thus the preferred conclusion is Some A
z = 2.99, p = .001) and correct rejections (92.1 vs. 87.5 %;
are C. While this answer is dominated by heuristics (corresponding
Table 1 Significant choices for Figure I and the percentage of participants who drew these conclusions
First premise Second premise
All [A] Some [I] Most [M] Normally [N]
All [A] I (78 %) M (72 %) N (60 %)

Some [I] I (78 %) I (79 %) I (67 %) I (69 %)
Most [M] M (74 %) I (74 %) M (67 %) M (33 %), I (28 %), N (26 %)
Normally [N] N (69 %) I (79 %) I (33 %), M (26 %), N (19 %) N (69 %)
123
Table 2 Significant choices for Figure IV and the percentage of participants who drew these conclusions
First premise Second premise
All [A] Some [I] Most [M] Normally [N]
All [A] A (74 %) I (66 %) M (50 %) N (50 %)

Some [I] I (55 %), I*(21 %) I (57 %), NV (36 %) I (48 %), NV (31 %) I (43 %), NV (31 %)
Most [M] M (45 %), M*(17 %) I (50 %), NV (24 %) M (50 %), NV (22 %), I (21 %) N (24 %), NV (24 %), M (22 %), I (21 %)
Normally [N] N (41 %), N*(17 %) I (57 %), NV (31 %) M (26 %), I (22 %), N (17 %), N (59 %), NV (24 %)
NV (17 %)
Conclusions marked with * are conclusions in C-A direction, all others are in A-C direction. NV = not valid conclusion
Wilcoxon test, z = 2.65, p = .004). According to our Minimal Model extended to address this issue in further research. Furthermore, for the
approach, for 26 tasks System 2 leads to responses differing from the presented experimental results the theory predictions for the quantifier
heuristic ones. In eight cases, this prediction was confirmed by the normally could also be examined using the more fine-grained method
data, i.e., in those cases a significant proportion of participants drew of Multinomial Processing Trees (cf. Ragni, Singmann and Steinlein
the respective System 2 conclusion. 2014).
The quantitative interpretation of the quantifiers is depicted in Non-monotonic reasoning, i.e., reasoning about default assump-
Fig. 1 for the quantifiers some, most, and normally. Note that the tions, often uses quantifiers like normally to express knowledge. For
values for the quantifier all are not illustrated, as with one exception instance (Schlechta 1995) characterizes such default assumptions in
all participants assigned a minimal value of 100 to it. For several his formal investigation as generalized quantifiers.
participants normally is equivalent to all, i.e., no exceptions are Although there are many theories for syllogistic reasoning
possiblein contrast to most. The direct comparison of the quantifiers (Khemlani, Johnson-Laird 2012), there are only few that can be
most and normally revealed that, as expected, normally generically extended; among these we focused on the Matching-
(mean = 75.5) is attributed a significantly (Wilcoxon text, z = 2.39, Hypothesis and the Mental Model Theory. In contrast to the Match-
p = .008) higher value than most (mean = 69.0). ing-Hypothesis, Mental Model Theory relies on mental
Discussion representations that can be changed to search for counter-examples
Our distinction between frequency-based quantifiers (e.g., normally) (as in relational reasoning cf. Ragni, Knauff 2013) and generate
and set-based quantifiers (e.g., most) in reasoning isto the best of additional predictions by the variation of the models. The findings
our knowledgenew. Although both, in principle, allow for excep- indicate that an extension of the Matching-Hypothesis to include the
tions, depending on the underlying semantics, four reasoners gave the set-based quantifier most and the frequency-based quantifier normally
same semantics for normally as for all. For most all reasoners lead to an acceptable prediction of the experimental data. There are,
assumed the possibility for exceptionspossibly applying a principle however, some empirical findings it cannot explain, e.g., System 2
similar to the Gricean Implicature (Newstead 1995). This principle responses and figural effects in reasoning with generalized quantifiers.
assumes that whenever we use expressions that allow for exceptions, It seems that in this case reasoners construct models as representa-
these can be typically assumed. tions instead of merely relying on superficial heuristics.
So far the PHM (Chater, Oaksford 1999) does not provide any
predictions for reasoning with the quantifier normally; however, given
our quantitative evaluation of this quantifier, the PHM could be Acknowledgments
The work has been partially supported by a grant to MR from the
DFG within the SPP in project Non-monotonic Reasoning. The
authors are grateful to Stephanie Schwenke for proof-reading.
References
Brewka G, Niemela I, Truszczynski M (2007) Nonmonotonic rea-
soning. Handbook of Knowledge Represent 239284
Bucciarelli M, Johnson-Laird PN (1999) Strategies in syllogistic
reasoning. Cogn Sci 23:247303. doi:10.1016/S0364-0213
(99)00008-7
Chater N, Oaksford M (1999) The probability heuristics model of
syllogistic reasoning. Cogn Psychol 38:191258. doi:10.1006/
cogp.1998.0696
Khemlani S, Johnson-Laird PN (2012) Theories of the syllogism: a
metaanalysis. Psychol Bull 138, 427457. doi:10.1037/a0026841
Newstead S (1995) Gricean implicatures and syllogistic reasoning.
J Memory Lang 34 (5):644664
Pfeifer N (2006) Contemporary syllogistics: Comparative and quan-
titative syllogisms. In Kreuzbauer G, Dorn GJW (eds)
Argumentation in Theorie und Praxis: Philosophie und Didaktik
des Argumentierens. LIT, Wien, pp 5771
Fig. 1 Individual values (points) and quartiles (lines) of participants Ragni M, Knau M (2013) A theory and a computational model of
understanding of the minimum of the quantifiers some, most, and spatial reasoning with preferred mental models. Psychological
normally Review 120 (3):561588
123
Ragni M, Singmann H, Steinlein E-M (2014) Theory comparison for visual system (finding the red, ripe fruit) to aspects of ecological
generalized quantifiers. In Bello P, Guarini M, McShane M, Scas- valence (experiences we associate with colored objects cause corre-
sellati B (eds) Proceedings of the 36th annual conference of the sponding color preference). In a spatial context, colored environments
cognitive science society. Cognitive Science Society, Austin, have been found to enhance wayfinding behavior of children and
pp 19841990 adults when compared to non-colored environments (Jansen-Osmann,
Rips LJ (1994) The psychology of proof: Deductive reasoning in Wiedenbauer 2004). Also, on the level of single colors applied to
human thinking. The MIT Press, Cambridge landmarks along a virtual route, green led to the worst performance in
Schlechta K (1995) Defaults as generalized quantifiers. J Logic a subsequent recognition task, while yellow, cyan, and red were
Comput 5 (4):473494 recognized best (Wahl et al. 2008). Interestingly, even though not
Wetherick NE, Gilhooly KJ (1995) Atmosphere, matching, and being a preferred color per se, yellow seems to be easy to recognize,
logic in syllogistic reasoning. Curr Psychol 14:169178. doi: therefore, probably helpful for remembering a path and learning the
10.1007/BF02686906 surrounding. Thus, it seems to be important to differentiate between a
personal preference for color and the memorability and utility of
colors in a spatial context.
A shape, such as a square or an ellipse, comprises both, visual and
What if you could build your own landmark? The semantic salience an appearance which is more or less easy to
perceive and to reproduce and a mental conceptualization, a label, a
influence of color, shape, and position on landmark meaning. Shapes compared to colors revealed a significantly higher
salience recognition performance (Wahl et al. 2008). Nevertheless, no differ-
ences in selecting or recognizing differently shaped landmarks could
Marianne Strickrodt, Thomas Hinterecker, Florian Roser, be found (Roser et al. 2012; Wahl et al. 2008). Outside the spatial
Kai Hamburger context, Bar, Neta (2006) found that angular objects are less preferred
Experimental Psychology and Cognitive Science, Justus Liebig than curved objects. They speculated: sharp contours elicit a sense of
University Giessen, Germany threat and lead to a negative bias towards the object. Taken together,
these findings again suggest that both, preference and utility of
Abstract
shapes, are to be differentiated, whereby utility might play a more
This study focused on participants preferences for building a
important role when selecting a color in a wayfinding context.
landmark from eight colors, eight shapes, and four possible landmark
Besides these low-level features color and shape, this research
positions for aiding the wayfinding of a nonlocal person. It can be
concentrates on the position of a landmark at an intersection, covering
suggested that participants did not only select features according to
an important aspect of structural salience, namely, how different
their personal and aesthetic preference (e.g. blue, circle), but also
positions are conceptualized. When instructed to select one out of
according to a sense of common cognitive availability and utility for
four landmarks for a route description, each attached to one of the
learning a route (e.g. red, triangle). Strong preferences for the position
four corners of an intersection, participants show a clear position
of a landmark namely, before the intersection and in direction of the
preference (Roser et al. 2012). From an egocentric perspective the
turn are in line with other studies investigating position preference
positions in direction of turn either before or after the intersection are
from an allocentric view.
chosen significantly more often than positions opposite to the direc-
Keywords
tion of turn. With allocentric (map-like) material the position before
Salience, Landmarks, Position, Color, Shape, Feature preference
the intersection and in direction of turn is most preferred. Therefore,
what most accounted for an object to be chosen was its location
Introduction dependent of the direction in which the route continued, not whether
When travelling in an unknown environment people can use objects it was presented on the left or right. The two types of defining the
such as buildings or trees for memorizing the paths they walk. These position of an object at an intersection are visualized in Fig. 3.
objects, also called landmarks, are considered to be important refer- This study addresses all saliencies with the help of a simple selection
ence points for learning and finding ones way (Lynch 1960). task. Participants should choose from color, shape and position to create
Essentially, almost everything can become a landmark as long as it is a landmark, which should be aiding another persons way through the
salient (Presson, Montello 1988). Sorrows, Hirtle (1999) defined three same environment. We assume that the landmarks produced by the
different landmark saliencies, whose precise definitions change participants mirror their own implicit sense of a good, salient landmark,
slightly in the literature (e.g. Klippel, Winter 2005; Roser et al. 2012) which everyone should be able to use. Results might in turn be an
but, nevertheless, include the following aspects: indicator for diverging scores of salience within the examined features.
visual/perceptual salience: physical aspects of an object (e.g. By combining the findings of the aforementioned preference and nav-
color, shape, size); igation research, we hypothesize that red and blue and the position in
semantic/cognitive salience: aspects of knowledge and experi- front of the intersection, in direction of turn (D) are most frequently
ences, refers to mental accessibility of an object (i.e. easiness to chosen. Since shapes seem to induce distinctive preference this might
label an object); also be reflected in the construction of a landmark, but no clear sug-
structural salience: easiness one can cognitively conceptualize the gestions can be made at this point.
position of an object. Material and Method
Participants
Speaking of perceptual salience, which is not absolute but con- The sample consisted of 56 students (46 females) from Giessen
trast-defined, a high contrast of the landmark to the surrounding will University (mean age 24 yrs, SD = 4.5), who received course credits
lead to easy and fast identification and recognition (Presson, Montello for participation. Normal or corrected to normal vision was required.
1988). Nevertheless, given a high contrast, also color preference itself Materials
might influence the visual salience of an object. In a non-spatial On an A4 sized paper an allocentric view of a small schematic city
context, blue to purple colors were found to be preferred whereas area was printed. The region of interest in this area consisted of four
yellowish-green colors were most disliked (Hurlbert, Ling 2007). orthogonal intersections (Fig. 1). The route (two right and left turns)
The cause of these preferences is discussed in light of different the- through this region was displayed by a magenta line. On the four
ories and explanations ranging from evolutionary adaption of our corners of each intersection a quadratic white field indicated an
123
30
25 22.77
color assignments [%]

20 17.86 17.41
* 14.29
15 12.95
uniform distribution
10.27
10
* *
5 3.57
0.89
0
red green blue yellow violet orange black white
30
25
shape assignments [%]

21.88 21.43
20 17.86
Fig. 1 Schematic city area and range of colors and shapes partici- * * 14.73
15 12.95
pants could choose from as presented to the participants. The route uniform distribution
from start (Start) to destination (Ziel) is indicated by a dashed 10
line. White quadratic fields are the optional locations for the created * *
landmarks 4.46 4.02
*
5 2.68
optional location for a landmark. Participants could choose from eight 0

colors (violet, blue, green, yellow, orange, red) or luminances (white,
black), respectively, and eight shapes (diamond, hexagon, square,
rhomboid, ellipse, circle, cross, triangle).
Procedure Fig. 2 Relative frequency of selected colors and shapes and their
Instructions were given on a separate paper. Only one landmark deviation from uniform distribution (dashed line)
should be built for every intersection and each color and shape could
only be used once. The landmark was to be positioned at one of the
four corners of an intersection. The shape, therefore, had to be drawn a| Position independent of Position dependent of |b
with the selected color in one of the four white corners of an inter- direction of turn direction of turn
section. The task was to use the subjectively preferred combinations
to build the landmarks in order to facilitate wayfinding for a notional,
nonlocal person. Participants were instructed to imagine giving a 1 2 A B
verbal route description to this nonlocal person, including their built 12.5% 14.29% 1.34% 25%
landmarks.
Results
Overall 224 decisions for shapes, colors, and positions, respectively
(56 participants * 4 landmarks to build), were analyzed with non- 3 4 C D
parametric Chi Square tests. Frequencies for selection of shapes and 37.5% 35.71% 1.79% 71.88%
colors can be seen in Fig. 2.
When analyzing single colors (Bonferroni correction a = .006),
red was significantly above uniform distribution (v2(1) = 21.592, Fig. 3 Relative frequency of selected positions a independent of
p \ .001), black (v2(1) = 16.327, p \ .001) and white direction of turn, 1 behind, left; 2 behind, right; 3 in front, left; 4 in
(v2(1) = 27.592, p \ .001) were below. Regarding the shapes, results front, right. b dependent of direction of turn, A behind, opposite;
show that participants have a significant preference for the triangle B behind, in; C in front, opposite; D in front, in. Note that the right
(v2(1) = 18, p \ .001) and the circle (v2(1) = 16.327, p \ .001). On figure includes both, right and left (transposed to right) direction of
the other hand, ellipse (v2(1) = 13.224, p = .001), hexagon turns
(v2(1) = 14.735, p \ .001), and rhomboid (v2(1) = 19.755,
p \ .001) were rarely chosen at all. Green, blue, yellow, violet,
orange, and square, diamond and cross did not deviate from average Discussion
frequencies. This study examined the selection of three different landmark fea-
Figure 3 and Table 1 comprise findings comparing landmark tures, namely color, shape, and location. Participants were instructed
positions. When focusing on landmark positions dependent of the to select according to their own persuasion what kind of landmark is
direction of turn, it could be shown that position D in front of the most qualified to aid a nonlocal person to find her way following a
intersection, in direction of turn is by far most frequently selected route description. Most favored by the participants was the color red
(71.88 %), followed by the other, associated position lying in direc- (followed by green and blue, which, due to a error correction did not
tion of turn but behind the intersection position B (25 %). Positions differ from chance). The least preferred colors were the luminances
opposite to the direction of turn (A and C) lag far behind, suggesting black and white. As for shapes triangle and circle were most fre-
that the significant difference between direction independent positions quently selected (ensued by square, although, without significant
in front of and behind the intersection (1 and 2 against 3 and 4) is difference from chance). Least preferred were ellipse, hexagon, and
solely driven by the popularity of D. rhomboid. A significant prominence of the position D was found.
123
Table 1 Multiple Chi Square comparisons for the two types of def- Jansen-Osmann P, Wiedenbauer G (2004) The representation of
inition for landmark location landmarks and routes in children and adults: A study in a virtual
environment. J Environ Psychol 24:347357
Independ. v2(1) p Depend. v2(1) p Klippel A, Winter S (2005) Structural salience of landmarks for route
discrimination. In: Cohn AG, Mark D (ed) Spatial information
12 0.267 .699 AB 47.610 \.001* theory. International Conference COSIT. Springer, Berlin
13 28 \.001* AC 0.143 1.000 Lynch K (1960) The image of the city. MIT Press, Cambridge
14 25.037 \.001* AD 152.220 \.001* Presson CC, Montello DR (1988) Points of reference in spatial cogni-
tion: Stalking the elusive landmark. Br J Dev Psychol 6:378381
23 23.310 \.001* BC 45.067 \.001*
Roser F, Krumnack A, Hamburger K, Knauff, M (2012) A four factor
24 20.571 \.001* BD 50.806 \.001* model of landmark salienceA new approach. In: Russwinkel
34 0.098 .815 CD 149.388 \.001* N, Drewitz U, van Rijn H (ed) Proceedings of the 11th Inter-
national Conference on Cognitive Modeling (ICCM).
v2 value and significance are shown. (Bonferroni correction a = .008) Universitatsverlag TU Berlin, Berlin
Sorrows ME, Hirtle SC (1999) The nature of landmarks for real and
electronic spaces. In: Freksa C, Mark DM (ed) Spatial information
The neglect of the luminances black and white is in line with the theory: cognitive and computational foundations of geographic
assumptions concerning visual salience, namely, that a low contrast to information science, International Conference COSIT 1999.
the grey and white background of the experimental material is not Springer, Stade
preferable in a wayfinding context. Results suggest that participants Wahl N, Hamburger K, Knauff M (2008) Which properties define the
were aware of the positive impact of contrast. Interestingly, neither salience of landmarks for navigation?An empirical investi-
former results of color preferences (Hurlbert, Ling 2007) nor benefit in gation of shape, color and intention. International Conference
recognition (Wahl et al. 2008) are perfectly mirrored in our data, sug- Spatial Cognition 2008, Freiburg
gesting that selection process was not based on either of these levels. Waller D, Lippa Y (2007) Landmarks as beacons and associative
Instead, it seems to be plausible to suggest a selection strategy prefer- cues: their role in route learning. Mem Cognit 35:910924
ring landmark features according to familiarity. As red, blue, and green
constitute the three primary colors every western pupil gets taught in
school and as they are probably the most used colors in street signs, they
might also be best established and conceptualized in the knowledge of Does language shape cognition?
an average person, selecting these colors. For the visual and semantic
salience of shapes a similar explanation may be consulted. Shapes are Alex Tillas
preferred, which are highly common and easy to identify by everyone: Institut fur Philosophie, Heinrich-Heine-Universitat, Dusseldorf,
triangles and circles. Furthermore, the low complexity of these shapes
Germany
compared to rhomboid or hexagon might have affected the selection as
well. It seems that the sharpness of the contour of an object was Introduction
immaterial in this task. The clearest and most reliable result is the In this paper, I investigate the relation between language and thinking
preference for position D (allocentric), the position before the inter- and offer an associationistic view of cognition. There are two main
section and in direction of the turn. Also Waller, Lippa (2007) pointed strands in the debate about the relation between language and cog-
out the advantages of landmarks in directions of turn as they serve as nition. On the one hand there are those that ascribe a minimal role to
beacons (as compared to associative cues). Merely recognizing these language and argue that language merely communicates thoughts
landmarks is sufficient to know where to go, since their position reveals from the Language of Thought-level to the conscious-level (e.g. Grice
the correct direction response at an intersection. 1957; Davidson 1975; Fodor 1978). On the other hand, there are those
Overall, it seems that participants did not choose object prop- who argue for a constitution relation holding between the two (Car-
erties according to a mere personal feature preference. Their ruthers 1998; Brandom 1994). Somewhere in the middle of these two
selection process probably involved preference with respect to per- extremes lie the supra-communicative views of language that go back
ceptibility, easiness of memorization, and usability in terms of to James (1890/1999), Vygotsky (trans. 1962) and more recently in
wayfinding (this works fine as a landmark). To what extent the the work of Berk and Garvin (1984). Furthermore, Gauker (1990)
selection was based on a conscious or unconscious process cant be argues that language is a tool for affecting changes in the subjects
determined here. Also, if the fact of guiding another person (com- environment, while Jackendoff (1996) argues that linguistic formu-
pared to oneself) played an important role in creating a landmark lation allows us a handle for attention. Finally, Clark (1998), and
cant be sufficiently answered at this point. Furthermore, if these Clark and Chalmers (1998) argue for the causal potencies of language
preferences really help people to learn a route faster or easier yet is and suggest that language complements our thoughts (see also Ru-
another question. For the task of building a landmark, which shall melhart et al. 1986).
aid other people to find the same way, we found evidence that Building upon associationism the view suggested here ascribes a
people show clear preference for best-known and most common significant role to language in terms of cognition. This role is not
colors and shapes. Moreover, the high frequency of the selection of limited to interfacing between unconscious and conscious level but
the position before the turn and in the direction of turn is striking. the relation between the two is not one of constitution. More spe-
This study, the task of creating a landmark, is a small contribution cifically, in the suggested view linguistic labels (or words) play a
to the expanding research on visual as well as semantic, and crucial role in thinking. Call this position Labels and Associations in
structural salience of landmarks. Thinking hypothesis (henceforth LASSO). LASSO is similar to
Clarks view in that utilization of linguistic symbols plays a signifi-
References cant role. However, for Clark, language is important in reducing
Bar M, Neta M (2006) Humans prefer curved visual objects. Psychol cognitive loads, while in LASSO utilization of linguistic labels is
Sci 17:645648 responsible for acquisition of endogenous control over thoughts. In
Hurlbert AC, Ling Y (2007) Biological components of sex differences particular, I start from the ability that human agents have to manip-
in color preference. Curr Biol 17:R623R625 ulate external objects in relationships of agency towards them, and
123
argue that we can piggyback on that ability to manipulate and direct constitutively related. Contra Carruthers, the relationship between a
our own thinking. Despite sharing with supra-communicative views first order and a second order thought is not a constitutive but a causal
that language does not merely serve to communicate thoughts to associative one. Thought and language are not constitutively
consciousness, my focus here is on a more general level. In particular, connected.
I focus on how language influences thinking, rather than on how Evidence for LASSO: Language & perceptual categorization
specific cognitive tasks might be propped by language. Finally, The suggested view enjoys significant empirical support, e.g. from
LASSO resembles Lupyans (2007) Label Feedback Hypothesis, even evidence showing that perceptual categorization depends on lan-
though my agenda is more general than Lupyans (non-linguistic guage. This evidence could in turn be used against the communicative
aspects of cognition such as perceptual processing). conception of language. For instance, in a series of experiments,
The LASSO Hypothesis Davidoff, Robertson (2004) examined LEWsa patient with lan-
LASSO is based on a view of concepts as structured entities, com- guage impairments and close to the profile of high-level Wernickes
prising a set of representations. Members of this set are mainly aphasia, abilities to categorize visually presented color stimuli, and
perceptual representations from experiences with instances of a given found that color categories did not pop-out for LEW. Instead, he
kind, as well as perceptual representations of the appropriate word. retreated to a comparison between pairs, which in turn resulted in his
These representations become associated on the basis of co-occur- poor performance in the categorization tasks. From this, Davidoff,
rence. Crucially they become reactivated when thinking about this Robertson argue that color categorization is essentially a rule-gov-
object; to this extent thinking is analogous to perceiving (Barsalou erned process. And even though colors are assigned to a given
1999). category on the basis of similarity, it is similarity to a conventionally
To endogenously control the tokening of a given concept is to named color that underlines this assignment. LEWs inability to
activate this concept in the absence of its referents. In turn, to categorize simple perceptual stimuli is because names are simply not
endogenously control thinking is to token a thought on the basis of available to him.
processes of thinking rather than of processes of perceiving the With regards to his performance in the color and shape categori-
appropriate stimulus. Endogenously controlled thinking is merely zation tasks, they argue that it is not the case that LEW has simply lost
associative thinking, i.e., current thinking caused by earlier thinking. color or shape names. He is rather unable to consciously allocate
The key claim hare is that we have endogenous control over our items to perceptual categories. To this extent, they argue that LEWs
production of linguistic items given that we are able to produce lin- impairment is not related to a type-of-knowledge but rather to a type-
guistic utterances at will. It is this executive control over linguistic of-thought story. Furthermore, they argue that there is a type of
utterances that gives us endogenous control over our thoughts. classification, independent of feature classification, which is
Admittedly, there are alternative ways to acquire endogenous control unavailable to aphasics with naming disorders. This evidence does not
over our thoughts, e.g. via associations with a goal-directed state over suggest a constitutive relation between language and thinking. Instead
which we already have endogenous control. Once a certain degree of it suggests a strong relation between naming and categorization
linguistic sophistication is acquired, the process of activating a con- impairments, which could be explained by appealing to a strong
cept in a top-down manner is achieved in virtue of activating association between a linguistic label and a concept. This in turn lends
associated words. support to LASSO.
Language is not constitutive to (conscious) thinking Evidence against a constitutive relation between language &
According to Carruthers (1998; 2005), accounting for our non-infer- cognition
ential access to our thoughts requires inner speech to be constitutively Evidence in favor of LASSO and against a constitutive relation
involved in propositional thinking. Contra Carruthers, I argue that this between language and cognition can be found in results showing that
is not the only way in which non-inferential thinking can occur. One grammara constitutive part of languageis neither necessary nor
alternative is associative thinking. It might be that the transition from sufficient for thinking and more specifically in Theory of Mind (ToM)
the word to the concept that has the very same content that a given reasoning. For instance, Siegal, Varley, Want (2001) show a double
word expresses is an associationistic link. In the suggested view, dissociation between grammar and ToM reasoning, which in turn
perceptual representations and words are associated in memory. Note indicates that reasoning can occur largely independently from gram-
that this is not a case of language being constitutive to thoughts, but a matical language. Even though ToM understanding and
case of co-activation of a concepts different subparts: Perceptual categorization is not all there is to cognition, had it been the case that
representations of the appropriate word (A) and representations there was a constitutive relation between language and (conscious)
formed during perceptual experiences with instances of a given kind cognitionin the way Carruthers argues for instancethen a double
(B). This occurs in virtue of an instance of a word activating A, which dissociation between grammar and ToM reasoning would have never
in turn activates B resulting in the concepts activation, as a whole. occurred.
Nevertheless, and importantly, this kind of thinking is not interpre- Focusing on the relation between grammar and cognition in
tative, as Carruthers argues. It is not that an agent hears a word, say aphasia, Varley and Siegal (2000) show that subjects with severe
Cat, and then tries to guess or infer what the word means. Instead, agrammatic aphasia and minimal access to propositional language
on hearing the word Cat the concept cat becomes activated. Access performed well in different ToM tests and were capable of simple
to thinking is neither interpretative nor constitutive. causal reasoning. On these grounds, Siegal, Varley, Want (2001)
Perceptual representations of objects and words are distinct from argue that reasoning about beliefs as well as other forms of sophis-
each other and are brought together during the process of concept ticated cognitive processes involve processes that are not dependent
formation. It is just that we only have conscious access at the level on grammar. By contrast to the previous evidence, Siegal et al. report
where representations of words and objects convergeconsider this that non-aphasic subjects with right-hemisphere (non-language
in terms of Damasios (1989) well known convergence zones
dominant) lesions exhibited impaired ToM reasoning and had diffi-
hypothesis. In this sense, an agent can only access representations of
culties understanding sarcasm, jokes and the conversational
objects and words simultaneously and treat them as if they were
implications of questions (Siegal et al. 1996; Happe et al. 1999). This
constitutive parts of a concept/thought.
The relationship between a thought and its representation in self- double dissociation between grammar on the one hand and causal
knowledge is brute causation. The particular transition between a first reasoning and ToM on the other, suggest a non-constitutive relation
order thought and a second order thought are causally and not between language and cognition, and in turn favors LASSO.
123
Objections to LASSO Davidoff J and Roberson D (2004) Preserved thematic and impaired
Qua inherently associationistic, LASSO might be subject to the taxonomic categorization: A case study. Lang Cognitive Proc 19
objection that it cannot account for propositional thinking or for 1: 137174. doi:10.1080/01690960344000125
compositionality of thought. For it might be that LASSO at best Davidson D (1975) Thought and talk. In his Inquiries into truth and
describes how inter-connected concepts become activated without interpretation pp 155170. Oxford University Press, Oxford
explaining the propositional-syntactic properties that thoughts in the Dummett M (1975) Wangs Paradox. Synthese. 30, 30124
form of inner speech have. In reply, a single thought becomes Elman JL, Bates EA, Johnson MH, Karmiloff-Smith A, Parisi D,
propositional in structure and content by piggybacking on language. Plunkett K (1996) Rethinking innateness: A connectionist per-
The conventional grammatical unity and structure of the sentence spective on development. MIT Press, Cambridge MA
unifies these concepts and orders them in a certain way. Fodor J (1978) Representations: Philosophical essays on the foun-
Another challenge facing associationistic accounts of thinking is dations of cognitive science. MIT Press, Cambridge MA
that it is unclear how they can account for the characteristic of con- Gauker C (1990) How to Learn a Language like a Chimpanzee. Phil
cepts to combine compositionally. In reply, I appeal to Prinzs Psych 3 1: 3153. doi:10.1080/09515089008572988
semantic account (2002), according to which, in order for c to refer to Grice P (1957) Meaning. Phil Review 66:37788
x, the following two conditions have to be fulfilled: Happe F et al. (1999) Acquired theory of mind impairments fol-
lowing stroke. Cognition 70, 21140. doi:10.1016/S0010-
a) xs nomologically covary with tokens of c 0277(99)00005-0
b) An x was the (actual) incipient cause of c Jackendoff R (1996) How language helps us think. P&C 4 1. doi:
In the suggested view the concept petfish, like all concepts, is a 10.1075/pc.4.1.03jac
folder that contains perceptual representations. The incipient causes James W (1890/1999) The Principles of Psychology (2 vols.). Henry
of petfish can either be instances of petfish or representations of pets Holt, New York (Reprinted Thoemmes Press, Bristol).
and representations of fish. Crucially, in terms of semantics, petfish Lupyan G (2012) Linguistically modulated perception and cognition:
has to nomologically covary with petfish rather than a disjunction of the label feedback hypothesis. Front Psychol 3: 54 doi:
pet and fish. The reason why petfish nomologically covaries with 10.3389/fpsyg.2012.00054
petfish is that the concepts functional role is constrained by the Prinz J (2002) Furnishing the mind: Concepts and their perceptual
constraints on the uses of the word that are set by the agents locking basis. MIT Press, Cambridge
into the conventions about conjunction formation. In this sense, Rumelhart DE, Smolensky P, McClelland JL, Hinton GE (1986)
agents participate in a convention and it is via the association between Parallel distributed models of schemata and sequential thought
the word and the concept that the functional role of the conjunctive processes. In McClelland JL and Rumelhart DE (eds) Parallel
concept is constrained. In terms of the constitutive representations of Distributed Processing: Explorations in the Microstructure of
petfish, these can be representations of pets like cats and dogs as well Cognition. Volume 2: Psychological and Biological Models
as representations of fish. Crucially, these representations are idle in pp 757
the functional role of the concept; the latter is more constrained by its Siegal M, Carrington J, Radel M (1996) Theory of mind and prag-
link to the words. matic understanding following right hemisphere damage. Brain
Lang 53: 4050. doi:10.1006/brln.1996.0035
Acknowledgments Siegal M, Varley M, Want SC (2001) Mind over grammar: reasoning
I am grateful to Finn Spicer, Anthony Everett and Jesse Prinz for in aphasia and development. Trends Cogn Sci 5 7. doi:
comments on earlier drafts of this paper. Research for this paper has 10.1016/S1364-6613(00)01667-3
been partly funded by the Alexander S. Onassis Public Benefit Varley R, Siegal M (2000) Evidence for cognition without grammar from
Foundation (ZF 075) and partly by the Deutsche Forschungsgeme- causal reasoning and theory of mind in an agrammatic aphasic
inschaft (DFG) (SFB 991_Project A03). patient. Curr Biol 10: 72326. doi:10.1016/S0960-9822(00)00538-8
Vygotsky LS (1962) Thought and Language. MIT Press, Cambridge
References
Barsalou LW (1999) Perceptual symbol systems, Behav Brain Sci 22:
577609. doi:10.1017/s0140525x99002149 Ten years of adaptive rewiring networks in cortical
Berk L and Garvin R (1984) Development of private speech among connectivity modeling. Progress and perspectives
low-income Appalachian children. Dev Psychol 20 2: 271286.
doi:10.1037/0012-1649.20.2.271
Cees van Leeuwen
Brandom R (1994) Making it explicit: Reasoning, representing, and
KU Leuven, Belgium; University of Kaiserslautern, Germany
discursive commitment. Harvard University Press, Cambridge
MA Activity in cortical networks is generally considered to be governed
Carruthers P (1998) Conscious thinking: Language or elimination? by oscillatory dynamics, enabling the network components to syn-
Mind Lang 13 4: 457476. doi:10.1111/1468-0017.00087 chronize their phase. Dynamics on networks are determined to a large
Carruthers P (2005) Consciousness: Essays from a higher order per- extent by the network topology (Barahona and Pecora 2002; Steur
spective. Clarendon Press, Oxford et al. 2014). Cortical network topology, however, is subject to change
Clark A (1998) Magic words: How language augments human com- as a result of development and plasticity. Adaptive network models
putation. In Carruthers P and Boucher J (ed) Language and enable the dynamics on networks to shape the dynamics of networks,
thought: Interdisciplinary themes, pp 162183. Cambridge Uni- i.e. the evolution of the network topology (Gross and Blasius 2008).
versity Press, Cambridge Adaptive networks show a strong propensity to evolve complex
Clark A and Chalmers DJ (1998) The extended mind. Analysis 58 topologies. In adaptive networks, the connections are selectively
1:719. doi:10.1111/1467-8284.00096 reinforced (Skyrms and Pemantle 2000) or rewired (Gong and van
Damasio AR (1989) Time-locked multiregional retroactivation: A sys- Leeuwen 2003, 2004; Zimmerman et al. 2004), in adaptation to the
tems-level proposal for the neural substrates of recall and recognition. dynamical properties of the nodes. The latter are called adaptively
Cognition 33: 2562. doi: 10.1016/0010-0277(89)90005-X rewiring networks.
123
Gong and van Leeuwen (2003, 2004) started using adaptive The products of rewiring have an additional characteristic that is
rewiring networks in order to understand the relation between large relevant to the brain: they are modular networks (Rubinov et al.
scale brain structure and function. They applied a Hebbian-like 2009a). This means that they form community structures that interact
algorithm, in which synchrony between pairs of network components via hubs. The hubs are specialized nodes that network evolution has
(nodes) is the criterion for rewiring. The nodes exhibit oscillatory given the role of mediating connections between communities. They
activity and, just like the brain does, where dynamic synchronization synchronize, sometimes with one and sometimes with another and can
in spontaneous activity shows traveling and standing waves, and be considered as agents of change in the behavior of the regions to
transitions between them (Ito et al. 2005, 2007), the network nodes which they are connected.
collectively move spontaneously in and out of patterns of partial Several studies have explored, and help extend, the notion that
synchrony. Meanwhile, adaptive rewiring takes place. When a pair of adaptive rewiring leads to modular small worlds. It was already
nodes is momentarily synchronized but not connected, from time to shown early on (Gong and van Leeuwen 2003) that combining
time a link from elsewhere is relayed, in order to connect these nodes. rewiring with network growth results in a modular network that is
This is the key principle of adaptive rewiring. also scale-free in the distribution of its connectivity (Barabasi and
Adaptively rewiring a network according to synchrony in spontane- Albert 1999). Kwok et al. (2007) have shown, that the behavior of
ous activity gave rise to the robust evolution of a certain class of complex these networks is not limited to coupled maps, but could also be
network structures (Fig. 1). These share important characteristics with obtained with more realistic, i.e. spiking model neurons. Other than
the large-scale connectivity structure of the brain. Adaptive rewiring the coupled maps, these have directed connections. As the system
models, therefore, became an integral part of the research program of the proceeds its evolution, the activity in the nodes changes. Initial
Laboratory for Perceptual Dynamics, which takes a complex systems bursting activity (as observed in immature neurons, see e.g. Leine-
view to perceptual processes (For a sketch of the Laboratory while at the kugel et al. 2002, an activity assumed to be random but in fact, like
RIKEN Brain Science Institute, see van Leeuwen 2005; for its current that of the model, shows deterministic structure, see Nakatani et al.
incarnation as an FWO-funded laboratory at the KU Leuven, see its 2003), gives way to a mixture of regular and irregular activity char-
webpage at http://perceptualdynamics.be/). The original adaptive acteristic of mature neurons.
rewiring model (Gong and van Leeuwen 2003, 2004) was developed over Van den Berg et al. (2012) lesioned the model and showed that there
the years in a number of studies (Jarman et al. 2014; Kwok et al. 2007 is a critical level of connectivity, at which the growth of small-world
Rubinov et al. 2009a; van den Berg et al. 2012; van den Berg and van structure can no longer be robustly sustained. Somewhat surprisingly,
Leeuwen 2004). Here I review these developments and sketch some this results in a break-down, not primarily in the connections between
further perspectives. the clusters, but in the local clustering. In other words, the network shifts
In the original algorithm (Gong and van Leeuwen 2004; van den towards randomness. This corresponds to observations in patients
Berg and van Leeuwen 2004), the network initially consists of ran- diagnosed with schizophrenia (Rubinov et al. 2009b). The model,
domly coupled maps. Coupled maps are continuously valued maps therefore, could suggest an explanation of the anomalies in large-scale
connected by a diffusive coupling scheme (Kaneko 1993). We used connectivity structures found in schizophrenic patients.
coupled logistic maps; the return plots of these maps are generic and Despite these promising results, a major obstacle towards realistic
can be regarded as coarsely approximating that of a chaotic neural application of the model has been the absence of any geometry. A
mass model (Rubinov et al. 2009a). Adaptively rewiring the couplings spatial embedding for the model would allow us to consider the effect of
of the maps showed the following robust tendency: From the initially biological constraints such as metabolic costs and wiring length. In a
random architecture and random initial conditions, a small-world recent study, Jarman et al. (2014) studied networks endowed with
network gradually emerges as the effect of rewiring. Small worlds are metrics, i.e. a definition of distance between nodes, and observed its
complex networks that combine the advantages of a high degree of effects on adaptive rewiring. A cost function that penalizes rewiring
local clustering from a regular network with the high degree of global more distant nodes, leads to a modular small world structure with
connectedness observed in a random-network (Watts and Strogatz greater efficiency and robustness, compared to rewiring based on syn-
1998). They are, in other words, an optimal compromise for local and chrony alone. The resulting network, moreover, consists of spatially
global signal transfer. Small-world networks have repeatedly been segregated modules (Fig. 2, left part), in which within-module con-
observed in the anatomical and functional connectivity of the human nections are predominantly of short range and their inter-connections
brain (He et al. 2007; Sporns 2011; Bullmore and Bassett 2011; are of long range (Fig. 2, right part). This implies that the topological
Gallos et al. 2012). principle of adaptive rewiring and the spatial principle of rewiring costs
operate in synergy to achieve a brain-like architecture. Both principles
are biologically plausible. The spatially biased rewiring process,
therefore, may be considered as a basic mechanism for how large-scale
architecture of the cortex is formed.
The models developed so far have been no more (and no less)
than a proof of principle. To some extent, this is how it should be.
Efforts at biological realism can sometimes obscure the cognitive,
neurodynamical principles on which a model is based. Some pre-
dictions, such as what happens when lesioning the model, could
already be made with a purely topological version, with its extreme
simplification of the neural dynamics. Yet, in order to be relevant,
future model development will have to engage more with neurobi-
ology. We are doing this step by step Jarman et al. (2014) have
overcome an important hurdle in applying the model by showing
Fig. 1 A random network prior to (left) and after (right) several how spatial considerations could be taken into account. Yet, more is
iterations of adaptive rewiring (From van Leeuwen 2008). Note that needed. First, we need to resume our work on realistic (spiking)
this version of the model considers topology only; geographical neurons (Kwok et al. 2007): We will consider, distinct (inhibitory
proximity of nodes was heuristically optimized in order to provide a and excitatory) neural populations, realistic neural transmission
visualization of the propensity of the system to evolve a modular delays, spike-timing dependent plasticity and more differentiated
small-world network description of mechanisms that guide synaptogenesis in the
123
Kwok HF, Jurica P, Raffone A, van Leeuwen C (2007) Robust

emergence of small-world structure in networks of spiking
neurons. Cogn Neurodyn 1:3951
Leinekugel X Khazipov R Cannon R Hirase H Ben-Ari Y, Buzsaki G
(2002) Correlated bursts of activity in the neonatal hippocampus
in vivo Science 296(5575) 20492052
Nakatani H Khalilov I Gong P, van Leeuwen C (2003) Nonlinearity in
giant depolarizing potentials. Phys Lett A 319:167172
Rubinov M Sporns O van Leeuwen C, Breakspear M (2009a) Sym-
biotic relationship between brain structure, dynamics BMC
Neuroscience 10:55 doi:101186/1471-2202-10-55
Rubinov M Knock S Stam C Micheloyannis S Harris A Williams L,
Breakspear M (2009b) Small-world properties of nonlinear brain
activity in schizophrenia Human Brain Mapping 58 (2) 403416
Skyrms B Pemantle R (2000) A dynamic model of social network
formation Proc Natl Acad Sci USA 97:93409346
Sporns O (2011) The human connectome: a complex network Ann N
Y Acad Sci 1224(1):109125
Steur E Michiels W Huijberts HJC, Nijmeijer H (2014) Networks of
diffusively time-delay coupled systems: Conditions for synchro-
Fig. 2 From: Jarman et al. 2014. Adaptive rewiring on a sphere left nization, its relation to the network topology Physica D 277 2239
Differently colored units reveal the community structure (modularity) van den Berg D Gong P Breakspear M, van Leeuwen C (2012)
resulting from adaptive rewiring with a wiring cost function. Right Fragmentation: Loss of global coherence or breakdown of
Correlation between spatial distance of connections (x-axis) and their modularity in functional brain architecture? Frontiers in Systems
topological betweenness centrality (Y-axis). From top to bottom Neuroscience 6 20 doi:103389/fnsys201200020
initial state and subsequent states during the evolution of the small van den Berg D, van Leeuwen C (2004) Adaptive rewiring in chaotic
world network. The correlation as it emerges with the network networks renders small-world connectivity with consistent clus-
evolutions shows that links between modules tend to be of long range ters Europhysics Letters 65 459464
van Leeuwen C (2005) The Laboratory for Perceptual Dynamics at
RIKEN BSI. Cogn Proc 6:208215
van Leeuwen C (2008) Chaos breeds autonomy: connectionist design
transition from immature to mature systems. Second, and only after between bias and babysitting. Cogn Proc 9:8392
that, should we be start preparing the system for information pro- Watts D, Strogatz S (1998) Collective dynamics of small-world
cessing functions. networks Nature 393:440442
Zimmermann M G Eguluz V M, M S Miguel (2004) Phys Rev E
References 69:065102
Barabasi A-L, Albert R (1999) Emergence of scaling in random
networks. Science 286:509512
Barahona M, Pecora LM (2002) Synchronization in small-world Bayesian mental models of conditionals
systems. Phys Rev Lett: 89:0541014
Bullmore ET Bassett DS (2011) Brain graphs: graphical models of the
Momme von Sydow
human connectome. Annu Rev Clin Psychol 7:113140
Department of Psychology, University of Heidelberg, Germany
Gallos LA, Makse HA, Sigman M (2012) A small world of weak ties
provides optimal global integration of self-similar modules in Conditionals play a crucial role in psychology of thinking, whether
functional brain networks PNAS 109:28252830 one is concerned with truth table tasks, the Wason selection task, or
Gong P, van Leeuwen C (2004) Evolution to a small-world network syllogistic reasoning tasks. Likewise, there has been detailed dis-
with chaotic units. Europh Lett 67:328333 cussion on normative models of conditionals in philosophy, in logics
Gong P, van Leeuwen C (2003) Emergence of scale-free network (including non-standard logics), in epistemology as well as in phi-
with chaotic units. Physica A Stat Mech Appl 321:679688 losophy of science. Here a probabilistic Bayesian account of the
Gross T, Blasius B (2008) Adaptive coevolutionary networks: a induction of conditionals based on categorical data is proposed that
review. J Roy Soc Interf 5:259271 draws on different traditions and suggests a synthesis of several
He Y, Chen ZJ, Evans AC (2007) Small-world anatomical networks aspects of some earlier approaches.
in the human brain revealed by cortical thickness from MRI. Three Main Accounts of Conditionals
Oxf J 17:24072419 There is much controversy in philosophy and psychology over how
Ito J, Nikolaev AR, van Leeuwen C (2005) Spatial, temporal structure indicative conditionals should be understood, and to whether this
of phase synchronization of spontaneous EEG alpha activity. relates to the material implication, to conditional probabilities, or to
Biol Cybern 92:5460 some other formalization (e.g. Anderson, Belnap 1975; Ali, Chater,
Ito J, Nikolaev AR, van Leeuwen C (2007) Dynamics of spontaneous Oaksford 2011; Byrne, Johnson-Laird 2009; Edgington 2003; Beller
transitions between global brain states. Hum Brain Mapp 2003; Evans, Over 2004, Kern-Isberner 2001; Krynski, Tenenbaum
28:904913 2007; Pfeiffer 2013; Johnson-Laird 2006; Leitgeb 2007; Oaksford,
Jarman N, Trengove C, Steur E, Tyukin I, van Leeuwen C (2014) Chater 2007, cf. 2010; Oberauer 2006; Oberauer, Weidenfeld, Fischer
Spatially constrained adaptive rewiring in cortical networks 2007; Over, Hadjichristidis, Evans, Handley, Sloman 2007;). Three
creates spatially modular small world architectures Cogn Neur- main influential approaches, on which we will build, may be
odyn: doi:101007/s11571-014-9288-y distinguished:
Kaneko K (ed) (1993) Theory, applications of coupled map lattices One class of approaches is based on the material implication. A
Wiley, Chichester psychological variant replaces this interpretation (with a T F T T truth
123
table by mental models akin either to complete truth tables or to only p (i.e. non-p) (cf. Bellers 2003, closed-world principle). Imagine
the first two cases of such a truth table (Johnson-Laird 2006; cf. homogeneity of non-p with P(q|p) = P(q|non-p) = .82 (e.g., if one
Byrne, Johnson-Laird 2009). The present approach adopts the idea does p then one gets chocolate q but for non-p cases one gets
that a conditional if p then q may be represented with reference chocolate with the same probability as well.) Here it seems inappro-
either to a full 2 9 2 contingency table or simply with reference to priate to assign the high probability of P(q|p) to P(p & [ q) as well,
the cells relating to the antecedent p (i.e., p & q, p & non-q). since the antecedent does not make a difference. However, consider a
Another class uses a conditional probability interpretation, thus similar case were non-p is heterogeneous. Take nine subclasses in
referring only to the first two cells of a contingency table (Stalnaker which P(q|non-p) = .9 and one in which P(q|non-p) = .1 (this yields
1968, cf. Eddington 2003; Evans, Over 2004; Oberauer et al. 2007; the same average of P(q|non-p) = .82). For such a heterogeneous
Pfeifer 2013). This is often linked to assuming the hypothetical or contrast class, the conditional is indeed taken to singles out only the
counterfactual occurrence of the antecedent p (cf. Ramsey test). Here specific subclass p (similar to the conditional probability approach),
we take conditional probabilities as a starting-point for a probabilistic since there is at least one potential contrast in one subclass of non-
understanding of conditionals, while adding advantages of the mental p For the homogeneous case, however, the probability of the condi-
model approach. Moreover, here an extended Bayesian version of this tional is claimed to reflect the overall situation, and a high probability
approach is advocated, concerned not with a hypothetical frequentist here would involve a difference between P(q|non-p) and P(q| p).
(observed or imagined) relative frequency of q given p, but rather (3) BMMC represents the simpler, antecedent-only models of
with an inference about an underlying generative probability of conditionals, not as extensional probabilities, or relative frequencies
q given p that now depends on priors and sample size. of (observed or imagined) conditionals, but as subjective estimates of
A subclass of the conditional probability approach additionally generative probabilities that have produced them. Although similar to
assumes a high probability criterion for the predication of logical a conditional probability approach, i.e. PE(q|p), this measure depends
propositions (cf. Foley 2009). This is essential to important classes of on priors and sample size. For flat priors observing a [4; 1] input
non-monotonic logic (e.g., System P) demanding a high probability (f(p&q), f(p&non-q)) yields a lower P(p * [q) than for a larger
threshold (a ratio of exceptions e) for the predication of a normic sample size, e.g. [40; 10]. Particularly for low sample sizes, priors
conditional (Adams 1986; Schurz 2001, cf. 2005): P(q|p) [ 1 - e. may overrule likelihoods, reversing high and low conditional proba-
We here reformulate a high probability criterion in a Bayesian way bility judgments.
using second-order probability distributions (cf. von Sydow 2014). Formally, the model uses cases of q or non-q, conditional on p, as
Third, conditionals sometimes involve causal readings (cf. Hag- input (taken as Bernoulli trials with an unchanging generative prob-
mayer, Waldmann 2006; Oberauer et al. 2007) and methods of causal ability h). Given a value of h the Binomial distribution provides us
induction (Delta P, Power, and Causal Support; Cheng 1997; Grif- with the likelihood of the data, P(D| h), with input k = f(q|p) in
fiths, Tenenbaum 2005; cf. Ali et al. 2011) that make use of all four n = f(q|p) + f(q|p) trials:
cells of a contingency table. Although conditionals have to be dis-
n k
tinguished from causality (if effect then cause; if effect E1 then Bkjh; n h 1 hnk
effect E2; if cause C1 then cause C2), conditional probabilities k
may not only form the basis for causality, but conditionals may be We obtain a likelihood density function for all h (cf. middle
estimated based on causality. Moreover, determining the probability Fig. 1), resulting in a Beta distribution, now with the generative
of conditionals may sometimes involve calculations similar to causal probability h as an unknown parameter (with a-1 = f(x = q|p) and
judgments. In any case, approaches linking conditionals and causality b-1 = f(x = :q|p):
have not been fully developed for non-causal conditionals in situa-
tions without causal model information. Betaa; b Phja; b const: ha1 1 hb1
Bayesian Mental Model Approach of Conditionals (BMMC)
The Bayesian Mental Model Approach of Conditionals allows for As prior for h we take the conjugate Beta distribution (e.g.,
complete and incomplete models of conditionals (here symbolized as Beta(1,1) as flat prior) to calculate easily a Beta posterior probability
p & [ q vs. p * [ q). It nonetheless models conditionals in a
probabilistic way. It is claimed that the probability of fully repre-
sented conditionals (P(p & [ q)) needs not to be equated with a
single conditional probability (P(q|p)). In contrast, the probability of
conditionals concerned with the antecedent p only, P(p * [ q), is
taken to be closely related to the relative frequency of the consequent
given the antecedent (its extension). However, the model does not
merely refer to the extensional probability Pe(q|p), but is concerned
with subjective generative probabilities affected by priors and sample
size.
The postulates of the approach and the modelling steps will be
sketched here (cf. von Sydow 2014, for a related model):
(1) Although BMMC relates to the truth values of conditionals and
biconditionals, etc. (Step 6), it assigns probabilities to these propo-
sitions as a whole (cf. Foley 2009, von Sydow 2011).
(2) BMMC distinguishes complete vs. incomplete conditionals.
This idea is adopted from mental model theory (Johnson-Laird, Byrne
1991; cf. Byrne, Johnson-Laird 2002). It is likewise assumed that
standard conditionals are incomplete. However, whereas mental model
theory has focused on cognitive elaboration as the cause for fleshing
out incomplete conditionals, the use of complete vs. incomplete con-
ditionals is primarily linked here to the homogeneity or inhomogeneity Fig. 1 Example for the prior for h, the Binomial likelihood and the
of the occurrence of q in the negated subclasses of the antecedent Beta posterior distribution over h
123
distribution for h (Fig. 1) that depends on sample size and priors. Its probability Pp of the hypotheses, predicting systematic (conditional)
mean is a rational point estimate for the subjective probability of inclusion fallacies (e.g., allowing for Pp(q_non-q|p) \ Pp(q|p)).
q given p. (Additionally, such intervals over h may help to model quantifiers:
(4) In contrast, given fully represented conditionals (no hetero- If x are p then most x are q, cf. Bocklisch 2011).
geneous contrast class), the probability of a conditional even more (6) In continuation of Step 4, and analogous to Step 5, we detail
clearly differs from (extensional) conditional probabilities (cf. Leit- the alternative testing of p &[q, p &[non-q, and p &[(q _ non-
geb 2007). One option would be to apply a general probabilistic q) for complete conditionals. Since this includes representation of
pattern logic (von Sydow 2011) to conditionals. In this case, condi- non-p as well, we can also model the converse conditionals (\ &,
tionals, however, would yield the same results as inclusive probabilistic necessary conditions) and biconditionals (\ &[, prob-
disjunctions P(p & [ q) = P(:p _ q). Albeit here concerned with all abilistic necessary and sufficient conditions) as alternatives to
four cells of a logical truth table, another option is that conditionals conditionals (&[, probabilistic sufficient conditions). First, to
have a direction even in non-causal settings. This assumption will be determine homogeneity of non-p subclasses (cf. Step 2), Step 5 is to
pursued here. A hypothetical causal-sampling assumption that asserts be applied repeatedly, revealing whether each subclass is rather q,
hypothetical antecedent-sampling for conditionals (Fiedler 2000), as non-q, or q_ non-q. If the dominant results for all subclasses do not
if assuming that the antecedent would have caused the data (cf. differ, we can determine the probability of a fully represented con-
Stalnaker 1968; Evans, Over 2004). (In the presence of additional ditional. We make use of the results for the incomplete conditionals
causal knowledge, one may correct for this, but this is not modelled (for p or non-p; cf. Step 5). Related to conditionals, converse con-
here.) Based on the generative models of conditional probabilities ditionals or biconditionals (or their full mental models), we interpret
(Step 3), here generative versions of delta P (Allan, Jenkins 1980) or ideal conditionals p &[q, at least in the presence of alternative bi-
causal power (Cheng 1997) are suggested as another possible for- conditionals, as the combination of p *[q and non-p * [ (q_non-
malization of a full conditional. q); ideal biconditionals p \ &[q as combinations of p *[q and
Formally, the two conditional probability distributions (for q|p and non-p * [ non-q; and ideal converse conditionals p \ &q as the
q|non-p) are determined based Step 3. To proceed from the two beta combination of p *[(q _ non-q) and non-p * [ q. Sometimes a
posterior distributions on the interval [0, 1], to a distribution for Delta connective may refer to more than one truth table: In the absence of
P, relating to P(q|p)-P(q|non-p) in the interval [- 1, 1], one can use biconditionals, P(p &[q) is taken to be the mixture of a conditional
standard sampling techniques (e.g. inversion or rejection method, and a biconditional. Likewise the approach allows to model, for
Lynch 2007). For the sequential learning measure for causal power instance, if p then q or non-q (p & [ (q_non-q)) as average of two
one proceeds analogously. The means of the resulting probability truth table instantiations (with non-p either being q or, in another
distributions may be taken as point estimates. However, these Delta P model, non-q).
and causal power may not be flexible enough (see Step 6). Technically it is suggested that one can obtain the pattern proba-
(5) Let us first return to incomplete conditionals (Step 3). Even bilities of the combination of the incomplete models by assuming
here the probability of a conditional P(p * [ q) may have to be their independence and by multiplying their outcome; e.g.:
distinguished from the conditional probability, even if modelled as a Pp(p & [ q) = Pp(p * [ q) * Pp(non-p * [ q_non-q). If the
generative conditional probability (Step 3). To me there are to other hypothesis-space is incomplete or if other logical hypotheses are
plausible options: One option would be to model probabilities of added (von Sydow 2011; 2014), the results need to be normalized to
conditionals along similar lines as other connectives have been obtain probabilities for alternative logical hypotheses.
modelled in von Sydow (2011). Here I propose another option, clo- Conclusion
sely related to another proposal von Sydow (2014). This builds on the Overall the sketched model is suggested to provide an improved
general idea of high probability accounts (Adams 1986; Schurz 2001, rational model for assessing generative probabilities for conditionals,
cf. 2005; Foley 2009), here specifying acceptance intervals over h. biconditionals, etc. The model predicts differences for complete and
This seems particularly suitable if concerned with the alternative incomplete mental models of conditionals, influences of priors,
testing of the hypotheses p * [ q, p * [ non-q, and, influences of sample size, probabilistic interpretations of converse
p * [ q_non-q (e.g., if one does p then one either gets chocolate conditionals and biconditionals, hypothesis-space dependence, and
q or does not). This links to the debate concerning conjunction conditional inclusion fallacies. Although all these phenomena seem
fallacies and other inclusion fallacies (given p, q_non-q refers to the plausible in some situations, none of the previous models, each with
tautology and includes the affirmation q; cf. von Sydow 2011, 2014). their specific advantages, seems to cover all predictions. Throughout
Formally, we start with ideal generative probabilities on the h its steps the present computational model may contribute to predicting
scale (hq = 1; hnon-q = 0; p and hq_non-q = .5) (cf. von Sydow 2011). a class of conditional probability judgments (perhaps complementing
We then vary for each of the three hypotheses H, the acceptance extensional conditionals) by potentially integrating some divergent
threshold e (over all, or all plausible, values). For e = .2, the closed findings and intuitions from other accounts into a Bayesian frame-
acceptance inter-val for the consequent q would be [.8, 1]; for non-q, work of generative probabilities of conditionals.
[0, .2]; and for q_non-q, [.4, .6]. Based on Step 3 we calculate for all
tested hypotheses the integral over h in the specified interval of the
Acknowledgments
posterior probability distribution: This work was supported by the grant Sy 111/2-1 from the DFG as
h2
part of the priority program New Frameworks of Rationality (SPP
r Posterior distributionh; H 1516). I am grateful to Dennis Hebbelmann for an interesting dis-
h1 cussion about modelling causal power in sequential learning scenarios
(cf. Step 4). Parts of this manuscript build on von Sydow (2014),
This specifies the subjective probability that for given observed
suggesting a similar model for other logical connectives.
data the posterior probability of H is within the acceptance interval
(cf. von Sydow 2011). The probability of each hypothesis is
determined by adding up the outcomes for H over different levels References
of e and normalizing the results over the alternative hypothesis (e.g., Adams EW (1986) On the logic of high probability. J Philos Logic
alternative conditionals). This provides us with a kind of pattern 15:255279
123
Allan LG, Jenkins HM (1980) The judgment of contingency and the Visualizer verbalizer questionnaire: evaluation
nature of the response alternative. Can J Psychol 34:111 and revision of the German translation
Ali N, Chater N, Oaksford M (2011) The mental representation of
causal conditional reasoning: Mental models or causal models.
Florian Wedell, Florian Roser, Kai Hamburger
Cognition 119:403418
Giessen, Germany
Anderson AR, Belnap N (1975) Entailment: the logic of relevance
and necessity, vol I. Princeton University Press, Princeton Abstract
Beller S (2003) The flexible use of deontic mental models. In R. Many everyday abilities depend on various cognitive styles. With the
Alterman, D. Kirsh (eds) Proceedings of the Twenty-Fifth Visualizer-Verbalizer Questionnaire (VVQ) we here translated a
Annual Conference of the Cognitive Science Society. Lawrence well-established inventory to distinguish between verbalizers and
Erlbaum, Mahwah, pp 127132 visualizers into German language and evaluated it. In our experiment
Bocklisch F (2011) The vagueness of verbal probability and fre- 476 participants answered the VVQ in an online study. Results of this
quency expressions. Int J Adv Comput Sci 1(2):5257 experiment suggest that indeed only eight items measure, what they
Byrne RMJ, Johnson-Laird PN (2009) If and the problems of con- are supposed to. To find out, whether these eight items are usable as a
ditional reasoning. Trend Cogn Sci 13:282286 future screening tool, we currently run further studies. The VVQ
Cheng PW (1997) From covariation to causation: A causal power translation will be discussed with respect to the original VVQ.
theory. Psychol Rev 104:367405 Keywords
Edgington D (2003) What if? Questions about conditionals. Mind Cognitive styles, Evaluation, Translation, Visualizer, Verbalizer,
Lang 18:380401 VVQ
Evans JSBT, Over DE (2004). If. Oxford University Press, Oxford
Fiedler K (2000) Beware of samples! A cognitive-ecological sam- Introduction
pling approach to judgment biases. Psychol Rev 107:659676 When I learn or think about things, I imagine them very pictorially.
Foley R (2009) Beliefs, degrees of belief, and the Lockean Thesis. In: People often describe their ability of learning or thinking in one of
Huber F, Schmidt-Petri C (eds) Degrees of belief, synthese two possible directions. Either they state that they are the vivid
library 342, Heidelberg: Springer type, whose thoughts are full of colors and images or they describe
Griffiths TL, Tenenbaum JB (2005) Structure and strength in causal themselves as the word-based-person, which seems often a bit cold
induction. Cogn Psychol 51:334384 and more rational.
Hagmayer Y, Waldmann MR (2006) Kausales Denken. In Funke J In the nineteen-seventies Baddeley and Hitch (1974) demonstrated
(ed) Enzyklopadie der Psychologie Denken und Problemlosen, how important the working memory is for everyday life. It seems as if
Band C/II/8 (S. 87166). Hogrefe Verlag, Gottingen the way of how we learn and describe things is more or less uncon-
Johnson-Laird PN, Byrne RMJ (2002) Conditionals: A theory of scious, but this fundamental ability is determined by individual
meaning, pragmatics, and inference. Psychol Rev 109:646678 preferences. Individual preferences and individual abilities are very
Johnson-Laird PN (2006) How We Reason. Oxford University Press, important for various human skills, e.g. wayfinding, decision making.
Oxford Therefore, they have to be taken into account throughout the whole
Kern-Isberner G (2001) Conditionals in Nonmonotonic Reasoning domain of spatial cognition (e.g., Pazzaglia, Moe` 2013).
and Belief Revision. Springer, Heidelberg One way of dealing with the necessary interindividual differenti-
Krynski TR, Tenenbaum JB (2007) The role of causality in judgment ation in wayfinding performance is to distinguish between peoples
under uncertainty. J Exp Psychol Gen 3:430450 cognitive style (Klein 1951) ormore preciselythe preferred
Leitgeb H (2007) Belief in conditionals vs. conditional beliefs. Topoi components of their working memory. In their model Baddeley and
26(1):115132 Hitch (1974) assumed that the central executive is a kind of attentive
Lynch SM (2007) Introduction to Applied Bayesian Statistics and coordinator of verbal and visuo-spatial information in certain ways.
Estimation for Social Scientists. Springer, Berlin Riding (2001) stated that one of the main dimensions of cognitive
Oaksford M, Chater N (2010) Cognition and Conditionals (eds). styles is the visualizer-verbalizer-dimension. Therefore it is common
Probability and Logic in Human Reasoning. Oxford University Press, in cognitive research to differentiate between preferring visual
Oxford (visualizer) and/or verbal (verbalizer) information (e.g. Richardson
Oberauer K (2006) Reasoning with conditionals: A test of formal 1977; Pazzaglia, Moe` 2013). Considering this classification it can be
models of four theories. Cogn Psychol 53:238283 assumed that visualizers seem to be people with high-imagery pref-
Oberauer K, Weidenfeld A, Fischer K (2007) What makes us believe erences and verbalizers tend to have low-imagery preferences. These
a conditional? The roles of covariation and causality. Think Reason two styles are generally accounted for with self-report-instruments.
13:340369 As Jonasson and Grabowski (1993) concluded, the primarily used
Over DE, Hadjichristidis C, Evans JSBT, Handley SJ, Sloman SA tool to distinguish between visualizer and verbalizer is the Visualizer-
(2007) The probability of causal conditionals. Cogn Psychol 54:6297 Verbalizer Questionnaire (VVQ; Richardson 1977). The VVQ con-
Pfeifer N (2013) The new psychology of reasoning: a mental proba- tains 15 items. Participants have to answer each of the given items by
bility logical perspective. Think Reason 19:329345 judging in how they apply to their style of thinking (dichotomy; yes/
Schurz G (2005) Non-monotonic reasoning from an evolutionary no). Still there is an unsolved problem concerning the VVQ. The
viewpoint. Synthese 146:3751 verbal subscale indeed surveys verbal abilities (e.g., Kirby et al.
von Sydow M (2011) The Bayesian logic of frequency-based con- 1988), whereas the items of the visual subscale are only partly con-
junction fallacies. J Math Psychol 55(2):119139 nected to visuo-spatial abilities (e.g., Edwards, Wilkins 1981; Kirby
von Sydow M (2014) Is there a Monadic as well as a Dyadic Bayesian et al. 1988). Another problem concerning the VVQ is that it is rather
Logic? Two Logics Explaining Conjunction Fallacies. In: Pro- hard to find people that can clearly be assigned to one of the
ceedings of the 36th annual conference of the cognitive science extremes of the visualizer-verbalizer-dimension, since most par-
society. Cognitive Science Society, Austin ticipants are located somewhere in between and may not be assigned
123
to one of the two dimension poles. Preliminary studies in our research Table 1 VVQ items (Richardson 1977) and the German translation
group revealed that in some cases an estimate of about 50 participants
had to be investigated for cognitive style with the VVQ in order to VVQ_01 I enjoy doing work that requires the use of words
clearly assign 23 people to one of the two groups, which is not very Mir machen Aufgaben Spa, bei denen man mit Wortern
useful and also not very economic for further research. umgehen muss
In the present study, our aim is to translate the VVQ into German VVQ_02 My daydreams are sometimes so vivid I feel as though I
language. It seems to be necessary to translate and evaluate this actually experience the scene
questionnaire, since it is not evaluated and because of the lack of an
equivalent tool freely available for research on the visualizer-verba- Meine Tagtraume fuhlen sich manchmal so lebendig an,
lizer-dimension in the German-speaking area. dass ich meine, sie wirklich zu erleben
Experiment VVQ_03 I enjoy learning new words
Method Das Lernen neuer Worter macht mir Spa
Participants
VVQ_04 I easily think of synomyms for words
A total of 476 participants (377 female/99 male), ranging from 18 to
50 years (M = 24.14 years) were examined anonymously in an Es fallt mir leicht. Synonyme von Wortern zu finden
online study during the period from 12/16/2013 to 01/14/2014. Most VVQ_05 My powers of imagination are higher than average
of the participants highest educational attainment was claimed to be a Ich besitze eine uberdurchschnittliche Vorstellungskraft
high-school diploma (n = 278), followed by university degree
(n = 195) and other school graduation (n = 3). All participants were VVQ_06 I seldom dream
told that the study served to evaluate several translated forms of Ich traume selten
questionnaires, which included the VVQ. Participation was voluntary VVQ_07 I read rather slowly
and was not compensated for in any way.
Ich lese eher langsam
Materials
The used material was the VVQ in its translated form. Table 1 shows VVQ_08 I cant generate a mental picture of a friends face when I
the translation of the whole inventory. The questionnaire was trans- close my mind
lated in three steps. In the first step, the VVQ was translated by the Wenn ich meine Augen schliee, kann ich mir das
first author of this study. Negatively formulated items were formu- Gesicht eines Freundes nicht bildhaft vorstellen
lated negatively in German as well. Then in step two the translation VVQ_09 I dont believe that anyone can think in terms of mental
was corrected by the two co-authors. In the third step, a bilingual pictures
member (native English- and German-speaking) of the research group
Ich glaube nicht, dass jemand in Form mentaler Bilder
of Experimental Psychology and Cognitive Science corrected the
denken kann
translated items on colloquial subtleties.
After the translation process the online-study was setup with VVQ_010 I prefer to read instructions about how to do something
LimeSurvey, a tool for creating and conducting online studies. rather than have someone show me
Procedure Ich lese lieber eine Anleitung, als mir von jemand
Participants were recruited with an E-Mail containing basic infor- anderem ihren Inhalt vorfuhren zu lassen
mation and the Hyperlink to the study webpage. Forwarded to the VVQ_011 My dreams are extremely vivid
webpage via the Hyperlink, participants first received a short intro-
duction about the aim of the study, followed by three standard Meine Tagtraume sind extrem lebhaft
demographical questions (gender, age, and level of education; Fig. 1). VVQ_012 I have better than average fluency in using words
A specific instruction marked the start of the VVQ. Participants Meine Wortgewandtheit ist uberdurchschnittlich
were asked to answer each item with either yes or no and, if they were
VVQ_013 My daydreams are rather indistinct and hazy
not able to answer an item neither with yes nor no, they were asked to
choose the answer that most likely applied to them. The translated Meine Tagtraume sind eher undeutlich und
items of the VVQ were presented in the same order as in the original verschwommen
version of the questionnaire. VVQ_014 I spend very little time attempting to increase my
Results vocabulary
Before reporting the results of the VVQ, it should be noted that we Ich verbringe wenig Zeit damit, meinen Wortschatz zu
were unable to compare our findings with the original data, due to the erweitern
lack of statistical data in the original study by Richardson (1977).
After reversing the code of negatively formulated items, we analyzed VVQ_015 My thinking often consists of mental pictures or images
the VVQ with a factor analysis and Varimax rotation. The assumed Ich denke sehr haufig in Form von Bildern
two factors were preset. Each of the two factors had an eigenvalue
above two (2.32 and 2.42) and taken together, these factors explained
31.59 % of the variance. Table 2 shows the results of the factor
analysis in detail. We only found eight items matching their predicted Discussion
scale, while each scale contains four items. The other seven items The investigation of the VVQ reveals a large deviation between the
could not clearly be assigned to one of these scales. Figure 2 shows a original VVQ and the translated version. The data suggests that the
diagram of the items in the rotated space to illustrate the distribution translated VVQ contains the two predicted main factors (visualizer
of each item to the respective underlying factor. and verbalizer). These two factors or in other words the two extreme
Cronbachs alpha (a = .04) of the translated version is very weak, poles of the visualizer-verbalizer-dimension are covered with four
when considering the whole inventory, but reaches at least a moderate items each. These are the items 02, 05, 11 and 15 for the visualizer-
level (a = .57), when items 06, 07, 08, 09, 10, 13 and 14 are pole and the items 01, 03, 04 and 12 for the verbalizer-pole. The
eliminated. remaining seven items cannot clearly be attributed to one of the poles.
123
Work in progress: Revising the VVQ

When analyzing the data of the VVQ, the results show that nearly half
of the questionnaire does not measure in detail whether a person is a
visualizer or a verbalizer. This finding matches some data of our
research group that both styles are not separable from each other, but
a small number of people can clearly be assigned to one of the two
groups. This shows that a translated form of the VVQ is not able to
exactly distinguish between visualizer and verbalizer, which can also
be assumed for the original version of the VVQ.
The results could be explained in the way that, due to the trans-
Fig. 1 Screenshot of presented demographic items in LimeSurvey; lation process, the intended item content changed. An aspect that
first the dichotomous question for the participants gender, second a supports this assumption is that in some cases the participants
free-text field for age and third a drop-down box where participants answered in the wrong direction, as item 14 as an example illustrates:
choose their level of education I spend very little time attempting to increase my vocabulary is
translated with Ich verbringe wenig Zeit damit, meinen Wortschatz
zu erweitern into German language. The problem is that the German
Table 2 Underlying factors of the translated version of the VVQ translation could induce two possible solutions for a participant to
items after Varimax rotation answer this item with yes, which makes a participant both, either
visualizer or verbalizer. The first solution, which marks the partici-
Item Verbalizer Visualizer
pant as a verbalizer is that the participant wants to say yes, I
01 .754 .037 spend little time, because there is no need for me to spend more
time on learning that stuff, as I already am very good. The second
02 .006 .618 possible solution would clearly mark the participant as visual-
03 .684 -.034 izer, when he or she answers in the intended way with yes, I spend
04 .655 .068 very little time on it, because I do not care much about that stuff. To
solve this problem, it seems to be necessary to change the phrasing of
05 .216 .423
several items. But when doing so, it is inevitable to change most
06 .032 -.619 parts of the inventory or even the whole inventory. We assume that
07 -.300 -.030 one possible way to work with the translated form of the VVQ is to
08 -.097 -.280 reduce the VVQ to eight items, namely the eight items, which
are clearly definable as being part of the visualizer- or verbalizer-pole
09 -.058 -.214 and use the inventory as a screening-test only. We currently do
10 .127 -.151 research on this possibility with a second online study. In this study
11 -.087 .657 our participants are asked to answer this VVQ screening version. On
the one hand, we want to investigate, whether a strict distinction
12 .587 .073
between visualizers and verbalizers is possible or if there is only one
13 -.089 -.692 cognitive style as a result of the combination of both, visual and
14 -.655 -.131 verbal abilities.
15 .011 .533 Our research group also plans to use the translated VVQ as a
pretest in further investigations on the visual-impedance effect
(Knauff, Johnson-Laird 2002). The visual-impedance effect is
described with relations that elicit visual images containing details
that are irrelevant to an inference problem and in turn (should) impede
the reasoning process (Knauff, Johnson-Laird 2002). The VVQ might
help to discover whether visualizers or verbalizers are more affected
by the visual-impedance effect. We assume that (extreme) verbalizers
might not be as much affected as (extreme) visualizers, because their
preference to imagine is more word-based (or propositional) and
therefore their reasoning process might not be as much disrupted as it
might be with the visualizers.
Further research and conclusion
The VVQ is a widely used tool in the German research area. One
reason for this is that it is freely available (in contrast to some other
questionnaires like the OSIQ (Blajenkova et al. 2006). Therefore, we
here consider creating a completely new inventory that fits the Ger-
man language better with the eight definable items of the VVQ as a
basis. There are two ways to fill the inventory with items. In the first
way, we suggest to translate and evaluate the revised version of the
VVQ by Kirby et al. (1988) and put the eight items from the original
VVQ together with the best fitting items from the revised version by
Kirby into a new inventory. Another way is to create completely new
Fig. 2 Diagram of components in the rotated space; cluster with items items. We think that the pioneering work of Richardson (1977) is
VVQ_02, 05, 11, 15 represents the visualizer-based items; cluster with neither lost nor unusable, but we conclude that his work and the VVQ
items VVQ_01, 03, 04, 12 represents the verbalizer-based items need to be revised for further use.
123
Acknowledgment Kirby JR, Moore PJ, Schofield NJ (1988) Verbal and visual learning
We thank Sarah Jane Abbott for help within the translations process styles. Contemp Educ Psychol 13:169184
and for proof-reading the manuscript. Klein GS (1951) A personal world through perception. In Blake RR,
Ramsey GV (eds) Perception: an approach to personality. The
References Ronald Press Company, New York, pp 328355
Baddeley AD, Hitch G (1974) Working memory. In Bower GH (ed) Knauff M, Johnson-Laird P N (2002) Visual imagery can impede
The psychology of learning and motivation: advances in reasoning. Mem Cogn 30(3):363371
research and theory. Academic Press, New York, pp 4789 Kosslyn SM, Koenig O (1992) Wet mind: the new cognitive neuro-
Blajenkova O, Kozhevnikov M, Motes MA (2006) Object-spatial science. Free Press, New York
imagery: a new self-report imagery questionnaire. Appl Cogn Pazzaglia F, Moe` A (2013) Cognitive styles and mental rotation
Psychol 20:239263 ability in map learning. Cogn Process 14:391399
Edwards JE, Wilkins W (1981) Verbalizer-visualizer questionnaire: Richardson A (1977) Verbalizer-visualizer: a cognitive style dimen-
relationship with imagery and verbal-visual ability. J Mental sion. J Mental Imagery 1(1):109125
Imagery 5:137142 Riding RJ (2001) The nature and effects of cognitive style. In
Jonasson DH, Grabowski BL (1993) Handbook of individual differ- Sternberg RJ, Zhang L (eds) Perspectives on thinking, learning,
ences, learning, and instruction. Erlbaum, Hillsdale and cognitive styles. Erlbaum, Mahwah, pp 4772
123
Author Index
For each author, references are given to the type of contribution, if (s)he is the first author, or to the first author, if (s)he is a
co-author. Within each type, contributions are ordered alphabetically.
Afsari Z. ? POSTER PRESENTATION Chuang L. L. ? POSTER PRESENTATIONS (2);

Albrecht R. ? POSTER PRESENTATIONS (2); ORAL PRESENTATION; Glatz, C.;
ORAL PRESENTATIONS (2) Symeonidou, E.; Scheer, M.
Alex-Ruf S. ? POSTER PRESENTATION Ciaunica A. ? ORAL PRESENTATION
Aschersleben G. ? SYMPOSIUM (Koester) Colombo M. ? SYMPOSIUM (Morgan)
Augurzky P. ? SYMPOSIUM (Brauner, Jager, Rolke) Coogan J. ? ORAL PRESENTATION; Blasing, B.
Bader M. ? ORAL PRESENTATIONS (2); Coyle D. ? POSTER PRESENTATION;
Ellsiepen, E. Limerick, H.
Bahnmueller J. ? SYMPOSIUM (Nuerk) Cremers A. B. ? ORAL PRESENTATION; Garcia, G.M.
Baier F. ? POSTER PRESENTATION; Damaskinos M. ? POSTER PRESENTATION
Hamburger, K. Daroczy G. ? SYMPOSIUM (Nuerk)
Baumann M. ? SYMPOSIUM (Baumann) de la Rosa S. ? SYMPOSIUM (de la Rosa); POSTER
Bech M. ? POSTER PRESENTATION; Michael, J. PRESENTATIONS (2); ORAL
Bekkering H. ? KEYNOTE LECTURE; SYMPOSIUM PRESENTATION Chang, D.;
(Koester) Hohmann, M.R.; Chang, D.
Bennati S. ? ORAL PRESENTATION; Rizzardi, E. de la Vega I. ? POSTER PRESENTATIONS (2);
Bergmann K. ? ORAL PRESENTATION Wolter, S.
Bernhart N. ? POSTER PRESENTATION; Schad, D. de Lange F. P. ? SYMPOSIUM (Koester)
Besold T. R. ? POSTER PRESENTATION, Demarchi G. ? POSTER PRESENTATION; Braun, S.
ORAL PRESENTATION Demberg V. ? SYMPOSIUM (Knoeferle, Burigo)
Bianco R. ? ORAL PRESENTATION Dittrich K. ? POSTER PRESENTATION;
Biondi J. ? ORAL PRESENTATION; Blasing, B. Scholtes, C.
Blascheck T. ? TUTORIAL (Raschke) Domahs U. ? ORAL PRESENTATION;
Blasing B. ? POSTER PRESENTATION; Kandylaki, K.
ORAL PRESENTATION; Seegelke, C. Dorner D. ? POSTER PRESENTATION;
Bogart K. ? ORAL PRESENTATION; Michael, J. Damaskinos, M.
Bohn K. ? ORAL PRESENTATION; Kandylaki, K. Dowker A. ? SYMPOSIUM (Nuerk)
Bott O. ? SYMPOSIUM (Brauner, Jager, Rolke) Dshemuchadse M. ? POSTER PRESENTATION; Frisch, S.
Brandenburg S. ? SYMPOSIUM (Baumann) Dudschig C. ? POSTER PRESENTATIONS (2);
Brandi M. ? SYMPOSIUM (Himmelbach) ORAL PRESENTATION;
Brauer R. R. ? POSTER PRESENTATION; de la Vega, I.; Wolter, S.; Lachmair, M.
Fischer, N.M. Egan F. ? SYMPOSIUM (Morgan)
Braun C. ? POSTER PRESENTATION Ehrenfeld S. ? ORAL PRESENTATION
Braun D. A. ? SYMPOSIUM (de la Rosa); POSTER Ehrsson H. H. ? KEYNOTE LECTURE
PRESENTATION; Leibfried, F. Ellsiepen E. ? ORAL PRESENTATION
Brauner C. ? SYMPOSIUM (Brauner, Jager, Rolke) Engelbrecht K. ? ORAL PRESENTATION;
Brock J. ? ORAL PRESENTATION; Caruana, N. Halbrugge, M.
Buchel C. ? POSTER PRESENTATION; Wache, S. Engelhardt P. E. ? SYMPOSIUM (Knoeferle, Burigo)
Bulthoff H. H. ? POSTER PRESENTATIONS (5); Fard P. R. ? POSTER PRESENTATION; Yahya, K.
ORAL PRESENTATION (2); Fengler A. ? POSTER PRESENTATION; Krause, C.
Chang, D; Glatz, C; Hohmann, M.R.; Fernandez L. B. ? SYMPOSIUM (Knoeferle, Burigo)
Meilinger, T.; Symeonidou, E.; Fernandez S.R. ? POSTER PRESENTATION;
Chang, D.; Scheer, M. ORAL PRESENTATIONS (2);
Burch M. ? TUTORIAL (Raschke) Lachmair, M.; Rolke, B.
Burigo M. ? SYMPOSIUM (Knoeferle, Burigo) Festl F. ? POSTER PRESENTATION;
Buschmeier H. ? POSTER PRESENTATION Seibold, V.C.
Butz M. V. ? POSTER PRESENTATION; ORAL Fischer M. H. ? SYMPOSIUM (Nuerk); POSTER
PRESENTATIONS (2); Lohmann, J.; PRESENTATION; Sixtus, E.
Ehrenfeld, S.; Schrodt, F. Fischer N. M. ? POSTER PRESENTATION
Caruana N. ? ORAL PRESENTATION Frankenstein J. ? POSTER PRESENTATION;
Chang D. ? POSTER PRESENTATION; Meilinger, T.
ORAL PRESENTATION Franzmeier I. ? ORAL PRESENTATION; Ragni, M.
123
Freksa C. ? ORAL PRESENTATION Holle H. ? SYMPOSIUM (Koester)

Frey J. ? POSTER PRESENTATION; Braun, C. Huber S. ? SYMPOSIUM (Nuerk); POSTER
Friederici A. D ? POSTER PRESENTATION; PRESENTATION; Radler, P.A.
ORAL PRESENTATION; Krause, C.; Huys Q. ? POSTER PRESENTATION; Schad, D.
Bianco, R. Jager G. ? SYMPOSIUM (Brauner, Jager, Rolke)
Friedrich C. K. ? POSTER PRESENTATION; Schild, U. Jakel F. ? TUTORIAL (Jakel)
Frintrop S. ? ORAL PRESENTATION; Garcia, G.M. Janczyk M. ? POSTER PRESENTATIONS;
Frisch S. ? POSTER PRESENTATION ORAL PRESENTATION; Groer, J.
Friston K. ? KEYNOTE LECTURE; SYMPOSIUM Javadi A. H. ? POSTER PRESENTATION; Schad, D.
(Morgan); POSTER PRESENTATION; Joeres F. ? TUTORIAL (Russwinkel, Prezenski,
Yahya, K. Joeres, Lindner, Halbrugge); POSTER
Fusaroli R. ? ORAL PRESENTATION; Michael, J. PRESENTATION
Garbusow M. ? POSTER PRESENTATION; Schad, D. Junger E. ? POSTER PRESENTATION; Schad, D.
Garcia G. M. ? ORAL PRESENTATION Kahl S. ? ORAL PRESENTATION;
Giese M. A. ? SYMPOSIUM (de la Rosa) Bergmann, K.
Giewein M. ? POSTER PRESENTATION; Kandylaki K. ? ORAL PRESENTATION
Albrecht, R. Karnath H. ? POSTER PRESENTATION; Rennig, J.
Glatz C. ? POSTER PRESENTATION Kathner D. ? SYMPOSIUM (Baumann)
Godde B. ? SYMPOSIUM (Koester) Kaul R. ? SYMPOSIUM (Baumann)
Goebel S. ? SYMPOSIUM (Nuerk) Kaup B. ? POSTER PRESENTATIONS (2);
Goldenberg G. ? SYMPOSIUM (Himmelbach) ORAL PRESENTATION;
Goltenboth N. ? POSTER PRESENTATION de la Vega, I.; Wolter, S.; Lachmair, M.
Gomez O. ? ORAL PRESENTATION Keller P. ? ORAL PRESENTATION; Bianco, R.
Goschke T. ? POSTER PRESENTATION; Frisch, S. Keyser J. ? POSTER PRESENTATION; Wache, S.
Grau-Moya J. ? POSTER PRESENTATION; Kircher T. ? ORAL PRESENTATION;
Leibfried, F. Kandylaki, K.
Gray W. D. ? KEYNOTE LECTURE Klauer K. C. ? POSTER PRESENTATION;
Grishkova I. ? POSTER PRESENTATION Scholtes, C.
Grosjean M. ? POSTER PRESENTATION Knoblich G. ? POSTER PRESENTATIONS (2);
Groer J. ? POSTER PRESENTATION Vesper, C.; Wolf, T.
Grosz P. ? SYMPOSIUM (Brauner, Jager, Rolke) Knoeferle P. ? SYMPOSIUM (Knoeferle, Burigo)
Gunter T. ? SYMPOSIUM (Koester) Koester D. ? SYMPOSIUM (Koester); POSTER
Guss C. D. ? POSTER PRESENTATIONS (2); PRESENTATION; Seegelke, C.
Damaskinos, M.; Goltenboth, N. Konig P. ? POSTER PRESENTATIONS (2);
Halbrugge M. ? SYMPOSIUM (Russwinkel, Prezenski, Afsari, Z.; Wache, S.
Lindner); TUTORIAL (Russwinkel, Konig S. U. ? POSTER PRESENTATION; Wache, S.
Prezenski, Joeres, Lindner, Halbrugge); Kopp S. ? POSTER PRESENTATIONS (2);
ORAL PRESENTATION ORAL PRESENTATION; Buschmeier,
Halfmann M. ? POSTER PRESENTATION; H.; Grishkova, I.; Bergmann, K.
Hardiess, G. Kotowski S. ? POSTER PRESENTATION
Hamburger K. ? POSTER PRESENTATIONS (2); Krause C. ? POSTER PRESENTATION
ORAL PRESENTATION (2); Roser, F.; Kroczek L. ? SYMPOSIUM (Koester)
Strickrodt, M.; Wedell, F. Kruegger J. ? ORAL PRESENTATION; Michael, J.
Hardiess G. ? POSTER PRESENTATIONS (2); Kuhl D. ? SYMPOSIUM (Baumann)
ORAL PRESENTATION; Schick, W.; Kunde W. ? ORAL PRESENTATION; Janczyk, M.
Mallot, H.A. Kurzhals K. ? TUTORIAL (Raschke)
Hartl H. ? POSTER PRESENTATION; Kutscheidt K. ? POSTER PRESENTATION
Kotowski, S. Lachmair M. ? POSTER PRESENTATIONS; ORAL
Heege L. ? POSTER PRESENTATION PRESENTATION; Fernandez, S.R.
Hein E. ? POSTER PRESENTATION; Lancier S. ? POSTER PRESENTATION
Kutscheidt, K. Lappe M. ? POSTER PRESENTATION;
Heinz A. ? POSTER PRESENTATION; Schad, D. Masselink, J.
Hellbernd N. ? POSTER PRESENTATION Le Bigot M. ? POSTER PRESENTATION;
Henning A. ? SYMPOSIUM (Koester) Grosjean, M.
Herbort O. ? SYMPOSIUM (Koester) Leibfried F. ? POSTER PRESENTATION
Hermsdorfter J. ? SYMPOSIUM (Himmelbach) Limerick H. ? POSTER PRESENTATION
Hesse C. ? SYMPOSIUM (Himmelbach) Lindemann O. ? POSTER PRESENTATION; Sixtus, E.
Himmelbach M. ? SYMPOSIUM (Himmelbach); POSTER Lindner A. ? SYMPOSIUM (Morgan); POSTER
PRESENTATION; Rennig, J. PRESENTATION; Kutscheidt, K.
Hinterecker T. ? ORAL PRESENTATION; Lindner N. ? ORAL PRESENTATION
Strickrodt, M. Lindner S. ? SYMPOSIUM (Russwinkel, Prezenski,
Hofmeister J. ? POSTER PRESENTATION; Lancier, S. Lindner); TUTORIAL (Russwinkel,
Hohmann M. R. ? POSTER PRESENTATION Prezenski, Joeres, Lindner, Halbrugge)
123
Lingnau A. ? SYMPOSIUM (Himmelbach) Rahona J. J ? ORAL PRESENTATION;

Lloyd D. ? SYMPOSIUM (Nuerk) Fernandez, S.R.
Lohmann J. ? POSTER PRESENTATION Rapp M. A. ? POSTER PRESENTATION; Schad, D.
Lopez J. J. R. ? ORAL PRESENTATION; Rolke, B. Raschke M. ? TUTORIAL (Raschke)
Ludmann M. ? ORAL PRESENTATION Rebuschat P. ? POSTER PRESENTATION
Lutsevich A. ? POSTER PRESENTATION; Renner P. ? POSTER PRESENTATIONS; ORAL
Damaskinos, M. PRESENTATION; Pfeiffer, T.
Maier S. ? ORAL PRESENTATION; Ragni, M. Rennig J. ? POSTER PRESENTATION
Mallot H. A. ? POSTER PRESENTATIONS (3); Rizzardi E. ? ORAL PRESENTATION
ORAL PRESENTATION; Hardiess, G.; Roberts M. ? SYMPOSIUM (Nuerk)
Lancier, S.; Schick, W. Rohrich W. G. ? ORAL PRESENTATION; Mallot, H.A.
Marmolejo-Ramos F. ? POSTER PRESENTATION; Vaci, N. Rolke B. ? SYMPOSIUM (Brauner, Jager, Rolke);
Masselink J. ? POSTER PRESENTATION POSTER PRESENTATION; ORAL
Matthews R. ? SYMPOSIUM (Morgan) PRESENTATION; Seibold, V.C.
McRae K. ? POSTER PRESENTATION; Romoli J. ? SYMPOSIUM (Brauner, Jager, Rolke)
Rabovsky, M. Roser F. ? POSTER PRESENTATIONS (2);
Meilinger T. ? POSTER PRESENTATION ORAL PRESENTATIONS (2);
Meurers D. ? SYMPOSIUM (Nuerk) Hamburger, K.; Strickrodt, M.;
Michael J. ? ORAL PRESENTATION Wedell, F.
Milin P. ? POSTER PRESENTATION; Vaci, N. Roth M. J. ? POSTER PRESENTATION;
Moeller K. ? SYMPOSIUM (Nuerk); POSTER Kutscheidt, K.
PRESENTATION; Radler P.A. Ruiz S. ? POSTER PRESENTATION;
Mohler B. J. ? POSTER PRESENTATION; Rebuschat, P.
Meilinger, T. Russwinkel N. ? SYMPOSIUM (Russwinkel, Prezenski,
Monittola G. ? POSTER PRESENTATION; Braun, C. Lindner); TUTORIAL (Russwinkel,
Moore J. ? POSTER PRESENTATION; Prezenski, Joeres, Lindner, Halbrugge);
Limerick, H. ORAL PRESENTATION; Joeres, F.
Morgan A. ? SYMPOSIUM (Morgan) Safra L. ? POSTER PRESENTATION; Vesper, C.
Muckli L. ? SYMPOSIUM (Morgan) Sammler D. ? POSTER PRESENTATION;
Muller R. ? POSTER PRESENTATION ORAL PRESENTATION; Bianco, R.;
Myachykov A. ? SYMPOSIUM (Knoeferle, Burigo) Hellbernd, N.
Nagels A. ? ORAL PRESENTATION; Sandamirskaya Y. ? TUTORIAL (Sandamirskaya,
Kandylaki, K. Schneegans)
Neumann H. ? TUTORIAL (Neumann); ORAL Schack T. ? POSTER PRESENTATION;
PRESENTATION; Gomez, O. ORAL PRESENTATION; Blasing, B.;
Newen A. ? POSTER PRESENTATION; Heege, L. Seegelke, C.
Novembre G. ? ORAL PRESENTATION; Bianco, R. Schad D. ? POSTER PRESENTATIONS (2);
Nuerk H. ? SYMPOSIUM (Nuerk) Rabovsky, M.
Obrig H. ? POSTER PRESENTATION; Krause, C. Scheer M. ? ORAL PRESENTATION
Olivari M. ? POSTER PRESENTATION; Schenk T. ? SYMPOSIUM (Himmelbach)
Symeonidou, E. Scherbaum S. ? POSTER PRESENTATION; Frisch, S.
Ondobaka S. ? SYMPOSIUM (Koester) Schick W. ? POSTER PRESENTATION
Ossandon J. ? POSTER PRESENTATION; Afsari, Z. Schiltz C. ? SYMPOSIUM (Nuerk)
Ostergaard J. R. ? ORAL PRESENTATION; Michael, J. Schmid U. ? POSTER PRESENTATION;
Patel-Grosz P. ? SYMPOSIUM (Brauner, Jager, Rolke) Damaskinos, M.
Pfeiffer T. ? POSTER PRESENTATIONS; ORAL Schmitz L. ? POSTER PRESENTATION; Vesper, C.
PRESENTATION; Renner, P. Schneegans S. ? TUTORIAL (Sandamirskaya,
Pfeiffer-Lessmann N. ? ORAL PRESENTATION; Pfeiffer, T. Schneegans)
Pfister R. ? SYMPOSIUM (Koester) Scholtes C. ? POSTER PRESENTATION
Pfluger H. ? TUTORIAL (Raschke) Schoner G. ? KEYNOTE LECTURE
Pixner S. ? POSTER PRESENTATION; Schrodt F. ? ORAL PRESENTATION
Radler, P.A. Schulz M. ? SYMPOSIUM (Russwinkel, Prezenski,
Pliushch I. ? POSTER PRESENTATION Lindner)
Popov T. ? POSTER PRESENTATION; Braun, C. Schumacher P. ? SYMPOSIUM (Brauner, Jager, Rolke)
Prezenski S. ? SYMPOSIUM (Russwinkel, Prezenski, Schumann F. ? POSTER PRESENTATION; Wache, S.
Lindner); TUTORIAL (Russwinkel, Sebanz N. ? KEYNOTE LECTURE; POSTER
Prezenski, Joeres, Lindner, Halbrugge) PRESENTATIONS (2); Vesper, C.;
Rabovsky M. ? POSTER PRESENTATION Wolf, T.
Radanovic J. ? POSTER PRESENTATION; Vaci, N. Sebold M. ? POSTER PRESENTATION; Schad, D.
Radler P. A. ? POSTER PRESENTATION Seegelke C. ? POSTER PRESENTATION
Ragni M. ? POSTER PRESENTATIONS; ORAL Sehm B. ? POSTER PRESENTATION; Krause, C.
PRESENTATION (3); Albrecht, R.; Seibold V. C. ? POSTER PRESENTATION; ORAL
Rizzardi, E.; Steinlein, E. PRESENTATION; Rolke, B.
123
Shaki S. ? SYMPOSIUM (Nuerk) von Sydow M. ? ORAL PRESENTATION

Simmel L. ? ORAL PRESENTATION; Blasing, B. Vorwerg C. ? POSTER PRESENTATION;
Sixtus E. ? POSTER PRESENTATION Grishkova, I.
Smolka M. ? POSTER PRESENTATION; Schad, D. Vosgerau G. ? ORAL PRESENTATION; Lindner, N.
Soltanlou M. ? SYMPOSIUM (Nuerk) Wache S. ? POSTER PRESENTATION
Sorg C. ? SYMPOSIUM (Himmelbach) Wachsmuth S. ? POSTER PRESENTATION; Renner, P.
Spiegel M. A. ? POSTER PRESENTATION; Weber L. ? SYMPOSIUM (Baumann)
Seegelke, C. Wedell F. ? ORAL PRESENTATION
Steffenhagen F. ? POSTER PRESENTATION; Weigelt M. ? SYMPOSIUM (Koester)
Albrecht, R. Weiss-Blankenhorn P. H. ? SYMPOSIUM (Himmelbach)
Stein S. C. ? POSTER PRESENTATION; Weisz N. ? POSTER PRESENTATION; Braun, C.
Sutterlutti, R. Wenczel F. ? ORAL PRESENTATION; Ragni M.
Steinlein E. ? ORAL PRESENTATION Westphal B. ? POSTER PRESENTATIONS (1);
Sternefeld W. ? SYMPOSIUM (Brauner, Jager, Rolke) ORAL PRESENTATION (2);
Strickrodt M. ? ORAL PRESENTATION Albrecht, R.; Albrecht, R.; Albrecht, R.
Sutterlutti R. ? POSTER PRESENTATION Wiese R. ? ORAL PRESENTATION;
Symeonidou E. ? POSTER PRESENTATION Kandylaki, K.
Szucs D. ? SYMPOSIUM (Nuerk) Wiese W. ? POSTER PRESENTATION;
Tamosinunaite M. ? POSTER PRESENTATION; Pliushch, I.
Sutterlutti, R. Wirzberger M. ? SYMPOSIUM (Russwinkel, Prezenski,
Teickner C. ? POSTER PRESENTATION; Schick, W. Lindner)
Thuring M. ? SYMPOSIUM (Baumann) Wittmann M. ? SYMPOSIUM (Koester)
Tillas A. ? ORAL PRESENTATION Wohlschlager A. ? SYMPOSIUM (Himmelbach)
Trillmich C. M. ? POSTER PRESENTATION; Wolbers T. ? POSTER PRESENTATION; Wache, S.
Hamburger, K. Wolf C. ? POSTER PRESENTATION;
Tuason M. T. ? POSTER PRESENTATION; Hamburger, K.
Goltenboth, N. Wolf T. ? POSTER PRESENTATION
Tylen K. ? ORAL PRESENTATION; Michael, J. Wolska M. ? SYMPOSIUM (Nuerk)
Tzelgov J. ? SYMPOSIUM (Nuerk) Wolter S. ? POSTER PRESENTATION
Ugen S. ? SYMPOSIUM (Nuerk) Wong H. Y. ? SYMPOSIUM (de la Rosa)
Ulrich R. ? SYMPOSIUM (Brauner, Jager, Rolke) Woolgar A. ? ORAL PRESENTATION; Caruana, N.
Unger M. ? POSTER PRESENTATION; Worgotter F. ? POSTER PRESENTATION;
Fischer, N.M. Sutterlutti, R.
Vaci N. ? POSTER PRESENTATION Wortelen B. ? SYMPOSIUM (Baumann)
van Leeuwen C. ? ORAL PRESENTATION Wuhle A. ? POSTER PRESENTATION; Braun, C.
Van Rinsveld A. ? SYMPOSIUM (Nuerk) Wunsch K. ? SYMPOSIUM (Koester)
Vesper C. ? POSTER PRESENTATIONS (2); Yaghoubzadeh R. ? POSTER PRESENTATION;
Wolf, T. Grishkova, I.
Villringer A. ? ORAL PRESENTATION; Bianco R. Yahya K. ? POSTER PRESENTATION
Voelcker-Rehage C. ? SYMPOSIUM (Koester) Zimmermann U. S. ? POSTER PRESENTATION; Schad, D.
Vogeley K. ? SYMPOSIUM (de la Rosa) Zohar-Shai B. ? SYMPOSIUM (Nuerk)
123

12th Biannual Conference of The German Cognitive Science Society

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

12th Biannual Conference of The German Cognitive Science Society

Uploaded by

Copyright:

Available Formats

123

Keynote lectures ............................................................................................................................................................................................. S6

Disclosure: This issue was not sponsored by external commercial interests.

Special Issue: Proceedings of KogWis 2014

12th Biannual conference of the German cognitive science society

Keynote lectures participants bodily self to different locations during high-resolution

Symposia System 1 is active in common and uncritical situations. The

The effects of event frequency and event predictability

Robert Kaul1, Martin Baumann2

COGNITION OF HUMAN ACTIONS:

Beatty J, Lucero-Wagoner B (2000) The pupillary system. Cambridge

Matteo Colombo Axel Lindner

Layer resolution fMRI to investigate cortical feedback

Gabriella Daroczy1, Magdalena Wolska1, Hans-Christoph Nuerk1,2,

References possible? In: Stephanidis C (ed) Universal access in human

Matthias Schulz Maria Wirzberger

Tutorials In this tutorial we will present an overview on further existing

Organizers: Nele Russwinkel, Sabine Prezenski, Fabian Joeres,

Dynamic Field Theory: from sensorimotor behaviors

Organizer: Michael Raschke Organizers: Yulia Sandamirskaya, Sebastian Schneegans

Implementation Identifying inter-individual planning strategies

Flake GW, Baum EB (2002) Rush hour is PSPACE-complete, or

We propose that interlocutors in dialogue engage in dynamic Acknowledgments

Actions -.05 -.11 -.20 -.14

de la Vega I, Dudschig C, Kaup B (in prep.) Faster responses to yes

Perception of background color in head mounted

Nele M. Fischer, Robert R. Brauer, Michael Unger

conflict monitoring. NeuroImage 63(1):126136. doi:10.1016/

Looming auditory warnings initiate earlier event-

Christiane Glatz, Heinrich H. Bulthoff, Lewis L.Chuang

Table 1 Some cultural differences in category frequencies

Being an artist means being financially uncertain Variant Typical General

Table 1 Research questions and results for the 31 observers

Carina Krause1, Bernhard Sehm1, Anja Fengler1, References

How to remember Tubingen? Reference frames

Tobias Meilinger1, Julia Frankenstein2, Betty J. Mohler1,

philosophical accounts of mental agency. (III) Finally, we will high-

Rebuschat P, Hamrick P, Sachs R, Riestenberg K, Ziegler N (2013) Shared-space interaction study

N, Wachsmuth I (eds) Proceedings of the 35th annual conference

Left to right or back to front? The spatial flexibility

Torralbo A, Santiago J, Lupianez J (2006) Flexible conceptual pro-

Smart goals, slow habits? Individual differences

Daniel Schad1, Elisabeth Junger2, Miriam Sebold1,

References incorrect and correct response activation we included a constant

various tasks (e.g., Domahs, Moeller, Huber, Willmes, Nuerk

independent variables Belief and Key would show a significant

9d; s : us ^ p ^ Ac1 ; t [ s ^ Ac2 ; t\Ac1 ; t ^ a ^ u0s

When deciding whether to keep the relative clause adjacent to its

Table 3 Results of mixed effect model for Experiment 2

vs. N 1.0587 0.3855 2.747 0.0060

Fig. 1 Overall production architecture

Table 1 Experimental design of the learning task

learning in general, specifically regarding the potential benefit of

subjects determine the intentionality of anothers behavior (Morris,

representational layer, namely (S), remains opaque to her, Anne is References

modality & FoR axis

0.3 without global measurement

position estimation error

Aux first Aux last Aux first Aux last -2 -1 0 1 2

presupposes a fixed lower bound of acceptability relative to the fully

a maximum entropy model. In: Spenader J, Eriksson A, Dahl S

within optimality theory. University of Stockholm, pp 111120