Crossmodal information for visual and haptic discrimination
Available from
flip phillips's profile on Mendeley.
Page 1
Crossmodal information for visual and haptic discrimination
Crossmodal information for visual and haptic discrimination
Flip Phillipsa and Eric J. L. Eganb
aSkidmore College, Saratoga Springs, NY, USA;
bThe Ohio State University, Columbus, OH, USA
ABSTRACT
Both our visual and haptic systems contribute to the perception of the three dimensional world, especially the
proximal perception of objects. The interaction of these systems has been the subject of some debate over the
years, ranging from the philosophically posed Molyneux problem to the more pragmatic examination of their
psychophysical relationship. To better understand the nature of this interaction we have performed a variety of
experiments characterizing the detection, discrimination, and production of 3D shape. A stimulus set of 25 complex,
natural appearing, noisy 3D target objects were statistically specied in the Fourier domain and manufactured
using a 3D printer. A series of paired-comparison experiments examined subjects' unimodal (visual-visual and
haptic-haptic) and crossmodal (visual-haptic) perceptual abilities. Additionally, subjects sculpted objects using
uni- or crossmodal source information. In all experiments, the performance in the unimodal conditions were
similar to one another and unimodal presentation fared better than crossmodal. Also, the spatial frequency of
object features aected performance dierentially across the range used in this experiment. The sculpted objects
were scanned in 3D and the resulting geometry was compared metrically and statistically to the original stimuli.
Objects with higher spatial frequency were harder to sculpt when limited to haptic input compared to only visual
input. The opposite was found for objects with low spatial frequency. The psychophysical discrimination and
comparison experiments yielded similar ndings. There is a marked performance dierence between the visual
and haptic systems and these dierences were systematically distributed along the range of feature details. The
existence of non-universal (i.e. modality-specic) representations explain the poor crossmodal performance. Our
current ndings suggest that haptic and visual information is either integrated into a multi-modal form, or each is
independent and somewhat ecient translation is possible. Vision shows a distinct advantage when dealing with
higher frequency objects but both modalities are eective when comparing objects that dier by a large amount.
Keywords: vision, haptics, touch, crossmodal, perception, shape, 3D shape, psychophysics
1. INTRODUCTION
The problem of visual-haptic sensory integration is posed in the notorious philosophical puzzle known as the
Molyneux problem. In 1690, William Molyneux wrote to John Locke, asking |
Suppose a Man born blind, and now adult, and taught by his touch to distinguish between a Cube,
and a Sphere of the same metal, and nighly of the same bigness, so as to tell, when he felt one and
t'other; which is the Cube, which the Sphere. Suppose then the Cube and Sphere placed on a Table,
and the Blind Man to be made to see. Qre, Whether by his sight, before he touch'd them, he could
now distinguish, and tell, which is the Globe, which the Cube.1,2
Perhaps the earliest empirical data probing this question came from anatomist William Cheselden who
published a patient's visual observations following a surgery removing cataracts that had been present since
birth.3 After surgery the young man was able to make discriminations of certain degrees of `lightness' and color
but was unable to distinguish shapes shapes whose characteristics had presumably been learned via touch. To
many, this put the lid on the question, but several criticisms of the method employed and the generalizability of
case study results has kept this question relevant even today. Recently, similar case studies by Gregory4 and Fine
Further author information: Send correspondence to FP or EJLE
FP: Email:
ip@skidmore.edu, Internet: http://www.
ipphillips.com, Telephone: 1 518 580 5313
EJLE: Email: egan.51@osu.edu, Telephone: 1 614 292 9535
Flip Phillipsa and Eric J. L. Eganb
aSkidmore College, Saratoga Springs, NY, USA;
bThe Ohio State University, Columbus, OH, USA
ABSTRACT
Both our visual and haptic systems contribute to the perception of the three dimensional world, especially the
proximal perception of objects. The interaction of these systems has been the subject of some debate over the
years, ranging from the philosophically posed Molyneux problem to the more pragmatic examination of their
psychophysical relationship. To better understand the nature of this interaction we have performed a variety of
experiments characterizing the detection, discrimination, and production of 3D shape. A stimulus set of 25 complex,
natural appearing, noisy 3D target objects were statistically specied in the Fourier domain and manufactured
using a 3D printer. A series of paired-comparison experiments examined subjects' unimodal (visual-visual and
haptic-haptic) and crossmodal (visual-haptic) perceptual abilities. Additionally, subjects sculpted objects using
uni- or crossmodal source information. In all experiments, the performance in the unimodal conditions were
similar to one another and unimodal presentation fared better than crossmodal. Also, the spatial frequency of
object features aected performance dierentially across the range used in this experiment. The sculpted objects
were scanned in 3D and the resulting geometry was compared metrically and statistically to the original stimuli.
Objects with higher spatial frequency were harder to sculpt when limited to haptic input compared to only visual
input. The opposite was found for objects with low spatial frequency. The psychophysical discrimination and
comparison experiments yielded similar ndings. There is a marked performance dierence between the visual
and haptic systems and these dierences were systematically distributed along the range of feature details. The
existence of non-universal (i.e. modality-specic) representations explain the poor crossmodal performance. Our
current ndings suggest that haptic and visual information is either integrated into a multi-modal form, or each is
independent and somewhat ecient translation is possible. Vision shows a distinct advantage when dealing with
higher frequency objects but both modalities are eective when comparing objects that dier by a large amount.
Keywords: vision, haptics, touch, crossmodal, perception, shape, 3D shape, psychophysics
1. INTRODUCTION
The problem of visual-haptic sensory integration is posed in the notorious philosophical puzzle known as the
Molyneux problem. In 1690, William Molyneux wrote to John Locke, asking |
Suppose a Man born blind, and now adult, and taught by his touch to distinguish between a Cube,
and a Sphere of the same metal, and nighly of the same bigness, so as to tell, when he felt one and
t'other; which is the Cube, which the Sphere. Suppose then the Cube and Sphere placed on a Table,
and the Blind Man to be made to see. Qre, Whether by his sight, before he touch'd them, he could
now distinguish, and tell, which is the Globe, which the Cube.1,2
Perhaps the earliest empirical data probing this question came from anatomist William Cheselden who
published a patient's visual observations following a surgery removing cataracts that had been present since
birth.3 After surgery the young man was able to make discriminations of certain degrees of `lightness' and color
but was unable to distinguish shapes shapes whose characteristics had presumably been learned via touch. To
many, this put the lid on the question, but several criticisms of the method employed and the generalizability of
case study results has kept this question relevant even today. Recently, similar case studies by Gregory4 and Fine
Further author information: Send correspondence to FP or EJLE
FP: Email:
ip@skidmore.edu, Internet: http://www.
ipphillips.com, Telephone: 1 518 580 5313
EJLE: Email: egan.51@osu.edu, Telephone: 1 614 292 9535
Page 2
et al.5 examined other restored-vision patients and found that, contrary to the previous conclusions by Cheselden
and others, there was a non-trivial ability to understand the visual world and the objects contained therein.
The manor in which multiple sensory systems integrate perceptual information is still debated.6 Evidence
exists for sensory specic and multisensory areas used to process 3D shape information. Early experiments
using single cell recording discovered both kinds of areas in macaque monkeys, proximal to both visual and
somatosensory cortex.7,8 Specic multisensory cortical regions include the ventral intraparietal, ventral premotor,
and posterior temporal areas as well as subcortical regions. The adjacent location of these regions in relation to
the sensory specic regions supports the hypothesis that perceptual information moves in a hierarchically linear
fashion. Recently, imaging techniques have shown multisensory areas consistent with data collected from single
cell recording.9
A patient who suered visual agnosia due to a lesion to the left occipito-temporal cortex was later found to also
have tactile agnosia as well.10 The patient still had a fully intact somatosensory cortex and sensation. Another
patient with a lesion to the lateral occipital cortex was unable to learn the shapes of unfamiliar objects.11 These
two patients' pathological perception support the concept of a multisensory integration of 3D shape information.
However, a third patient with a lesion to the ventrolateral somatosensory cortex suered from only tactile agnosia.
This suggests that haptic representations take form before the information reaches the higher level visual or
multisensory areas.
Several studies have shown that regions traditionally labeled as sensory specic may in fact be involved in
multisensory perception. For example, the visual cortex appears to be involved in object perception regardless of
perceptual modality. Sathian12 used positron emission tomography to examine visual cortex activation during
haptic perception. The stimuli consisted of raised 2D grids with varying spacings and orientations. It was found
that visual cortex activation occurred but was task dependent since it only appeared for the orientation task. It
was later found that transcranial magnetic stimulation to the parieto-occipital cortex can signicantly eect the
ability to perform the haptic orientation task.13 Similar imaging studies have found visual area activation for
motion14 and object form.15
It has been widely shown that the two systems do not necessarily perceive objects congruently. Unimodal
(visual-visual, haptic-haptic) and crossmodal (visual-haptic) paired comparison discrimination tasks of natural
3D shapes have demonstrated the visual system's overall superior ability. Norman et al.16,17 used a stimulus
set that encompassed only a single object category. Their natural bell pepper stimuli could be considered more
`global' in the sense that high-frequency information is of less importance to the overall characterization of the 3D
shape. Other studies have shown the haptic system has a propensity to perform better on local compared to
global shape perception, while the opposite is true for vision.18{20 This suggests that the information content of
the stimuli used in a particular experiment can matter greatly.
Furthermore, it can be shown that it is relatively easy to convince the two systems that they are being presented
with the same stimuli when, in fact, they are not. Helbig21 examined the perception of elliptical shapes of varying
dimensions, presented either uni- or bimodally. Participants were deceived into thinking they were perceiving a
single stimulus when, in fact, they were looking at and touching two dierent stimuli. Bimodal perception was
more accurate than unimodal perception which suggests our mental representations are summations of unimodal
perceptions, combined in some statistically optimal manner.
In the following experiments, we set out to better understand the nature of the information necessary to
accurately discriminate and reproduce 3D shape uni- and crossmodally. We investigate whether information is
`transferable' between senses, and if so, what is the nature of information that can be shared and to what extent
can it be done. We begin by examining the utility of various amounts of visual information by way of a simple
psychophysical discrimination task and a more complex object production task.
2. EXPERIMENT ONE | VISION
To investigate the relationships between haptic and visual information it is important to establish the baseline
performance of each modality by considering the senses separately.
and others, there was a non-trivial ability to understand the visual world and the objects contained therein.
The manor in which multiple sensory systems integrate perceptual information is still debated.6 Evidence
exists for sensory specic and multisensory areas used to process 3D shape information. Early experiments
using single cell recording discovered both kinds of areas in macaque monkeys, proximal to both visual and
somatosensory cortex.7,8 Specic multisensory cortical regions include the ventral intraparietal, ventral premotor,
and posterior temporal areas as well as subcortical regions. The adjacent location of these regions in relation to
the sensory specic regions supports the hypothesis that perceptual information moves in a hierarchically linear
fashion. Recently, imaging techniques have shown multisensory areas consistent with data collected from single
cell recording.9
A patient who suered visual agnosia due to a lesion to the left occipito-temporal cortex was later found to also
have tactile agnosia as well.10 The patient still had a fully intact somatosensory cortex and sensation. Another
patient with a lesion to the lateral occipital cortex was unable to learn the shapes of unfamiliar objects.11 These
two patients' pathological perception support the concept of a multisensory integration of 3D shape information.
However, a third patient with a lesion to the ventrolateral somatosensory cortex suered from only tactile agnosia.
This suggests that haptic representations take form before the information reaches the higher level visual or
multisensory areas.
Several studies have shown that regions traditionally labeled as sensory specic may in fact be involved in
multisensory perception. For example, the visual cortex appears to be involved in object perception regardless of
perceptual modality. Sathian12 used positron emission tomography to examine visual cortex activation during
haptic perception. The stimuli consisted of raised 2D grids with varying spacings and orientations. It was found
that visual cortex activation occurred but was task dependent since it only appeared for the orientation task. It
was later found that transcranial magnetic stimulation to the parieto-occipital cortex can signicantly eect the
ability to perform the haptic orientation task.13 Similar imaging studies have found visual area activation for
motion14 and object form.15
It has been widely shown that the two systems do not necessarily perceive objects congruently. Unimodal
(visual-visual, haptic-haptic) and crossmodal (visual-haptic) paired comparison discrimination tasks of natural
3D shapes have demonstrated the visual system's overall superior ability. Norman et al.16,17 used a stimulus
set that encompassed only a single object category. Their natural bell pepper stimuli could be considered more
`global' in the sense that high-frequency information is of less importance to the overall characterization of the 3D
shape. Other studies have shown the haptic system has a propensity to perform better on local compared to
global shape perception, while the opposite is true for vision.18{20 This suggests that the information content of
the stimuli used in a particular experiment can matter greatly.
Furthermore, it can be shown that it is relatively easy to convince the two systems that they are being presented
with the same stimuli when, in fact, they are not. Helbig21 examined the perception of elliptical shapes of varying
dimensions, presented either uni- or bimodally. Participants were deceived into thinking they were perceiving a
single stimulus when, in fact, they were looking at and touching two dierent stimuli. Bimodal perception was
more accurate than unimodal perception which suggests our mental representations are summations of unimodal
perceptions, combined in some statistically optimal manner.
In the following experiments, we set out to better understand the nature of the information necessary to
accurately discriminate and reproduce 3D shape uni- and crossmodally. We investigate whether information is
`transferable' between senses, and if so, what is the nature of information that can be shared and to what extent
can it be done. We begin by examining the utility of various amounts of visual information by way of a simple
psychophysical discrimination task and a more complex object production task.
2. EXPERIMENT ONE | VISION
To investigate the relationships between haptic and visual information it is important to establish the baseline
performance of each modality by considering the senses separately.
Page 3
2.1 The Method
In this experiment, subjects performed a simple discrimination using a single two-alternative forced choice (2AFC)
paradigm. Two stimuli were presented visually and the subjects were instructed to judge if the presented objects'
shapes were geometrically the same or dierent.
Figure 1. The entire stimulus set consisting of 25 globally convex, natural appearing, noisy objects that are ordered by
spatial frequency. A video of stimulus 13 is available here: http://dx.doi.org/doi.number.goes.here
2.2 The Stimuli
The full set of stimuli used in this and subsequent experiments are shown in Figure 1. The set consists of 25 globally
convex, natural appearing, noisy objects created using the techniques outlined in our previous work.22 Basically,
these objects are spheres subjected to a series of pseudo-random transformations at various scales. The objects'
global shapes dier systematically along two dimensions: those of spacial frequency and amplitude. Variations of
these two shape characteristics subsequently give rise to objects that dier in overall visual `complexity'. The
nal set is ordered by spatial frequency, i.e., roughly the number of bumps and dimples per object, labeled from
one to twenty-ve starting with the smallest spatial frequency. Objects' amplitudes, i.e., the height and depth of
the bumps and dimples, decrease as frequency increases in order to normalize their overall diameters.
The underlying noise used to generate the objects was of a constant `shape' for all stimuli | as frequency was
increased the sampling range of the noise signal was increased. This means that the objects share an underlying
In this experiment, subjects performed a simple discrimination using a single two-alternative forced choice (2AFC)
paradigm. Two stimuli were presented visually and the subjects were instructed to judge if the presented objects'
shapes were geometrically the same or dierent.
Figure 1. The entire stimulus set consisting of 25 globally convex, natural appearing, noisy objects that are ordered by
spatial frequency. A video of stimulus 13 is available here: http://dx.doi.org/doi.number.goes.here
2.2 The Stimuli
The full set of stimuli used in this and subsequent experiments are shown in Figure 1. The set consists of 25 globally
convex, natural appearing, noisy objects created using the techniques outlined in our previous work.22 Basically,
these objects are spheres subjected to a series of pseudo-random transformations at various scales. The objects'
global shapes dier systematically along two dimensions: those of spacial frequency and amplitude. Variations of
these two shape characteristics subsequently give rise to objects that dier in overall visual `complexity'. The
nal set is ordered by spatial frequency, i.e., roughly the number of bumps and dimples per object, labeled from
one to twenty-ve starting with the smallest spatial frequency. Objects' amplitudes, i.e., the height and depth of
the bumps and dimples, decrease as frequency increases in order to normalize their overall diameters.
The underlying noise used to generate the objects was of a constant `shape' for all stimuli | as frequency was
increased the sampling range of the noise signal was increased. This means that the objects share an underlying
Page 4
similarity due to the noise constancy but that similarity occurs at dierent scales across the object range. For
example, note the similarity of objects 11 & 12, 14 & 15, and others in Figure 1.
The resulting objects have an average diameter of ~50 mm, all t within a 55 55 55 mm bounding box, and
range from relatively smooth and free of notable in
ections to a maximum of ca. 10-12 bumps per object.
These objects share some degree of similarity with the objects used by Norman et al.16,17 with our objects
extending the range of structural information available. Furthermore, our objects have real metric steps between
them whereas the bell peppers used by Norman et al. are not representative of any continuum.
The stimuli were densely sampled at ca. 20,000 triangles/object, painted a uniform, saturated blue and
rendered with OpenGL using smooth shading. Illumination was provided by a single source located at the front,
upper right quadrant with respect to the viewer. Traditional, simple specular shading was employed with the
surfaces' shading coecients Ka = 0:01;Kd = 0:75;Ks = 0:24 for the ambient, diuse, and specular components,
respectively.
2.3 The Procedure
Subjects were seated at a viewing distance of 57 cm (1=1 cm) from a standard, color and luminance calibrated
LCD monitor. The stimuli were scaled such that their projection to the screen was consistent with that of a
55 cm3 object at that viewing distance. In order to enhance the perceived depth of the presented stimuli an eye
patch was used to ensure monocular viewing. Subjects were free to pick their preferred eye. Finally, no chinrest
was used, allowing relatively free viewing, but subjects were instructed to remain relatively still throughout the
experiment.
On each experimental trial, two stimuli were presented on the monitor, each rotating at a speed of 6620/ sec
(Gaussian distributed) about an arbitrary axis that was constrained to pass through the objects' center of gravity.
Subjects used a keypad to indicate if the stimuli were geometrically the `same' or `dierent'. There was no time
limit placed on the comparison and subjects were instructed to work with the emphasis on precision rather than
speed. On average, subjects took approximately 5 2 sec per comparison. On each trial there was an equal
probability of receiving a `same' pairing as a `dierent'. Since each stimulus was rotated about an arbitrary axis
and had a dierent rotation rate subjects could not simply compare the stimulus' image information to make
their judgements.
Each subject performed two sessions of 500 trials for a total of 13,000 comparisons across all subjects. Due
to the large number of possible pairings and a desire to cover the entire range with multiple trials per pairing,
a random subset of pairings was chosen for each subject at each session. This method resulted in about 30
judgements per possible pairing condition.
2.4 The Participants
There were a total of thirteen subjects. The majority (9 subjects) were undergraduate students at Skidmore
College and the remainder consisted of the authors and colleagues. With the exception of the authors, all had
never seen the objects before and were nave with regard to the purposes of the experiment. All of the subjects
possessed normal, or corrected-to-normal, visual acuity.
2.5 The Results
Figure 2 presents the results of Experiment 1. Our rst analysis computes the d0 for each object. As is shown d0 s
for all stimuli are quite good, ranging from d0 = 2:1 to 3.1. The mean d0 across all stimuli is x = 2:53 with a
standard error of sx = 0:1. This level of performance clearly demonstrates that the objects are easily discriminable
from each other. The pattern of d0 as a function of frequency, however, does not show any clear linear relationship
(r2 0:0) though there are local maxima of discriminability around the lowest, highest, and mid-frequency
objects. (Recall that objects 1{25 increase in frequency as the object-number increases.) For the low frequency
case it might be assumed that subtle changes are easily discriminable due to the relative smoothness at that level.
Similarly, for the high frequency objects there may be individual diagnostic locations that are easy to detect the
presence or absence of. The middle scale objects are slightly more dicult to interpret and this may be based on
the absence both diagnostic and smooth locations.
example, note the similarity of objects 11 & 12, 14 & 15, and others in Figure 1.
The resulting objects have an average diameter of ~50 mm, all t within a 55 55 55 mm bounding box, and
range from relatively smooth and free of notable in
ections to a maximum of ca. 10-12 bumps per object.
These objects share some degree of similarity with the objects used by Norman et al.16,17 with our objects
extending the range of structural information available. Furthermore, our objects have real metric steps between
them whereas the bell peppers used by Norman et al. are not representative of any continuum.
The stimuli were densely sampled at ca. 20,000 triangles/object, painted a uniform, saturated blue and
rendered with OpenGL using smooth shading. Illumination was provided by a single source located at the front,
upper right quadrant with respect to the viewer. Traditional, simple specular shading was employed with the
surfaces' shading coecients Ka = 0:01;Kd = 0:75;Ks = 0:24 for the ambient, diuse, and specular components,
respectively.
2.3 The Procedure
Subjects were seated at a viewing distance of 57 cm (1=1 cm) from a standard, color and luminance calibrated
LCD monitor. The stimuli were scaled such that their projection to the screen was consistent with that of a
55 cm3 object at that viewing distance. In order to enhance the perceived depth of the presented stimuli an eye
patch was used to ensure monocular viewing. Subjects were free to pick their preferred eye. Finally, no chinrest
was used, allowing relatively free viewing, but subjects were instructed to remain relatively still throughout the
experiment.
On each experimental trial, two stimuli were presented on the monitor, each rotating at a speed of 6620/ sec
(Gaussian distributed) about an arbitrary axis that was constrained to pass through the objects' center of gravity.
Subjects used a keypad to indicate if the stimuli were geometrically the `same' or `dierent'. There was no time
limit placed on the comparison and subjects were instructed to work with the emphasis on precision rather than
speed. On average, subjects took approximately 5 2 sec per comparison. On each trial there was an equal
probability of receiving a `same' pairing as a `dierent'. Since each stimulus was rotated about an arbitrary axis
and had a dierent rotation rate subjects could not simply compare the stimulus' image information to make
their judgements.
Each subject performed two sessions of 500 trials for a total of 13,000 comparisons across all subjects. Due
to the large number of possible pairings and a desire to cover the entire range with multiple trials per pairing,
a random subset of pairings was chosen for each subject at each session. This method resulted in about 30
judgements per possible pairing condition.
2.4 The Participants
There were a total of thirteen subjects. The majority (9 subjects) were undergraduate students at Skidmore
College and the remainder consisted of the authors and colleagues. With the exception of the authors, all had
never seen the objects before and were nave with regard to the purposes of the experiment. All of the subjects
possessed normal, or corrected-to-normal, visual acuity.
2.5 The Results
Figure 2 presents the results of Experiment 1. Our rst analysis computes the d0 for each object. As is shown d0 s
for all stimuli are quite good, ranging from d0 = 2:1 to 3.1. The mean d0 across all stimuli is x = 2:53 with a
standard error of sx = 0:1. This level of performance clearly demonstrates that the objects are easily discriminable
from each other. The pattern of d0 as a function of frequency, however, does not show any clear linear relationship
(r2 0:0) though there are local maxima of discriminability around the lowest, highest, and mid-frequency
objects. (Recall that objects 1{25 increase in frequency as the object-number increases.) For the low frequency
case it might be assumed that subtle changes are easily discriminable due to the relative smoothness at that level.
Similarly, for the high frequency objects there may be individual diagnostic locations that are easy to detect the
presence or absence of. The middle scale objects are slightly more dicult to interpret and this may be based on
the absence both diagnostic and smooth locations.
Page 5
.
.
.
.
.
d’
object
object
Figure 2. Results from the unimodal visual comparison condition. The graph on the left side shows the discriminability of
each stimulus. Recall that the spatial frequency of surface features increases as the object-number increases. There is no
meaningful linear relationship between d0 and frequency but there are local maxima at the extremes and in the middle
of the range. The right-hand graph shows the relative confusability between two stimuli. Darker squares indicate more
frequent confusion between the two indicated objects.
To illustrate the relative discriminability between two given stimuli a second analysis based on the relative
confusability of two given objects was performed. The right side of Figure 2 shows the result of this analysis.
Each location of the matrix indicates the frequency that two given stimuli were confused with each other | e.g.
the frequency that the two `dierent' stimuli elicited `same' responses. Darker locations indicate higher frequency.
Since the two stimuli were presented simultaneously in the same interval only the lower diagonal matrix is shown.
For example, the comparison between stimulus 7 & 8 is the same as between 8 & 7.
There are several things to notice | First, there is a narrow band where the stimuli are confused. For example,
stimulus 5 is more frequently confused with stimuli nearby (like 6 and 7) than with stimuli further away (like 24
and 25). This is consistent with the overall high d0 ndings. Second, there are three main clusters of confusability
| A large group between stimuli 1{10, a second large group from about 18{25 and a third, smaller group from
about 12{16. Finally, within these groups there are some stimulus pairs that are clearly more frequently confused.
For example, 5 & 6, 10 & 11, and 23 & 25 are often mistaken for each other.
3. EXPERIMENT TWO | HAPTIC
We proceed by performing an experiment similar to Experiment 1 using touch instead of vision. Using the same
set of objects, realized as sculptures, we determine discriminability and confusion as in the previous experiment.
3.1 The Method
This experiment was similar to Experiment 1 in that subjects performed a simple discrimination using a two-
alternative forced choice (2AFC) paradigm. In this experiment, two stimuli were presented haptically, without
visual information, and the subjects were instructed to judge if the presented stimuli shapes were geometrically
the same or dierent.
3.2 The Stimuli
The stimuli consisted of the same 25 objects used in Experiment 1. Each object was printed in plastic using a
Dimension rapid prototyping three-dimensional printer (Stratasys, Inc.). Molds were taken from each and plaster
casts were made to facility rapid reproduction of the stimulus set. Each object t within a 55 55 55 mm
bounding box. This size was chosen to allow the objects to be easily manipulated by hand, without requiring the
.
.
.
.
d’
object
object
Figure 2. Results from the unimodal visual comparison condition. The graph on the left side shows the discriminability of
each stimulus. Recall that the spatial frequency of surface features increases as the object-number increases. There is no
meaningful linear relationship between d0 and frequency but there are local maxima at the extremes and in the middle
of the range. The right-hand graph shows the relative confusability between two stimuli. Darker squares indicate more
frequent confusion between the two indicated objects.
To illustrate the relative discriminability between two given stimuli a second analysis based on the relative
confusability of two given objects was performed. The right side of Figure 2 shows the result of this analysis.
Each location of the matrix indicates the frequency that two given stimuli were confused with each other | e.g.
the frequency that the two `dierent' stimuli elicited `same' responses. Darker locations indicate higher frequency.
Since the two stimuli were presented simultaneously in the same interval only the lower diagonal matrix is shown.
For example, the comparison between stimulus 7 & 8 is the same as between 8 & 7.
There are several things to notice | First, there is a narrow band where the stimuli are confused. For example,
stimulus 5 is more frequently confused with stimuli nearby (like 6 and 7) than with stimuli further away (like 24
and 25). This is consistent with the overall high d0 ndings. Second, there are three main clusters of confusability
| A large group between stimuli 1{10, a second large group from about 18{25 and a third, smaller group from
about 12{16. Finally, within these groups there are some stimulus pairs that are clearly more frequently confused.
For example, 5 & 6, 10 & 11, and 23 & 25 are often mistaken for each other.
3. EXPERIMENT TWO | HAPTIC
We proceed by performing an experiment similar to Experiment 1 using touch instead of vision. Using the same
set of objects, realized as sculptures, we determine discriminability and confusion as in the previous experiment.
3.1 The Method
This experiment was similar to Experiment 1 in that subjects performed a simple discrimination using a two-
alternative forced choice (2AFC) paradigm. In this experiment, two stimuli were presented haptically, without
visual information, and the subjects were instructed to judge if the presented stimuli shapes were geometrically
the same or dierent.
3.2 The Stimuli
The stimuli consisted of the same 25 objects used in Experiment 1. Each object was printed in plastic using a
Dimension rapid prototyping three-dimensional printer (Stratasys, Inc.). Molds were taken from each and plaster
casts were made to facility rapid reproduction of the stimulus set. Each object t within a 55 55 55 mm
bounding box. This size was chosen to allow the objects to be easily manipulated by hand, without requiring the
Page 6
traversal of large surface areas. Thus, through active touch, aspects of the global shape could be derived all at
once, if desired, rather than in a piecewise fashion.
The resulting cast objects were painted a saturated blue, similar to that used on the computer-presented
stimuli from the previous experiment. Of course, in this experiment the color and shading characteristics were of
no importance since the task was completed totally by touch, but the paint provided a smooth even surface on
the stimuli to prevent intentional or inadvertent use of non-global features in the discrimination task.
3.3 The Procedure
Subjects were seated at a table and instructed to adjust their chair such that they were situated at a comfortable
height. A wooden shelf and black cloth served to block the view of the stimuli from the subject. The subject
placed their hands under the shelf, through the black cloth for the duration of the experiment. The experimenter
sat across from the subject with two sets of objects laid out in trays. Each object was marked with a UV sensitive
ink, indicating the stimulus number. A UV light, available only to the experimenter, was used to ensure the
correct stimuli were presented on each trial.
A computer program prepared a pseudo-random set of trial pairings. On each experimental trial, two stimuli
were presented simultaneously to the subject by the experimenter. Subjects used a keypad to indicate if the
stimuli were geometrically the `same' or `dierent'. There was an equal probability of receiving a `same' pairing as
a `dierent' during a given trial. There was no time limit placed on the comparison and subjects were instructed
to work with the emphasis on precision rather than speed. Subjects were free to manipulate the stimuli in any
way with either hand, serially or simultaneously.
Pilot experiments showed that each trial was somewhat lengthy | ca. 10-15 sec each. Furthermore, results
from Experiment 1 demonstrate that comparisons between two stimuli that dier by even a moderate amount
were trivial. Therefore, we reduced both the number of comparisons per experimental block and reduced the range
of frequencies used when comparing. On each `dierent' trial, the stimuli diered by a maximum of 5 frequency
steps. Each subject performed two sessions of 200 trials for a total of 2,400 comparisons across all subjects.
3.4 The Participants
There were a total of six subjects, all undergraduate students at Skidmore College. None had seen the objects
before and all were nave with regard to the purposes of the experiment. Finally, all were free of any neurological
or physical problems that would interfere with their haptic exploration of the stimuli.
3.5 The Results
Figure 3 presents the results of Experiment 2. As with Experiment 1 we performed two analyses | one on the
individual stimulus discriminability and another on their relative confusability.
The mean d0 across all stimuli is x = 2:20 with a standard error of sx = 0:18, indicating that, as a whole,
the stimuli are discriminable. However, unlike the unimodal vision condition, a clear inverse relationship is seen
between the complexity of the stimuli and their d0 with a r2 0:6.
In our subsequent analysis we see that the majority of confusion occurs among the high frequency objects. In
the lower frequencies, those below object 13, there is little inter-stimulus confusion with one exception | objects
5 and 7. The general trends across both analyses show that, as object complexity increases the haptic system is
unable to keep up with the information it is being presented with. Recall that the highest frequency objects have
as many as 12 `bumps' around their circumference while the middle-frequency objects are closer to 4{5. The
results show two main groupings | a small, primarily low frequency group and a larger high frequency group.
As with the visual condition these groupings make some intuitive sense. Smooth objects don't provide much in
the way of `landmarks' or features that can be used to diagnose relative dierences whereas the high frequency
objects posses too much information.
once, if desired, rather than in a piecewise fashion.
The resulting cast objects were painted a saturated blue, similar to that used on the computer-presented
stimuli from the previous experiment. Of course, in this experiment the color and shading characteristics were of
no importance since the task was completed totally by touch, but the paint provided a smooth even surface on
the stimuli to prevent intentional or inadvertent use of non-global features in the discrimination task.
3.3 The Procedure
Subjects were seated at a table and instructed to adjust their chair such that they were situated at a comfortable
height. A wooden shelf and black cloth served to block the view of the stimuli from the subject. The subject
placed their hands under the shelf, through the black cloth for the duration of the experiment. The experimenter
sat across from the subject with two sets of objects laid out in trays. Each object was marked with a UV sensitive
ink, indicating the stimulus number. A UV light, available only to the experimenter, was used to ensure the
correct stimuli were presented on each trial.
A computer program prepared a pseudo-random set of trial pairings. On each experimental trial, two stimuli
were presented simultaneously to the subject by the experimenter. Subjects used a keypad to indicate if the
stimuli were geometrically the `same' or `dierent'. There was an equal probability of receiving a `same' pairing as
a `dierent' during a given trial. There was no time limit placed on the comparison and subjects were instructed
to work with the emphasis on precision rather than speed. Subjects were free to manipulate the stimuli in any
way with either hand, serially or simultaneously.
Pilot experiments showed that each trial was somewhat lengthy | ca. 10-15 sec each. Furthermore, results
from Experiment 1 demonstrate that comparisons between two stimuli that dier by even a moderate amount
were trivial. Therefore, we reduced both the number of comparisons per experimental block and reduced the range
of frequencies used when comparing. On each `dierent' trial, the stimuli diered by a maximum of 5 frequency
steps. Each subject performed two sessions of 200 trials for a total of 2,400 comparisons across all subjects.
3.4 The Participants
There were a total of six subjects, all undergraduate students at Skidmore College. None had seen the objects
before and all were nave with regard to the purposes of the experiment. Finally, all were free of any neurological
or physical problems that would interfere with their haptic exploration of the stimuli.
3.5 The Results
Figure 3 presents the results of Experiment 2. As with Experiment 1 we performed two analyses | one on the
individual stimulus discriminability and another on their relative confusability.
The mean d0 across all stimuli is x = 2:20 with a standard error of sx = 0:18, indicating that, as a whole,
the stimuli are discriminable. However, unlike the unimodal vision condition, a clear inverse relationship is seen
between the complexity of the stimuli and their d0 with a r2 0:6.
In our subsequent analysis we see that the majority of confusion occurs among the high frequency objects. In
the lower frequencies, those below object 13, there is little inter-stimulus confusion with one exception | objects
5 and 7. The general trends across both analyses show that, as object complexity increases the haptic system is
unable to keep up with the information it is being presented with. Recall that the highest frequency objects have
as many as 12 `bumps' around their circumference while the middle-frequency objects are closer to 4{5. The
results show two main groupings | a small, primarily low frequency group and a larger high frequency group.
As with the visual condition these groupings make some intuitive sense. Smooth objects don't provide much in
the way of `landmarks' or features that can be used to diagnose relative dierences whereas the high frequency
objects posses too much information.
Page 7
.
.
.
.
.
d’
object
object
**
Figure 3. Results from the unimodal haptic comparison condition. The graph on the left side shows the discriminability of
each stimulus. Stars indicate near-innite (in the case of stimulus 1) and near-zero (stimulus 25) d0s. Unlike the visual
condition, there is an inverse relationship between d0 and frequency. The right-hand graph shows the relative confusability
between two stimuli. Darker squares indicate more frequent confusion between the two indicated stimuli. Here we see that
most of the confusability is between the high frequency stimuli.
4. EXPERIMENT THREE | CROSSMODAL
This experiment combines Experiments 1 and 2 into a crossmodal (visual-haptic) task. The design is the same
simple discrimination using a single interval two-alternative forced choice (2AFC) paradigm used in Experiments 1
and 2. In this experiment, one object was presented haptically and a second presented visually. Subjects were
instructed to judge if the presented stimuli's shapes were geometrically the same or dierent.
4.1 The Method
The experiment setup was similar to that of Experiment 2 with the addition of a computer monitor set up as in
Experiment 1.
4.2 The Stimuli
The stimuli consisted of the same 25 objects used in Experiments 1 and 2.
4.3 The Procedure
As in Experiment 2 a computer program prepared a pseudo-random set of trial pairings. On each experimental
trial, one object was presented for haptic exploration and a second was presented visually. Subjects used a keypad
to indicate if the objects were geometrically the `same' or `dierent'. There was an equal probability of receiving a
`same' pairing as a `dierent' during a given trial. There was no time limit placed on the comparison and subjects
were instructed to work with the emphasis on precision rather than speed. Subjects were free to manipulate the
haptic stimuli in any way with either hand.
Keeping with the constraints used in Experiment 2, we reduced both the number of comparisons per experi-
mental block and reduced the range of frequencies used when comparing. On each `dierent' trial, the stimuli
diered by a maximum of 5 frequency steps. Each subject performed two sessions of 200 trials for a total of
2,400 comparisons across all subjects.
4.4 The Participants
There were a total of six subjects, all undergraduate students at Skidmore College. None had served in previous
experiments or had seen the objects before and all were nave with regard to the purposes of the experiment.
Finally, all were free of any neurological or physical problems that would interfere with their haptic exploration of
the stimuli.
.
.
.
.
d’
object
object
**
Figure 3. Results from the unimodal haptic comparison condition. The graph on the left side shows the discriminability of
each stimulus. Stars indicate near-innite (in the case of stimulus 1) and near-zero (stimulus 25) d0s. Unlike the visual
condition, there is an inverse relationship between d0 and frequency. The right-hand graph shows the relative confusability
between two stimuli. Darker squares indicate more frequent confusion between the two indicated stimuli. Here we see that
most of the confusability is between the high frequency stimuli.
4. EXPERIMENT THREE | CROSSMODAL
This experiment combines Experiments 1 and 2 into a crossmodal (visual-haptic) task. The design is the same
simple discrimination using a single interval two-alternative forced choice (2AFC) paradigm used in Experiments 1
and 2. In this experiment, one object was presented haptically and a second presented visually. Subjects were
instructed to judge if the presented stimuli's shapes were geometrically the same or dierent.
4.1 The Method
The experiment setup was similar to that of Experiment 2 with the addition of a computer monitor set up as in
Experiment 1.
4.2 The Stimuli
The stimuli consisted of the same 25 objects used in Experiments 1 and 2.
4.3 The Procedure
As in Experiment 2 a computer program prepared a pseudo-random set of trial pairings. On each experimental
trial, one object was presented for haptic exploration and a second was presented visually. Subjects used a keypad
to indicate if the objects were geometrically the `same' or `dierent'. There was an equal probability of receiving a
`same' pairing as a `dierent' during a given trial. There was no time limit placed on the comparison and subjects
were instructed to work with the emphasis on precision rather than speed. Subjects were free to manipulate the
haptic stimuli in any way with either hand.
Keeping with the constraints used in Experiment 2, we reduced both the number of comparisons per experi-
mental block and reduced the range of frequencies used when comparing. On each `dierent' trial, the stimuli
diered by a maximum of 5 frequency steps. Each subject performed two sessions of 200 trials for a total of
2,400 comparisons across all subjects.
4.4 The Participants
There were a total of six subjects, all undergraduate students at Skidmore College. None had served in previous
experiments or had seen the objects before and all were nave with regard to the purposes of the experiment.
Finally, all were free of any neurological or physical problems that would interfere with their haptic exploration of
the stimuli.
Page 8
4.5 The Results
As with Experiments 1 and 2 we performed two analyses | one on the individual stimulus discriminability and
another on their relative confusability.
.
.
.
.
.
d’
object
object
* *
Figure 4. Results from the crossmodal visual-haptic comparison condition. The graph on the left side shows the
discriminability of each object. Stars indicate near-zero d0s. As with the visual-only condition, there is no linear relationship
between d0 and frequency. However, there seems to be a bimodality in the results with d0 dropping o precipitously with
the higher frequency stimuli. The right-hand graph shows the relative confusability between two objects. Darker squares
indicate more frequent confusion between the two indicated objects. Here we see a more distributed confusion compared to
the unimodal conditions.
Results from the crossmodal visual-haptic comparison condition are shown in Figure 4. The graph on the left
side shows the discriminability of each object. Stars indicate near-zero d0s. As with the visual-only condition,
there is no linear relationship between d0 and frequency (r2 0:0). However, there seems to be a bimodality in
the results with d0s dropping o precipitously with the higher frequency objects, starting with object 17 and
continuing to the end of the stimulus range. This condition had the lowest overall mean d0, x = 1:79 with
sx = 0:01
.
.
.
.
.
d’
vision haptic -vision haptic
Figure 5. Results from Experiments 1 through 3. The d0 for the unimodal conditions show that they outperform the
crossmodal condition. (Error bars indicate one sx.)
The right-hand graph shows the relative confusability between two objects. Darker squares indicate more
frequent confusion between the two indicated objects. Here we see a rather distributed confusion compared to
the unimodal conditions. There appear to be three main clusters, as with the vision-only condition, located in
As with Experiments 1 and 2 we performed two analyses | one on the individual stimulus discriminability and
another on their relative confusability.
.
.
.
.
.
d’
object
object
* *
Figure 4. Results from the crossmodal visual-haptic comparison condition. The graph on the left side shows the
discriminability of each object. Stars indicate near-zero d0s. As with the visual-only condition, there is no linear relationship
between d0 and frequency. However, there seems to be a bimodality in the results with d0 dropping o precipitously with
the higher frequency stimuli. The right-hand graph shows the relative confusability between two objects. Darker squares
indicate more frequent confusion between the two indicated objects. Here we see a more distributed confusion compared to
the unimodal conditions.
Results from the crossmodal visual-haptic comparison condition are shown in Figure 4. The graph on the left
side shows the discriminability of each object. Stars indicate near-zero d0s. As with the visual-only condition,
there is no linear relationship between d0 and frequency (r2 0:0). However, there seems to be a bimodality in
the results with d0s dropping o precipitously with the higher frequency objects, starting with object 17 and
continuing to the end of the stimulus range. This condition had the lowest overall mean d0, x = 1:79 with
sx = 0:01
.
.
.
.
.
d’
vision haptic -vision haptic
Figure 5. Results from Experiments 1 through 3. The d0 for the unimodal conditions show that they outperform the
crossmodal condition. (Error bars indicate one sx.)
The right-hand graph shows the relative confusability between two objects. Darker squares indicate more
frequent confusion between the two indicated objects. Here we see a rather distributed confusion compared to
the unimodal conditions. There appear to be three main clusters, as with the vision-only condition, located in
Page 9
the high-, medium-, and low-frequencies. Unlike the vision-only condition, there is signicantly more confusion
within each group. These results suggest that the information is able to be integrated across sensory modalities
but not without signicant error and, even then, there is an upper limit to the frequencies of the objects that are
matchable.
Figure 5 shows a summary of the results across Experiments 1 to 3. With respect to overall discriminability
(d0), it is clear that the crossmodal condition underperforms the unimodal conditions with vision performing
best of all. If information is being shared eciently, we would expect crossmodal performance to match that of
the worst unimodal condition (haptic comparison). Rather, there appears to be diculty using one modality's
information to make judgements in another. This suggest that there is some degree of loss of information in the
transfer between modalities.
5. EXPERIMENT FOUR | STIMULUS REPRODUCTION
In our previous work investigating visual 3D shape perception, subjects produced 2-dimensional line drawings of
objects in various presentation conditions | a combination of physical or computer-rendered stimuli, static or
rotating, and memory or imaged based drawing.23 The stimuli used in these experiments were similar to those
in the present experiment but realized at a much larger size (approx. 250 250 250 mm). This experiment
showed that subjects performance varied based on the memory component and the spatial complexity of the
objects. As with the previous studies we have seen here, more complex objects were interpreted less successfully
than their simpler counterparts. However, these experiments also showed some minor performance problems
when rendering the lower frequency objects | some objects didn't posses enough structural information to make
accurate renderings.
For this, our nal experiment, we adapt this methodology to a bimodal production task. Unlike the two-
dimensional reproductions created in the previous experiment, this task results in physical models that share
the dimensionality of the stimulus. This makes a direct comparison somewhat more straightforward | but not
without its own set of challenges.
5.1 The Method
In this experiment, subjects produced sculptural reproductions of our stimuli given three dierent conditions
of stimulus perception. The reproduction is essentially a uni- and bimodal magnitude estimation task. This
experiment asks | Given haptic, visual, and haptic-visual input information, how accurately can the subjects
reproduce the target stimuli?
5.2 The Stimuli
Cast 3D models of objects 4, 10, 13, and 16 from our stimulus set were used as target objects in this task. Pilot
testing concluded that this set of target objects captures a reasonable range of spatial frequencies that were not
too dicult or trivial to sculpt. The four stimuli are also representative of the clusterings found in the previous
experiments.
5.3 The Procedure
In each session, subjects were asked to sculpt a target object, e.g., reproduce its global shape, using an identical
volume of No. 1, gray-green Plasticine. Plasticine was chosen due to its stability in shape over time. It does not
dry on exposure to air and retains its shape over a wide range of temperatures.
For each sculpture there were three possible perceptual conditions | visual presentation of the target, haptic
presentation, or both visual and haptic acquisition. There was no time limit placed on the exploration and
subjects were told to \sculpt until you've done the best reproduction that you think you can."
In the visual condition the target object was illuminated using overhead lighting similar to the simulated
lighting used in the computer graphics displays. The stimulus was placed on a turntable whose position could
be adjusted by the subject. Subjects could also request that the experimenter reposition the stimulus to any
orientation that would facilitate an accurate sculpture.
within each group. These results suggest that the information is able to be integrated across sensory modalities
but not without signicant error and, even then, there is an upper limit to the frequencies of the objects that are
matchable.
Figure 5 shows a summary of the results across Experiments 1 to 3. With respect to overall discriminability
(d0), it is clear that the crossmodal condition underperforms the unimodal conditions with vision performing
best of all. If information is being shared eciently, we would expect crossmodal performance to match that of
the worst unimodal condition (haptic comparison). Rather, there appears to be diculty using one modality's
information to make judgements in another. This suggest that there is some degree of loss of information in the
transfer between modalities.
5. EXPERIMENT FOUR | STIMULUS REPRODUCTION
In our previous work investigating visual 3D shape perception, subjects produced 2-dimensional line drawings of
objects in various presentation conditions | a combination of physical or computer-rendered stimuli, static or
rotating, and memory or imaged based drawing.23 The stimuli used in these experiments were similar to those
in the present experiment but realized at a much larger size (approx. 250 250 250 mm). This experiment
showed that subjects performance varied based on the memory component and the spatial complexity of the
objects. As with the previous studies we have seen here, more complex objects were interpreted less successfully
than their simpler counterparts. However, these experiments also showed some minor performance problems
when rendering the lower frequency objects | some objects didn't posses enough structural information to make
accurate renderings.
For this, our nal experiment, we adapt this methodology to a bimodal production task. Unlike the two-
dimensional reproductions created in the previous experiment, this task results in physical models that share
the dimensionality of the stimulus. This makes a direct comparison somewhat more straightforward | but not
without its own set of challenges.
5.1 The Method
In this experiment, subjects produced sculptural reproductions of our stimuli given three dierent conditions
of stimulus perception. The reproduction is essentially a uni- and bimodal magnitude estimation task. This
experiment asks | Given haptic, visual, and haptic-visual input information, how accurately can the subjects
reproduce the target stimuli?
5.2 The Stimuli
Cast 3D models of objects 4, 10, 13, and 16 from our stimulus set were used as target objects in this task. Pilot
testing concluded that this set of target objects captures a reasonable range of spatial frequencies that were not
too dicult or trivial to sculpt. The four stimuli are also representative of the clusterings found in the previous
experiments.
5.3 The Procedure
In each session, subjects were asked to sculpt a target object, e.g., reproduce its global shape, using an identical
volume of No. 1, gray-green Plasticine. Plasticine was chosen due to its stability in shape over time. It does not
dry on exposure to air and retains its shape over a wide range of temperatures.
For each sculpture there were three possible perceptual conditions | visual presentation of the target, haptic
presentation, or both visual and haptic acquisition. There was no time limit placed on the exploration and
subjects were told to \sculpt until you've done the best reproduction that you think you can."
In the visual condition the target object was illuminated using overhead lighting similar to the simulated
lighting used in the computer graphics displays. The stimulus was placed on a turntable whose position could
be adjusted by the subject. Subjects could also request that the experimenter reposition the stimulus to any
orientation that would facilitate an accurate sculpture.
Page 10
In the haptic condition the target object was placed in an opaque cloth bag so it could only be touched by the
subject, not seen. The subjects were free to manipulate the stimulus with either hand for as long as necessary.
Finally, in the bimodal visual-haptic condition the subject was free to visually and haptically examine the
target object, without constraint.
Each subject produced three sculpted objects. Three of the four target objects (4, 10, 13, and 16) were
sculpted once, each in a dierent perceptual condition, in a random order. The conditionobject combinations
(12 in all) were randomly distributed across all subjects such that each combination was sculpted an equal number
of times. This resulted in a total of 36 sculpted objects, 3 in each of the 12 conditions.
5.4 The Participants
Twelve participants were recruited from the undergraduate population at Skidmore College. None had seen the
objects before and all were nave with regard to the purposes of the experiment. All had normal or corrected-to-
normal visual acuity and were free of any neurological or physical problems that would interfere with their haptic
exploration and production of the stimuli.
5.5 The Results
Our previous research on drawing production suggests that the veridicality of the sculpted objects will likely
depend upon the spatial frequency of the target object. That is | target objects with higher spatial frequencies
should be more dicult to reproduce. Just as with two-dimensional shapes, there are no widely accepted methods
for quantitatively comparing two three-dimensional shapes. Herein we present two methods of comparison designed
to uncover two types of dierences relevant to our investigation.
Each sculpted object was scanned using a three-dimensional scanner (NextEngine, Inc.), sampled at a density
of approximately 60,000 vertices and 60,000 polygons per object. Since there may have been some distortion in
the target objects due to the inaccuracies of the 3D printing process and subsequent casting, the target stimuli
were also scanned at the same density. It should be noted that similar target and sculpted objects should have
equal volumes since the amount of plasticine used to sculpt was mostly equal to the volume of the target object.
After scanning the sculpture, any minor discrepancies in volume, most likely due to measurement error when
portioning the clay or due to changes in temperature, were removed by globally scaling the sculpture to the
proper total volume.
As an initial rough measurement we compared the surface areas of the sculptures to the target objects reasoning
that, as one was more `bumpy' than the other, the surface area dierence between the two would increase. The
total surface areas of the sculpted objects are, for the most part slightly greater than the target surface area,
ranging from 99 to 104%, than those of the target objects. Alas, there are no signicant dierences in surface
areas across the objects and conditions used in this experiment. This nding led us to consider a class of methods
that utilize volumetric as well as surface information.
5.5.1 Boolean Method
Our rst measure is based on boolean combinations of the sculpted and target objects (See Figure 6). Two
boolean subtractions are performed | the target from the sculpture and vice-versa. These residues represent
under- and over-sculpted regions, respectively, and their sum is considered the total error. Finally, the objects are
subjected to rotational alignment such that this error is minimized. The minima is taken as the nal total error
between the sculpted and target object.
For all objects, the error in the haptic only case is greatest with the bimodal condition being equal to the vision
condition (See Figure 6). This suggests that the haptic information adds very little to the visual information in
this reproduction task. Also note that the error increases steadily as the object complexity increases, the haptic
increasing at the highest rate. The addition of visual information acts to lower this error and its rate of increase
as complexity increases.
It should be noted that, using this calculation, the error cannot be attributed to a specic type of geometric
distortion. Since we know that a given object and target have similar surface areas and volumes they therefore
should share a similar degree of bumpiness at a broad scale. However, the location of the bump and dimple
subject, not seen. The subjects were free to manipulate the stimulus with either hand for as long as necessary.
Finally, in the bimodal visual-haptic condition the subject was free to visually and haptically examine the
target object, without constraint.
Each subject produced three sculpted objects. Three of the four target objects (4, 10, 13, and 16) were
sculpted once, each in a dierent perceptual condition, in a random order. The conditionobject combinations
(12 in all) were randomly distributed across all subjects such that each combination was sculpted an equal number
of times. This resulted in a total of 36 sculpted objects, 3 in each of the 12 conditions.
5.4 The Participants
Twelve participants were recruited from the undergraduate population at Skidmore College. None had seen the
objects before and all were nave with regard to the purposes of the experiment. All had normal or corrected-to-
normal visual acuity and were free of any neurological or physical problems that would interfere with their haptic
exploration and production of the stimuli.
5.5 The Results
Our previous research on drawing production suggests that the veridicality of the sculpted objects will likely
depend upon the spatial frequency of the target object. That is | target objects with higher spatial frequencies
should be more dicult to reproduce. Just as with two-dimensional shapes, there are no widely accepted methods
for quantitatively comparing two three-dimensional shapes. Herein we present two methods of comparison designed
to uncover two types of dierences relevant to our investigation.
Each sculpted object was scanned using a three-dimensional scanner (NextEngine, Inc.), sampled at a density
of approximately 60,000 vertices and 60,000 polygons per object. Since there may have been some distortion in
the target objects due to the inaccuracies of the 3D printing process and subsequent casting, the target stimuli
were also scanned at the same density. It should be noted that similar target and sculpted objects should have
equal volumes since the amount of plasticine used to sculpt was mostly equal to the volume of the target object.
After scanning the sculpture, any minor discrepancies in volume, most likely due to measurement error when
portioning the clay or due to changes in temperature, were removed by globally scaling the sculpture to the
proper total volume.
As an initial rough measurement we compared the surface areas of the sculptures to the target objects reasoning
that, as one was more `bumpy' than the other, the surface area dierence between the two would increase. The
total surface areas of the sculpted objects are, for the most part slightly greater than the target surface area,
ranging from 99 to 104%, than those of the target objects. Alas, there are no signicant dierences in surface
areas across the objects and conditions used in this experiment. This nding led us to consider a class of methods
that utilize volumetric as well as surface information.
5.5.1 Boolean Method
Our rst measure is based on boolean combinations of the sculpted and target objects (See Figure 6). Two
boolean subtractions are performed | the target from the sculpture and vice-versa. These residues represent
under- and over-sculpted regions, respectively, and their sum is considered the total error. Finally, the objects are
subjected to rotational alignment such that this error is minimized. The minima is taken as the nal total error
between the sculpted and target object.
For all objects, the error in the haptic only case is greatest with the bimodal condition being equal to the vision
condition (See Figure 6). This suggests that the haptic information adds very little to the visual information in
this reproduction task. Also note that the error increases steadily as the object complexity increases, the haptic
increasing at the highest rate. The addition of visual information acts to lower this error and its rate of increase
as complexity increases.
It should be noted that, using this calculation, the error cannot be attributed to a specic type of geometric
distortion. Since we know that a given object and target have similar surface areas and volumes they therefore
should share a similar degree of bumpiness at a broad scale. However, the location of the bump and dimple
Page 12
Figure 7 illustrates the results of this decomposition for stimulus object 4 and a collection of ve sculpted
objects, labeled 4a-e. For this analysis we decomposed the objects into their rst seven harmonics above the
DC component. The graph on the left shows the relative energy at each of these harmonics with the target
object at the top. To further quantify the dierences, each harmonic description was mapped to a vector in a
seven-dimensional space and compared to the target. The angle between the vectors represented the error. This
error is illustrated in Figure 7 by the length of the lines connecting the sculptures with the target object. Visual
examination reveals that 4a is the most likely candidate for a `match' and indeed this is conrmed by the SHD
analysis, while 4e is the worst with the other sculpted objects lying between them. Visual inspection bears this
result out, at least at a subjective level.
Distance from target
Haptic
Visual
Bimodal
object
Figure 8. Comparison of mean distance from the target in SHD space for the four objects (4,10,13, and 16) in each of the
three sculpting conditions haptic, visual, and bimodal. Unlike the boolean analysis, the SHD analysis does not re
ect a
dierence between the perceptual modalities | with the exception of the bimodal condition of the most complex object.
(Error bars indicate one sx.)
Figure 8 shows the results of comparing the sculptures in SHD space for the four objects across the three
perceptual conditions. Comparing these results to those of the boolean metric shown in Figure 6 we see the trend of
increasing error as object complexity increases. However, using this analysis we do not see the signicant dierence
between the three sensory input conditions, with one exception. In the case of the most complex object (16)
the bimodal condition outperforms the unimodal conditions and is more consistent with the middle-frequency
sculptures across all modalities. We hypothesize that, in this high frequency case, the haptic system is able
to help determine the nature of the local high-frequency information. By treating bumps and dimples as local
entities subjects might be able to more accurately depict them in the sculpture | just not accurately in relation
to each other. We take this up in more detail below, but it suces to say that the two analysis methods we
propose may be getting at dierent parts of the same phenomena | that of object similarity.
6. DISCUSSION
Experiments 1 through 3 undertook the task of investigating our ability to discriminate objects based on visual
and haptic perceptual modalities. We found that, for the range of stimuli used in our experiments, unimodal
comparisons were always better than crossmodal comparisons. This suggests that the representational strategies
between the two senses are not entirely isomorphic. There is certainly at least some overlap, since the task can
be done, if not in a more noisy fashion, up to a limiting amount of structural information.
Norman16 found precisely the same result using a class of natural stimuli (bell peppers) whose predominant
features' scale was roughly the same across their collection. We have extended this approach by using objects
who's complexity, as dened by the scale of features, varied in a systematic manner. This allows us to further
investigate the eect of complexity on performance. We found that, as the complexity increased, the tasks
objects, labeled 4a-e. For this analysis we decomposed the objects into their rst seven harmonics above the
DC component. The graph on the left shows the relative energy at each of these harmonics with the target
object at the top. To further quantify the dierences, each harmonic description was mapped to a vector in a
seven-dimensional space and compared to the target. The angle between the vectors represented the error. This
error is illustrated in Figure 7 by the length of the lines connecting the sculptures with the target object. Visual
examination reveals that 4a is the most likely candidate for a `match' and indeed this is conrmed by the SHD
analysis, while 4e is the worst with the other sculpted objects lying between them. Visual inspection bears this
result out, at least at a subjective level.
Distance from target
Haptic
Visual
Bimodal
object
Figure 8. Comparison of mean distance from the target in SHD space for the four objects (4,10,13, and 16) in each of the
three sculpting conditions haptic, visual, and bimodal. Unlike the boolean analysis, the SHD analysis does not re
ect a
dierence between the perceptual modalities | with the exception of the bimodal condition of the most complex object.
(Error bars indicate one sx.)
Figure 8 shows the results of comparing the sculptures in SHD space for the four objects across the three
perceptual conditions. Comparing these results to those of the boolean metric shown in Figure 6 we see the trend of
increasing error as object complexity increases. However, using this analysis we do not see the signicant dierence
between the three sensory input conditions, with one exception. In the case of the most complex object (16)
the bimodal condition outperforms the unimodal conditions and is more consistent with the middle-frequency
sculptures across all modalities. We hypothesize that, in this high frequency case, the haptic system is able
to help determine the nature of the local high-frequency information. By treating bumps and dimples as local
entities subjects might be able to more accurately depict them in the sculpture | just not accurately in relation
to each other. We take this up in more detail below, but it suces to say that the two analysis methods we
propose may be getting at dierent parts of the same phenomena | that of object similarity.
6. DISCUSSION
Experiments 1 through 3 undertook the task of investigating our ability to discriminate objects based on visual
and haptic perceptual modalities. We found that, for the range of stimuli used in our experiments, unimodal
comparisons were always better than crossmodal comparisons. This suggests that the representational strategies
between the two senses are not entirely isomorphic. There is certainly at least some overlap, since the task can
be done, if not in a more noisy fashion, up to a limiting amount of structural information.
Norman16 found precisely the same result using a class of natural stimuli (bell peppers) whose predominant
features' scale was roughly the same across their collection. We have extended this approach by using objects
who's complexity, as dened by the scale of features, varied in a systematic manner. This allows us to further
investigate the eect of complexity on performance. We found that, as the complexity increased, the tasks
Page 13
became more dicult in general, but the tasks involving haptic perception suered precipitously. This suggests
that humans are not very good at keeping track of the conguration of large amounts of structural detail for
comparison within or between modalities.
We further discovered that the stimuli appear to be clustered into two or three groups based on the modality
employed. Vision and the crossmodal condition yielded three groups (high, mid, and low frequency) whereas the
haptic condition yielded two (low and mid-high). In the crossmodal condition the results were much noisier. This
re
ects the impact of the haptic system's inability to discriminate high frequency objects as well as its greater
inaccuracy overall.
Finally, we observed that in the unimodal visual condition the d0s of individual objects are not dependent on
frequency (r2 0:0), but in the unimodal haptic case they are highly correlated (r2 0:6). In the crossmodal
condition there was, again, no correlation with frequency (r2 0:0). This is further evidence that the haptic
system is constrained to collecting information about lower frequency structural information. Furthermore, the
visual system dominates when haptic information is available.
In Experiment 4 we moved to a production task, directly estimating the shape of the objects in three
dimensions. Because of the vagaries of quantitative shape comparison in both two- and three-dimensions, we
used two strategies to identify the dierences. The boolean method reinforced the evidence from the previous
experiment that the haptic system is impoverished at high frequencies and that the visual system performs best,
as well as dominates touch when they are combined. However, this method cannot localize the sources of the
error since it is taken over the entire object.
object
SHD Model
object
object
Unimodal Visual Condition
Figure 9. Confusion as predicted by the SHD model (left) and as observed in Experiment 1 (right). There is signicant
overlap here, suggesting that, if nothing else, the SHD model may be a good predictor of discrimination.
Our spherical harmonic based measures tell a slightly dierent story. This method computed three-dimensional
spatial frequency magnitudes, essentially extracting the amount of `bumpiness' and the size of those bumps.
Unlike the boolean analysis, the perceptual modality used did not aect performance, with the exception of the
oddball nding in the bimodal condition where the highest frequency object was sculpted more accurately than
the unimodal conditions. Taken as a whole, this suggests that a benet of bimodal perception only exists for
more demanding tasks.
In order to gauge the eectiveness of this analysis it is informative to compare the predictions of the SHD
model to human performance in the discrimination tasks. The left of Figure 9 shows an object confusion matrix
as determined by the SHD model. The right shows the human performance data from Experiment 1's visual
discrimination task. It is clear that the model makes predictions that are largely consistent with the subjects'
performance in that task. Applying the same analysis to the data and scanned objects used in Norman's16
study shows a similar degree of t. Therefore we claim that the SHD model explains a good amount of the
human performance in visual discrimination. It remains to be seen if this is true of other classes of shapes and
congurations of features.
The dierences across perceptual conditions found with boolean analysis does not separate phase and frequency
error. The lack of a dierence found in the SHD analysis rules out frequency error as the main contributor to the
that humans are not very good at keeping track of the conguration of large amounts of structural detail for
comparison within or between modalities.
We further discovered that the stimuli appear to be clustered into two or three groups based on the modality
employed. Vision and the crossmodal condition yielded three groups (high, mid, and low frequency) whereas the
haptic condition yielded two (low and mid-high). In the crossmodal condition the results were much noisier. This
re
ects the impact of the haptic system's inability to discriminate high frequency objects as well as its greater
inaccuracy overall.
Finally, we observed that in the unimodal visual condition the d0s of individual objects are not dependent on
frequency (r2 0:0), but in the unimodal haptic case they are highly correlated (r2 0:6). In the crossmodal
condition there was, again, no correlation with frequency (r2 0:0). This is further evidence that the haptic
system is constrained to collecting information about lower frequency structural information. Furthermore, the
visual system dominates when haptic information is available.
In Experiment 4 we moved to a production task, directly estimating the shape of the objects in three
dimensions. Because of the vagaries of quantitative shape comparison in both two- and three-dimensions, we
used two strategies to identify the dierences. The boolean method reinforced the evidence from the previous
experiment that the haptic system is impoverished at high frequencies and that the visual system performs best,
as well as dominates touch when they are combined. However, this method cannot localize the sources of the
error since it is taken over the entire object.
object
SHD Model
object
object
Unimodal Visual Condition
Figure 9. Confusion as predicted by the SHD model (left) and as observed in Experiment 1 (right). There is signicant
overlap here, suggesting that, if nothing else, the SHD model may be a good predictor of discrimination.
Our spherical harmonic based measures tell a slightly dierent story. This method computed three-dimensional
spatial frequency magnitudes, essentially extracting the amount of `bumpiness' and the size of those bumps.
Unlike the boolean analysis, the perceptual modality used did not aect performance, with the exception of the
oddball nding in the bimodal condition where the highest frequency object was sculpted more accurately than
the unimodal conditions. Taken as a whole, this suggests that a benet of bimodal perception only exists for
more demanding tasks.
In order to gauge the eectiveness of this analysis it is informative to compare the predictions of the SHD
model to human performance in the discrimination tasks. The left of Figure 9 shows an object confusion matrix
as determined by the SHD model. The right shows the human performance data from Experiment 1's visual
discrimination task. It is clear that the model makes predictions that are largely consistent with the subjects'
performance in that task. Applying the same analysis to the data and scanned objects used in Norman's16
study shows a similar degree of t. Therefore we claim that the SHD model explains a good amount of the
human performance in visual discrimination. It remains to be seen if this is true of other classes of shapes and
congurations of features.
The dierences across perceptual conditions found with boolean analysis does not separate phase and frequency
error. The lack of a dierence found in the SHD analysis rules out frequency error as the main contributor to the
Page 14
poor performance in the haptic conditions. Therefore, not only is the haptic system susceptible to confusion at
higher spatial frequencies, it cannot process phase information to the same degree as the visual system.
Norman suggested haptic perception may be more sensitive to local, phase independent information. In
those experiments, subjects were often able to correctly haptically discriminate between two bell peppers when
diagnostic local features were present. For example, subjects often credited their correct identications to specic
fold or stem that made a bell pepper unique. Our stimuli were largely devoid of such landmark features, but our
results do not rule out this possibility either.
In a recent review, Lacey and colleagues summarizes three models for the convergence of visual and haptic
information.6 Convergence of information could rely on A) the more dominant visual modality borrowing from
haptic information, B) the integration of haptic and visual information into a modality independent form, or C)
each modality processing information independently, but translation between them is somewhat ecient. Our
ndings agree most closely with B and C above. We found evidence for the ability to compare with a limited
amount of haptic information. However, since our evidence is merely behavioral there always exists the possibility
that there are separate representational strategies and humans simply fail to integrate them properly.
To return to Molyneux' question, we would propose that, indeed it is possible to visually identify things only
learned by touch | but objects dened by higher spatial frequency information will certainly be much more
dicult to identify than their lower frequency cousins.
ACKNOWLEDGMENTS
The authors wish to thank Luc Barthelet, J. Farley Norman, Joshua Lesparance, and Kubra Komek for their
valuable assistance and insight.
REFERENCES
[1] Davis, J. W., \The Molyneux problem," J Hist Ideas 21(3), 392{408 (1960).
[2] Jacomuzzi, A., Kobau, P., and Bruno, N., \Molyneux' question redux," Phenomen Cog Sci 2(4), 255{280
(2003).
[3] Chesselden, W., \An account of some observations made by a young gentleman, who was born blind, or
lost his sight so early, that he had no remembrance of ever having seen, and was couch'd between 13 and 14
years of age," Phil Trans 402, 447{450 (1728).
[4] Gregory, R. L. and Wallace, J. G., \Recovery from early blindness - a case study," Exp Psych Soc Monogr 2
(1963).
[5] Fine, I., Wade, A. R., Brewer, A. A., May, M. G., Goodman, D. F., Boynton, G. M., Wandell, B. A., and
MacLeod, D. I. A., \Long-term deprivation aects visual perception and cortex," Nat Neurosci 6(9), 915{916
(2003).
[6] Lacey, S., Campbell, C., and Sathian, K., \Vision and touch: Multiple or multisensory representations of
objects?," Perception 36(10), 1513{1521 (2007).
[7] Bruce, C., Desimone, R., and Gross, C., \Visual properties of neurons in a polysensory area in superior
temporal sulcus of the macaque," J Neurophysiol 46(2), 369{384 (1981).
[8] Duhamel, J., Colby, C., and Goldberg, M., \Ventral intraparietal area of the macaque: Congruent visual and
somatic response properties," J Neurophysiol 79, 126{136 (1998).
[9] Macaluso, E. and Driver, J., \Spatial attention and crossmodal interactions between vision and touch,"
Neuropsychologia 39, 1304{1316 (2001).
[10] Feinberg, T., \Multimodal agnosia after unilateral left hemisphere lesion," Neurology 36(6), 864 (1986).
[11] James, T. W., \Ventral occipital lesions impair object recognition but not object-directed grasping: an fMRI
study," Brain 126(11), 2463{2475 (2003).
[12] Sathian, K., Zangaladze, A., Homan, J., and Grafton, S., \Feeling with the mind's eye," NeuroReport 8(18),
3877{3881 (1997).
[13] Zangaladze, A., Epstein, C., Grafton, S., and Sathian, K., \Involvement of visual cortex in tactile discrimina-
tion of orientation," Nature 401, 587{590 (1999).
higher spatial frequencies, it cannot process phase information to the same degree as the visual system.
Norman suggested haptic perception may be more sensitive to local, phase independent information. In
those experiments, subjects were often able to correctly haptically discriminate between two bell peppers when
diagnostic local features were present. For example, subjects often credited their correct identications to specic
fold or stem that made a bell pepper unique. Our stimuli were largely devoid of such landmark features, but our
results do not rule out this possibility either.
In a recent review, Lacey and colleagues summarizes three models for the convergence of visual and haptic
information.6 Convergence of information could rely on A) the more dominant visual modality borrowing from
haptic information, B) the integration of haptic and visual information into a modality independent form, or C)
each modality processing information independently, but translation between them is somewhat ecient. Our
ndings agree most closely with B and C above. We found evidence for the ability to compare with a limited
amount of haptic information. However, since our evidence is merely behavioral there always exists the possibility
that there are separate representational strategies and humans simply fail to integrate them properly.
To return to Molyneux' question, we would propose that, indeed it is possible to visually identify things only
learned by touch | but objects dened by higher spatial frequency information will certainly be much more
dicult to identify than their lower frequency cousins.
ACKNOWLEDGMENTS
The authors wish to thank Luc Barthelet, J. Farley Norman, Joshua Lesparance, and Kubra Komek for their
valuable assistance and insight.
REFERENCES
[1] Davis, J. W., \The Molyneux problem," J Hist Ideas 21(3), 392{408 (1960).
[2] Jacomuzzi, A., Kobau, P., and Bruno, N., \Molyneux' question redux," Phenomen Cog Sci 2(4), 255{280
(2003).
[3] Chesselden, W., \An account of some observations made by a young gentleman, who was born blind, or
lost his sight so early, that he had no remembrance of ever having seen, and was couch'd between 13 and 14
years of age," Phil Trans 402, 447{450 (1728).
[4] Gregory, R. L. and Wallace, J. G., \Recovery from early blindness - a case study," Exp Psych Soc Monogr 2
(1963).
[5] Fine, I., Wade, A. R., Brewer, A. A., May, M. G., Goodman, D. F., Boynton, G. M., Wandell, B. A., and
MacLeod, D. I. A., \Long-term deprivation aects visual perception and cortex," Nat Neurosci 6(9), 915{916
(2003).
[6] Lacey, S., Campbell, C., and Sathian, K., \Vision and touch: Multiple or multisensory representations of
objects?," Perception 36(10), 1513{1521 (2007).
[7] Bruce, C., Desimone, R., and Gross, C., \Visual properties of neurons in a polysensory area in superior
temporal sulcus of the macaque," J Neurophysiol 46(2), 369{384 (1981).
[8] Duhamel, J., Colby, C., and Goldberg, M., \Ventral intraparietal area of the macaque: Congruent visual and
somatic response properties," J Neurophysiol 79, 126{136 (1998).
[9] Macaluso, E. and Driver, J., \Spatial attention and crossmodal interactions between vision and touch,"
Neuropsychologia 39, 1304{1316 (2001).
[10] Feinberg, T., \Multimodal agnosia after unilateral left hemisphere lesion," Neurology 36(6), 864 (1986).
[11] James, T. W., \Ventral occipital lesions impair object recognition but not object-directed grasping: an fMRI
study," Brain 126(11), 2463{2475 (2003).
[12] Sathian, K., Zangaladze, A., Homan, J., and Grafton, S., \Feeling with the mind's eye," NeuroReport 8(18),
3877{3881 (1997).
[13] Zangaladze, A., Epstein, C., Grafton, S., and Sathian, K., \Involvement of visual cortex in tactile discrimina-
tion of orientation," Nature 401, 587{590 (1999).
Page 15
[14] Blake, R., Sobel, K., and James, T., \Neural synergy between kinetic vision and touch," Psych Sci 15(6),
397{402 (2004).
[15] Amedi, A., Malach, R., Hendler, T., Peled, S., and Zohary, E., \Visuo-haptic object-related activation in the
ventral visual pathway," Nat Neurosci 4(3), 324{330 (2001).
[16] Norman, J. F., Norman, H. F., Clayton, A. M., Lianekhammy, J., and Zielke, G., \The visual and haptic
perception of natural object shape," Percept & Psychophys 66(2), 342{351 (2004).
[17] Norman, J. F., Clayton, A. M., Norman, H. F., and Crabtree, C. E., \Learning to perceive dierences in
solid shape through vision and touch," Perception 37(2), 185{196 (2008).
[18] Navon, D., \Forest before trees: The precedence of global features in visual perception," Cognit Psychol 9,
353{383 (1977).
[19] Heller, M. and Clyburn, S., \Global versus local processing in haptic perception of form," Bull Psychonomic
Soc 31(6), 574{576 (1993).
[20] Lakatos, S. and Marks, L., \Haptic form perception: relative salience of local and global features," Percept
Psychophys 61(5), 895{908 (1999).
[21] Helbig, H. B. and Ernst, M. O., \Optimal integration of shape information from vision and touch," Exp
Brain Res 179(4), 595{606 (2007).
[22] Phillips, F., \Creating noisy stimuli," Perception 33(7), 837{54 (2004).
[23] Phillips, F., Casella, M. W., and Gaudino, B. M., \What can drawing tell us about our mental representation
of shape?," J Vis 5(8), 522 (2005).
[24] Kazhdan, M., Funkhouser, T., and Rusinkiewicz, S., \Rotation invariant spherical harmonic representation
of 3d shape descriptors," ACM Intl Conf Proc Ser 43, 156{164 (2003).
397{402 (2004).
[15] Amedi, A., Malach, R., Hendler, T., Peled, S., and Zohary, E., \Visuo-haptic object-related activation in the
ventral visual pathway," Nat Neurosci 4(3), 324{330 (2001).
[16] Norman, J. F., Norman, H. F., Clayton, A. M., Lianekhammy, J., and Zielke, G., \The visual and haptic
perception of natural object shape," Percept & Psychophys 66(2), 342{351 (2004).
[17] Norman, J. F., Clayton, A. M., Norman, H. F., and Crabtree, C. E., \Learning to perceive dierences in
solid shape through vision and touch," Perception 37(2), 185{196 (2008).
[18] Navon, D., \Forest before trees: The precedence of global features in visual perception," Cognit Psychol 9,
353{383 (1977).
[19] Heller, M. and Clyburn, S., \Global versus local processing in haptic perception of form," Bull Psychonomic
Soc 31(6), 574{576 (1993).
[20] Lakatos, S. and Marks, L., \Haptic form perception: relative salience of local and global features," Percept
Psychophys 61(5), 895{908 (1999).
[21] Helbig, H. B. and Ernst, M. O., \Optimal integration of shape information from vision and touch," Exp
Brain Res 179(4), 595{606 (2007).
[22] Phillips, F., \Creating noisy stimuli," Perception 33(7), 837{54 (2004).
[23] Phillips, F., Casella, M. W., and Gaudino, B. M., \What can drawing tell us about our mental representation
of shape?," J Vis 5(8), 522 (2005).
[24] Kazhdan, M., Funkhouser, T., and Rusinkiewicz, S., \Rotation invariant spherical harmonic representation
of 3d shape descriptors," ACM Intl Conf Proc Ser 43, 156{164 (2003).
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
3 Readers on Mendeley
by Discipline
100% Psychology
by Academic Status
67% Post Doc
33% Professor
by Country
100% United States



