Attending to Space or Intensity Modulates Spatial Release from Informational Masking
Available from
Antje Ihlefeld's profile on Mendeley.
Page 1
Attending to Space or Intensity Modulates Spatial Release from Informational Masking
Attending to Space or Intensity Modulates Spatial Release from Informational
Masking
Antje Ihlefeld1, Sarah Chu2, Barbara G. Shinn-Cunningham3
Hearing Research Center, Boston University, Boston MA 02215, USA,
Email: 1antje1@gmail.com, 2schu@mit.edu, 3shinn@cns.bu.edu
Introduction
When trying to understand target speech in a background of
perceptually similar speech maskers or when listening for a
speech target with unknown acoustic features, informational
masking (IM) can impair performance. IM causes target
detection thresholds to be elevated relative to what
traditional models of peripheral masking would predict (e.g.,
see [1]). Previous studies show that when target and masker
are perceived at different locations, thresholds can improve,
an effect known as spatial release from masking (SRM).
Furthermore, even when target and masker are coming from
the same location, release from IM can occur if the overall
speaking level differs between the competing utterances.
Previous studies show that the amount of SRM can depend
on how similar the target and masker are in level. Here, we
examined whether the amount of SRM depends on listeners
attending to location versus intensity of the target.
Probabilistic Model
Previously we showed that a probabilistic model of response
patterns can help tease apart the contributions of two distinct
processes underlying IM ([2], Fig. 1). Our model posits that
intelligibility in a selective speech identification task
depends on low-level spectrotemporal continuity (short-term
segmentation), on correctly joining short-term segments
across spectrotemporal discontinuities (across-time linkage),
and on the ability to properly select short-term segments
and/or streams (selective attention). When IM is the primary
form of interference, errors consist of the subject reporting
either the masker message (henceforth, masker errors) or a
combination of words from both the target and masker
messages (henceforth, mix errors). In contrast, random
guesses, where listener may report keywords that were not
part of either the target or masker message (henceforth, drop
errors), rarely occur. This is consistent with the idea that the
spectrotemporal structure of speech stimuli used in these
studies is rich enough for listeners to properly segment
syllables from the acoustic mixture, and that errors arise
from difficulties in selecting the correct syllables and/or
streams of syllables.
For each cue condition, angular separation (AS), and level
differences (LD) of the sources, the fixed probability of
properly selecting a keyword is PSEL, while the fixed
probability of properly linking the words across time is PSTR.
Conditioned on there being no drop errors, PSEL and PSTR can
be computed from the relative likelihoods of correct
responses, masker errors, and mix errors (see Fig. 1).
Methods
To emphasize the effects of IM, the current study employs
spectrally interleaved bandpass filtered target and masker
speech that was derived from the Coordinate Response
Measure corpus (see methods in the full-cue condition in
[3]). Target and masker [<color> <number>] phrases were
extracted from the original utterances by time windowing.
<Color> was one of the set [white, red, blue, and green].
<Number> was one of the digits between one and eight,
excluding the two-syllable digit seven. In each trial, two
different [<color><number>] phrases were used as sources.
The numbers and colors in the competing utterances were
randomly chosen, but constrained to differ from each other
in each trial. Subjects were instructed to report the target
color and number based on intensity, location, or both
intensity and location. Feedback was provided. Three
consecutive blocks always had the same instructions; each
session consisted of three blocks of each of the three cue-
conditions. Each session consisted of nine blocks of 60 trials
each. Seven normal-hearing subjects were paid for their
participation in the experiment; each completed four
sessions of the experiment. The data from the first session
were discarded as practice.
ff fi
flffi
"!#
$ %&
'flffi
(
ff fi)flffi
)* +(%ffi
(, & ' flffi fl,
'-flffi
./ 021
34
fl 5
#6
7589 ! * 9
-:
3
;
* <6
=& 9 * 9)
3
%&>
?A@ BffiBffi@ BffiC
D"EF EG
H IffiJffiKL
MN
KIO
Pffi P
:
34
fl 5
#6
75896 ! * 9
3
; * <
=& 9 * 9)
3
%&>
?Q@ BBffi@ BffiCARSTQU&V
G
EG
?Q@ BffiB@ BC
D"EF EG W
R(STQU&V
GffiX-Y
@
G
H IJffiKL
MN
KffiIO
=-
#6#6fl * 58
>
* 9$ !7
3Zffi[
Figure 1: Decision-theory model to quantify the roles of
selective attention (PSEL) and across-time linkage (PSTR).
Target location was chosen randomly from trial to trial, and
was equally likely to be any of 11 locations (±90˚, ±80˚,
±50˚, ±40˚, ±10˚, 0˚). On each trial, the angular separation
between target and masker was randomly chosen (either 0˚,
10˚, or 90˚). Target and masker levels were between 50 dB
and 80 dB SPL, roving randomly from trial to trial. On each
trial, the level difference between target and masker was
Masking
Antje Ihlefeld1, Sarah Chu2, Barbara G. Shinn-Cunningham3
Hearing Research Center, Boston University, Boston MA 02215, USA,
Email: 1antje1@gmail.com, 2schu@mit.edu, 3shinn@cns.bu.edu
Introduction
When trying to understand target speech in a background of
perceptually similar speech maskers or when listening for a
speech target with unknown acoustic features, informational
masking (IM) can impair performance. IM causes target
detection thresholds to be elevated relative to what
traditional models of peripheral masking would predict (e.g.,
see [1]). Previous studies show that when target and masker
are perceived at different locations, thresholds can improve,
an effect known as spatial release from masking (SRM).
Furthermore, even when target and masker are coming from
the same location, release from IM can occur if the overall
speaking level differs between the competing utterances.
Previous studies show that the amount of SRM can depend
on how similar the target and masker are in level. Here, we
examined whether the amount of SRM depends on listeners
attending to location versus intensity of the target.
Probabilistic Model
Previously we showed that a probabilistic model of response
patterns can help tease apart the contributions of two distinct
processes underlying IM ([2], Fig. 1). Our model posits that
intelligibility in a selective speech identification task
depends on low-level spectrotemporal continuity (short-term
segmentation), on correctly joining short-term segments
across spectrotemporal discontinuities (across-time linkage),
and on the ability to properly select short-term segments
and/or streams (selective attention). When IM is the primary
form of interference, errors consist of the subject reporting
either the masker message (henceforth, masker errors) or a
combination of words from both the target and masker
messages (henceforth, mix errors). In contrast, random
guesses, where listener may report keywords that were not
part of either the target or masker message (henceforth, drop
errors), rarely occur. This is consistent with the idea that the
spectrotemporal structure of speech stimuli used in these
studies is rich enough for listeners to properly segment
syllables from the acoustic mixture, and that errors arise
from difficulties in selecting the correct syllables and/or
streams of syllables.
For each cue condition, angular separation (AS), and level
differences (LD) of the sources, the fixed probability of
properly selecting a keyword is PSEL, while the fixed
probability of properly linking the words across time is PSTR.
Conditioned on there being no drop errors, PSEL and PSTR can
be computed from the relative likelihoods of correct
responses, masker errors, and mix errors (see Fig. 1).
Methods
To emphasize the effects of IM, the current study employs
spectrally interleaved bandpass filtered target and masker
speech that was derived from the Coordinate Response
Measure corpus (see methods in the full-cue condition in
[3]). Target and masker [<color> <number>] phrases were
extracted from the original utterances by time windowing.
<Color> was one of the set [white, red, blue, and green].
<Number> was one of the digits between one and eight,
excluding the two-syllable digit seven. In each trial, two
different [<color><number>] phrases were used as sources.
The numbers and colors in the competing utterances were
randomly chosen, but constrained to differ from each other
in each trial. Subjects were instructed to report the target
color and number based on intensity, location, or both
intensity and location. Feedback was provided. Three
consecutive blocks always had the same instructions; each
session consisted of three blocks of each of the three cue-
conditions. Each session consisted of nine blocks of 60 trials
each. Seven normal-hearing subjects were paid for their
participation in the experiment; each completed four
sessions of the experiment. The data from the first session
were discarded as practice.
ff fi
flffi
"!#
$ %&
'flffi
(
ff fi)flffi
)* +(%ffi
(, & ' flffi fl,
'-flffi
./ 021
34
fl 5
#6
7589 ! * 9
-:
3
;
* <6
=& 9 * 9)
3
%&>
?A@ BffiBffi@ BffiC
D"EF EG
H IffiJffiKL
MN
KIO
Pffi P
:
34
fl 5
#6
75896 ! * 9
3
; * <
=& 9 * 9)
3
%&>
?Q@ BBffi@ BffiCARSTQU&V
G
EG
?Q@ BffiB@ BC
D"EF EG W
R(STQU&V
GffiX-Y
@
G
H IJffiKL
MN
KffiIO
=-
#6#6fl * 58
>
* 9$ !7
3Zffi[
Figure 1: Decision-theory model to quantify the roles of
selective attention (PSEL) and across-time linkage (PSTR).
Target location was chosen randomly from trial to trial, and
was equally likely to be any of 11 locations (±90˚, ±80˚,
±50˚, ±40˚, ±10˚, 0˚). On each trial, the angular separation
between target and masker was randomly chosen (either 0˚,
10˚, or 90˚). Target and masker levels were between 50 dB
and 80 dB SPL, roving randomly from trial to trial. On each
trial, the level difference between target and masker was
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
3 Readers on Mendeley
by Discipline
33% Psychology
by Academic Status
67% Ph.D. Student
33% Post Doc
by Country
67% United States
33% United Kingdom


