nerve which are of interest. The fact that notched–noise was added to the stimulus, and that the chirps were calculated to start and end exactly with zero, was done to avoid stimulation of undesired areas of the cochlea due to an abrupt onset and offset of the stimulus. The study reported in Wegner and Dau (2002) obtained ABRs responses with a similar method like the one used in this work but only one low frequency chirp was tested and not a series that cover mostly the entire auditory range in humans. It could be argued that an alternative approach using a broadband chirp combined with noise could limit the response to the bands of interest. Nevertheless, there are no results or comparisons for the approach presented here in which the following is accomplished: first, the stimuli are band limited and cover large proportion of the auditory range, and second, a notched–noise masking condition is added. The masking level used in this approach, was 20 dB below the pe SPL of the stimulations, as recommended in Stapells (1994) for low frequency specific brief tone–evoked ABRs. It can be further investigated which level of masking gives better results for low, medium, and high frequency specific chirp stimulations. And it could also be compared the effect of threshold estimations by using protocols with different masking noise, e.g., pink instead of white.
Animal vocalizations and human speech are often emitted in series of sounds. Basic perceptual distinctions in the time domain are made at boundaries near 20–30 ms, 100 ms, and 400 ms inter-sound intervals (ISIs) and/or sound durations. Examples for boundaries near 20–30 ms and 100 ms are given by the transitions from hearing the pitch of a sound series (ISIs shorter than about 20 ms) to hearing rough sounds (ISIs between about 20–100 ms) to hearing a sound rhythm (ISIs longer than about 100 ms) [ 1 – 4 ]. The 20–30 ms boundary has also been found in the discrimination of stop consonants via voice-onset-time [ 5 – 7 ], for the perception of temporal order [ 8 ], gaps in sounds [ 9 , 10 ], categorization of mouse pup ultrasounds by their mothers [ 11 ], and for spectral integration of frequency components starting and ending together within 20–30 ms to be perceived as a single stream of auditory objects by humans [ 12 , 13 ] and mice [ 14 ]. The 100 ms boundary has also been found in the perception of acoustic streams in humans [ 15 ] and in wriggling call perception in mice [ 16 ]. Reports about the 400 ms boundary are found in loudness summation and forward masking in humans [ 17 – 19 ] and wriggling call perception in mice [ 16 ]. Neural correlates of these perceptual boundaries in the time domain are unknown because they have not systematically been studied or not been stud- ied at all [ 2 , 20 , 21 ]. Interestingly, however, hearing the differences in the sound quality at these perceptual boundaries does not require explicit learning and seems to be a rather general abil- ity of mammals. Therefore, our hypothesis is that these boundaries may be based on features of sound processing in the auditory periphery up to the midbrain inferior colliculus (IC). In order to test this hypothesis in the mouse, we used the same sounds as stimuli for the auditory system as in the tests for perception of communication calls. As method of testing, we used the auditorybrainstem response (ABR) which allows to record information from the whole audi- tory periphery in one approach.
To what extent auditory-brainstemresponses can inform on other types of speech-in-noise deficits remains, however, debated. Animal studies have shown that cochlear synaptopathy results in a reduction of wave I of the click-evoked brainstem response (6–8), as well as in changes in other brainstem measures such as a latency shift of wave V with level of background noise and the amplitude of envelope-following responses (11–13). To which extent these findings translate to humans is, however, unclear: studies on human volunteers have produced conflicting results regarding the correlation of such brainstem measures with a putative cochlear synaptopathy as well as with speech-in-noise comprehension (11–17).
This study examines auditorybrainstemresponses (ABR) elicited by rising frequency chirps. Two new chirp stimuli were developed and designed such as to compensate for basilar- membrane (BM) group-delay differences across frequency, in order to maximize neural syn- chrony at cochlear level. One chirp, referred to as the O-chirp in the present study, was based on estimates of human BM group delays derived from stimulus-frequency otoacoustic (SFOAE) at a level of 40 dB SPL [Shera and Guinan, in Recent Developments in Auditory Mechanics, (2000)]. The other chirp, referred to here as the A-chirp, was derived from la- tency functions fitted to tone-burst-evoked ABR wave V data over a wide range of stimulus levels and frequencies [Neely et al., ” J. Acoust. Soc. Am. 83(2), 652–656 (1988)]. In this case, a set of level-dependent chirps, was generated. The chirp-evoked responses, particu- larly wave-V amplitude and latency, were compared with click responses and with responses obtained with the “original” chirp as defined in Dau et al. [J. Acoust. Soc. Am. 107(3), 1530–1540], referred to as the M-chirp, which was developed on the basis of a (linear) cochlea model. The main hypothesis was that, at low and medium stimulation levels, the new chirps might produce a larger response than the original one whose parameters were essentially based on high-level BM data. The main results of the present study are as follows: (i) all chirps evoked a higher wave-V amplitude than the click. (ii) Surprisingly little differences occurred between the O-chirp and the original M-chirp for low and medium levels, indicat- ing that SFOAE may only provide a relatively rough estimate of BM group delay. (iii) The A-chirp produced the largest responses, particularly at the lowest stimulation levels. This chirp might therefore be particularly interesting for clinical applications.
76 Chapter 3: Spectral and temporal integration in modulation detection integration and the resolution time constants is not a real problem (Viemeister and Wakeeld, 1991). They pointed out that the observation of a 3-dB decrease in threshold for each doubling of duration - as seen in typical test-tone integration data - means that the auditory system behaves as if perfect power integration occurs but that the system is not necessarily performing the operation of mathe- matical integration. Therefore it might be important to distinguish between the phenomenon of temporal integration and the process that accounts for the phe- nomenon. Viemeister and Wakeeld (1991) provided evidence that integration with a long time constant, such as proposed by the classical models, does not occur in auditory processing. They showed that the threshold for a pair of short pulses yields classic power integration only for pulse separations of less than 5- 10 ms. For longer separations, the thresholds did not change with separation and the pulses appeared to be processed independently (cf. Zwislocki et al., 1962). In a second experiment, Viemeister and Wakeeld (1991) showed that the threshold for a pair of tone pulses was lower than for a single pulse, indicating some type of integration, but was not aected by level changes of the noise which was presented between the two pulses. The experimental results from that study are plotted in the left panel of Fig. 3.17. It shows the average thresholds for the rst pulse alone (squares), the second pulse alone (circles), and for the pulse pair (triangles) as a function of the relative level of the intervening noise. The thresholds for the rst pulse alone do not depend on the noise level. There is a slight increase in threshold for the second pulse re
ecting forward masking. The thresholds for the pulse pair are about 2.5 dB lower than those for either pulse alone and do not depend on the level of the intervening noise (for details see Viemeister and Wakeeld, 1991). These data cannot be explained by long-term integration.
Another fast rhythm is the gamma-band (γ) rhythm with a frequency range from 30 to 100 Hz. It is typical for high attention for conscious perception, when vast nets of widely distributed neurones are functionally connected. Different types of neurones in different regions of the cortex can be connected to a group by means of activity synchronisation in 30 to 100 Hz frequency which makes them into an ensemble. The role of GBO was mostly examined in relation to the visual system, and therefore less is known about its role in the auditory modality (Pape 2014, Steinmann et al. 2014a). Several studies have shown that functional connectivity in the gamma- band, i.e. synchronizsation of GBO between local and/or distant neurones are associated with cognitive or perceptual processing: in a 2008 conducted EEG study using gamma-band phase synchronisation it was shown that synchronisation of GBO between inferior and medial temporal regions was crucial for working memory (Axmacher et al, 2008). Moreover, it appears that anatomical connectivity, i.e. WM connections, is more important than topographic proximity, considering the strength of oscillation synchrony in the gamma-band range (Csicsvari et al. 2003, Mulert et al. 2011).
In general, there are two major ERP techniques to estimate mental workload. The first technique uses a primary task (actual experimental subject‟s task). In this approach, the state of attention is assessed directly by measuring the P3 amplitude to certain discrete events embedded in the primary task (Kramer, Wickens, & Donchin, 1985; Nittono, Hamada, & Hori, 2003; Novak, Ritter, & Vaughan, 1992). The second technique represents the “probe” technique to define the neural implementation of a task/cognitive process indirectly. With the aid of the probe technique (presenting additional task irrelevant stimuli that had certain features with the actual task stimuli in common) a participant‟s level of attention or rather the mental workload in auditory materials can be assessed (Papanicolaou & Johnstone, 1984). This latter technique is especially used when there are stimuli in a high frequency like it is the case for continuous speech. A valid analysis of those primary task stimuli would be not given because of overlays of ERP effects from one stimulus to the following. Probe stimuli instead occur less frequent than the task relevant stimuli. Thus, with probes the patterns of regional cerebral activation are assessable in a better way by avoiding overlap effects. Furthermore, task irrelevant probes and task relevant primary stimuli share the same relevant stimulus features. Therefore, probe stimuli are assumed to be processed like task relevant stimuli as well. This fact then allows indirect conclusions from the processing of probes to processing of task relevant stimuli. Moreover, this method assesses cerebral engagement without confounding with stimulus and response-specific activity (Papanicolaou & Johnstone, 1984). Furthermore, the probe technique can be divided into two subtypes: the relevant and irrelevant probe techniques, in which subjects have to ignore or pay attention to probe stimuli.
(average number of trials retained in analysis: 69.84; range across participants: 63 e74). Bad channels were excluded from the whole data set (average number of rejected sensors: 6.84; range across participants: 4 e10). Sensor space trials were projected into source space using linearly constrained minimum variance beamformer ﬁlters ( Van Veen et al., 1997 ) and further analysis was performed on the obtained time-series of each brain voxel. Since structural MRIs were available only for few participants, we decided to use a template MRI in the Fieldtrip toolbox, that were morphed to the individual head shape in MEG space using af ﬁne transformation. The aligned brain volumes were also used to create single-sphere head models and lead ﬁeld matrices ( Nolte, 2003 ). Using a grid in MNI space warped to the individual anatomy allowed us to average and compute statistics since each grid point in the warped grid, despite different space coordinates, belongs to the same brain re- gion across participants. For the creation of this template grid we chose a resolution of 2 mm in MNI space. Furthermore, using the Brainnetome atlas ( http://atlas.brainnetome.org ), we restricted the placement of grid points to primary and secondary auditory re- gions, in particular along the superior temporal gyrus (STG) for the left and right hemisphere respectively (7304 grid points per hemisphere, 14,608 in total). These areas are depicted in Fig. 1 A. The
Understanding what is said and recognising the identity of the talker are two important tasks that the brain is faced with in human communication. For a long time neuroscientific models for speech and voice processing have focused mostly on auditory language and voice-sensitive cerebral cortex regions to explain speech and voice recognition. However, our research has shown that the brain uses even more complex processing strategies for recognising auditory communication signals, such as the recruitment of dedicated visual face areas for auditory processing. In my talk I will give a brief introduction to this work and show how the multisensory influences on auditory processing can be harnessed to improve auditory learning.
Accurate speech intelligibility predictions are of great importance for various practical applications such as the objective evaluation of different signal processing strategies in hearing assistive devices, or room acoustic design. If appropriately adjusted, they can also be used as a tool to assess the effect of different aspects of hearing impairment on speech intelligibility in an objective way. Several models were proposed to predict speech intelligibility in noise including static measures (Articulation index / speech intelligibility index(SII); Kryter (1962); ANSI (1970, 1997), temporal measures (speech transmission index, Steeneken and Houtgast, 1980), measures based on automatic speech recognition (Cooke, 2006), or measures taking psychoacoustical or physiological aspects of the auditory periphery into account (Holube and Kollmeier, 1996; Stadler et al., 2007; Jürgens and Brand, 2009; Jørgensen et al., 2013). These models usually make strong assumptions about the relation between the (time-dependent) signal-to-noise ratio (SNR) in frequency bands and the resulting speech intelligibility, or about the processing steps in the auditory system and the auditory pattern recognition process. Any deviation between predictions and real human data can usually be attributed to the failure of one or several of these assumptions. Since the accuracy of predicting speech intelligibility even for listeners with normal hearing in different listening conditions (such as different types of background noise) is very limited (e.g., Hochmuth et al., 2015), the validity and relevance of these assumptions is unclear.
many years. Notwithstanding these findings, to date no animal model reflecting brainstem changes that are observed in AD has been described. Only sparse information about neurofi- brillary lesions in the brainstem of tau transgenic mice is obtainable [22,28,73,78,89]. Prelim- inary pilot studies revealed early stages of tau pathology in selected brainstem areas in P301L tau transgenic pR5 mice. Immunohistochemical examinations of 18-months-old animals showed both the expression of human tau protein and the existence of pretangles in the spinal trigeminal nucleus, the vestibular nuclei, and gigantocellular, parvocellular and intermediate nuclei of the reticular formation (dissertation Bauer, 2011; unpublished data). Very few neu- rons, containing soluble hyperphosphorylated tau, were observed in the cholinergic para- brachial nucleus, whereas the locus coeruleus and the dorsal raphe nucleus were devoid of human tau protein and pretangle material . Therefore the objective of the present study was to investigate the degree and localization of neurofibrillary lesions throughout the brain- stem of senescent P301L tau transgenic pR5 mice. The stages of tau hyperphosphorylation were assessed by immunohistochemistry using three different phosphorylation dependent an- tibodies as well as silver impregnation methods. Furthermore, the distribution of cholinergic, catecholaminergic and serotonergic neurons throughout the brainstem of these animals was explored, in order to address the following questions:
Functional imaging of the brainstem may open new avenues for clinical diagnostics. However, for reliable assessments of brainstem activation, further efforts improving signal quality are needed. Six healthy subjects performed four repeated func- tional magnetic resonance imaging (fMRI) sessions on different days with jaw clenching as a motor task to elicit activation in the trigeminal motor nucleus. Functional images were acquired with a 7 T MR scanner using an optimized multiband EPI sequence. Activation measures in the trigeminal nucleus and a control region were assessed using different physiological noise correction methods (aCompCor and RETROICOR-based approaches with variable numbers of regressors) combined with cerebrospinal fluid or brainstem masking. Receiver-operating characteristic analyses accounting for sensitivity and specific- ity, activation overlap analyses to estimate the reproducibility between sessions, and intraclass correlation analyses (ICC) for testing reliability between subjects and sessions were used to systematically compare the physiological noise correction approaches. Masking the brainstem led to increased activation in the target ROI and resulted in higher values for the area under the curve (AUC) as a combined measure for sensitivity and specificity. With the highest values for AUC, activation overlap, and ICC, the most favorable physiological noise correction method was to control for the cerebrospinal fluid time series (aCompCor with one regressor). Brainstem motor nuclei activation can be reliably identified using high-field fMRI with optimized acquisition and processing strategies—even on single-subject level. Applying specific physiological noise correction methods improves reproducibility and reliability of brainstem activation encouraging future clinical applications.
Participants were able to handle the procedure, and, in particular, the high number of report alternatives, as evidenced by their reports being systematic and plausible. Specifically, their reports were in line with findings on two-tone sequences: Spatial as well as frequency separation raised the proportion of stream segregation and reduced the proportion of the integrated percept (1,3). In addition, we found that tones of intermediate feature values (the B tones) were rarely perceived in the foreground, which is in line with findings from objective measurements showing difficulties in focusing on inner subsequences (11). Surprisingly, it was quite common for participants to report perceptual organization alternatives where two streams were in the foreground at once, e.g. A tones and C tones segregated into two streams but both equally in the foreground. One of the most often reported organization alternatives was one with two streams in the foreground, namely the percept with B and C tones integrated into one auditory stream and the A tones segregated in another stream , but both streams being perceived in the foreground. While studies on task-based procedures indicate that it is possible but difficult to pay attention to more than one stream (12,13), our results show that there are circumstances or tasks where the auditory system tracks multiple foreground streams in parallel without being forced to do so.
partial processes in the organism and their coordination. Especially in times of systems adaptation they are characterized by transients, phase transitions, or persistent instationarities. These properties can be investigated by analytical methods, which do not require stationarity or permit the pursuit of instationarities, respectively. Time- resolving procedures are the appropriate tools, since they do not characterize time series by static quantities but by time-dependent functions instead. In this paper, several algorithms for multiple time series analysis based on the theory of dynamic systems, statistics and time-frequency analysis are applied to biosignals representing the complex couplings of physiological subsystems. In physiological experiments, neuronal activities of the reticular formation (RF) of the lower brainstem and of the nucleus tractus solitarii (NTS) were recorded together with related parameters of EEG, respiration, and the cardiovascular system.
Two subjects did not show any contingency effects in reaction times and/or ampli- tudes, but they did show a significant interaction effect of ISC and catch recency in SOA = −210 ms. SRTs in this SOA condition were generally almost as long as when the audi- tory stimulus was presented simultaneously or after the visual target. We therefore con- clude that the longer SRTs in SOA = −210 ms represent a specific effect of the auditory catch trials as subjects had to wait for the occurrence of the visual target to decide whether they actually were allowed to respond in that trial. According to their informal self-report given at the end of the sessions, these two subjects tried to ignore as much as possible the auditory stimulus in order to render the task easier. This strategy might explain why they did not show any contingency effects except for trials with SOA = −210 ms following a catch trial. Only in a situation in which they had just been reminded to take notice of the auditory stimulus in order to correctly withhold the response in a catch trial, and in which they had to wait for the visual stimulus to be presented, these subjects actually made use of the contingency information contained in the auditory stimulus. In other words, it seems that our task left a certain margin for the subjects in considering the auditory stim- ulus. Only when they almost had no choice but to act on it did they use the contingency information the auditory stimulus contained.
Aus der kognitiven Psychologie ist, im Rahmen der Auditory Scene Analysis, inzwischen be- kannt, dass durchaus verschiedene auditorische Wahrnehmungsniveaus existieren, die in Ab- hängigkeit zu den spatialen Informationen stehen, die über eine Klangquelle bekannt sind (Son- nenschein 2011, Bregman 1990 et al.). Wenn ein Signal beispielsweise als separat erkannt wird, es dem Zuhörer aber nicht möglich ist, seine Position oder Herkunft zu bestimmen, liegt eine bloße Diskriminierung (Discrimination) vor. Kann das Geräusch jedenfalls einer bestimmten Klangart oder Klasse zugeordnet werden, ist das Niveau der Klassifizierung (Classification) erreicht. Kann das Geräusch wiedererkannt oder einem bekannten, akustischen Ereignis zugeordnet werden, handelt es sich bereits um eine Erkennung (Recognition). Wenn der Zuhörer in der Lage ist, das Signal einer eingeschränkten Menge an potentiellen Quellen in der Umgebung zuzuord- nen, wird von einer Zuordnung (Matching) gesprochen. Die höchsten beiden Wahrnehmungsni- veaus bilden die Identifikation (Identification) und Korrespondenz (Correspondence). Eine Identifikation setzt ein, sobald der Zuhörer ein Signal sicher einer Quelle zuordnen kann. Befindet sich diese Quelle zusätzlich in der sichtbaren Umgebung des Zuhörers und kann vollständig verortet wer- den, ist die Ebene der Korrespondenz erreicht. Je höher die erreichte Wahrnehmungsebene, desto sicherer auch der Umgang mit dem entsprechenden akustischen Ereignis. Sonnenschein beschreibt diesen Zusammenhang anhand eines eigenen Erlebnisses:
The results of the auditory localization experiment yield a large bandwidth of strategies and abil- ities to handle the task, which can be observed with respect to the eye movement trajectories as well as to their latencies. Some participants allow themselves quite a long time to perform an eye movement but are very accurate in their judgments, while others respond very fast but at the same time correctly (if successive movements are also taken into account). There are also some listeners who, despite their comparatively slow reactions, frequently fail to localize correctly. Although most location estimation errors are made with respect to elevation, no significant ef- fects of vertical stimulus eccentricity on saccadic reaction times can be found. There are only effects of laterality in some cases, which are very common. All in all, inter- and intra-subject variabilities of SRTs are quite large, in which SRTs are not related to localization performance. Apparently, auditory guided saccades are not as stereotyped as their visual counterparts: inde- pendent of a participant’s localization ability, they even differ within one listener depending on what kind of auditory environment was chosen (see PN). A qualitative analysis of trajectory and position time traces reveals that unlike the straight visually evoked saccades, auditory guided eye movements often consist of several successive “miniature-saccades”. Auditory eye move- ments seem to be dividable into a first (quite fast) response, which includes the judged azimuth eccentricity and a rough guess of the elevation component, and one or more “corrective steps” starting some ten milliseconds later with one or more additional vertical movements completing the saccadic response. The initial vertical trend, however, seems to be strategy-dependent. In most participants, there is only a weak elevation component in the first movement, as if eleva- tion estimation is nearly left out at that time. In some other participants the initial movement to elevated stimuli is dominated by a pronounced vertical drift, which would lead to a strong over-estimation of vertical eccentricity if set forth. This might be interpreted as if there was a primary general judgement as to whether the stimulus was elevated or not. Although the pattern of auditory guided eye movements is very individual, it turns out to be quite replicable within each participant and condition. Even if there were significant errors in location estimation (e.g., the free field stimuli from the horizontal plane were frequently judged to come from below,
Objectives: We compared the spatial concepts given to sounds’ directions by blind football players to both blind non-athletes and sighted individuals under two response actions. Method: While standing blindfolded in the middle of a circle with 16 loudspeakers, participants were presented acoustic stimuli coming from the loudspeakers in randomized order, and asked to verbally describe their directions by using the concept labels “front”, “back”, “left”, “right”, “front-right”, “front-left”, “back-right” and “back-left”. Response actions were varied in two blocked conditions: 1) facing front, 2) turning the head and upper body to face the stimulus, plus pointing with the hand and outstretched arm towards the stimulus. Results: Blind football players categorized the directions more precisely than the other groups, and their categorization was less sensitive to the response conditions than blind non-athletes. Sighted participants’ categorization was similar to previous studies, in which the front and back regions were generally more precisely described. Conclusions: The differences in conceptual categorization of sound directions found in this study are a) in sighted individuals, influenced by the representation of the visual space b) in blind individuals, influenced by the level of expertise in action and locomotion based on non-visual information, as it can be acquired by increased auditive stimulation provided by blind football training. Keywords: Blind football, blind, auditory space, spatial cognition, spatial concepts
In O’Sullivan’s and Guilford’s tests (1965), auditory abilities are measured with three subtests (For further details beyond the information provided here, see O’Sullivan & Guilford, 1965.). In the subtest “inflections” (behavioral units), one of four drawn facial expressions has to be chosen that expresses the same feeling as a tape-recorded vocal inflection. For test construction, six neutral phrases were selected (yes, mother, I did it, well, really, that’s good). Three actors produced a variety of inflections for each of the six phrases. Five inflections of each phrase were selected (5 x 6 = 30 items). Half of the inflections and pictures are male, half female. Concerning the subtest “sound meaning” (behavioral classes), three classes of emotive sounds were formed, produced by a man and a woman (non–actors). For each emotive sound the appropriate class has to be chosen. For example, three sounds: heavy breathing, startled laugh, whimper, were produced; the appropriate class to select would be “fear”. Within the subtest “reflections” (behavioral implications), subjects have to choose one out of four alternative interpretations that correctly reflects the feeling of a tape-recorded statement. The material (statements) for the test stems from published reports of therapy cases. An example: Which alternative statement expresses the attitude or feeling underlying the given statement? - “I’m just wondering how I’ll act – I mean how things will turn out.” Alternatives: a) She’s looking forward to it., b) She’s worried about it., c) She’s interested in how things will work out. With respect to psychometrics, the auditory tests “inflections” and “sound meaning” have the lowest reliabilities within the battery (see also Table 2-2). O’Sullivan and Guilford expected reasons in little agreement among subjects with a single auditory stimulus (e.g. Kramer, 1963), but there was good agreement. Test intercorrelations were between .14 and .25