Frequency compression (or frequency lowering) is a general term applied to attempts to lower the spectrum of the acoustic speech signal to better match the residual hearing of listeners with severe to profound high-frequency sensorineural impairment accompanied by better hearing at the low frequencies. This pattern of hearing loss is common to a number of different etiologies of hearing loss (including presbycusis, noise exposure, ototoxicity, and various genetic syndromes) and arises from greater damage to the basal region relative to the apical region of the cochlea (see noise-induced hearing loss; ototoxic medications; presbyacusis). The major effect of high-frequency hearing loss on speech reception is a degraded ability to perceive sounds whose spectral energy is dominated by high frequencies, in some cases extending to 10 kHz or beyond. Perceptual studies have documented the difficulties of listeners with high-frequency loss in the reception of high-frequency sounds (including plosive, fricative, and affricate consonants) and have demonstrated that this pattern of confusion is similar to that observed by normal-hearing listeners deprived of high-frequency cues through the use of low-pass filtering (Wang, Reed, and Bilger, 1978). Traditional hearing aids attempt to treat this pattern of hearing loss by delivering frequency-dependent amplification to overcome the loss at high frequencies. Such amplification, however, may not lead to improved performance and has even been shown to degrade the speech reception ability of some listeners with severe to profound high-frequency loss (Hogan and Turner, 1998).
The goal of frequency lowering is to recode the high-frequency components of speech into a lower frequency range that is matched to the residual capacity of a listener's hearing. Frequency lowering has been accomplished through a variety of different techniques. These methods have arisen primarily from attempts at bandwidth reduction in the telecommunications industry, rather than being driven by the perceptual needs of hearing-impaired listeners. This article summarizes and updates the review of the literature on frequency lowering published by Braida et al. (1979). For each of seven different categories of signal processing, the major characteristics of each method are described and a brief summary of results obtained with it is provided.
The earliest approach to frequency lowering was the playback of recorded speech at a slower speed than that used in recording. Each spectral component is scaled lower in frequency by a multiplicative factor equal to the slowdown factor. Although this method is not suitable for real-time applications (because of the inherent time dilation of the resulting waveform) and leads to severe alterations in temporal relations between speech sounds, it is nonetheless important to understand its effects because it is a component of many other frequency-lowering schemes. An important characteristic of this method is its preservation of the proportional relationship between spectral components, including the relation between the short-term spectral envelope and the fundamental frequency (F0) of voiced speech. A negative consequence of proportional lowering, however, is the shifting of F0 into an undesirably low frequency range (particularly for male voices). Results obtained in studies of the effects of slow playback on speech reception conducted with normal-hearing listeners (Tiffanny and Bennett, 1961; Daniloff, Shriner, and Zemlin, 1968) indicate that reductions in bandwidth up to roughly 25% produce only small losses in intelligibility, bandwidth reductions of 50% cause moderate losses in intelligibility, and bandwidth reductions of 66% or greater lead to severe loss in intelligibility. These studies have shown that the voices of females and children are more resistant to lowering than male voices (presumably because the fundamental and formant frequencies are higher for women than for men), that the effects of lowering are similar for the reception of consonants and vowels, and that performance with lowered speech materials improves with practice. In a study of slow playback in listeners with high-frequency sensorineural hearing loss, Bennett and Byers (1967) found beneficial effects for modest degrees of frequency lowering (up to a 20% reduction in bandwidth) but that greater degrees of lowering led to a substantial reduction in performance.
A solution to the time dilation inherent to slow playback was introduced by techniques that compress speech in time (Fairbanks, Everitt, and Jaeger, 1954) prior to the application of slow playback. Time compression can be accomplished in different ways, including the elimination of a fixed duration of speech at a given rate of interruption or eliminating pitch periods from voiced speech. When the time-compression and slow-playback factors are chosen to be equal, the long-term duration of the speech signal can be preserved while at the same time frequencies are lowered proportionally. Fundamental frequency can be affected differently, depending on the particular characteristics of the time-compression scheme, including being lowered, remaining unchanged, or being severely distorted (see Braida et al., 1979). The intelligibility of speech processed by this technique for normal-hearing listeners is similar to that described above for slow playback; that is, bandwidth reduction by factors greater than 20% lead to severe decrease in performance (Daniloff, Shriner, and Zemlin, 1968; Nagafuchi, 1976). Results of studies in hearing-impaired listeners (Mazor et al., 1977; Turner and Hurtig, 1999) indicate that improvements for time-compressed slow-playback speech compared to conventional linear or high-pass amplification may be observed under certain conditions. Small benefits, on the order of 10-20 percentage points, are most likely to be observed for small amounts of frequency lowering (bandwidth reduction factors in the range of 10%-30%), for female rather than male voices, and for individuals who receive little aid from conventional high-pass amplification. A wearable aid that operates on the basic principles of time-compressed slow playback (the AVR Transonic device) has been evaluated in children with profound deafness (Davis-Penn and Ross, 1993; MacArdle et al., 2001) and in adults with high-frequency impairment (Parent, Chmiel, and Jerger, 1997; McDermott et al., 1999). A high degree of variability is observed across studies and across subjects within a given study, with substantial improvements noted for certain subjects and negligible effects or degradations for others.
Another technique for frequency lowering employs heterodyne processing, which uses amplitude modulation to shift all frequency components in a given band downward by a fixed displacement. This process leads to the overlaying, or aliasing, of high-frequency and low-frequency components. Aliasing is generally avoided by the removal of low-frequency components through filtering before modulation. Systems that employ shifting of the entire spectrum have a number of disadvantages: although temporal and rhythmic patterns of speech remain normal, the harmonic relationships of voiced sounds are greatly altered, fundamental frequency is severely modified, and low-frequency components important to speech recognition are removed to prevent aliasing. Even mild degrees of frequency shifting (e.g., a 400-Hz shift for male voices) have been found to interfere substantially with speech reception ability (Raymond and Proud, 1962).
When frequency shifting is restricted to the high-frequency components of speech (rather than to the entire speech spectrum), the process is referred to as frequency transposition. This approach has been incorporated into several different wearable or desktop aids (Johansson, 1966; Velmans, 1973) whose basic operation involves shifting speech frequencies in the region above 3 or 4 kHz into a lower frequency region, and adding these processed components to the original unprocessed speech signal. Generally, the most careful and controlled studies of frequency transposition indicate that benefits are quite modest. Transposition can render high-frequency speech cues audible to listeners with severe to profound high-frequency hearing loss (Rees and Vel-mans, 1993); however, these cues may interfere with information normally associated with the reception of low-frequency speech components (Ling, 1968). There is evidence to suggest that transposition aids may be more useful in training in speech production than in improving speech reception in deaf children (Ling, 1968).
Another approach to frequency lowering lies in an attempt to reduce the zero-crossing rate of the speech signal. In these schemes, bands of speech are extracted by filtering and the filter outputs are converted to lower frequency sounds having reduced zero-crossing rates. Evaluations of a system in which processing was applied only to high-frequency components and inhibited during voiced speech (Guttman and Nelson, 1968) indicated no benefits for processed materials on a large-set word recognition task for normal-hearing listeners with simulated hearing loss. Use of this system as a speech-production aid for hearing-impaired children indicates that, following extensive training, the ability to produce selected high-frequency sounds was improved, while at the same time the ability to discriminate these same words auditorily showed no such improvements (Guttman, Levitt, and Bellefleur, 1970).
An important class of frequency-lowering systems for the hearing-impaired is based on the channel vocoder (Dudley, 1939), which was originally developed to achieve bandwidth reduction in telecommunications systems. Vocoding systems analyze speech into contiguous bandpass filters whose output envelopes are detected and low-pass-filtered for transmission. These signals are then used to control the amplitudes of corresponding channels. For frequency lowering, the set of synthesis filters correspond to lower frequencies than the associated analysis filters. Vocoding systems appear to have a number of advantages, including operation in real time and flexibility in terms of the choice of analysis and synthesis filters (which can allow for different degrees of lowering in different regions of the spectrum as well as for independent manipulation of F 0 and the spectral envelope). The effect of degree of lowering in vocoder-based systems appears to be comparable to that described above for slow playback and time-compressed slow playback (Fu and Shannon, 1999). A number of studies conducted with vocoder-based lowering systems have demonstrated improved speech reception with training, both for normal-hearing (Takefuta and Swi-gart, 1968; Posen, Reed, and Braida, 1993) and for hearing-impaired listeners (Ling and Druz, 1967; McDermott and Dean, 2000). When performance with vocoding systems is compared to baseline systems employing low-pass filtering to an equivalent bandwidth for normal listeners or conventional amplification for impaired listeners, however, the benefits of lowered speech appear to be quite modest. One possible reason for the lack of success of some of these systems (despite the apparent promise of this approach) may have been the failure to distinguish between voiced and unvoiced sounds. Systems in which processing is suppressed when the input signal is dominated by low-frequency energy (Posen, Reed, and Braida, 1993) lead to better performance (compared to systems with no inhibitions in processing for voiced sounds) based on their ability to enhance the reception of high-frequency sounds while not degrading the reception of low-frequency sounds.
A more recent approach to frequency lowering incorporates digital signal-processing techniques developed for correcting "helium speech.'' The speech signal is segmented pitch synchronously, processed to achieve nonuniform spectral warping, dilated in time to achieve frequency lowering, and resynthesized with the original periodicity. Both the overall bandwidth reduction and the relative compression of high- and low-frequency components can be specified. These methods roughly extrapolate the variance associated with increased length of the vocal tract and include the following characteristics: they preserve the temporal and rhythmic properties of speech, they leave F 0 of voiced sounds unaltered, they allow for independent manipulation of F 0 and spectral envelope, and they compress the short-term spectrum in a continuous and monotonic fashion. Studies of speech reception with frequency-warped speech indicate that spectral transformations that lead to greater lowering of the high frequencies relative to the low frequencies are superior to those with linear lowering or with greater lowering of low relative to high frequencies (Allen, Strong, and Palmer, 1981; Reed et al., 1983). Improvements in the ability to identify frequency-warped speech with training have been noted for normal and hearing-impaired listeners (Reed et al., 1985). Improved ability to discriminate and identify high-frequency consonants has been demonstrated with such warping transformations compared to low-pass filtering for substantial reductions in bandwidth (up to a factor of 4 or 5). Overall performance, however, is similar for lowering and low-pass filtering, owing to reduced performance for the lowering schemes on sounds with substantial low-frequency energy.
Attempts at frequency lowering through a variety of different methods have met with only limited success. Frequency lowering leads to a reduction in bandwidth of the original speech signal and to the creation of new speech codes which may sound unnatural to the untrained ear. Evidence from a number of different studies indicates that performance on frequency-lowered speech can improve with familiarization and training in the use of frequency-lowered speech. Many of these same studies, however, also indicate that even after extended practice, performance with the coded speech signals does not exceed that achieved with appropriate baseline conditions (e.g., speech filtered to an equivalent bandwidth in normal-hearing listeners or conventional amplification with appropriate frequency gain characteristics in hearing-impaired listeners). Although frequency-lowering techniques can lead to large improvements in the reception of high-frequency sounds, they may at the same time lead to detrimental effects on the reception of vowels and consonants whose spectral energy is concentrated at low frequencies. Because of the need to use the region of low-frequency residual hearing for recoding high-frequency sounds, the low-frequency components of speech may be altered as well through the overlaying of coded signals onto the original unprocessed speech or through wholesale lowering of the entire speech signal. In listeners with high-frequency impairment accompanied by good residual hearing in the low frequencies, benefits for frequency lowering have been observed for listeners with severe to profound high-frequency loss using mild degrees of lowering (no greater than 30% reduction in bandwidth). For children with profound deafness (whose residual low-frequency hearing may be quite limited), frequency lowering appears to be more effective as a speech production training aid for specific groups of phonemes rather than as a speech perception aid.
—Charlotte M. Reed and Louis D. Braida References
Allen, D. R., Strong, W. J., and Palmer, E. P. (1981). Experiments on the intelligibility of low-frequency speech codes. Journal of the Acoustical Society of America, 70, 12481255.
Bennett, D. N., and Byers, V. W. (1967). Increased intelligibility in the hypacusic by slow-play frequency transposition. Journal of Auditory Research, 7, 107-118. Braida, L. D., Durlach, N. I., Lippmann, R. P., Hicks, B. L., Rabinowitz, W. M., and Reed, C. M. (1979). Hearing aids: A review of past research on linear amplification, amplitude compression, and frequency lowering (ASHA Monograph No. 19). Rockville, MD: American Speech and Hearing Association.
Daniloff, R. G., Shriner, T. H., and Zemlin, W. R. (1968). Intelligibility of vowels altered in duration and frequency. Journal of the Acoustical Society of America, 44, 700-707.
Davis-Penn, W., and Ross, M. (1993). Pediatric experiences with frequency transposing. Hearing Instruments, 44, 2632.
Dudley, H. (1939). Remaking speech. Journal of the Acoustical Society of America, 11, 169-177.
Fairbanks, G., Everitt, W. L., and Jaeger, R. P. (1954). Method for time or frequency compression-expansion of speech. IRE Transactions on Audio, AU-2, 7-12.
Fu, Q. J., and Shannon, R. V. (1999). Recognition of spectrally degraded and frequency-shifted vowels in acoustic and electric hearing. Journal of the Acoustical Society of America, 105, 1889-1900.
Guttman, N., Levitt, H., and Bellefleur, P. (1970). Articulation training of the deaf using low-frequency surrogate fricatives. Journal of Speech and Hearing Research, 13, 19-29.
Guttman, N., and Nelson, J. R. (1968). An instrument that creates some artificial speech spectra for the severely hard of hearing. American Annals of the Deaf, 113, 295-302.
Hogan, C. A., and Turner, C. W. (1998). High-frequency audibility: Benefits for hearing-impaired listeners. Journal of the Acoustical Society of America, 104, 432-441.
Johansson, B. (1966). The use of the transposer for the management of the deaf child. International Audiology, 5, 362-373.
Ling, D. (1968). Three experiments on frequency transposition. American Annals of the Deaf, 113, 283-294.
Ling, D., and Druz, W. S. (1967). Transposition of high-frequency sounds by partial vocoding of the speech spectrum: Its use by deaf children. Journal of Auditory Research, 7, 133-144.
MacArdle, B. M., West, C., Bradley, J., Worth, S., Mackenzie, J., and Bellman, S. C. (2001). A study of the application of a frequency transposition hearing system in children. British Journal of Audiology, 35, 17-29.
Mazor, M., Simon, H., Scheinberg, J., and Levitt, H. (1977). Moderate frequency compression for the moderately hearing impaired. Journal of the Acoustical Society of America, 62, 1273-1278.
McDermott, H. J., and Dean, M. R. (2000). Speech perception with steeply sloping hearing loss: Effects of frequency transposition. British Journal of Audiology, 34, 353-361.
McDermott, H. J., Dorkos, V. P., Dean, M. R., and Ching, T. Y. C. (1999). Improvements in speech perception with use of the AVR TranSonic frequency-transposing hearing aid. Journal of Speech, Language, and Hearing Research, 42, 1323-1335.
Nagafuchi, M. (1976). Intelligibility of distorted speech sounds shifted in frequency and time in normal children. Audiology, 15, 326-337.
Parent, T. C., Chmiel, R., and Jerger, J. (1997). Comparison of performance with frequency transposition hearing aids and conventional hearing aids. Journal of the American Academy of Audiology, 8, 355-365.
Posen, M. P., Reed, C. M., and Braida, L. D. (1993). Intelligibility of frequency-lowered speech produced by a channel vocoder. Journal of Rehabilitation Research and Development, 30, 26-38.
Raymond, T. H., and Proud, G. O. (1962). Audiofrequency conversion. Archives of Otolaryngology, 76, 60-70.
Reed, C. M., Hicks, B. L., Braida, L. D., and Durlach, N. I. (1983). Discrimination of speech processed by low-pass filtering and pitch-invariant frequency lowering. Journal of the Acoustical Society of America, 74, 409-419.
Reed, C. M., Schultz, K. I., Braida, L. D., and Durlach, N. I. (1985). Discrimination and identification of frequency-lowered speech in listeners with high-frequency hearing im
Functional Hearing Loss in Children 475
pairment. Journal of the Acoustical Society of America, 78, 2139-2141.
Rees, R., and Velmans, M. (1993). The effect of frequency transposition on the untrained auditory discrimination of congenitally deaf children. British Journal of Audiology, 27, 53-60.
Takefuta, Y., and Swigart, E. (1968). Intelligibility of speech signals spectrally compressed by a sampling-synthesizer technique. IEEE Transactions on Auditory Electroacoustics, AU-16, 271-274.
Tiffanny, W. R., and Bennett, D. N. (1961). Intelligibility of slow played speech. Journal of Speech and Hearing Research, 4, 248-258.
Turner, C. W., and Hurtig, R. R. (1999). Proportional frequency compression of speech for listeners with sensori-neural hearing loss. Journal of the Acoustical Society of America, 106, 877-886.
Velmans, M. L. (1973). Speech imitation in simulated deafness, using visual cues and ''recoded'' auditory information. Language and Speech, 16, 224-236.
Wang, M. D., Reed, C. M., and Bilger, R. C. (1978). A comparison of the effects of filtering and sensorineural hearing loss on patterns of consonant confusions. Journal of Speech and Hearing Research, 21, 5-36.
Beasley, D. S., Mosher, N. L., and Orchik, D. J. (1976). Use of frequency shifted/time compressed speech with hearing-impaired children. Audiology, 15, 395-406.
Block, von R., and Boerger, G. (1980). Horverbessernde Verfahren mit Bandbreitenkompression. Acustica, 45, 294-303.
Boothroyd, A., and Medwetsky, L. (1992). Spectral distribution of /s/ and the frequency-response of hearing aids. Ear and Hearing, 13, 150-157.
Ching, T. Y. C., Dillon, H., and Byrne, D. (1998). Speech recognition of hearing-impaired listeners: Predictions from audibility and the limited role of high-frequency amplification. Journal of the Acoustical Society of America, 103, 11281140.
David, E. E., and McDonald, H. S. (1956). Note on pitch-synchronous processing of speech. Journal of the Acoustical Society ofAmerica, 28, 1261-1266.
Denes, P. B. (1967). On the motor theory of speech perception. In W. W. Dunn (Ed.), Models for the perception of speech and visual form (pp. 309-314). Cambridge, MA: MIT Press.
Foust, K. O., and Gengel, R. W. (1973). Speech discrimination by sensorineural hearing impaired persons using a transposer hearing aid. Scandinavian Audiology, 2, 161-170.
Hicks, B. L., Braida, L. D., and Durlach, N. I. (1981). Pitch-invariant frequency lowering with nonuniform spectral compression. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (pp. 121-124). New York: IEEE.
Ling, D. (1972). Auditory discrimination of speech transposed by a sample-and-hold process. In G. Fant (Ed.), Proceedings of the International Symposium on Speech Communication and Profound Deafness, Stockholm (1970) (pp. 323-333). Washington, DC: A.G. Bell Association.
Oppenheim, A. V., and Johnson, D. H. (1972). Discrete representation of signals. Proceedings of the IEEE, 60, 681-691.
Reed, C. M., Power, M. H., Durlach, N. I., Braida, L. D., Foss, K. K., Reid, J. A., et al. (1991). Development and testing of low-frequency speech codes. Journal of Rehabilitation Research and Development, 28, 67-82.
Risberg, A. (1965). The transposer and a model for speech perception. STL-QPSR, 4, 26-30. Stockholm: KTH.
Rosenhouse, J. (1990). A new transposition device for the deaf. Hearing Journal, 43, 20-25.
Sher, A. E., and Owens, E. (1974). Consonant confusions associated with hearing loss above 2000 Hz. Journal of Speech and Hearing Research, 17, 669-681.
Zue, V. W. (1970). Translation of diver's speech using digital frequency warping. MIT Research Laboratory of Electronics Quarterly Progress Report, 101, 175-182.
Was this article helpful?
Discover Simple Techniques to Help Control Your Stutter. Stuttering is annoying and embarrassing. If you or a member of your family stutters, you already know the impact it can have on your everyday life. Stuttering interferes with communication, and can make social situations very difficult. It can even be harmful to your school or business life.