Beschreibung
The narrowband frequency range of telephone speech signals originally caused by former analog transmission techniques still leads to frequent acoustical limitations in today's digital telephony systems. It provokes muffled sounding phone calls with reduced speech intelligibility and quality. By means of artificial speech bandwidth extension approaches, missing frequency components can be estimated and reconstructed. However, the artificially extended speech bandwidth typically suffers from annoying artifacts. Particularly susceptible to this are the fricatives /s/ and /z/. They can hardly be estimated based on the narrowband spectrum and are therefore easily confusable with other phonemes as well as speech pauses. This work takes advantage of phonetic a priori knowledge to optimize the performance of artificial bandwidth extension. Both the offline training part conducted in advance and the main processing part performed later on shall be thereby provided with important phoneme information. As the preceding training part does not require online processing, phonetic a priori knowledge can be made available. But its availability during the later processing part depends on the online requirements of the particular application. In this work, the two main application areas of artificial bandwidth extension are addressed. On the one hand, existing telephone speech databases are upgraded in bandwidth to be able to train telephony-based wideband interactive voice response systems. On the other hand, narrowband telephone speech services are artificially extended in bandwidth to enhance their intelligibility and quality. The developed artificial bandwidth extension approach successfully demonstrates its abilities for both application areas in comparison with the state of the art.