Skip to main content

Advertisement

Log in

Prosodic tools for language learning

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In this paper we will be concerned with the role played by prosody in language learning and by the speech technology already available as commercial product or as prototype, capable to cope with the task of helping language learner in improving their knowledge of a second language from the prosodic point of view. The paper has been divided into two separate sections: Section One, dealing with Rhythm and all related topics; Section Two dealing with Intonation. In the Introduction we will argue that the use of ASR (Automatic Speech Recognition) as Teaching Aid should be under-utilized and should be targeted to narrowly focussed spoken exercises, disallowing open-ended dialogues, in order to ensure consistency of evaluation. Eventually, we will support the conjoined use of ASR technology and prosodic tools to produce GOP useable for linguistically consistent and adequate feedback to the student. This will be illustrated by presenting State of the Art for both sections, with systems well documented in the scientific literature of the respective field.

In order to discuss the scientific foundations of prosodic analysis we will present data related to English and Italian and make comparisons to clarify the issues at hand. In this context, we will also present the Prosodic Module of a courseware for computer-assisted foreign language learning called SLIM—an acronym for Multimedia Interactive Linguistic Software, developed at the University of Venice (Delmonte et al. in Convegno GFS-AIA, pp. 47–58, 1996a; Ed-Media 96, AACE, pp. 326–333, 1996b). The Prosodic Module has been created in order to deal with the problem of improving a student’s performance both in the perception and production of prosodic aspects of spoken language activities. It is composed of two different sets of Learning Activities, the first one dealing with phonetic and prosodic problems at word level and at syllable level; the second one dealing with prosodic aspects at phonological phrase and utterance suprasegmental level. The main goal of Prosodic Activities is to ensure consistent and pedagogically sound feedback to the student intending to improve his/her pronunciation in a foreign language.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Avesani, C. (1995). ToBIT: un sistema di trascrizione per l’intonazione italiana. In Atti delle 5 Giornate di Studio GFS, Povo (TN) (pp. 85–98).

  • Bacalu, C., & Delmonte, R. (1999). Prosodic modeling for syllable structures from the VESD—venice English syllable database. In Atti 9° Convegno GFS-AIA, Venezia.

  • Bagshaw, P. (1994). Automatic prosodic analysis for computer aided pronunciation teaching. Unpublished PhD Dissertation, Univ. of Edinburgh, UK.

  • Bagshaw, P., Hiller, S., & Jack, M. (1993a). Computer aided intonation teaching. In Proceedings of Eurospeech, 93 (pp. 1003–1006).

  • Bagshaw, P. C., Hiller, S. M., & Jack, M. A. (1993b). Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching. In Proc. Eurospeech93, Berlin (pp. 1003–1006).

  • Bannert, R. (1987). From prominent syllables to a skeleton of meaning: a model of a prosodically guided speech recognition. In Proceedings of the XIth ICPhS (Vol. 2, p. 22.4).

  • Batliner, A., Kompe, R., Kiessling, A., Mast, M., Niemann, H., & Noeth, E. (1998). M—Syntax + Prosody: a syntactic-prosodic labelling scheme for large spontaneous speech databases. Speech Communication, 25(4), 193–222.

    Article  Google Scholar 

  • Bernstein, J., & Franco, H. (1995). Speech recognition by computer. In N. Lass (Ed.), Principles of experimental phonetics (pp. 408–434). New York: Mosby.

    Google Scholar 

  • Bertinetto, P. M. (1980). The perception of stress by Italian speakers. Journal of Phonetics, 8, 385–395.

    Google Scholar 

  • Bowen, J. D. (1975). Patterns of English pronunciation. New York: Newbury House.

    Google Scholar 

  • Breen, A. P. (1995). A simple method of predicting the duration of syllables. In Eurospeech’95 (pp. 595–598).

  • Campbell, W. (1993). Predicting segmental durations for accommodation within a syllable-level timing framework. In Eurospeech ’93 (pp. 1081–1085).

  • Campbell, W., & Isard, S. (1991). Segment durations in a syllable frame. Journal of Phonetics, 19, 37–47.

    Google Scholar 

  • Chun, D. M. (1998). Signal analysis software for teaching discourse intonation. LLTJ, Language Learning & Technology, 2(1), 61–77.

    Google Scholar 

  • Delmonte, R. (1981). L’accento di parola nella prosodia dell’enunciato dell’Italiano standard. In Studi di Grammatica Italiana, Accademia della Crusca, Firenze (pp. 69–81).

  • Delmonte, R. (1983a). A phonological processor for Italian. In Proceedings of the 1st conference of the European chapter of ACL, Pisa (pp. 26–34).

  • Delmonte, R. (1983b). Regole di Assegnazione del Fuoco o Centro Intonativo in Italiano Standard. CLESP, Padova.

  • Delmonte, R. (1984). On certain differences between English and Italian in phonological processing and syntactic processing. Manuscript, Università di Trieste.

  • Delmonte, R. (1985a). Parsing Difficulties & Phonological Processing in Italian. In Proceedings of the 2nd conference of the European chapter of ACL, Geneva (pp. 136–145).

  • Delmonte, R. (1985b). Sintassi, semantica, fonologia e regole di assegnazione del fuoco. In Atti del XVII congresso SLI, Bulzoni, Urbino (pp. 437–455).

  • Delmonte, R. (1987a). The realization of semantic focus and language modeling. In Proceedings of the XIth ICPhS (Vol. 2, 24.1, pp. 101–104).

  • Delmonte, R. (1987b). The realization of semantic focus and language modeling. In Proceeding of the international congress of phonetic sciences, Tallinn, URSS (pp. 100–104).

  • Delmonte, R. (1988a). Analisi Automatica delle Strutture Prosodiche. In R. Delmonte, G. Ferrari, & I. Prodanoff (Eds.), Studi di linguistica computazionale (pp. 109–162). Padova: Unipress, Cap. IV.

    Google Scholar 

  • Delmonte, R. (1988b). Focus and the semantic component. In Rivista di Grammatica Generativa (pp. 81–121).

  • Delmonte, R. (1991). Linguistic tools for speech understanding and recognition. In P. Laface & R. De Mori (Eds.), NATO ASI Series : Vol.  F 75. Speech recognition and understanding: recent advances (pp. 481–485). Berlin: Springer.

    Google Scholar 

  • Delmonte, R. (1998). Prosodic modeling for automatic language tutors. In Proc. STiLL’98, ESCA, Sweden (pp. 57–60).

  • Delmonte, R. (1999). Prosodic variability: from syllables to syntax through phonology. In Atti IX Convegno GFS-AIA, Venezia (pp. 133–146).

  • Delmonte, R. (2000). SLIM prosodic automatic tools for self-learning instruction. Speech Communication, 30, 145–166.

    Article  Google Scholar 

  • Delmonte, R., & Dolci, R. (1991). Computing linguistic knowledge for text-to-speech systems with PROSO. In Proc. EUROSPEECH’91, Genova (pp. 1291–1294).

  • Delmonte, R., Mian, G. A., & Tisato, G. (1986). A grammatical component for a text-to-speech system. In Proceedings ICASSP ’86, IEEE, Tokyo (Vol. 4, pp. 2407–2410).

  • Delmonte, R., Cristea, D., Petrea, M., Bacalu, C., & Stiffoni, F. (1996a). Modelli fonetici e prosodici per SLIM. In Convegno GFS-AIA, Roma (pp. 47–58).

  • Delmonte, R., Cacco, A., Romeo, L., Dan, M., Mangilli-Climpson, M., & Stiffoni, F. (1996b). SLIM—a model for automatic tutoring of language skills. In Ed-Media 96, AACE, Boston (pp. 326–333).

  • Delmonte, R., Petrea, M., & Bacalu, C. (1997). SLIM prosodic module for learning activities in a foreign language. In Proc. ESCA, Eurospeech’97, Rhodes (Vol. 2, pp. 669–672).

  • Eskénazi, M. (1999). Using automatic speech processing for foreign language pronunciation tutoring: some issues and a prototype. Language Learning & Technology, 2(2), 62–76.

    Google Scholar 

  • Grover, C., Fackrell, J., Vereecken, H., Martens, J.-P., & Van Coile, B. (1998). Designing prosodic databases for automatic modelling in 6 languages. In Proceedings of ESCA/COCOSDA workshop on speech synthesis, Australia (pp. 93–98).

  • Hiller, S., Rooney, E., Laver, J., & Jack, M. (1993). SPELL: an automated system for computer-aided pronunciation teaching. Speech Communication, 13, 463–473.

    Article  Google Scholar 

  • Hurley, D. S. (1992). Issues in teaching pragmatics, prosody, and non-verbal communication. Applied Linguistics, 13(3), 259–281.

    Article  Google Scholar 

  • Jilka, M. (2000). The contribution of intonation to the perception of foreign accent. Doctoral Dissertation, Arbeiten des Instituts für Maschinelle Sprachverarbeitung (AIMS) (Vol. 6(3)). University of Stuttgart.

  • Jilka, M. (2009). Ph.D. Dissertation. Available at http://www.ims.uni-stuttgart/phonetik/matthias/.

  • Jilka, M., & Möhler, G. (1998). Intonational foreign accent: speech technology and foreign language teaching. In Proceedings of the ESCA workshop on speech technology in language learning, Marholmen (pp. 115–118).

  • Kahn, D. (1976). Syllable-based generalizations in English phonology. MIT doctoral dissertation, distributed by IULC.

  • Kawai, G., & Hirose, K. (1997). A call system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruent. In Proc. Eurospeech97 (Vol. 2, pp. 657–660).

  • Kelm, O. R. (1987). An acoustic study on the differences of contrastive emphasis between native and non-native Spanish speakers. Hispania, 70, 627–633.

    Article  Google Scholar 

  • Kim, Y., Franco, H., & Neumeyer, L. (1997). Automatic pronunciation scoring of specific phone segments for language instruction. In Proc. Eurospeech97 (Vol. 2, pp. 645–648).

  • Klatt, D. (1987). Review of text-to-speech conversion for English. Journal of the Acoustical Society of America, 82, 737–797.

    Article  Google Scholar 

  • Lehiste, I. (1977). Isochrony reconsidered. Journal of Phonetics, 3, 253–263.

    Google Scholar 

  • Loveday, L. (1981). Pitch, politeness and sexual role: an exploratory investigation into the pitch correlates of English and Japanese politeness formulae. Language and Speech, 24, 71–89.

    Google Scholar 

  • Luthy, M. J. (1983). Nonnative speakers’ perceptions of English “nonlexical” intonation signals. Language Learning, 33(1), 19–36.

    Article  MathSciNet  Google Scholar 

  • Meador, J., Ehsani, F., Egan, K., & Stokowski, S. (1998). An interactive dialog system for learning Japanese. In Proc. STiLL ’98, op.cit. (pp. 65–69).

  • Medan, Y., Yair, E., & Chazan, D. (1991). Super resolution pitch determination of speech signals. New York: IEEE Press.

    Google Scholar 

  • Pisoni, D. (1977). Identification and discrimination of the relative onset times of two component tones: implications for voicing perception in stops. Journal of the Acoustic Society of America, 61, 1352–1361.

    Article  Google Scholar 

  • Price, P. (1998). How can speech technology replicate and complement good language teachers to help people learn language? In Proc. STiLL ’98, op.cit. (pp. 103–106).

  • Ramus, F., & Mehler, J. (1999). Language identification with suprasegmental cues: a study based on speech resynthesis. Journal of the Acoustic Society of America, 105(1), 512–521.

    Article  Google Scholar 

  • Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 26(2), 145–171.

    Google Scholar 

  • Roach, P. (2000). Studying rhythm and timing in English speech: scientific curiosity, or a classroom necessity?

  • Ronen, O., Neumeyer, L., & Franco, H. (1997). Automatic detection of mispronunciation for language instruction. In Proc. Eurospeech97 (Vol. 2, pp. 649–652).

  • Rooney, E., Hiller, S., Laver, J., & Jack, M. (1992). Prosodic features for automated pronunciation improvement in the SPELL system. In Proceedings of the International Conference on Spoken Language Processing, Banff, Canada (pp. 413–416).

  • Shriberg, E., Bates, R., Stolcke, A., Taylor, P., Jurafsky, D., Ries, K., Coccaro, N., Martin, R., Meteer, M., & Van Ess-Dykema, C. (1998). Can prosody aid the automatic classification of dialog acts in conversational speech? Language and Speech, 41(3–4), 439–487. Special Issue on Prosody and Conversation.

    Google Scholar 

  • Umeda, N. (1977). Consonant duration in American English. Journal of the Acoustical Society of America, 61, 846–858.

    Article  Google Scholar 

  • van Santen, J. (1997). Prosodic modeling in text-to-speech synthesis. In Proc. Eurospeech97 (Vol. 1, pp. 19–28).

  • van Santen, J., Shih, C., Möbius, B., Tzoukermann, E., & Tanenblatt, M. (1997). Multi-lingual durational modeling. In Proc. Eurospeech97 (Vol. 5, pp. 2651–2654).

  • van Son, R., & van Santen, J. (1997). Strong interaction between factors influencing consonant duration. In Proc. Eurospeech97 (Vol. 1, pp. 319–322).

  • Willems, N. (1983). English intonation from a Dutch point of view. Doctoral Dissertation, University of Utrecht.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodolfo Delmonte.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Delmonte, R. Prosodic tools for language learning. Int J Speech Technol 12, 161–184 (2009). https://doi.org/10.1007/s10772-010-9065-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-010-9065-1

Keywords

Navigation