Phonemic Tagging
Depending on your speech data, there are several ways to obtain phonemic transcriptions for words:
- Lexical tagging
- CELEX - for British English, German, Dutch, using one of the CELEX layer managers.
- CMU Pronouncing Dictionary - for US English, using th CMU Pronouncing Dictionary layer manager.
- Unisyn - for various English varieties, using the Unisyn layer manager.
- Define your own lexicon, and use the Flat File Dictionary layer manager to integrate it into LaBB-CAT.
- Inferring pronunciation from orthography
- Spanish, using the Spanish Phonological Transcriber layer manager
- Bas Web Service: G2P - for various languages.
- Define your own simple mapping rules from orthography to phonology, using the Character Mapper layer manager.
If the speech corpus includes data in more than one language, it is possible to ensure that the utterances are phonemically tagged in a way that’s sensitive to the language of the specific utterance, using the language layers and attributes, and auxiliary layer managers.
Reuse
Copyright
© 2023-2024 NZILBB