Phonemic Tagging

Depending on your speech data, there are several ways to obtain phonemic transcriptions for words:

  • Lexical tagging
    • CELEX - for British English, German, Dutch, using one of the CELEX layer managers.
    • CMU Pronouncing Dictionary - for US English, using th CMU Pronouncing Dictionary layer manager.
    • Unisyn - for various English varieties, using the Unisyn layer manager.
    • Define your own lexicon, and use the Flat File Dictionary layer manager to integrate it into LaBB-CAT.
  • Inferring pronunciation from orthography

If the speech corpus includes data in more than one language, it is possible to ensure that the utterances are phonemically tagged in a way that’s sensitive to the language of the specific utterance, using the language layers and attributes, and auxiliary layer managers.

Reuse