Class CMU2DISC

  • All Implemented Interfaces:
    Function<String,​String>, UnaryOperator<String>

    public class CMU2DISC
    extends PhonemeTranslator
    Translates CMU-encoded phonemic transcriptions like T R AE2 N S K R IH1 P SH AH0 N to CELEX-DISC-encoded transcriptions like tr{nskrIpSVn.

    The CMU encoding is assumed to use only the phonemes used by the CMU Pronouncing Dictionary: http://www.speech.cs.cmu.edu/cgi-bin/cmudict.

    Thanks to Stefanie Jannedy for this mapping.

    There are differences between the DISC2ARPAbet translation and this one, primarily that this translation is strict; phonemes that are not explicitly present in the phone set are dropped, where DISC2ARPAbet includes extra phonemes, includes some extensions to ARPAbet and DISC, and passes through unknown phonemes unchanged.

    Mapping
    ARPAbetDISCExample
    Vowels
    AA # START odd/father
    AE { TRAP at/fast
    AH V STRUT hut/but
    AO $ THOUGHT ought/fall - one-to-two this could also be Q!
    AO Q LOT ought/off
    AW 6 MOUTH cow/how
    AY 2 PRICE hide/my
    EH E DRESS Ed/red
    ER 3 NURSE hurt/her
    EY 1 FACE ate/say
    IH I KIT it/big
    IY i FLEECE eat/bee
    OW 5 GOAT oat/show
    OY 4 CHOICE toy/boy
    UH U FOOT hood/should
    UW u GOOSE two/you
    Consonants
    B b
    CH J
    D d
    DH D
    F f
    G g
    HH h
    JH _
    K k
    L l
    M m
    N n
    NG N
    P p
    R r
    S s
    SH S
    T t
    TH T
    V v
    W w
    Y j
    Z z
    ZH Z
    Author:
    Robert Fromont robert@fromont.net.nz
    See Also:
    DISC2CMU, DISC2ARPAbet
    • Constructor Detail

      • CMU2DISC

        public CMU2DISC()
        Default constructor.
    • Method Detail

      • getZeroStressToSchwa

        public boolean getZeroStressToSchwa()
        Getter for zeroStressToSchwa: Translate zero-stress vowels as schwa. Default is true.
        Returns:
        Translate zero-stress vowels as schwa. Default is true.
      • setZeroStressToSchwa

        public CMU2DISC setZeroStressToSchwa​(boolean newZeroStressToSchwa)
        Setter for zeroStressToSchwa: Translate zero-stress vowels as schwa. Default is true.
        Parameters:
        newZeroStressToSchwa - Translate zero-stress vowels as schwa. Default is true.
      • apply

        public String apply​(String source)
        Translates a phonemic transcription from the source encoding to the destination encoding.
        Specified by:
        apply in interface Function<String,​String>
        Overrides:
        apply in class PhonemeTranslator
        Parameters:
        source - Phonemic transcription in the source encoding.
        Returns:
        Phonemic transcription in the destination encoding.