Package nzilbb.encoding
Class DISC2CMU
- java.lang.Object
-
- nzilbb.encoding.PhonemeTranslator
-
- nzilbb.encoding.DISC2CMU
-
- All Implemented Interfaces:
Function<String,String>
,UnaryOperator<String>
public class DISC2CMU extends PhonemeTranslator
Translates CELEX-DISC-encoded transcriptions like tr{nskrIpSVn to CMU-encoded phonemic transcriptions like T R AE N S K R IH P SH AH N.The CMU encoding is assumed to use only the phonemes used by the CMU Pronouncing Dictionary: http://www.speech.cs.cmu.edu/cgi-bin/cmudict.
Thanks to Stefanie Jannedy for this mapping.
There are differences between the
ARPAbet2DISC
translation and this one, primarily that this translation is strict; phonemes that are not explicitly present in the phone set are dropped, whereARPAbet2DISC
includes extra phonemes, includes some extensions to ARPAbet and DISC, and passes through unknown phonemes unchanged.Mapping Source Destination Example Vowels # → AA START odd/father { → AE TRAP at/fast V → AH STRUT hut/but $ → AO THOUGHT ought/fall - two-to-one Q → AO LOT ought/off - two-to-one 6 → AW MOUTH cow/how @ → IH schwa discuss doesn't exist in CMU 2 → AY PRICE hide/my E → EH DRESS Ed/red 3 → ER NURSE hurt/her 1 → EY FACE ate/say I → IH KIT it/big i → IY FLEECE eat/bee 5 → OW GOAT oat/show 4 → OY CHOICE toy/boy U → UH FOOT hood/should u → UW GOOSE two/you Consonants b → B J → CH d → D D → DH f → F g → G h → HH _ → JH k → K l → L m → M n → N N → NG p → P r → R R → R Possible linking R is pretty definitely R s → S S → SH t → T T → TH v → V w → W j → Y z → Z Z → ZH Not in the CMU set but exist in Buckeye corpus L → D flap this is an extension to DISC ? → K glottal stop this is an extension to DISC Not in CMU set but exist in DISC 7 → IY R NEAR 8 → EH R SQUARE 9 → UH R CURE F → IH M idealism H → IH N burden P → IH L dangle C → IH NG bacon 0 → AO N lingerie ~ → AO N bouillon c → AO M timbre q → AO N detente - Author:
- Robert Fromont robert@fromont.net.nz
- See Also:
CMU2DISC
,DISC2ARPAbet
-
-
Constructor Summary
Constructors Constructor Description DISC2CMU()
Default constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description String
apply(String source)
Translates a phonemic transcription from the source encoding to the destination encoding.String
getDefaultStress()
Getter fordefaultStress
: Default stress value to append to vowels.DISC2CMU
setDefaultStress(String newDefaultStress)
Setter fordefaultStress
: Default stress value to append to vowels.-
Methods inherited from class nzilbb.encoding.PhonemeTranslator
getDestinationEncoding, getSourceEncoding
-
-
-
-
Method Detail
-
getDefaultStress
public String getDefaultStress()
Getter fordefaultStress
: Default stress value to append to vowels.- Returns:
- Default stress value to append to vowels.
-
setDefaultStress
public DISC2CMU setDefaultStress(String newDefaultStress)
Setter fordefaultStress
: Default stress value to append to vowels.- Parameters:
newDefaultStress
- Default stress value to append to vowels.
-
-