Phoneme Transcoder
The Phoneme Transcoder translates word pronunciations from one phoneme encoding system to another.
Phonemic and phonetic transcriptions may be expressed using a number of systems, for example:
- Unicode IPA
- One or more Unicode character per phoneme, possibly including diacritics,
e.g.
there'll
→ ðɛəɹl̩ - CELEX DISC
- Exactly one ASCII character per phoneme,
e.g.
there'll
→ D8r@l - ARPAbet
- Phonemes are one or two uppercase ASCII characters, possibly suffixed with a
digit indicating stress.
e.g.
there'll
→ DH EH1 R AX0 L - CMU
- A subset of ARPAbet, which excludes certain phonemes, including AX (schwa)
e.g.
there'll
→ DH EH1 R AH0 L
The annotator supports selecting from a number of predetermined conversions between known encodings, or a custom mapping between label characters can be specified, e.g. for language where the orthography maps transparently to the phonology.
The following table presents some common encodings and equivalences or near-equivalences between phonemes. 1
Example | IPA | SAM-PA | DISC2 | CPA3 | Kirshenbaum4 | ARPAbet | CMU Dict |
---|---|---|---|---|---|---|---|
Vowels | |||||||
kit | ɪ | I | I | I | I | IH | IH |
dress | ɛ | E | E | E | E | EH | EH |
trap | æ | { | { | ^/ | & | AE | AE |
strut | ʌ | V | V | ^ | V | AH | AH |
foot | ʊ | U | U | U | U | UH | UH |
another | ǝ | @ | @ | @ | @ | AX | |
fleece | iː | i: | i | i: | i: | IY | IY |
bath | ɑː | A: | # | A: | A: | AA | AA |
lot | ɒ | Q | Q | Q | A. | AO | AO |
thought | ɔː | O: | $ | O: | O: | ||
goose | uː | u: | u | u: | u: | UW | UW |
nurse | ɜː | 3ː | 3 | @: | V” | ER | ER |
face | eɪ | eI | 1 | e/ | eI | EY | EY |
price | aɪ | aI | 2 | a/ | aI | AY | AY |
choice | ɔɪ | OI | 4 | o/ | OI | OY | OY |
goat | ǝʊ | @U | 5 | O/ | @U | OW | OW |
mouth | aʊ | aU | 6 | A/ | aU | AW | AW |
near | ɪǝ | I@ | 7 | I/ | I@ | IY R | IY R |
square | ɛǝ | E@ | 8 | E/ | E@ | EH R | EH R |
cure | ʊǝ | U@ | 9 | U/ | U@ | UH R | UH R |
timbre | æ | {~ | c | ^/~ | &~ | ||
détente | ɑ̃ː | A~: | q | A~: | A~: | ||
lingerie | æ̃ː | {~: | 0 | ^/~: | &~: | ||
bouillon | ɒ̃ː | O~: | ~ | O~: | A.~: | ||
Consonants | |||||||
pat | p | p | p | p | p | P | P |
bad | b | b | b | b | b | B | B |
tack | t | t | t | t | t | T | T |
dad | d | d | d | d | d | D | D |
cad | k | k | k | k | k | K | K |
game | g | g | g | g | g | G | G |
bang | ŋ | N | N | N | N | NG | NG |
mad | m | m | m | m | m | M | M |
nat | n | n | n | n | n | N | N |
lad | l | l | l | l | l | L | L |
rat | r | r | r | r | r | R | R |
fat | f | f | f | f | f | F | F |
vat | v | v | v | v | v | V | V |
thin | Ɵ | T | T | T | T | TH | TH |
then | ð | D | D | D | D | DH | DH |
sap | s | s | s | s | s | S | S |
zap | z | z | z | z | z | Z | Z |
sheep | ʃ | S | S | S | S | SH | SH |
measure | Ʒ | Z | Z | Z | Z | ZH | ZH |
yank | j | j | j | j | j | Y | Y |
had | h | h | h | h | h | HH | HH |
wet | w | w | w | w | w | W | W |
cheap | ʧ | tS | J | T/ | tS | CH | CH |
jeep | ʤ | dZ | _ | J/ | dZ | JH | JH |
loch | x | x | x | x | x | ||
bacon | ŋ̩ | N, | C | N, | N- | ||
idealism | m̩ | m, | F | m, | m- | ||
burden | n̩ | n, | H | n, | n- | ||
dangle | l̩ | l, | P | l, | l- | ||
car alarm | * | r* | R | r* | |||
uh-oh | ʔ | ? | ? | Q | |||
father | ɚ | AXR | |||||
wetter | ɾ | DX |
1 In the table, some phoneme representations are highlighted with a bold typeface; this highlighting is intended to indicate representations that are unpredictable in some way, either because they're substantially different from IPA or from English orthographical convention, or they're different from the corresponding representation in an otherwise-similar set of representations. Others are highlighted with an italic typeface; these are examples of representations that actually use a combination of two phonemes, where in other sets only one phoneme is used.
2 SAM-PA and DISC phonemes taken from CELEX English Guide (1995) § 2.4.1 pp. 31-32, Tables 3 & 4.
3 The Computer Phonetic Alphabet (CPA) was developed for seven European languages, based on the IPA - Kugler-Kruse (1987)