TrsToEaf
Converts Transcriber .trs transcripts to ELAN .eaf files
ELAN does not support the same meta-data that Transcriber does, so the following meta-data is lost during conversion:
- version
- version date
- air date
- scribe
- language
- participant gender
- participant dialect
- participant accent
- participant scope
The following Transcriber annotations are not supported by ELAN, and are lost:
- phrase language annotations
- named entity annotations
The following Transcriber annotations are not directly supported by ELAN, and are converted using bracketed, inline conventions within annotation labels:
- comments
- noises
- lexical tags
- pronounce tags
To disable these conventions (and thus lose these annotations during conversion) use the –useConventions=false command line switch.
If the Transcriber transcript includes topic tags, these are included in the ELAN file on their own tier.
Deserializing from “Transcriber transcript” text/xml-transcriber
Command-line configuration parameters for deserialization:
--topicLayer= Layer |
Topic tags |
--commentLayer= Layer |
Commentary |
--noiseLayer= Layer |
Noise annotations |
--languageLayer= Layer |
Inline language tags |
--lexicalLayer= Layer |
Lexical tags |
--pronounceLayer= Layer |
Manual pronunciation tags |
--entityLayer= Layer |
Named entities |
--scribeLayer= Layer |
Name of transcriber |
--versionLayer= Layer |
Version of transcriber |
--versionDateLayer= Layer |
Version date of transcriber |
--programLayer= Layer |
Name of the program recorded |
--airDateLayer= Layer |
Date the program aired |
--transcriptLanguageLayer= Layer |
The language of the whole transcript |
--participantCheckLayer= Layer |
Participant checked |
--genderLayer= Layer |
Gender - participant ‘type’ |
--dialectLayer= Layer |
Participant's dialect |
--accentLayer= Layer |
Participant's accent |
--scopeLayer= Layer |
Participant's ‘scope’ |
Serializing to “ELAN EAF Transcript” text/x-eaf+xml
Command-line configuration parameters for serialization:
--commentLayer= Layer |
Commentary |
--noiseLayer= Layer |
Noise annotations |
--lexicalLayer= Layer |
Lexical tags |
--pronounceLayer= Layer |
Manual pronunciation tags |
--authorLayer= Layer |
Name of transcriber |
--dateLayer= Layer |
Document date |
--languageLayer= Layer |
The language of the whole transcript |
--phraseLanguageLayer= Layer |
For tagging individual phrases with a language |
--useConventions= Boolean |
Whether to use text conventions for comment, noise, lexical, and pronounce annotations |
--ignoreBlankAnnotations= Boolean |
Whether to skip annotations with no label, or process them |
--minimumTurnPauseLength= Double |
Minimum amount of time between two turns by the same speaker, with no intervening speaker, for which the inter-turn pause counts as a turn change boundary. If the pause is shorter than this, the turns are merged into one. |