TrsToEaf

Converts Transcriber .trs transcripts to ELAN .eaf files

ELAN does not support the same meta-data that Transcriber does, so the following meta-data is lost during conversion:

  • version
  • version date
  • air date
  • scribe
  • language
  • participant gender
  • participant dialect
  • participant accent
  • participant scope

The following Transcriber annotations are not supported by ELAN, and are lost:

  • phrase language annotations
  • named entity annotations

The following Transcriber annotations are not directly supported by ELAN, and are converted using bracketed, inline conventions within annotation labels:

  • comments
  • noises
  • lexical tags
  • pronounce tags

To disable these conventions (and thus lose these annotations during conversion) use the –useConventions=false command line switch.

If the Transcriber transcript includes topic tags, these are included in the ELAN file on their own tier.

Deserializing from “Transcriber transcript” text/xml-transcriber

Command-line configuration parameters for deserialization:

--topicLayer=Layer Topic tags
--commentLayer=Layer Commentary
--noiseLayer=Layer Noise annotations
--languageLayer=Layer Inline language tags
--lexicalLayer=Layer Lexical tags
--pronounceLayer=Layer Manual pronunciation tags
--entityLayer=Layer Named entities
--scribeLayer=Layer Name of transcriber
--versionLayer=Layer Version of transcriber
--versionDateLayer=Layer Version date of transcriber
--programLayer=Layer Name of the program recorded
--airDateLayer=Layer Date the program aired
--transcriptLanguageLayer=Layer The language of the whole transcript
--participantCheckLayer=Layer Participant checked
--genderLayer=Layer Gender - participant ‘type’
--dialectLayer=Layer Participant's dialect
--accentLayer=Layer Participant's accent
--scopeLayer=Layer Participant's ‘scope’

Serializing to “ELAN EAF Transcript” text/x-eaf+xml

Command-line configuration parameters for serialization:

--commentLayer=Layer Commentary
--noiseLayer=Layer Noise annotations
--lexicalLayer=Layer Lexical tags
--pronounceLayer=Layer Manual pronunciation tags
--authorLayer=Layer Name of transcriber
--dateLayer=Layer Document date
--languageLayer=Layer The language of the whole transcript
--phraseLanguageLayer=Layer For tagging individual phrases with a language
--useConventions=Boolean Whether to use text conventions for comment, noise, lexical, and pronounce annotations
--ignoreBlankAnnotations=Boolean Whether to skip annotations with no label, or process them
--minimumTurnPauseLength=Double Minimum amount of time between two turns by the same speaker, with no intervening speaker, for which the inter-turn pause counts as a turn change boundary. If the pause is shorter than this, the turns are merged into one.