TrsToText
Converts Transcriber .trs files to plain text files
The following participant meta-data is lost during conversion:
- gender
- dialect
- accent
- scope
If the –metaData command-line switch is used, then the following Transcriber meta-data will be included as a header to the file:
- version
- version date
- air date
- scribe
- language
Otherwise, this meta-data is lost during conversion
The following Transcriber annotations are lost during conversion:
- phrase language annotations
- named entity annotations
The following Transcriber annotations are converted using bracketed, inline text conventions:
- comments
- noises
- lexical tags
- pronounce tags
To disable these conventions (and thus lose these annotations during conversion) use the –useConventions=false command line switch.
If the –textOnly command-line switch is used, then the output text includes only the transcribed speech, and all annotations and meta-data are lost.
Deserializing from “Transcriber transcript” text/xml-transcriber
Command-line configuration parameters for deserialization:
--topicLayer= Layer |
Topic tags |
--commentLayer= Layer |
Commentary |
--noiseLayer= Layer |
Noise annotations |
--languageLayer= Layer |
Inline language tags |
--lexicalLayer= Layer |
Lexical tags |
--pronounceLayer= Layer |
Manual pronunciation tags |
--entityLayer= Layer |
Named entities |
--scribeLayer= Layer |
Name of transcriber |
--versionLayer= Layer |
Version of transcriber |
--versionDateLayer= Layer |
Version date of transcriber |
--programLayer= Layer |
Name of the program recorded |
--airDateLayer= Layer |
Date the program aired |
--transcriptLanguageLayer= Layer |
The language of the whole transcript |
--participantCheckLayer= Layer |
Participant checked |
--genderLayer= Layer |
Gender - participant ‘type’ |
--dialectLayer= Layer |
Participant's dialect |
--accentLayer= Layer |
Participant's accent |
--scopeLayer= Layer |
Participant's ‘scope’ |
Serializing to “Plain Text Document” text/plain
Command-line configuration parameters for serialization:
--commentLayer= Layer |
Commentary |
--noiseLayer= Layer |
Background noises |
--lexicalLayer= Layer |
Lexical tags |
--pronounceLayer= Layer |
Non-standard pronunciation tags |
--orthographyLayer= Layer |
Orthography |
--useConventions= Boolean |
Whether to use text conventions for comment, noise, lexical, and pronounce annotations |
--maxParticipantLength= Integer |
The maximum length of a participant name |
--maxHeaderLines= Integer |
The maximum number of lines in a meta-data header |
--participantFormat= String |
Format for marking a change of turn within the transcript body - e.g. {0}:, where {0} is a place-holder for the participant ID/name |
--metaDataFormat= String |
Format for a meta-data line in the header - e.g. {0}={1}, where {0} is a place-holder for the attribute name or key, and {1} is a place-holder for the attribute value |
--timestampFormat= String |
Format for a time stamp - e.g. HH:mm:ss.SSS |