Class Transcriber
- java.lang.Object
-
- nzilbb.ag.automation.Annotator
-
- nzilbb.ag.automation.Transcriber
-
- All Implemented Interfaces:
Function<Graph,Graph>
,UnaryOperator<Graph>
,GraphTransformer
,MonitorableTask
public abstract class Transcriber extends Annotator
Base class for an automated transcriber module.When
transcribe(File, Graph)
is invoked, it should transcribe the given audio file and insert the corresponding transcription data into the given annotation graph.The typical lifecycle of an transcriber is:
- The module is installed:
-
Annotator.setSchema(Schema)
is invoked. -
Annotator.setWorkingDirectory(File)
is invoked. -
Annotator.getConfig()
is invoked, in case the transcriber has a default configuration. - the user is presentated with the config web-app, if any, and
-
Annotator.setConfig(String)
is invoked. (if there's no config web-app, then the config string passed will be the result of the earlier getConfig() invocation)
-
- Transcriber may be then run one or more times:
-
Annotator.setSchema(Schema)
is invoked to provide the current schema. -
Annotator.setWorkingDirectory(File)
is invoked. -
getDiarizationRequired()
is called to determine if the audio need chunking before callingtranscribe(File, Graph)
. -
transcribe(audio, transcript)
is invoked with the speech file and a graph that should contain the transcript.
-
- The module is uninstalled, in which case
Annotator.uninstall()
is invoked, which should remove all persistent data on the system.
The methods below marked in bold are those that an Transcriber subclass should implement, in addition to
transform(graph)
.- Author:
- Robert Fromont robert@fromont.net.nz
-
-
Constructor Summary
Constructors Constructor Description Transcriber()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract boolean
getDiarizationRequired()
Specify whether the transcriber needs the audio to be split into utterance chunks beforetranscribe(File,Graph)
is called.String[]
getOutputLayers()
Determines which layers the annotator will create/update/delete annotations on.String[]
getRequiredLayers()
Requires participant and turn layers, and also utterance layer ifgetDiarizationRequired()
returns true.void
setTaskParameters(String parameters)
Normally, a transcriber has no specific task configuration, so this implementation does nothing.abstract Graph
transcribe(File speech, Graph transcript)
Transcribes the given audio file, saving the resulting transcript in the given graph.void
transcribeFragments(Stream<File> speech, Consumer<Graph> consumer)
Transcribes all audio files in the given stream.Graph
transform(Graph transcript)
Transforms the graph by callingtranscribe(File,Graph)
it if audio is accessible and it has no words.-
Methods inherited from class nzilbb.ag.automation.Annotator
cancel, getAnnotatorId, getCancellationObservers, getConfig, getMinimumApiVersion, getPercentComplete, getPercentCompleteObservers, getRunning, getRunningObservers, getSchema, getStatus, getStatusObservers, getStore, getVersion, getWorkingDirectory, newConnection, setCancellationObservers, setConfig, setRdbConnectionFactory, setSchema, setStatus, setStore, setWorkingDirectory, transformFragments, transformTranscripts, uninstall
-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface nzilbb.ag.GraphTransformer
apply
-
Methods inherited from interface nzilbb.util.MonitorableTask
getTaskId
-
-
-
-
Method Detail
-
getDiarizationRequired
public abstract boolean getDiarizationRequired()
Specify whether the transcriber needs the audio to be split into utterance chunks beforetranscribe(File,Graph)
is called.If the transcriber returns true when this method is called, it should assume that the
participant
,turn
andutterance
layers are populated whentranscribe(File,Graph)
is called, and that the utterance annotations define the start and end times of individual speaker utterances for transcription.If the transcriber returns false when this method is called, it should assume that the
turn
andutterance
layers are empty whentranscribe(File,Graph)
is called.
-
transcribe
public abstract Graph transcribe(File speech, Graph transcript) throws Exception
Transcribes the given audio file, saving the resulting transcript in the given graph.- Parameters:
speech
- An audio file containing the speech to transcribe.transcript
- The annotation graph that should contain the transcription.If the transcriber's
getDiarizationRequired()
returns false, the annotation graph may or may not have any annotations on theturn
,utterance
, andword
layers. If there are existing annotations, they should be re-used if possible, orAnnotation.destroy()
should be called on each to ensure they're removed from the graph.If the transcriber's
getDiarizationRequired()
returns true, it should be assumed that the annotation graph has annotations on theparticipant
,turn
, andutterance
layers, and that the utterance annotations define the start and end times of individual speaker utterances for transcription. In this case, the transcriber should fill in the labels of the given utterance annotations.- Returns:
- The given graph. This should have annotations structured as follows:
- Annotations on the
participant
layer, if the given transcript had no pre-existing participants. - Annotations on the
turn
layer (even if it's one big turn encompassing the whole transcript), with the parent(s) set to the corresponding participant annotations. The turn labels should match the participant labels - Annotations on the
utterance
layer, with the parent(s) set to the corresponding turn annotations. The labels should be the transcript of the utterance. - Optionally, new annotations on the
word
layer, representing individual word tokens with alignment information, if available.
- Annotations on the
- Throws:
Exception
-
transcribeFragments
public void transcribeFragments(Stream<File> speech, Consumer<Graph> consumer) throws Exception
Transcribes all audio files in the given stream.Implementors may override this to provide more efficient processing in cases where overhead can be saved by invoking a recogniser only once for a collection of recordings, instead of one invocation per recording.
The default implementation simply creates an empty graph and calls #transcribe(File,Graph) for each speech file.
- Parameters:
speech
- A stream of speech files to transcribe.consumer
- A consumer for receiving the graphs once they're transcribed.- Throws:
Exception
-
setTaskParameters
public void setTaskParameters(String parameters) throws InvalidConfigurationException
Normally, a transcriber has no specific task configuration, so this implementation does nothing.- Specified by:
setTaskParameters
in classAnnotator
- Parameters:
parameters
- The configuration of the annotator, encoded in a String using whatever mechanism is preferred (serialization of Properties object, JSON, etc.)- Throws:
InvalidConfigurationException
-
getRequiredLayers
public String[] getRequiredLayers() throws InvalidConfigurationException
Requires participant and turn layers, and also utterance layer ifgetDiarizationRequired()
returns true.- Specified by:
getRequiredLayers
in classAnnotator
- Returns:
- A list of layer IDs.
- Throws:
InvalidConfigurationException
- IfsetTaskParameters(String)
orAnnotator.setSchema(Schema)
have not yet been called.
-
getOutputLayers
public String[] getOutputLayers() throws InvalidConfigurationException
Determines which layers the annotator will create/update/delete annotations on.- Specified by:
getOutputLayers
in classAnnotator
- Returns:
- A list of layer IDs.
- Throws:
InvalidConfigurationException
- IfsetTaskParameters(String)
orAnnotator.setSchema(Schema)
have not yet been called.
-
transform
public Graph transform(Graph transcript) throws TransformationException
Transforms the graph by callingtranscribe(File,Graph)
it if audio is accessible and it has no words.- Parameters:
transcript
- The graph to transform.- Returns:
- The given graph, transformed.
- Throws:
TransformationException
- If the transformation cannot be completed.
-
-