This is client library for communicating with LaBB-CAT web application servers.
LaBB-CAT is a web-based linguistic annotation store that stores audio or video recordings, text transcripts, and other annotations.
Annotations of various types can be automatically generated or manually added.
LaBB-CAT servers are usually password-protected linguistic corpora, and can be accessed manually via a web browser, or programmatically using a client library like this one.
The library copies from nzilbb.ag.IGraphStoreQuery and related Java interfaces, for standardized API calls.
nzilbb-labbcat is available in the Python Package Index here.
Detailed Python documentation is available here.
The following example shows how to:
import labbcat
# Connect to the LaBB-CAT annotation store
corpus = labbcat.LabbcatEdit("http://localhost:8080/labbcat", "labbcat", "labbcat")
# List the corpora on the server
corpora = corpus.getCorpusIds()
# List the transcript types
transcript_type_layer = corpus.getLayer("transcript_type")
transcript_types = transcript_type_layer["validLabels"]
# Upload a transcript
corpus_id = corpora[0]
transcript_type = next(iter(transcript_types))
taskId = corpus.newTranscript(
"test/labbcat-py.test.txt", None, None, transcript_type, corpus_id, "test")
# wait for the annotation generation to finish
corpus.waitForTask(taskId)
corpus.releaseTask(taskId)
# get the "POS" layer annotations
annotations = corpus.getAnnotations("labbcat-py.test.txt", "pos")
labels = list(map(lambda annotation: annotation["label"], annotations))
# find all /a/ segments (phones) in the whole corpus
results = corpus.getMatches({ "segment" : "a" })
# get the start/end times of the segments
segments = corpus.getMatchAnnotations(results, "segment", offsetThreshold=50)
# get F1/F2 at the midpoint of each /a/ vowel
formantsAtMidpoint = corpus.processWithPraat(
labbcat.praatScriptFormants(), 0.025, results, segments)
# delete tha transcript from the corpus
corpus.deleteTranscript("labbcat-py.test.txt")