By default LaBB-CAT includes a layer manager called the Flat Lexicon Tagger, which can be configured to annotate words with data from a dictionary loaded from a plain text file (e.g. a CSV file). The file must have a 'flat' structure in the sense that it's a simple list of dictionary entries with a fixed number of columns/fields, rather than having a complex structure.

loadLexicon(
  labbcat.url,
  file,
  lexicon,
  field.delimiter,
  field.names,
  quote = "",
  comment = "",
  skip.first.line = FALSE,
  no.progress = FALSE
)

Arguments

labbcat.url

URL to the LaBB-CAT instance.

file

The full path name of the lexicon file.

lexicon

The name for the resulting lexicon. If the named lexicon already exists, it will be completely replaced with the contents of the file (i.e. all existing entries will be deleted befor adding new entries from the file). e.g. 'cmudict'

field.delimiter

The character used to delimit fields in the file. If this is " - ", rows are split on only the <em>first</em> space, in line with common dictionary formats. e.g. ',' for Comma Separated Values (CSV) files.

field.names

A list of field names, delimited by field.delimiter, e.g. 'Word,Pronunciation'.

quote

The character used to quote field values (if any), e.g. '"'.

comment

The character used to indicate a line is a comment (not an entry) (if any) e.g. '#'.

skip.first.line

Whether to ignore the first line of the file (because it contains field names).

no.progress

TRUE to supress visual progress bar. Otherwise, progress bar will be shown when interactive().

Value

An error message, or NULL if the upload was successful.

Details

This function uploads such a lexicon file, for use in tagging tokens.

You must have editing privileges in LaBB-CAT in order to be able to use this function.

Examples

if (FALSE) {
## Upload the CMU Pronouncing Dictionary 
loadLexicon(labbcat.url, "cmudict", " - ", "", ";", "Word - Pron", FALSE, "cmudict.txt")
}