GALENA

Tutorial for GALENA tagger with text interface

This is a simple tutorial for getting started with the GALENA tagger and morphological analizer.

Usually, the tagger is a binary file with name gld (from galena lexer driver). Consult the download page in order to obtain the binary file for your machine and operating system. Altenatively, you can try the GALENA tagger text demo page. Currently, the tagger is able to recognize and correctly tag 13,667 lemmas of common Spanish with an average speed about 1360 word per second on a SPARCstation 20.


Help

Once you have the binary file for your system, you can invoke it with the -h option in order to obtain help about its syntax:
 $ gld -h
 Usage: gld [-e | -s] [-i <file>] [-o <file>] [-d <file>] [-h]

 -e etiquetas en inglés y en formato largo
 -s etiquetas en español y en formato largo
 -i <file> entrada de texto desde el fichero <file>
 -o <file> escribir el resultado en el fichero <file>
 -d <file> desambigüar usando la matriz situada en el fichero <file>
 -h mostrar esta ayuda, cancelar otras opciones.

 $


Input format

If you invoke gld without options, you must use your keyboard in order to introduce the text you want to analyze. Special characters are indicated as follow:


Output format

If you do not indicate any option to gld, the output will appear in your screen with the following format:

["The word analyzed", (The label in short format), "lemma"]

for each possible label of the word. Here you can see an example of a word having three taggings (noun, preposition and verb):

$ gld
sobre <Enter>
=> ["sobre", (Vysps0), "sobrar"]
["sobre", (Scms), "sobre"]
["sobre", (P), "sobre"]

$
You can finish introducing CTRL-D.


Useful options

If you do not know the tag set, use the option -e or -s in order to obtain long format output.

if you want to analyze the text caontained in a file, use the option -i file. if you want to obtain the result in a file, use the option -o file.

You can also use the option -d, which allows you to specify a disambiguation matrix.


Example

We show the tagging of the sentence El hombre ha dejado el paquete sobre el suelo antes de dártelo, a causa del cansancio, in which appear several interesting characteristic: Here is the output:
 $ gld -e
 El hombre ha dejado el paquete sobre el suelo antes de d'artelo,
 a causa del cansancio.
 => ["El", (Dms), Articulo, masc, sing, "El"]
 => ["hombre", (Scms), Sustantivo comun, masc, sing, "hombre"]
 => ["ha dejado", (V3sPi0), Verbo, tercera, sing, antepresente, indicativo,
 genero no aplicable, "dejar"]
 => ["el", (Dms), Articulo, masc, sing, "el"]
 => ["paquete", (Scms), Sustantivo comun, masc, sing, "paquete"]
 => ["en", (P), Preposicion, "en"]
 => ["el", (Dms), Articulo, masc, sing, "el"]
 => ["suelo", (V1spi0), Verbo, primera, sing, presente, indicativo, genero no
 aplicable, "soler"]
 ["suelo", (Scms), Sustantivo comun, masc, sing, "suelo"]
 => ["antes", (Scmp), Sustantivo comun, masc, plur, "ante"]
 ["antes", (Wn), Adverbio nuclear, "antes"]
 => ["de", (P), Preposicion, "de"]
 => ["d'ar", (V000f02), Verbo, persona no aplicable, numero persona no aplicable,
 tiempo verbal no aplicable, infinitivo, genero no aplicable, 2 pronombres
 cliticos, "dar"]
 => ["te", (Re2syy), Pronombre Personal enclitico atono, segunda, sing, acusativo
 y dativo, masc y fem, "t'u"]
 => ["lo", (Re3sam), Pronombre Personal enclitico atono, tercera, sing, acusativo,
 masc, "'el"]
 => [",", (Q,), Marca de Puntuacion coma, ","]
 => ["a", (P), Preposicion, "a"]
 ["a", (Scfs), Sustantivo comun, fem, sing, "a"]
 => ["causa", (V2spm0), Verbo, segunda, sing, presente, imperativo, genero no apli
 cable, "causar"]
 ["causa", (V3spi0), Verbo, tercera, sing, presente, indicativo, genero no apli
 cable, "causar"]
 ["causa", (Scfs), Sustantivo comun, fem, sing, "causa"]
 => ["de", (P), Preposicion, "de"]
 => ["el", (Dms), Articulo, masc, sing, "el"]
 => ["cansancio", (Scms), Sustantivo comun, masc, sing, "cansancio"]
 => [".", (Q.), Marca de Puntuacion punto, "."]
 
 $

Send comments and suggestions to webmaster@coleweb.dc.fi.udc.es