pith. sign in

arxiv: cmp-lg/9505035 · v1 · pith:HX7SVSQNnew · submitted 1995-05-19 · cmp-lg · cs.CL

Development of a Spanish Version of the Xerox Tagger

classification cmp-lg cs.CL
keywords spanishtaggercorpusmodelorderperformedpresentedsome
0
0 comments X
read the original abstract

This paper describes work performed withing the CRATER ({\em C}orpus {\em R}esources {\em A}nd {\em T}erminology {\em E}xt{\em R}action, MLAP-93/20) project, funded by the Commission of the European Communities. In particular, it addresses the issue of adapting the Xerox Tagger to Spanish in order to tag the Spanish version of the ITU (International Telecommunications Union) corpus. The model implemented by this tagger is briefly presented along with some modifications performed on it in order to use some parameters not probabilistically estimated. Initial decisions, like the tagset, the lexicon and the training corpus are also discussed. Finally, results are presented and the benefits of the {\em mixed model} justified.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.