uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: difficulty using Dictionary Annotator and Hmm Tagger
Date Tue, 16 Jun 2009 07:43:59 GMT
it seems to me that Hmm Tagger is working properly in the CVD.When you run
the Hmm Tagger it tags part of speech not as a separate Annotation, it fills
a property in your Token annotations created by the Whitespace Tokenizer (I
can't recall the name of the property, something like 'pos'), so try to have
a look at the Token annotations before and after Tagger processing.
The Document Analyzer log is showing only the Whitespace Tokenizer has
started...but try to have a look at Token properties, as said above.
Bye,
Tommaso


2009/6/16 estelle <ed@similis.org>

> Tommaso Teofili <tommaso.teofili@...> writes:
>
> >
> > Hi, try to use CAS Visual Debugger, I think it's very useful for starting
> > developing with UIMA.The HMM tagger needs the Whitespace Tokenizer to
> > process the document first in order to annotate POSs.
> > The flow order is significant so beware.
> > For the Dictionary, is there any entry inside the dictionary? Is it
> pointed
> > in the right place?
> > Check the log at runtime too.
> > Provide more info
> > Regards,
> > Tommaso
> >
> > 2009/6/15 estelle <ed@...>
> >
> > > Hello,
> > > I'm new to UIMA and i am currently testing the sandbox addons.
> > > I'm testing them with the help of the Document Analyzer utility.
> > > The Dictionnary Annotator and the Hmm Tagger seem to work fine (there
> are
> > > no
> > > error messages) but once the text is processed,  I can't see any
> annotation
> > > on
> > > the Annotation results panel.
> > >
> > > Can someone help me please ?
> > >
> > >
> >
>
> Hello and thank you for your answer.
>
> HmmTagger and DictionaryAnnotator work fine with the CAS Visual Debugger.
>
> I do use the aggregateAnnotator "Tokenizer > HmmTagger" for Tagging and the
> "Tokenizer > DictionaryAnnotator" aggregateAnnotator for dictionary
> annotation.
>
> The entries in the dictionary are the default entries + an entry for the
> word
> "UIMA" that I've added to make sure it would match on the sample texts.
>
> I have checked the logfiles and it seems that only the WhiteSpaceTokenizer
> works
> when launching the Tokenizer + HmmTagger aggregation.
>
>
> Log file from running "Tokenizer + Hmm" with Document Analyzer :
>
> 16/06/09 09:32:09 - 12: WhitespaceTokenizer.initialize: INFO: "Whitespace
> tokenizer successfully initialized"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.typeSystemInit: INFO:
> "Whitespace
> tokenizer typesystem initialized"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
>
>
>
> Log file from running "Tokenizer + Hmm" with CAS Visual Debugger :
>
> 16/06/09 09:23:52 - 10: WhitespaceTokenizer.initialize: INFO: "Whitespace
> tokenizer successfully initialized"
> 16/06/09 09:24:05 - 10: WhitespaceTokenizer.typeSystemInit: INFO:
> "Whitespace
> tokenizer typesystem initialized"
> 16/06/09 09:24:05 - 10: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> starts processing"
> 16/06/09 09:24:05 - 10: WhitespaceTokenizer.process: INFO: "Whitespace
> tokenizer
> finished processing"
> 16/06/09 09:24:05 - 10:
> org.apache.uima.tools.cvd.MainFrame.internalRunAE(1570):
> INFO: Process trace of AE run:
> Component Name: HmmTaggerTAE
> Event Type: Analysis
> Duration: 179ms (100%)
> Sub-events:
>        Component Name: WhitespaceTokenizer
>        Event Type: Analysis
>        Duration: 7ms (3,91%)
>
>        Component Name: Hidden Markov Model - Part of Speech Tagger
>        Event Type: Analysis
>        Duration: 162ms (90,5%)
>
>        Component Name: Fixed Flow Controller
>        Event Type: Analysis
>        Duration: 5ms (2,79%)
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message