uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From estelle ...@similis.org>
Subject Re: difficulty using Dictionary Annotator and Hmm Tagger
Date Tue, 16 Jun 2009 07:35:09 GMT
Tommaso Teofili <tommaso.teofili@...> writes:

> 
> Hi, try to use CAS Visual Debugger, I think it's very useful for starting
> developing with UIMA.The HMM tagger needs the Whitespace Tokenizer to
> process the document first in order to annotate POSs.
> The flow order is significant so beware.
> For the Dictionary, is there any entry inside the dictionary? Is it pointed
> in the right place?
> Check the log at runtime too.
> Provide more info 
> Regards,
> Tommaso
> 
> 2009/6/15 estelle <ed@...>
> 
> > Hello,
> > I'm new to UIMA and i am currently testing the sandbox addons.
> > I'm testing them with the help of the Document Analyzer utility.
> > The Dictionnary Annotator and the Hmm Tagger seem to work fine (there are
> > no
> > error messages) but once the text is processed,  I can't see any annotation
> > on
> > the Annotation results panel.
> >
> > Can someone help me please ?
> >
> >
> 

Hello and thank you for your answer. 

HmmTagger and DictionaryAnnotator work fine with the CAS Visual Debugger. 

I do use the aggregateAnnotator "Tokenizer > HmmTagger" for Tagging and the
"Tokenizer > DictionaryAnnotator" aggregateAnnotator for dictionary annotation.

The entries in the dictionary are the default entries + an entry for the word
"UIMA" that I've added to make sure it would match on the sample texts.

I have checked the logfiles and it seems that only the WhiteSpaceTokenizer works
when launching the Tokenizer + HmmTagger aggregation.


Log file from running "Tokenizer + Hmm" with Document Analyzer : 

16/06/09 09:32:09 - 12: WhitespaceTokenizer.initialize: INFO: "Whitespace
tokenizer successfully initialized"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.typeSystemInit: INFO: "Whitespace
tokenizer typesystem initialized"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"



Log file from running "Tokenizer + Hmm" with CAS Visual Debugger : 

16/06/09 09:23:52 - 10: WhitespaceTokenizer.initialize: INFO: "Whitespace
tokenizer successfully initialized"
16/06/09 09:24:05 - 10: WhitespaceTokenizer.typeSystemInit: INFO: "Whitespace
tokenizer typesystem initialized"
16/06/09 09:24:05 - 10: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
starts processing"
16/06/09 09:24:05 - 10: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
finished processing"
16/06/09 09:24:05 - 10: org.apache.uima.tools.cvd.MainFrame.internalRunAE(1570):
INFO: Process trace of AE run:
Component Name: HmmTaggerTAE
Event Type: Analysis
Duration: 179ms (100%)
Sub-events:
        Component Name: WhitespaceTokenizer
        Event Type: Analysis
        Duration: 7ms (3,91%)

        Component Name: Hidden Markov Model - Part of Speech Tagger
        Event Type: Analysis
        Duration: 162ms (90,5%)

        Component Name: Fixed Flow Controller
        Event Type: Analysis
        Duration: 5ms (2,79%)






Mime
View raw message