uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: difficulty using Dictionary Annotator and Hmm Tagger
Date Tue, 16 Jun 2009 08:26:47 GMT
Estelle,

this may just be a usability issue with the
DocumentAnalyzer.  If your analysis chain
works with CVD, there is no reason to believe
it wouldn't work with the DocumentAnalyzer.
Did you follow the instruction described here:
http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/tools/tools.html#ugr.tools.doc_analyzer.viewing_results

The difference in the log file is no cause
for alarm.  The process trace is logged by
CVD itself, not the annotators.  Looks to
me like the POS tagger is not logging anything,
neither in CVD nor in DocumentAnalyzer.

HTH,
Thilo

estelle wrote:
> Tommaso Teofili <tommaso.teofili@...> writes:
> 
>> Hi, try to use CAS Visual Debugger, I think it's very useful for starting
>> developing with UIMA.The HMM tagger needs the Whitespace Tokenizer to
>> process the document first in order to annotate POSs.
>> The flow order is significant so beware.
>> For the Dictionary, is there any entry inside the dictionary? Is it pointed
>> in the right place?
>> Check the log at runtime too.
>> Provide more info 
>> Regards,
>> Tommaso
>>
>> 2009/6/15 estelle <ed@...>
>>
>>> Hello,
>>> I'm new to UIMA and i am currently testing the sandbox addons.
>>> I'm testing them with the help of the Document Analyzer utility.
>>> The Dictionnary Annotator and the Hmm Tagger seem to work fine (there are
>>> no
>>> error messages) but once the text is processed,  I can't see any annotation
>>> on
>>> the Annotation results panel.
>>>
>>> Can someone help me please ?
>>>
>>>
> 
> Hello and thank you for your answer. 
> 
> HmmTagger and DictionaryAnnotator work fine with the CAS Visual Debugger. 
> 
> I do use the aggregateAnnotator "Tokenizer > HmmTagger" for Tagging and the
> "Tokenizer > DictionaryAnnotator" aggregateAnnotator for dictionary annotation.
> 
> The entries in the dictionary are the default entries + an entry for the word
> "UIMA" that I've added to make sure it would match on the sample texts.
> 
> I have checked the logfiles and it seems that only the WhiteSpaceTokenizer works
> when launching the Tokenizer + HmmTagger aggregation.
> 
> 
> Log file from running "Tokenizer + Hmm" with Document Analyzer : 
> 
> 16/06/09 09:32:09 - 12: WhitespaceTokenizer.initialize: INFO: "Whitespace
> tokenizer successfully initialized"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.typeSystemInit: INFO: "Whitespace
> tokenizer typesystem initialized"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:32:10 - 13: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 
> 
> 
> Log file from running "Tokenizer + Hmm" with CAS Visual Debugger : 
> 
> 16/06/09 09:23:52 - 10: WhitespaceTokenizer.initialize: INFO: "Whitespace
> tokenizer successfully initialized"
> 16/06/09 09:24:05 - 10: WhitespaceTokenizer.typeSystemInit: INFO: "Whitespace
> tokenizer typesystem initialized"
> 16/06/09 09:24:05 - 10: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> starts processing"
> 16/06/09 09:24:05 - 10: WhitespaceTokenizer.process: INFO: "Whitespace tokenizer
> finished processing"
> 16/06/09 09:24:05 - 10: org.apache.uima.tools.cvd.MainFrame.internalRunAE(1570):
> INFO: Process trace of AE run:
> Component Name: HmmTaggerTAE
> Event Type: Analysis
> Duration: 179ms (100%)
> Sub-events:
>         Component Name: WhitespaceTokenizer
>         Event Type: Analysis
>         Duration: 7ms (3,91%)
> 
>         Component Name: Hidden Markov Model - Part of Speech Tagger
>         Event Type: Analysis
>         Duration: 162ms (90,5%)
> 
>         Component Name: Fixed Flow Controller
>         Event Type: Analysis
>         Duration: 5ms (2,79%)
> 
> 
> 
> 

Mime
View raw message