uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: Dictionary annotator - added parameter to make analysis on different SOFAs
Date Thu, 06 Feb 2014 13:18:42 GMT
On 06.02.2014, at 12:02, Luca Foppiano <luca@foppiano.org> wrote:

> On Tue, Feb 4, 2014 at 10:40 PM, Richard Eckart de Castilho
> <rec@apache.org>wrote:
> 
> [...]
> 
> 
>> The correct mapping for the dictionaryEngine should be
>> 
>> builder.add(dictionaryEngine,
>>  CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY);
>> 
>> so the SOFA_NAME_TEXT_ONLY is supplied as the default view to the
>> dictionaryEngine.
>> 
>> Similarly, it should be possible to remove the view parameter from
>> whitespaceEngine and the getView call from the consumer and use these
>> mappings:
>> 
>> builder.add(preparationEngine);
>> builder.add(whitespaceEngine,
>>  CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY);
>> builder.add(dictionaryEngine,
>>  CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY);
>> builder.add(casConsumer,
>>  CAS.NAME_DEFAULT_SOFA, SimpleParserAE.SOFA_NAME_TEXT_ONLY);
>> 
>> 
> Thanks! With your help I managed to made it works.
> 
> So, to see if I've understood:
>    I have annotator A that read from the "initial view" and write in the
> view 'BAO' (hardcoded inside the annotator) I could remap the sofa in this
> way:
> 
> builder.add(myAnnotator,
>   CAS.NAME_DEFAULT_SOFA, "MyNewFancySofaInput",
>   "BAO", "MyNewFancySofaOutput")

Yep - sounds right. I'd probably do the naming a bit different though.
What is input and output changes from component to component. So I'd used
other names on the pipeline level, e.g.

 RAW_DATA
 EXTRACTED_TEXT
 TRANSLATED_TEXT

I'd hardcode simple names like "INPUT" and "OUTPUT" in the components
and then map these to the pipeline-level-names:

builder.add(TextExtractor,
  "INPUT",  "RAW_DATA",
  "OUTPUT", "EXTRACTED_TEXT");
builder.add(Parser,
  CAS.NAME_DEFAULT_SOFA, "EXTRACTED_TEXT");
builder.add(Translator,
  "INPUT",  "EXTRACTED_TEXT",
  "OUTPUT", "TRANSLATED_TEXT");

etc. Since it is currently not possible in uimaFIT to use view mappings
with readers, I'd probably substitute RAW_TEXT with CAS.NAME_DEFAULT_SOFA,
so that the reader can just dump its data into the default view.

builder.add(TextExtractor,
  "INPUT",  CAS.NAME_DEFAULT_SOFA,
  "OUTPUT", "EXTRACTED_TEXT");
builder.add(Parser,
  CAS.NAME_DEFAULT_SOFA, "EXTRACTED_TEXT");
builder.add(Translator,
  "INPUT",  "EXTRACTED_TEXT",
  "OUTPUT", "TRANSLATED_TEXT");

> I have to say this feature is quite interesting but in fact the type
> systems are the components generating the real pain… :)

What's the pain with the type systems?

Cheers,

-- Richard


Mime
View raw message