uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kl├╝gl <peter.klu...@averbis.com>
Subject Re: New dictionary annotator
Date Mon, 05 Dec 2016 08:52:50 GMT

a very nice annotator, thank you.

Do you have figures how the annotator compares to the others with
respect to speed and memory usage?

Storing the complete tokens will maybe provide challenges in scenarios
with parallelization if the dictionary is not shared between annotators.

Would you be interested to set up a benchmark?

Because of the limitations of the dictionaries in ruta, I also created a
new simple dictionary annotator, but it lives now in our own components
repository. Maybe I'll contribute it sometimes to ruta since it provides
exactly the functionality the ruta dictionaries miss.



Am 30.11.2016 um 15:38 schrieb Donatas Remeika:
> Hi,
> Just wanted to let you know that we created a new (probably one more)
> dictionary annotator.
> Reasons for creating it was:
>  - Quite often we used Ruta in our pipelines only because of its MARKTABLE
> action which is able to set several features on annotation
>  - Sometimes dictionaries contain duplicate entries with different features
> and we need to create annotations for each entry
>  - Possibility to use custom dictionary entries tokenizer (default is
> whitespace tokenizer)
> It was inspired by both DKPro dictionary-annotator and Ruta MARKTABLE. Big
> thanks to their developers!
> Code with examples can be found
> https://github.com/tokenmill/dictionary-annotator
> BTW, maybe someone knows Concept Mapper alternative, which is more uimaFIT
> friendly?
> Best regards,
> Donatas

View raw message