uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donatas Remeika <donatas.reme...@gmail.com>
Subject Re: New dictionary annotator
Date Mon, 05 Dec 2016 11:56:39 GMT
Hi,

Thanks for feedback.
Yes, it would be interesting to see benchmark results. Maybe you know where
I could find examples and data for doing benchmarks in UIMA?

Best regards,
Donatas


On Mon, Dec 5, 2016 at 10:52 AM Peter Kl├╝gl <peter.kluegl@averbis.com>
wrote:

> Hi,
>
>
> a very nice annotator, thank you.
>
>
> Do you have figures how the annotator compares to the others with
> respect to speed and memory usage?
>
> Storing the complete tokens will maybe provide challenges in scenarios
> with parallelization if the dictionary is not shared between annotators.
>
> Would you be interested to set up a benchmark?
>
>
> Because of the limitations of the dictionaries in ruta, I also created a
> new simple dictionary annotator, but it lives now in our own components
> repository. Maybe I'll contribute it sometimes to ruta since it provides
> exactly the functionality the ruta dictionaries miss.
>
>
> Best,
>
>
> Peter
>
>
> Am 30.11.2016 um 15:38 schrieb Donatas Remeika:
> > Hi,
> >
> > Just wanted to let you know that we created a new (probably one more)
> > dictionary annotator.
> >
> > Reasons for creating it was:
> >  - Quite often we used Ruta in our pipelines only because of its
> MARKTABLE
> > action which is able to set several features on annotation
> >  - Sometimes dictionaries contain duplicate entries with different
> features
> > and we need to create annotations for each entry
> >  - Possibility to use custom dictionary entries tokenizer (default is
> > whitespace tokenizer)
> >
> > It was inspired by both DKPro dictionary-annotator and Ruta MARKTABLE.
> Big
> > thanks to their developers!
> >
> > Code with examples can be found
> > https://github.com/tokenmill/dictionary-annotator
> >
> > BTW, maybe someone knows Concept Mapper alternative, which is more
> uimaFIT
> > friendly?
> >
> > Best regards,
> > Donatas
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message