uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikolai Krot <tal...@gmail.com>
Subject Re: fuzzy matching possible?
Date Tue, 07 May 2019 11:58:27 GMT
Hi Peter,

Thank you for your reply


> at the end you need to check both, but you could maybe refactor the
> checks in a new condition like (not tested):
>
> CONDITION LemmaCT(ANNOTATION word, STRING check) = OR(word.lemma ==
> check, word.ct == check);
>
> w: Word{LemmaCT(w, "gearbeitet")};
>
>
this looks interesting and indeed shorter. Somehow I missed the section
about Macros in RUTA manual.

Thanks again and best regards,
Nikolai



> Best,
>
>
> Peter
>
>
>
> Am 04.05.2019 um 13:44 schrieb Nikolai Krot:
> > Hi Peter,
> >
> > Thank you for the answer.
> >
> >> that mainly depends on the typesystem. Your rule could look something
> > like:
> >>
> >> w:Word{OR(w.lemma == "arbeiten", w.ct == "gearbeitet")};
> > I know of this syntax. My question is whether there is a shorter form to
> > tell than whenever I need to match word text, the matching should check
> > both lemma and ct fields. Think of a few dozen rules like this...
> >
> > Best regards,
> > Nikolai
> >>
> >> Best,
> >>
> >>
> >> Peter
> >>
> >> Am 03.05.2019 um 18:28 schrieb Nikolai Krot:
> >>> Hi Peter,
> >>>
> >>> Thank you for your prompt reply.
> >>>
> >>> Speaking about pre-annotation with another engine. Say, I managed to
> >>> annotate words of interest and additionally set an attribute, something
> >>> like this
> >>>
> >>> ... <word lemma="arbeiten">gearbeitet</word>...
> >>>
> >>> Is there a simple way configure the object of matching in ruta rules so
> >>> that the rule matches over actual text ("gearbeitet" in our case) or
> the
> >>> value of attribute "lemma" ("arbeiten" in our case)?
> >>> That is, match should return True if either of the fields evaluates to
> > True.
> >>> This would make some rules simpler.
> >>>
> >>> Best regards,
> >>> Nikolai
> >>>
> >>> On Fri, May 3, 2019 at 2:03 PM Peter Klügl <peter.kluegl@averbis.com>
> > wrote:
> >>>> Hi,
> >>>>
> >>>>
> >>>> there is/was support for a weighted edit distance in the trie lookup,
> >>>> but that functionality was not maintained for many years.
> >>>>
> >>>> The dictionary lookup functionality in Ruta is overall very limited.
> >>>> Normally, one uses an separate analysis engine with extended logic
> >>>> (ConceptMapper?) for creating the annotations, which are then later
> >>>> reused in rules.
> >>>>
> >>>>
> >>>> Best,
> >>>>
> >>>>
> >>>> Peter
> >>>>
> >>>> Am 03.05.2019 um 13:16 schrieb Nikolai Krot:
> >>>>> Hi all,
> >>>>>
> >>>>> Is there a possibility to match a word somehow fuzzily in UIMA Ruta
> >>>>> language? I am thinking how to overcome problems with typos and
OCR
> >>>>> mistakes... It is hardly possible to list all possibilities how
a
> word
> >>>>> could have been broken.
> >>>>>
> >>>>> Best regards,
> >>>>> Nikolai Krot
> >>>>>
> >>>> --
> >>>> Dr. Peter Klügl
> >>>> R&D Text Mining/Machine Learning
> >>>>
> >>>> Averbis GmbH
> >>>> Salzstr. 15
> >>>> 79098 Freiburg
> >>>> Germany
> >>>>
> >>>> Fon: +49 761 708 394 0
> >>>> Fax: +49 761 708 394 10
> >>>> Email: peter.kluegl@averbis.com
> >>>> Web: https://averbis.com
> >>>>
> >>>> Headquarters: Freiburg im Breisgau
> >>>> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
> >>>> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó
> >>>>
> >>>>
> >> --
> >> Peter Klügl
> >> R&D Text Mining/Machine Learning
> >>
> >> Averbis GmbH
> >> Salzstr. 15
> >> 79098 Freiburg
> >> Germany
> >>
> >> Fon: +49 761 708 394 0
> >> Fax: +49 761 708 394 10
> >> Email: peter.kluegl@averbis.com
> >> Web: https://averbis.com
> >>
> >> Headquarters: Freiburg im Breisgau
> >> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
> >> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó
> >>
> --
> Dr. Peter Klügl
> R&D Text Mining/Machine Learning
>
> Averbis GmbH
> Salzstr. 15
> 79098 Freiburg
> Germany
>
> Fon: +49 761 708 394 0
> Fax: +49 761 708 394 10
> Email: peter.kluegl@averbis.com
> Web: https://averbis.com
>
> Headquarters: Freiburg im Breisgau
> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message