uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominik Terweh <d.ter...@drooms.com>
Subject Re: Usage of anchors
Date Wed, 21 Aug 2019 13:47:25 GMT
Hi Peter,

Thanks a lot for the clarification. I was wondering about (10) too.

Following your explanation I was wondering, Does it make sense to anchor sequences, such as
in (8) and is it "legal" to use multiple anchors in hierarchical fashion?
Like A @(B @C D)?

Also, is there a difference between the processing of sequences of annotations or literals
(given "A" is annotated as A and so on)?
A @(B C D)
Vs
"A" @("B" "C" "D")
Vs
A @("B" C "D")

Best
Dominik



Dominik Terweh
Praktikant

DROOMS


Drooms GmbH
Eschersheimer Landstraße 6
60322 Frankfurt, Germany
www.drooms.com

Phone:
Fax:
Mail: d.terweh@drooms.com


Subscribe to the Drooms newsletter
>>> https://drooms.com/en/newsletter?utm_source=newslettersignup&utm_medium=emailsignature

Drooms GmbH; Sitz der Gesellschaft / Registered Office: Eschersheimer Landstr. 6, D-60322
Frankfurt am Main; Geschaeftsfuehrung / Management Board: Alexandre Grellier;
Registergericht / Court of Registration: Amtsgericht Frankfurt am Main, HRB 76454; Finanzamt
/ Tax Office: Finanzamt Frankfurt am Main, USt-IdNr.: DE 224007190

On 21.08.19, 12:10, "Peter Klügl" <peter.kluegl@averbis.com> wrote:

    Hi,

    Am 20.08.2019 um 16:09 schrieb Dominik Terweh:
    >
    > Dear All,
    >
    >
    >
    > I have some questions regarding processing times and anchors ("@").
    >
    >
    >
    > First of all, is it possible to define an anchor on a disjunction?
    >
    > What I tested was to have a simple rule (1) that should start on the
    > Element in the middle (2). Now this element had a variation (3) but I
    > could not use the anchor in that case anymore:
    >
    > 1) A    B   C;       // works
    >
    > 2) A   @B   C;       // works
    >
    > 3) A @(B|D) C;       // NOT WORKING
    >
    > Is this behaviour intended or simply not supported?
    >
    > [NOTE: NOT WORKING means eclipse does not complain, but the rule never
    > matches]
    >
    >
    >
    > The above led to some testing with a different setup(4), however,
    > since disjunctions don't seem to work, this was also not valid.
    >
    > 4) A @((B C) | (D C));   // NOT WORKING
    >

    Anchors at disjunct rule elements are syntactically supported but do not
    work correctly. I will open a bug ticket.


    >
    >
    > Is there a scenario where anchors are valid in and before brackets?
    > From my observation I've seen that (5)-(10) are all working as
    > expected and all start matching on B. But, do they differ in terms of
    > processing? I noticed slightly longer processing times in (5) and ever
    > so slightly in (6), but not very indicative. Could (5)-(10) differ in
    > processing time?
    >
    > 5)   A   @B C
    >
    > 6)  (A   @B C)
    >
    > 7) @(A   @B C)
    >
    > 8)   A  @(B C)
    >
    > 9)   A @(@B C)
    >
    > 10)  A  (@B C)
    >

    Yes since different combinations of methods are called, but I think
    there should not be a big difference between (5)-(9).


    >
    >
    > Since rule (10) works as expected, why does (11) work differently and
    > start on A but not on B and D? (This would be useful in a scenario
    > where B and D combined appear less often than A)
    >
    > 11) A  ((@B C) | (@D C));   // starts matching on A
    >
    >
    >
    >
    >

    I have to check that. I think (10) start with A too.



    Two comments for anchors and disjunct rule elements:

    Anchors started as a manual option to optimize the rule execution time
    compared tot he automatic dynamic anchoring. However, the anchor can
    considerably change the consequences of a rule. For me, the anchor is
    more of an engineering option which also can be used to speed up the rules.


    Disjunct rule elements are not well supported and maintained in Ruta.
    Their implementation is not efficient and they can lead to unintened
    matches. Thus, their usage is not allowed in my team and I would not
    recommend using them right now.


    (I will try to find the time to improve the implementation)


    Best,


    Peter


    > Thank you in advance for your answers,
    >
    > Best
    >
    > Dominik
    >
    > Dominik Terweh
    > Praktikant
    >
    > *Drooms GmbH*
    > Eschersheimer Landstraße 6
    > 60322 Frankfurt, Germany
    > www.drooms.com <http://www.drooms.com>
    >
    > Phone:
    > Mail: d.terweh@drooms.com <mailto:d.terweh@drooms.com>
    >
    > <https://drooms.com/en/newsletter?utm_source=newslettersignup&utm_medium=emailsignature>
    >
    > *Drooms GmbH*; Sitz der Gesellschaft / Registered Office:
    > Eschersheimer Landstr. 6, D-60322 Frankfurt am Main; Geschäftsführung
    > / Management Board: Alexandre Grellier;
    > Registergericht / Court of Registration: Amtsgericht Frankfurt am
    > Main, HRB 76454; Finanzamt / Tax Office: Finanzamt Frankfurt am Main,
    > USt-IdNr.: DE 224007190
    >
    --
    Dr. Peter Klügl
    R&D Text Mining/Machine Learning

    Averbis GmbH
    Salzstr. 15
    79098 Freiburg
    Germany

    Fon: +49 761 708 394 0
    Fax: +49 761 708 394 10
    Email: peter.kluegl@averbis.com
    Web: https://averbis.com

    Headquarters: Freiburg im Breisgau
    Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
    Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó



Mime
View raw message