uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: question on Ruta Query View
Date Sun, 19 Jun 2016 12:52:26 GMT

the annotation browser just lists all annotations in the CAS, it is
completely independent of the ruta language and just an extension of the
CAS Editor. The query view applies rules on a CAS and lists the rule
matches. So the query view is much more powerful than the annotation
browser since it can use the complete expressiveness of the language.
However, that is also the reason why it is sensible to the visibility



Am 19.06.2016 um 14:39 schrieb Bonnie MacKellar:
> The idea that spaces are making the annotations invisble is totally
> plausible. But why does the AnnotationBrowser see them then? The
> annotations are there - they haven't been skipped- just the query view is
> not picking them up. What is different about Annotation Browser that would
> make those annotations not visible?
> thanks,
> Bonnie MacKellar
> On Sun, Jun 19, 2016 at 8:03 AM, Peter Klügl <peter.kluegl@averbis.com>
> wrote:
>> Hi,
>> attachements are removed on this mailing list.
>> I would bet that some annotations are not visible to the rules, so they
>> are simply skipped -> query view reutrn no matches.
>> In Ruta, annotations are invisble if their begin or end are covered by
>> something invisible, that are all annoations of types that are filtered.
>> Most often, the annotations are missed because they start or and with a
>> space or line break.
>> You can trim annotation, e.g., with
>> tsCurrent{-> TRIM(SPACE,BREAK)};
>> You can use the query view for this use case. I have to mention that the
>> query view was build to serve as a tool during rule engineering: to get
>> a quick overview over the annotated documents. It does not scale with
>> the number of documents since there is not indexing across CASes and you
>> need to deserialze all CASes.
>> If it is fast enough, it is totally fine for counting annotations with
>> the query view.
>> You can also write a simple uimaFIT analysis engine and add it to the
>> pipeline or the the ruta script. The analysis engine counts the
>> annotation in process() and outputs the aggregates result in
>> collectionProcessingComplete() (or the overridden method with the
>> correct name). If you want to parallelize it, you need a different
>> solution with a resource or something.
>> Best,
>> Peter
>> Am 17.06.2016 um 21:21 schrieb Bonnie MacKellar:
>>> Hi
>>> I am trying to use Ruta Query View to get a view of all matches for a
>>> particular annotation type across a large set of .xmi files. However,
>>> I am noticing something strange about Ruta Query View - it doesnt't
>>> report lots of matches that are shown in the Annotation browser (and
>>> which I believe are correct matches). For example, a given annotation
>>> type tsCurrent has 4 matches in the file NCT0036712, but these matches
>>> do not appear at all in the list of results in Ruta Query View when I
>>> query for tsCurrent.  For some files, though, the results for all
>>> matches do show up, and for other files, only a partial set of matches
>>> are in the query results. I cannot understand why this is happening.
>>> Perhaps my query syntax is wrong?  I can only find the one example in
>>> the manual, which isn't much to go on.
>>> I am attaching a screenshot showing the AnnotationBrowser on the top
>>> right in Eclipse, with all of the matches for tsCurrent, and the Ruta
>>> Query view on bottom, which does not contain those matches. I think it
>>> is easier to see the problem visually.
>>> Also,ultimately I am just trying to get a count of the number of times
>>> certain annotations are made across all of my files. Is there a better
>>> way to do that instead of Ruta Query View?  I can't find another way
>>> to total matches across lots of files.
>>> thanks,
>>> Bonnie MacKellar
>>> Inline image 1

View raw message