uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Gerlach <Martin.Gerl...@neofonie.de>
Subject Re: Regular expressions over UIMA annotations
Date Tue, 23 Jun 2009 09:45:13 GMT
Hi Ekaterina and replyers,

we migrated the bridge written by the GATE team to Apache UIMA 2.2.2 and
find it shows pretty good performance although there are indeed a lot of
objects being created. It is also possible to map rich UIMA type systems
to GATE annotations if you run some extra JAPE phases with Java code to
convert the FSArray structures etc. to GATE annotations. However, we
found that mapping rich GATE annotations back to UIMA is somewhat
difficult as the mapping is not very powerful. You might need to do some
extra conversion in an AE following the GATE AE.

We're planning to donate our code back to GATE or even to Apache at some
point but have to clarify legal issues first, which may take some time.

However, everything you need is go by the presentation pointed out by
Jochen before:

http://gate.ac.uk/sale/talks/gate-course-oct06/uima-integration.ppt

(unfortunately gate.ac.uk seems to be currently down)

To migrate the GATE bridge from IBM UIMA to Apache, start with changing
the package names. It's almost all you need to do.

Let me know if you decide to go that way and run into problems - my
colleague who did the main work on this and I can then try to assist you.

Regards,
Martin

Roberto Franchini schrieb:
> On Mon, Jun 22, 2009 at 5:03 PM, Ekaterina Buyko <
> ekaterina.buyko@uni-jena.de> wrote:
> 
>> Hi,
>>
>> I am interested in using JAPE grammar or something with similar
>> functionality in UIMA. Has anybody already experience in that?
>>
>> Thank you
>>
>>
> We developed a bridge to use JAPE inside a UIMA pipeline.
> It works, but I'm not very happy with it:
> - low performance: mapping from uima to jape and then back to uima
> genersates a lot of objects and a lot of GC cycles: if you are going to
> analyze a lot of documents you can:
> -- buy new HW
> -- study the voodo to optimize  the JVM performance :)
> - difficult to map a rich UIMA type sytem  to jape annotaions
> 
> If you want a can share my very-ugly code :)
> 
> The GATE team wrote a bridge but as far as I know it support the old
> ibm-uima.
> Text marker seems very interesting to me, I will take a deep look to it on
> August .
> cheers,
> R.
> 

-- 
--------------------------------
Martin Gerlach
Softwareentwicklung

neofonie
Technologieentwicklung und
Informationsmanagement GmbH
Robert-Koch-Platz 4
10115 Berlin
fon: +49.30 24627 413
fax: +49.30 24627 120
Martin.Gerlach@neofonie.de
http://www.neofonie.de

Handelsregister
Berlin-Charlottenburg: HRB 67460

Geschaeftsfuehrung
Helmut Hoffer von Ankershoffen
(Sprecher der Geschaeftsfuehrung)
Nurhan Yildirim
-------------------------------

WeFind - Genau was Du suchst

Die erste Web 2.0 Suchmaschine jetzt auf http://www.wefind.de.
Unterwegs immer bestens informiert mit WeFind Mobile für iPhone und
jetzt auch mit WeFind Mobile für Android: kostenloser Download im iTunes
AppStore und im Android Market.

Mime
View raw message