uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hugues de Mazancourt <hug...@mazancourt.com>
Subject Limiting the memory used by an annotator ?
Date Sat, 29 Apr 2017 10:53:12 GMT
Hello UIMA users,

I’m currently putting a Ruta-based system in production and I sometimes run out of memory.
This is usually caused by combinatory explosion in Ruta rules. These rules are not necessary
faulty: they are adapted to the documents I expect to parse. But as this is an open system,
people can upload whatever they want and the parser crashes by multiplying annotations (or
at least takes 20 minutes in garbage-collecting millions of annotations).

Thus, my question is: is there a way to limit the memory used by an annotator, or to limit
the number of annotations made by an annotator, or to limit the number of matches made by
Ruta ?
I prefer cancelling a parse for a given document than a 20 minutes downtime of the whole system.

Several UIMA-based services run in production, I guess that others certainly have hit the
same problem.

Any hint on that topic would be very helpful.


Hugues de Mazancourt

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message