uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pascal Coupet" <pascal.cou...@temis.com>
Subject RE: Annotation (Indexing) a bottleneck in UIMA in terms of speed
Date Thu, 26 Jun 2008 13:36:57 GMT
Hi Rohan,

6000 records in 6 min => 1000/min
800000 records in 10H => 1333/min

Not that bad! I guess one of these numbers is wrong. 

Are you distributing the load across several machines? Vinci is not that
good for load balancing across a lot of machines (>20-50, depending your
annotator speed). 

> -----Original Message-----
> From: rohan rai [mailto:hirohanin@gmail.com]
> Sent: Thursday, June 26, 2008 7:36 AM
> To: uima-user@incubator.apache.org
> Subject: Annotation (Indexing) a bottleneck in UIMA in terms of speed
> When I profile a UIMA application
> What I see that annonation takes a lot of time
> If I profile I see that to annotate 1 record , it takes around 0.06
> seconds
> Now you may say its good
> Now scale up
> Although it does not scale up linearly. But here is rough estimate on
> experiments done
> 6000 records take 6 min to annotate
> 800000 record tale around 10 hrs min to annotate
> Which is bad.
> One thing is that I am treating each record individually as a cas
> Even if I treat all the record as a single cas it takes around 6-7 hrs
> Which is still not good in terms of speed
> Is there a way out?
> Can I improve performance by any means??
> Regards
> Rohan

View raw message