uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: performance of JCas.reset()
Date Fri, 18 Jun 2010 03:10:25 GMT


On 6/17/2010 10:16 PM, Eddie Epstein wrote:
> Using 2.3.0, and a CAS defined by the PersonTitleAnnotator, this code
> runs in 100ms on my laptop.
>   

Philip, 2.3.0 has some performance improvements to reset(); please see
if it helps in your test configuration.

-Marshall
> Eddie
>
> On Wed, Jun 16, 2010 at 4:23 PM, Philip Ogren <philip@ogren.info> wrote:
>   
>> When I run the following loop it takes about 6 seconds on my 2GHz machine:
>>
>> for(int i=0; i<10000; i++) {
>>
>> jCas.reset();
>>
>> }
>>
>> Which comes out to a .6 milliseconds per call. This is pretty slow for cases
>> in which you have many short documents. For example, this would add 10
>> minutes of processing time for 1M document corpus. Is this a known issue and
>> is there anything that I can do to minimize this impact?
>>
>> Thanks,
>>
>> Philip
>>
>>
>>     
>
>   

Mime
View raw message