opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Svetoslav Marinov <>
Subject Re: Size of training data
Date Fri, 26 Apr 2013 07:30:52 GMT
I use the API. Can one specify the memory size via the command line? I
think the default there is 1024M? At 8G memory during "computing event
counts...", at 16G during indexing: "Computing event counts...  done.
50153300 events


On 2013-04-26 09:12, "Jörn Kottmann" <> wrote:

>On 04/26/2013 09:06 AM, Svetoslav Marinov wrote:
>> I'm wondering what is the max size (if such exists) for training a NER
>>model? I have a corpus of 2 600 000 sentences annotated with just one
>>category, 310M in size. However, the training never finishes ­ 8G memory
>>resulted in java out of memory exception, and when I increased it to 16G
>>it just died with no error message.
>Do you use the command line interface or the API for the training?
>At which stage of the training did you get the out of memory exception?
>Where did it just die when you used 16G of memory (maybe do a jstack) ?

View raw message