opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samik Raychaudhuri <sam...@gmail.com>
Subject Re: Need to speed up the model creation process of OpenNLP
Date Wed, 19 Nov 2014 00:30:23 GMT
Hi,
This is essentially a machine learning problem, nothing to do with 
OpenNLP. If you have such a large corpus, it would take a substantial 
amount of time to train models. You can possibly have smaller training 
sets and see if the models deteriorate substantially. Another strategy 
is to incrementally introduce training sets containing specific class of 
Token Names - that would provide a quicker turnaround.
Hope this help.
Best,
-Samik


On 18/11/2014 8:46 AM, nikhil jain wrote:
> Hi,
> I asked below question yesterday, did anyone get a chance to look at this.
> I am new in OpenNLP and really need some help. Please provide some clue or link or example.
> ThanksNIkhil
>        From: nikhil jain <nikhil_jain1234@yahoo.com.INVALID>
>   To: "users@opennlp.apache.org" <users@opennlp.apache.org>; Dev at Opennlp Apache
<dev@opennlp.apache.org>
>   Sent: Tuesday, November 18, 2014 12:02 AM
>   Subject: Need to speed up the model creation process of OpenNLP
>     
> Hi,
> I am using OpenNLP Token Name Finder for parsing the unstructured data. I have created
a corpus of about 4 million records. When I am creating a model out of the training set using
openNLP API's in Eclipse using default setting (cut-off 5 and iterations 100), process is
taking a good amount of time, around 2-3 hours.
> Can someone suggest me how can I reduce the time as I want to experiment with different
iterations but as the model creation process is taking so much time, I am not able to experiment
with it. This is really a time consuming process.
> Please provide some feedback.
> Thanks in advance.Nikhil Jain
>
>    


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message