opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Kottmann <kottm...@gmail.com>
Subject Re: Document Categorizer - Classifying: Help
Date Fri, 30 Mar 2012 13:39:19 GMT
Sorry for this bug. We have a jira for it, but no one every took time to 
fix it.
Well, instead of the stack trace you should see an error message which
tells you that you don't have enough training data.

You should try with a few hundred examples at least, otherwise
the model you produce will not really work.

Jörn

On 03/30/2012 03:36 PM, Adriano Santos wrote:
> Hi Jörn, thanks for help me.
>
> I changed the class path and OpenNLP version. Ran, again, the sample and
> returned this error:
>
> C:\apache-opennlp-1.5.2\bin>opennlp DoccatTrainer -encoding UTF-8 -lang en
> -data
>   en-doccat.train -model en-doccat.bin
> Indexing events using cutoff of 5
>
>          Computing event counts...  done. 2 events
>          Indexing...  Dropped event GMDecrease:[bow=Major, bow=acquisitions,
> bow=
> that, bow=have, bow=a, bow=lower, bow=gross, bow=margin, bow=than, bow=the,
> bow=
> existing, bow=network, bow=also]
> Dropped event GMIncrease:[bow=The, bow=upward, bow=movement, bow=of,
> bow=gross,
> bow=margin, bow=resulted, bow=from, bow=amounts, bow=pursuant, bow=to,
> bow=adjus
> tments]
> done.
> Sorting and merging events... Done indexing.
> Incorporating indexed data for training...
> Exception in thread "main" java.lang.NullPointerException
>          at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
>          at opennlp.maxent.GIS.trainModel(GIS.java:256)
>          at opennlp.model.TrainUtil.train(TrainUtil.java:182)
>          at
> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerM
> E.java:154)
>          at
> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerM
> E.java:176)
>          at
> opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerM
> E.java:192)
>          at
> opennlp.tools.cmdline.doccat.DoccatTrainerTool.run(DoccatTrainerTool.
> java:91)
>          at opennlp.tools.cmdline.CLI.main(CLI.java:191)
>
>
>
> On Fri, Mar 30, 2012 at 10:18 AM, Jörn Kottmann<kottmann@gmail.com>  wrote:
>
>> Looks like you do not have the maxent jar on the classpath.
>> Maybe it is just an issue with our script (does that work with head?).
>>
>> Anyway, try to go to this dir:
>>
>> C:\Program Files\Apache Software Foundation\opennlp-tools-1.5.0
>>
>> and type: bin/opennlp
>>
>> Or does it not work because of the whitespace in Program Files?
>>
>> I suggest that you try 1.5.2, if I remember it correctly we spent some
>> time on this script to fix it.
>>
>> Jörn
>>
>>
>> On 03/30/2012 03:14 PM, Adriano Santos wrote:
>>
>>> Hi, people.
>>>
>>> So... I run the exemple and return this error:
>>>
>>> C:\Program Files\Apache Software Foundation\opennlp-tools-1.5.**
>>> 0\bin>opennlp
>>> Docc
>>> atTrainer -encoding UTF-8 -lang en -data en-doccat.train -model
>>> en-doccat.bin
>>> Exception in thread "main" java.lang.**NoClassDefFoundError:
>>> opennlp/model/EventSt
>>> ream
>>>          at opennlp.tools.cmdline.CLI.<**clinit>(CLI.java:107)
>>> Caused by: java.lang.**ClassNotFoundException: opennlp.model.EventStream
>>>          at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
>>>          at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
>>>          at java.security.**AccessController.doPrivileged(**Native Method)
>>>          at java.net.URLClassLoader.**findClass(URLClassLoader.java:**354)
>>>          at java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
>>>          at sun.misc.Launcher$**AppClassLoader.loadClass(**
>>> Launcher.java:308)
>>>          at java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
>>>          ... 1 more
>>>
>>> I'm using opennlp-tools-1.5.0 version.
>>>
>>> Thanks for all.
>>>
>>>
>>> On Tue, Mar 27, 2012 at 8:40 PM, william.colen@gmail.com<
>>> william.colen@gmail.com>   wrote:
>>>
>>>   Hi, Adriano,
>>>> We don't have any ready to use model for Document Categorizer yet. You
>>>> should try training your own using the instructions.
>>>>
>>>> Regards,
>>>> William
>>>>
>>>>
>>>> On Tue, Mar 27, 2012 at 5:31 PM, Adriano Santos<adriano.nego@gmail.com>
>>>> wrote:
>>>>
>>>>> To perform classification I need a maxent model. But I don’t have an
>>>>> example this. In the others tasks (Name Finder, Tokenizer, Sentence
>>>>> Detector...) has example... I’m beginner in the OpenNLP and I’d like
run
>>>>> all existents examples.
>>>>>
>>>>> Can you help me?
>>>>>
>>>>> On Tue, Mar 27, 2012 at 5:17 PM, Jörn Kottmann<kottmann@gmail.com>
>>>>>
>>>> wrote:
>>>>
>>>>> On 03/27/2012 10:04 PM, Adriano Santos wrote:
>>>>>>    I'm trying to use Document Categorizer - Classifying, but  I could
>>>>>>> not
>>>>>>> run
>>>>>>> the example .
>>>>>>>
>>>>>>>   What the problem you have? Do you get an exception?
>>>>>> Jörn
>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>> Adriano Araújo Santos
>>>>> *************************************************
>>>>>
>>>>> *Professor da **Escola Superior de Aviação Civil - ESAC* *
>>>>> *
>>>>>
>>>>> *Professor do Curso de Sistemas de Informação - FACISA*
>>>>> *Professor do Departamento de Computação da UEPB
>>>>> * *PMI Membership
>>>>> Mestrando em Ciência da Computação da UFCG*
>>>>>
>>>>> *Pós-graduando em Gestão Empresarial de Projetos - MBA*
>>>>>
>>>>> *MSP Lead - Microsoft Student Partner
>>>>> Lider do Grupo de Usuários.NUG
>>>>> **Twitter:* @Adriano_Santos
>>>>>
>>>>> *Site:**https://sites.google.**com/site/adrianosantospb*<https://sites.google.com/site/adrianosantospb*>
>>>>>
>>>
>


Mime
View raw message