lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Train Lucene with topic-defined files
Date Sun, 22 Jun 2014 13:59:24 GMT
Hi benglish,

You are almost there. As it seems that you have got an index already,
all you should do to train is that call train() method of the classifier.

 > for(int i = 0; i < NumberOfTraningFiles; i++)
 > {
 >       classifier.train(ar, bodyTextOfTheFile, categoryOfTheFile, new
 > JapaneseAnalyzer(Version.LUCENE_46));
 > }

But you should call train() method at the out of the loop.
And also, you need to use an appropriate Analyzer for your text field,
e.g. StandardAnalyzer for English.

koji
-- 
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

(2014/06/22 16:27), benglish wrote:
> Dear Koji,
>
> Firstly, thank you so much.
>
> I have a number of files and their categories. Each file can have just 2
> attributes: category and text. Unfortunately, I could not understand how you
> inserted your training data into the SimpleNaiveBayes classifier. In other
> words, I did not get the .xml file. I guess it is something related to Solr,
> but I have no experience with that. I was wondering if you'd mind helping me
> and tell me how to have my files inserted into the training part of the
> classifier. Is it possible to do something like this:
>
> for(int i = 0; i < NumberOfTraningFiles; i++)
> {
>       classifier.train(ar, bodyTextOfTheFile, categoryOfTheFile, new
> JapaneseAnalyzer(Version.LUCENE_46));
> }
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Train-Lucene-with-topic-defined-files-tp4141979p4143296.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>




Mime
View raw message