lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: Hot to get word importance in lucene index
Date Fri, 23 Jul 2010 09:22:33 GMT
Are you perhaps looking for this:

http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/similar/MoreLikeThis.html

?

	karl

23 jul 2010 kl. 10.54 skrev Xaida:

>
> Hi! thanks for reply! I will try to explain better, sorry if it was  
> unclear.
>
> I have user text document collection. Not too big. Goal is to get  
> the most
> "important" concepts  which would in a way represent user  
> interests.  That
> is what i mean when i say important :)
>
>
> So lets say, in my collection I have my school documents, i have some
> snowboarding articles, i have some backpacking and easy travelling  
> guides,
> my favorite cooking recipes......and so on. Collection is more - less
> supervised, so number of documents for each "area" is similar. Not  
> equal,
> but there is some balance.
>
> So i would like, as result to get terms which are important in the  
> entire
> collection. For example, i think that term "cheese" should appear in  
> my
> results, because i know in my recipes there is a lot of cheese. Also  
> i would
> like to get the term "database"...from my school documents. And so on.
>
> So nothing more smart comes to my mind than this :)
> step 1 take one document
> step 2 calculate tfidf for all its terms
> step 3 take the terms with best tfidf and save them somewhere....
> step 4 go to step 1......and so on for all the documents
>
> And in the end to merge these results somehow :/
>
> I guess there is better way :)
>
> Thank you!!
>
>
>
>
>
> -- 
> View this message in context: http://lucene.472066.n3.nabble.com/Hot-to-get-word-importance-in-lucene-index-tp988836p989301.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message