lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xaida <hota.a...@gmail.com>
Subject Hot to get word importance in lucene index
Date Fri, 23 Jul 2010 02:44:23 GMT

Hi all!

hmmm, i need to get how important is the word in entire document collection
that is indexed in the lucene index. I need to extract some "representable
words", lets say concepts that are common and can be representable to whole
collection. Or collection "keywords". I did the fulltext indexing and the
only field i am using are text contents, because titles of the documents are
mostly not representable(numbers, codes etc....)

So, if i calculate tfidf, it gives me importance of single term with respect
to single document. But if that word is repeating in the documents, how can
i calculate its total importance within index?

All help appreciated!! Thank you!!!

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Hot-to-get-word-importance-in-lucene-index-tp988836p988836.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message