lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Koscho <>
Subject Weight for all Terms in all Documents
Date Tue, 05 Oct 2010 18:01:13 GMT
How do I get the weights for all terms in all documents?

For a given set of documents, what are the series of API calls I need to
make to get the following type of information:

doc1, termA_weight, termB_weight, etc..
doc2, termC_weight, termD_weight, etc..
doc3, termE_weight, termZ_weight, etc..

It seems that I have to start with a Query object, that is typically
provided by an end-user.  However, in my case, I don't have an end user or a
specific query.  Instead I am trying to analyze the documents and interested
in getting the weights of all terms so that I can compute some statistics
about the similarity among documents.

Thanks in advance,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message