lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Turian <tur...@gmail.com>
Subject MemoryIndex or RAMDirectory, but score using term statistics from a corpus given during preprocessing?
Date Fri, 29 Oct 2010 00:06:55 GMT
How do I use MemoryIndex or RAMDirectory, but score using term statistics
from a corpus given during preprocessing?

Let's say I want to use a MemoryIndex or RAMDirectory to store a *single*
document, and then run a query against it, and get the score of the query
using just this one document.
I know how to do this. See, for some example code, this blog post on
persistent search:
http://www.sajalkayan.com/prospective-search-using-python.html

Now what I want to do is take a *corpus* of K documents, and "index" it
during preprocessing to calculate the term statistics (e.g. idf).
I then want to freeze these term statistics, and use them whenever I insert
and compute the query score of a new document.
I.e. I want to *quickly* query a new document, using preprocessed term
statistics in the scoring function.
How can I do this?

Thanks,
   Joseph

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message