lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: IndexSearcher with two Indexes
Date Fri, 27 Jan 2012 22:10:04 GMT
On Fri, Jan 27, 2012 at 4:53 PM, Hany Azzam <hany@eecs.qmul.ac.uk> wrote:
> Hi Robert,
>
> Thanks for the reply. I am trying to do something different. If I use a mutireader then
the searching/scoring will take place over the two indexes at the same time. However, in my
case the subcomponents of the retrieval model are calculated over separate evidence spaces.
For example, the retrieval model calculates something like that:
>
> score := P(query_term | documents) * P(query_term | relevant_documents)
>
> The P(query_term | documents) can be estimated using the index over the whole collection
of documents. The P(query_term | relevant_documents) can be estimated using the index over
the relevant documents only (which are known prior to the execution of the query).
>

In this situation, if you want to combine the statistics from
different indexes in your own way, you can look at
IndexSearcher.termStatistics() and
IndexSearcher.collectionStatistics().
These are intended for situations like distributed search, but maybe
you can make use of them.

here is some pseudocode:

    IndexReader relevant = IndexReader.open(relevantDirectory);
    IndexReader documents = IndexReader.open(documentsDirectory);

    final IndexSearcher relevantSearcher = new IndexSearcher(relevant);
    IndexSearcher documentsSearcher = new IndexSearcher(documents) {

      @Override
      public CollectionStatistics collectionStatistics(String field)
throws IOException {
        CollectionStatistics documentStats = super.collectionStatistics(field);
        return new CollectionStatistics(...
someCombinationOf(documentStats + stuff from relevantSearcher));
      }

      // do a similar thing for termStatistics()....
    };

    documentsSearcher.search(...)

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message