lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 车 东 <ched...@hotmail.com>
Subject contrib: IndexSearcher with docID sorting
Date Sat, 21 Sep 2002 08:52:26 GMT
//Che Dong wrote:
//> 1. custom sorting beside default score sorting: make docID alias one 
field you need 
//>output sorting
//> solved  by sort data before indexing(example sorted by field PostDate), 
so docID can 
//>be an alias to the sort field. if we make hitCollector
//> sort with docID or 1/docID or even complex stragety (docID * score)...
//> 
//>http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=115469

//> IndexOrderSearcher: sort data before indexing and use 1/docID instead 
of score 
//
//That's an interesting approach.  I don't recall ever seeing this message 
//when it was originally posted.  Sorry.
//
//I had imagined instead adding this functionality to Hits.java.  Having 
//a different Searcher implementation makes it possible for folks to use 
//MultiSearcher to combine results from an IndexSearcher and an 
//IndexOrderSearcher, which would not make sense.  If the functionality 
//instead resides in Hits.java, then it could not be misused in this way.
//
//So the way I was going to do it was to add something to Hits.java like:
//   public static final long ORDER_BY_SCORE = 1;
//   public static final long ORDER_BY_DOC_NUM = 1;
//   public void setHitOrdering(int order);
//
//If ORDER_BY_SCORE is specfied then Hits would work as it does now.  This 
//would be the default.  But when ORDER_BY_DOC_NUM is specified then 
//Hits.java would use a HitCollector to implement this ordering.

I added docID sorting in IndexSearcher.
Please check it out.

Regards

Che, Dong
70,84d67
<   /**
<    * customize search result sort behavior:
<    * if data source sorted by some field before indexing docID can be 
take 
<    * as the alias to the sort field, so 
<    * search result sort by docID(or desc) equals to sort by field
<    * 
<    * search results sort method:
<    *  0:  sort by score (default)
<    *  1:  sort by docID 
<    *  -1: sort by docID desc
<    */
<   public static final int ORDER_BY_SCORE = 0;
<   public static final int ORDER_BY_DOCID = 1;
<   public static final int ORDER_BY_DOCID_DESC = -1;
<   public int sortType = ORDER_BY_SCORE; 
129,157c112,127
<     final int md = reader.maxDoc();
< 
<     scorer.score(new HitCollector() 
<       {
<               private float minScore = 0.0f;
<               public final void collect(int doc, float score) {
<                 if (score > 0.0f &&                     // ignore zeroed 
buckets
<                     (bits==null || bits.get(doc))) {    // skip docs not 
in bits
<                   totalHits[0]++;
<                   if (score >= minScore) {
<                     // update hit queue
<                     switch (sortType) {
<                           case ORDER_BY_SCORE:   //sort results by score
<                             hq.put(new ScoreDoc(doc, score));   
<                           case ORDER_BY_DOCID:   //sort results by docID
<                             hq.put(new ScoreDoc(doc, doc));
<                           case ORDER_BY_DOCID_DESC:  //sort results by 
docID desc
<                             hq.put(new ScoreDoc(doc, (md - doc) ) );    
<                           default:  //sort results by score(default)
<                             hq.put(new ScoreDoc(doc, score));
<                         }      
<                     if (hq.size() > nDocs) {            // if hit queue 
overfull
<                               hq.pop();                         // remove 
lowest in hit queue
<                               minScore = ((ScoreDoc)hq.top()).score; // 
reset minScore
<                     }
<                   }
<                 }
<               }
<       }, md);
---
>     scorer.score(new HitCollector() {
>       private float minScore = 0.0f;
>       public final void collect(int doc, float score) {
>         if (score > 0.0f &&                     // ignore zeroed buckets
>             (bits==null || bits.get(doc))) {    // skip docs not in bits
>           totalHits[0]++;
>           if (score >= minScore) {
>             hq.put(new ScoreDoc(doc, score));   // update hit queue
>             if (hq.size() > nDocs) {            // if hit queue overfull
>               hq.pop();                         // remove lowest in hit 
queue
>               minScore = ((ScoreDoc)hq.top()).score; // reset minScore
>             }
>           }
>         }
>       }
>       }, reader.maxDoc());

_________________________________________________________________
享用世界上最大的电子邮件系统― MSN Hotmail。http://www.hotmail.com/cn

Mime
View raw message