lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction
Date Fri, 07 Aug 2009 02:09:15 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740362#action_12740362
] 

Mark Miller commented on LUCENE-1789:
-------------------------------------

Its basically what I did as a first attempt at 1771 actually (you have a glimpse into how
hectic my brain is in that I didn't remember that 30 minutes ago) :

(with some of this in ReaderUtil now, it can be written in half the length)
{code}
+    // constructor
+    private ValueSourceScorer(Similarity similarity, IndexReader reader, ValueSourceWeight
w, boolean valuesFromSubReaders) throws IOException {
       super(similarity);
+      if(!valuesFromSubReaders) {
+        this.weight = w;
+        this.qWeight = w.getValue();
+        // this is when/where the values are first created.
+        vals = valSrc.getValues(reader);
+        termDocs = reader.termDocs(null);
+        return;
+      }
+      
       this.weight = w;
       this.qWeight = w.getValue();
-      // this is when/where the values are first created.
-      vals = valSrc.getValues(reader);
+      List subReadersList = new ArrayList();
+      ReaderUtil.gatherSubReaders(subReadersList, reader);
+      subReaders = (IndexReader[]) subReadersList.toArray(new IndexReader[subReadersList.size()]);
+      valsArray = new DocValues[subReaders.length];
+      docStarts = new int[subReaders.length];
+      int maxDoc = 0;
+      for (int i = 0; i < subReaders.length; i++) {
+        docStarts[i] = maxDoc;
+        maxDoc += subReaders[i].maxDoc();
+        valsArray[i] = valSrc.getValues(subReaders[i]);
+      }
+      
+      vals = new DocValues() {
+
+        //@Override
+        public float floatVal(int doc) {
+          int n = ReaderUtil.subSearcher(doc, subReaders.length, docStarts);
+          return valsArray[n].floatVal(doc);
+        }
+
+        //@Override
+        public String toString(int doc) {
+          return Float.toString(floatVal(doc));
+        }
+        
+      };
       termDocs = reader.termDocs(null);
     }

{code}

> getDocValues should provide a MultiReader DocValues abstraction
> ---------------------------------------------------------------
>
>                 Key: LUCENE-1789
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1789
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Hoss Man
>            Priority: Minor
>             Fix For: 2.9
>
>
> When scoring a ValueSourceQuery, the scoring code calls ValueSource.getValues(reader)
on *each* leaf level subreader -- so DocValue instances are backed by the individual FieldCache
entries of the subreaders -- but if Client code were to inadvertently  called getValues()
on a MultiReader (or DirectoryReader) they would wind up using the "outer" FieldCache.
> Since getValues(IndexReader) returns DocValues, we have an advantage here that we don't
have with FieldCache API (which is required to provide direct array access). getValues(IndexReader)
could be implimented so that *IF* some a caller inadvertently passes in a reader with non-null
subReaders, getValues could generate a DocValues instance for each of the subReaders, and
then wrap them in a composite "MultiDocValues".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message