lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #633: LUCENE-8753 UniformSplit PostingsFormat
Date Wed, 12 Jun 2019 14:54:09 GMT
bruno-roustant commented on a change in pull request #633: LUCENE-8753 UniformSplit PostingsFormat
URL: https://github.com/apache/lucene-solr/pull/633#discussion_r292959376
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/codecs/lucene80/Lucene80Codec.java
 ##########
 @@ -91,7 +91,11 @@ public Lucene80Codec() {
    *             flushed/merged segments.
    */
   public Lucene80Codec(Mode mode) {
-    super("Lucene80");
+    this("Lucene80", mode);
+  }
+  
+  protected Lucene80Codec(String name, Mode mode) {
+    super(name);
 
 Review comment:
   > Use FilterCodec instead?
   Yes! Good point. I'll do that.
   
   > Use Lucene50PostingsWriter instead?
   DeltaBaseTermStateSerializer writes each TermState with file pointers delta encoded relatively
to a base pointer (which is the block base file pointer). And it does only that thing. This
is different from Lucene50PostingsWriter.encodeTerm which writes the TermState with file pointers
delta encoded relatively to the previous TermState written.
   This logic to delta encode relatively to the base pointer allows us to read TermStates
in random access, no sequential reading required. This allows us to make a binary search inside
the block itself and read only one TermState.
   To summarize: No, we cannot use Lucene50PostingsWriter.encodeTerm. That said DeltaBaseTermStateSerializer
could be located elsewhere, the only issue is that it writes IntBlockTermState to stay compatible
and still use Lucene50PostingsReader.postings() which requires IntBlockTermState because it
casts internally the provided BlockTermState.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message