lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Margolis <roman.margo...@gmail.com>
Subject Re: Question about a documentation note in CompressingStoredFieldsIndexWriter
Date Fri, 24 Nov 2017 09:39:59 GMT
Sorry about that.
In my original message, I highlighted the relevant parts which probably
didn't make it to the mail archive.

I would expect the note to state the following (unless I misunderstood some
of the details):
"Once data is loaded into memory, you can lookup the start pointer of any
document chunk by performing two binary searches: a first one based on the
values of DocBase in order to find the right block, and then inside the
block based on DocBaseDeltas (by reconstructing the doc bases for every
chunk)."

instead of:
"Once data is loaded into memory, you can lookup the start pointer of any
document by performing two binary searches: a first one based on the values
of DocBase in order to find the right block, and then inside the block
based on DocBaseDeltas (by reconstructing the doc bases for every chunk)."

The difference between the two is the added word 'chunk' after the word
'document'.

Thanks,
Roman Margolis


On Fri, Nov 24, 2017 at 11:24 AM, Adrien Grand <jpountz@gmail.com> wrote:

> Hi Roman,
>
> It's unclear to me what modification you are suggesting, could you please
> share what the updated comment would look like?
>
> Le mer. 22 nov. 2017 à 14:17, Roman Margolis <roman.margolis@gmail.com> a
> écrit :
>
> > Hi,
> >
> > I was reading some internal info about Lucene, and was confused by a note
> > on this page:
> >
> > https://lucene.apache.org/core/7_1_0/core/org/apache/
> lucene/codecs/compressing/CompressingStoredFieldsIndexWriter.html
> >
> > The note (the last note at the bottom) says:
> >
> >    - Once data is loaded into memory, you can lookup the start pointer of
> >    any document by performing two binary searches: a first one based on
> the
> >    values of DocBase in order to find the right block, and then inside
> the
> >    block based on DocBaseDeltas (by reconstructing the doc bases for
> every
> >    chunk).
> >
> > Shouldn't it say chunk, or document chunk (referring to document chunks
> in
> > the field data file)?
> >
> > Thanks in advance,
> > Roman Margolis
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message