lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <>
Subject [jira] [Updated] (LUCENE-7399) Speed up flush of points v2
Date Tue, 02 Aug 2016 08:19:20 GMT


Adrien Grand updated LUCENE-7399:
    Attachment: LUCENE-7399.patch

Here is a new patch, I fixed assertHistogram to be called in an assertion and added the suggested

bq. Maybe the visitor should also take BytesRef? Codec impls could read a whole byte[] values
block in at once

I am not sure codecs could leverage this. I think a serious codec impl would do prefix compression
to save space, so it could not read large byte[] anyway as it would need to concatenate the
shared prefix and the suffix that is specific to the value at every iteration?

bq. We could also fix BKDWriter.writeCommonPrefixes to save the copy there, though that's
just once per leaf block.

I remember trying it out and it didn't help.

bq. Have you tweaked 20 to see if that's a good value? Sorting BKD points is rather costly
since when we swap, we swap whole values (docID, maybe ord, then the byte[] value for this

I remember tweaking it a long time ago when I worked in this Sorter abstraction, and values
in [20,50] looked fine when sorting a simple int[] (so both comparisons and swaps were cheap)
so I picked 20 to err on the safe side. It's true it might be different with points that have
costly swaps.

> Speed up flush of points v2
> ---------------------------
>                 Key: LUCENE-7399
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7399.patch, LUCENE-7399.patch
> There are improvements we can make on top of LUCENE-7396 to get ever better flush performance.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message