lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Audenaerde <rob.audenae...@gmail.com>
Subject Re: indexing performance 6.6 vs 7.1
Date Mon, 29 Jan 2018 10:29:27 GMT
Hi all,

Some follow up (sorry for the delay).

We built a benchmark in our application, and profiled it (on a smallish
data set). What we currently see in the profiler is that in Lucene 7.1 the
calls to `commit()` take much longer.

The self-time committing in 6.6: 3,215 ms
The self-time committing in 7.1: 10,187 ms.

We will try to run a larger data set and also later with the IW info
stream.

-Rob

On Thu, Jan 18, 2018 at 7:03 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Robert:
>
> Ah, right. I keep confusing my gmail lists
> "lucene dev"
> and
> "lucene list"....
>
> Siiigggghhhhh.
>
>
>
> On Thu, Jan 18, 2018 at 9:18 AM, Adrien Grand <jpountz@gmail.com> wrote:
> > If you have sparse data, I would have expected index time to *decrease*,
> > not increase.
> >
> > Can you enable the IW info stream and share flush + merge times to see
> > where indexing time goes?
> >
> > If you can run with a profiler, this might also give useful information.
> >
> > Le jeu. 18 janv. 2018 à 11:23, Rob Audenaerde <rob.audenaerde@gmail.com>
> a
> > écrit :
> >
> >> Hi all,
> >>
> >> We recently upgraded from Lucene 6.6 to 7.1.  We see a significant drop
> in
> >> indexing performace.
> >>
> >> We have a-typical use of Lucene, as we (also) index some database tables
> >> and add all the values as AssociatedFacetFields as well. This allows us
> to
> >> create pivot tables on search results really fast.
> >>
> >> These tables have some overlapping columns, but also disjoint ones.
> >>
> >> We anticipated a decrease in index size because of the sparse
> docvalues. We
> >> see this happening, with decreases to ~50%-80% of the original index
> size.
> >> But we did not expect an drop in indexing performance (client systems
> >> indexing time increased with +50% to +250%).
> >>
> >> (Our indexing-speed used to be mainly bound by the speed the Taxonomy
> could
> >> deliver new ordinals for new values, currently we are investigating if
> this
> >> is still the case, will report later when a profiler run has been done)
> >>
> >> Does anyone know if this increase in indexing time is to be expected as
> >> result of the sparse docvalues change?
> >>
> >> Kind regards,
> >>
> >> Rob Audenaerde
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message