lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <>
Subject Re: Static index, fastest way to do forceMerge
Date Fri, 02 Nov 2018 19:52:29 GMT
> int processors = Runtime.getRuntime().availableProcessors();
> int ConcurrentMergeScheduler cms = new ConcurrentMergeScheduler();
> cms.setMaxMergesAndThreads(processors,processors);

See the number of threads in the CMS only matters if you have
concurrent merges of independent segments. What you're doing
effectively forces an eventual X -> 1 merge, which is done by a single
thread (regardless of the max processors above).

>    38G _583u.fdt
>    25M _583u.fdx
>    13K _583u.fnm
>    47G _583u_Lucene50_0.doc
>    54G _583u_Lucene50_0.pos
>    30G _583u_Lucene50_0.tim
>   413M _583u_Lucene50_0.tip
>   2.1G _583u_Lucene70_0.dvd
>    213 _583u_Lucene70_0.dvm

Merging segments as large as this one requires not just CPU, but also
serious I/O throughput efficiency. I assume you have fast NVMe drives
on that machine, otherwise it'll be slow, no matter what. It's just a
lot of bytes going back and forth.

> If we did such a max resource merge code would there be interest to have this merged?

I think so. Try to experiment locally first though and see if what you
can find out. Hacking that code I pointed at shouldn't be too
difficult. see what happens.

> Or should we maybe do something like this assuming 64 cpus
> writer.forceMerge(64, true);
> writer.forceMerge(32, true);
> writer.forceMerge(16, true);
> writer.forceMerge(8, true);
> writer.forceMerge(4, true);
> writer.forceMerge(2, true);
> writer.forceMerge(1, true);

No, this doesn't make much sense. If your goal is 1 segment then you
want to read from as many of them as once as possible and merge into a
single segment. Doing what you did above would only bump I/O traffic a


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message