lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjoy Das <san...@playingwithpointers.com>
Subject Benchmarking Lucene
Date Mon, 23 Nov 2015 19:42:59 GMT
Hi all,

I work for a JVM vendor, and we're interested in obtaining / creating
a set of Lucene benchmarks for internal use.  We plan to use these for
performance regression testing and general performance analysis
(i.e. to make sure Lucene performs well on our JVM).  I'm especially
interested in benchmarks that demonstrate opportunities for
improvements in our JIT compiler.

While I imagine that the lucene/benchmark/ directory is probably the
right place to start, I have a few high-level questions that are best
answered by people on this mailing list:

- Are there realistic Lucene workloads that are bottle-necked on the
   JVM's performance (JIT, GC etc.) and *not* e.g. disk / network IO?
   If so, what are some examples?

- How relevant are the Dacapo "luindex" and "lusearch" benchmarks
   today?  Will porting them to the latest version of Lucene give me a
   benchmark representative of modern Lucene usage, or has Lucene's
   performance characteristics evolved in fundamental ways since Dacapo
   was published?

- What is the distribution of Lucene versions in production
   deployments?  Do users tend to aggressively upgrade to the "latest
   and greatest" Lucene version, or is there usually a non-trivial lag?

Any other information that you think is useful or relevant is
welcome.

Thanks!
-- Sanjoy

Mime
View raw message