lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <>
Subject Re: Bet you didn't know Lucene can...
Date Tue, 25 Oct 2011 15:26:42 GMT
>>using Lucene that don't fit under the core premise of full text search

 I've had several use cases over the years that use features peculiar to Lucene but here's
a very simple one I came across today that illustrates its raw index lookup capability:

I needed a fast, scalable and persistent "Set" implementation to maintain a large cold-list
(millions of string-based keys).
I benchmarked various implementations using a set of ~6 million keys with 10,000 random key
When it comes to RAM use, retrieval times and start-up costs Lucene stands up very well against
equivalent embedded databases for this task:

* Benchmarks for times to initially open the set when stored on disk:
* Benchmarks for Avg key lookup time once opened:
* Stats for RAM use after 10,000 lookups:

I don't doubt all of these implementations could be tweaked (e.g. optimizing the Lucene index,
various DB-specific settings) but I tried to use sensible defaults to make the tests fair
e.g. use of prepared statements, indexes, minimal data retrieved.
Speeds varied with each run of the random lookup test due to OS-level caching effects so the
best times were recorded in each case.
The HashSet tests are loaded entirely from file (hence the long start-up time) and are not
a scalable solution because of RAM costs.
MySQL requires an inter-process call as it was not  embedded but even using a remoted Lucene
call I get significantly better performance (avg 0.5ms lookup vs MySQL 10ms)


----- Original Message -----
From: Grant Ingersoll <>
Sent: Saturday, 22 October 2011, 10:11
Subject: Bet you didn't know Lucene can...

Hi All,

I'm giving a talk at ApacheCon titled "Bet you didn't know Lucene can..." ( 
It's based on my observation, that over the years, a number of us in the community have done
some pretty cool things using Lucene that don't fit under the core premise of full text search. 
I've got a fair number of ideas for the talk (easily enough for 1 hour), but I wanted to reach
out to hear your stories of ways you've (ab)used Lucene and Solr to see if we couldn't extend
the conversation to a bit more than the conference and also see if I can't inject more ideas
beyond the ones I have.  I don't need deep technical details, but just high level use case
and the basic insight that led you to believe Lucene could solve the problem.

Thanks in advance,

Grant Ingersoll

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message