Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DA3179241 for ; Tue, 25 Oct 2011 19:57:31 +0000 (UTC) Received: (qmail 43388 invoked by uid 500); 25 Oct 2011 19:57:29 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 43337 invoked by uid 500); 25 Oct 2011 19:57:29 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 43327 invoked by uid 99); 25 Oct 2011 19:57:29 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Oct 2011 19:57:29 +0000 Received: from localhost (HELO [10.0.0.12]) (127.0.0.1) (smtp-auth username gsingers, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Oct 2011 19:57:29 +0000 From: Grant Ingersoll Mime-Version: 1.0 (Apple Message framework v1251.1) Content-Type: multipart/alternative; boundary="Apple-Mail=_EAF3F746-0C34-44CE-86DA-ABAED0754F1E" Subject: Re: Bet you didn't know Lucene can... Date: Tue, 25 Oct 2011 15:57:27 -0400 In-Reply-To: <1319556402.67340.YahooMailNeo@web29007.mail.ird.yahoo.com> To: java-user@lucene.apache.org, mark harwood References: <05AE2EA4-204C-4C0F-B80F-75F847AD1828@apache.org> <1319556402.67340.YahooMailNeo@web29007.mail.ird.yahoo.com> Message-Id: <06DFB62E-71DA-489F-B13E-669358FE49F2@apache.org> X-Mailer: Apple Mail (2.1251.1) --Apple-Mail=_EAF3F746-0C34-44CE-86DA-ABAED0754F1E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 On Oct 25, 2011, at 11:26 AM, mark harwood wrote: >>> using Lucene that don't fit under the core premise of full text = search >=20 > I've had several use cases over the years that use features peculiar = to Lucene but here's a very simple one I came across today that = illustrates its raw index lookup capability: >=20 > I needed a fast, scalable and persistent "Set" implementation to = maintain a large cold-list (millions of string-based keys). > I benchmarked various implementations using a set of ~6 million keys = with 10,000 random key lookups. > When it comes to RAM use, retrieval times and start-up costs Lucene = stands up very well against equivalent embedded databases for this task: >=20 > * Benchmarks for times to initially open the set when stored on disk: = http://goo.gl/dJL3g > * Benchmarks for Avg key lookup time once opened: http://goo.gl/SG79N > * Stats for RAM use after 10,000 lookups: http://goo.gl/MyJDn Those charts are beautiful. I have Lucene/Solr down as an excellent = key-value store (I've seen this done many times) and these charts = further cement it. >=20 > I don't doubt all of these implementations could be tweaked (e.g. = optimizing the Lucene index, various DB-specific settings) but I tried = to use sensible defaults to make the tests fair e.g. use of prepared = statements, indexes, minimal data retrieved. > Speeds varied with each run of the random lookup test due to OS-level = caching effects so the best times were recorded in each case. > The HashSet tests are loaded entirely from file (hence the long = start-up time) and are not a scalable solution because of RAM costs. > MySQL requires an inter-process call as it was not embedded but even = using a remoted Lucene call I get significantly better performance (avg = 0.5ms lookup vs MySQL 10ms) > =20 >=20 > Cheers > Mark >=20 >=20 >=20 > ----- Original Message ----- > From: Grant Ingersoll > To: java-user@lucene.apache.org > Cc:=20 > Sent: Saturday, 22 October 2011, 10:11 > Subject: Bet you didn't know Lucene can... >=20 > Hi All, >=20 > I'm giving a talk at ApacheCon titled "Bet you didn't know Lucene = can..." (http://na11.apachecon.com/talks/18396). It's based on my = observation, that over the years, a number of us in the community have = done some pretty cool things using Lucene that don't fit under the core = premise of full text search. I've got a fair number of ideas for the = talk (easily enough for 1 hour), but I wanted to reach out to hear your = stories of ways you've (ab)used Lucene and Solr to see if we couldn't = extend the conversation to a bit more than the conference and also see = if I can't inject more ideas beyond the ones I have. I don't need deep = technical details, but just high level use case and the basic insight = that led you to believe Lucene could solve the problem. >=20 > Thanks in advance, > Grant >=20 > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org >=20 -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com --Apple-Mail=_EAF3F746-0C34-44CE-86DA-ABAED0754F1E--