Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 21729 invoked from network); 19 May 2009 12:26:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 May 2009 12:26:52 -0000 Received: (qmail 52880 invoked by uid 500); 19 May 2009 12:26:51 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 52800 invoked by uid 500); 19 May 2009 12:26:51 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 52789 invoked by uid 99); 19 May 2009 12:26:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 May 2009 12:26:48 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dmsmith555@gmail.com designates 74.125.92.25 as permitted sender) Received: from [74.125.92.25] (HELO qw-out-2122.google.com) (74.125.92.25) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 May 2009 12:26:37 +0000 Received: by qw-out-2122.google.com with SMTP id 5so3080531qwd.53 for ; Tue, 19 May 2009 05:26:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=PGAnpSdNcyHnHg4+zInqB+jP51gvPpF5+N45KoY2FgQ=; b=b5FPY4rmSae1sY55TseX51YVQQvUP/QQR5j53rMFWgkaXjfJrfRh/GK6BVXSlgZ21V eYwIbHANcF7PLz6PiNMtI3DivYRxQheBA1x9fuaPJaSGRH2cF2NnOfu+eqypR7/vPADz ujdhkz5GBf5AaabR+zXGwU+SCvOjOgZG8qH84= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=YS/bW4ioaEUC0Gtk22Xoo/dgay47wHplrn4wItuFh+u/6LSTGljElMuNj82JseHvdW HkrFgBrsOOZJApeYfUaTIFA8BE8w1G8qhaw8A8yuE8UDGFAO9WUUOnr+MlDRPMxxecO9 fXro+5buvj0pKjJ8YP3Sk4eXANgf0zHhBpxQc= Received: by 10.224.28.65 with SMTP id l1mr369121qac.75.1242735976580; Tue, 19 May 2009 05:26:16 -0700 (PDT) Received: from ?10.0.1.199? (cpe-24-210-174-132.woh.res.rr.com [24.210.174.132]) by mx.google.com with ESMTPS id 8sm1414052qwj.41.2009.05.19.05.26.14 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 19 May 2009 05:26:16 -0700 (PDT) Message-Id: From: DM Smith To: java-dev@lucene.apache.org In-Reply-To: <9ac0c6aa0905190445y5192b144n604336f0323f088f@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Subject: Re: Lucene's default settings & back compatibility Date: Tue, 19 May 2009 08:26:13 -0400 References: <9ac0c6aa0905181406l5c951016k97a16d8db766716e@mail.gmail.com> <8f0ad1f30905182031l69b2f54ex784fcf8043244f10@mail.gmail.com> <9ac0c6aa0905190445y5192b144n604336f0323f088f@mail.gmail.com> X-Mailer: Apple Mail (2.935.3) X-Virus-Checked: Checked by ClamAV on apache.org On May 19, 2009, at 7:45 AM, Michael McCandless wrote: > On Tue, May 19, 2009 at 6:47 AM, DM Smith > wrote: > >> It is common in my application, a Bible program, that indexes each >> verse >> (think of a verse as a numbered sentence) as a separate document. >> We index >> everything, including words that are typically stop words as those >> might be >> important to our end users. Besides this, the top 280 word roots >> represent >> 90% of the occurrences. >> And on searches, we return everything in book order, unless the >> user wants >> to score the result. In that case, we return a small, user >> configurable >> amount of hits ordered by score. > > The ability to turn off scoring when sorting by field, new in 2.9, > should be a good performance boost for your use case (if performance > is important). > >> And we are using Lucene out of the box for the most part. We've >> deviated >> only to incrementally solve performance problems. > > Right, my impression is most people will stick w/ Lucene's defaults, > incrementally changing only limited settings they come across, which > is why selecting good defaults is vital to Lucene's growth/adoption > (new users especially simply start w/ our defaults). > > But we can't pick good defaults when we're so heavily bound by back- > compat. > > Which is why I find the Settings approach so appealing :) Suddenly, > on all improvements to Lucene, we have the freedom to change our > defaults so a new user sees all such improvements. From my perspective as a user: Backward compatibility is important, but it is not a be-all and end-all. To me, if I can drop in the new jar and get bug fixes that's great. My expectation is that searches against an existing index will still return the same or, in the case of bug fixes, better results. What I need to know is when that is not the case. Today, we use a naming convention of the Lucene jars to indicate whether that is true. I'd be just as happy if there were a compatibility level that I could check (I'm having to do that in our code as I change our analyzers frequently enough to be embarrassed). The problem, which might be addressed in the "fixing" of core vs contrib, is that we use lots of contrib (analyzers, snowball, highlighting) and want it to maintain backward compatibility too. (I'm happy that has been the case!) So, perhaps a compatibility level per contribution. The packagers for jpackage consider nearly every release of Lucene to break backward compatibility, because they treat Lucene as a whole. Perhaps that is the same with other Linux distributions. But because backward compatibility does not apply to contrib in a strict fashion, one cannot reliably use Lucene from distributions unless such a policy is the case. In any case, I don't think anyone should just drop in a new jar without some testing. At a minimum, they should compile with deprecations turned on. Regarding deprecations, I'd also be just as happy if a method was marked @deprecated This behavior has changed in with this release, 2.4.3. That is, as a warning of changed behavior. And then on the 3.0 release the warning could be removed. But then again, my use of Lucene, while very important to my application, is very simple and easy to change. -- DM --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org