Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 65030 invoked from network); 7 Jan 2003 00:01:31 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 7 Jan 2003 00:01:31 -0000 Received: (qmail 7804 invoked by uid 97); 7 Jan 2003 00:02:53 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@jakarta.apache.org Received: (qmail 7787 invoked by uid 97); 7 Jan 2003 00:02:52 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 7773 invoked by uid 98); 7 Jan 2003 00:02:51 -0000 X-Antivirus: nagoya (v4218 created Aug 14 2002) Message-ID: <20030107000132.99901.qmail@web12701.mail.yahoo.com> Date: Mon, 6 Jan 2003 16:01:32 -0800 (PST) From: Otis Gospodnetic Subject: Re: Should Token be immutable? To: Lucene Developers List In-Reply-To: <3E1A0E8C.50605@lub.umontreal.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Ah, sorry about bringing up performance, I mixed that with another thread. Anyhow, I still think that setPosition offers a nice feature that some people may want to use. It was on a to do list for a while, and it was there because people requested it, so even though Lucene doesn't use setPosition internally, maybe Lucene-based apps out there are. Otis --- stephane vaucher wrote: > I'm not sure if I understand your question. I'm not trying to > optimise > anything. This thread was spawned because the usage of Token was > unclear > and inconsistent (I don't see the purpose here a package scoped > members). The result of this is that a few of us thought that an > immutable Token might be clearer. > > The most simple change (I personally believe it's an essential > change) > is to make the members private. > The second change for the object to be immutable would be to remove > the > positionIncrement, but since I'm no lucene guru, I can't tell what is > > better (hence the email). > > I'll test the simples changes tonight to see if there is a sizable > performance hit, and I'll wait to see if a guru speaks out about the > controversial second change (which is also trivial). > > Stephane > > Otis Gospodnetic wrote: > > >It sounds to me that having the ability to do that that point 13. in > >CHANGES states is more important than trying to only slightly > decrease > >the number of temporary objects instantiated. > > > >By the way, have you observed or measured the difference in > >performance, memory consumption or anything else, before and after > your > >local changes? > >Not having those and making Token immutable for performance reasons > >would be wrong. > > > >Thanks, > >Otis > > > > > >--- stephane vaucher wrote: > > > >>I've noticed that there is a method public void > >>setPositionIncrement(int > >>positionIncrement) that would probably have to disappear for Token > to > >>be > >>immutable. The CHANGES.txt doc seems to mention some good reasons > why > >>it > >>was added, but there is no code in CVS that seems to depend on it. > >> > >> From CHANGES: > >> 13. Added new method Token.setPositionIncrement(). > >> > >> This permits, for the purpose of phrase searching, placing > >> multiple terms in a single position. This is useful with > >> stemmers that produce multiple possible stems for a word. > >> > >> This also permits the introduction of gaps between terms, so > >>that > >> terms which are adjacent in a token stream will not be matched > >>by > >> and exact phrase query. This makes it possible, e.g., to > build > >> an analyzer where phrases are not matched over stop words > which > >> have been removed. > >> > >> Finally, repeating a token with an increment of zero can also > be > >> used to boost scores of matches on that token. (cutting) > >> > >>Any comments? With an immutable Token, does the positionIncrement > >>still > >>have a reason for being there? If not, then I'll remove > >>getPositionIncrement as well. > >> > >>Stephane > >> > >>Doug Cutting wrote: > >> > >>>stephane vaucher wrote: > >>> > >>>>1) Does anyone mind? Will it break anything? > >>>> > >>> > >>>It shouldn't break anything. > >>> > >>>>2) Are there units tests for this? (particularly > >>>> > >>PorterStemFilter). > >> > >>>>The changes are obviously not spectacular, but I prefer not to > >>>> > >>screw > >> > >>>>everyone up... > >>>> > >>> > >>>I don't know of any unit tests specifically for this. Mostly this > > >>>change will affect compilation. In general though, if you don't > >>> > >>see > >> > >>>unit tests for things that you think you might break, then it > never > >>> > >>>hurts to write more unit tests. > >>> > >>>>3) I've checked-out the latest version of lucene, is there > >>>> > >>anything > >> > >>>>special I need to do if I get the go ahead to check my stuff in > >>>> > >>(like > >> > >>>>a dev list review)? > >>>> > >>> > >>>If you're not a regular committer then please send diffs to > >>> > >>lucene-dev > >> > >>>before comitting and give folks a few days to consider the > changes. > >>> > >>>Doug > >>> > >>> > >>>-- > >>>To unsubscribe, e-mail: > >>> > >>>For additional commands, e-mail: > >>> > >>> > >>> > >> > >> > >>-- > >>To unsubscribe, e-mail: > >> > >>For additional commands, e-mail: > >> > >> > > > > > >__________________________________________________ > >Do you Yahoo!? > >Yahoo! Mail Plus - Powerful. Affordable. Sign up now. > >http://mailplus.yahoo.com > > > >-- > >To unsubscribe, e-mail: > > >For additional commands, e-mail: > > > > > > > > > -- > To unsubscribe, e-mail: > > For additional commands, e-mail: > > __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com -- To unsubscribe, e-mail: For additional commands, e-mail: