lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Should Token be immutable?
Date Tue, 07 Jan 2003 00:01:32 GMT
Ah, sorry about bringing up performance, I mixed that with another
thread.
Anyhow, I still think that setPosition offers a nice feature that some
people may want to use.  It was on a to do list for a while, and it was
there because people requested it, so even though Lucene doesn't use
setPosition internally, maybe Lucene-based apps out there are.

Otis


--- stephane vaucher <vaucher@LUB.UMontreal.CA> wrote:
> I'm not sure if I understand your question. I'm not trying to
> optimise 
> anything. This thread was spawned because the usage of Token was
> unclear 
> and inconsistent (I don't see the purpose here a package scoped 
> members). The result of this is that a few of us thought that an 
> immutable Token might be clearer.
> 
> The most simple change (I personally believe it's an essential
> change) 
> is to make the members private.
> The second change for the object to be immutable would be to remove
> the 
> positionIncrement, but since I'm no lucene guru, I can't tell what is
> 
> better (hence the email).
> 
> I'll test the simples changes tonight to see if there is a sizable 
> performance hit, and I'll wait to see if a guru speaks out about the 
> controversial second change (which is also trivial).
> 
> Stephane
> 
> Otis Gospodnetic wrote:
> 
> >It sounds to me that having the ability to do that that point 13. in
> >CHANGES states is more important than trying to only slightly
> decrease
> >the number of temporary objects instantiated.
> >
> >By the way, have you observed or measured the difference in
> >performance, memory consumption or anything else, before and after
> your
> >local changes?
> >Not having those and making Token immutable for performance reasons
> >would be wrong.
> >
> >Thanks,
> >Otis
> >
> >
> >--- stephane vaucher <vaucher@LUB.UMontreal.CA> wrote:
> >
> >>I've noticed that there is a method public void
> >>setPositionIncrement(int 
> >>positionIncrement) that would probably have to disappear for Token
> to
> >>be 
> >>immutable. The CHANGES.txt doc seems to mention some good reasons
> why
> >>it 
> >>was added, but there is no code in CVS that seems to depend on it.
> >>
> >> From CHANGES:
> >> 13. Added new method Token.setPositionIncrement().
> >>
> >>     This permits, for the purpose of phrase searching, placing
> >>     multiple terms in a single position.  This is useful with
> >>     stemmers that produce multiple possible stems for a word.
> >>
> >>     This also permits the introduction of gaps between terms, so
> >>that
> >>     terms which are adjacent in a token stream will not be matched
> >>by
> >>     and exact phrase query.  This makes it possible, e.g., to
> build
> >>     an analyzer where phrases are not matched over stop words
> which
> >>     have been removed.
> >>
> >>     Finally, repeating a token with an increment of zero can also
> be
> >>     used to boost scores of matches on that token.  (cutting)
> >>
> >>Any comments? With an immutable Token, does the positionIncrement
> >>still 
> >>have a reason for being there? If not, then I'll remove 
> >>getPositionIncrement as well.
> >>
> >>Stephane
> >>
> >>Doug Cutting wrote:
> >>
> >>>stephane vaucher wrote:
> >>>
> >>>>1) Does anyone mind? Will it break anything?
> >>>>
> >>>
> >>>It shouldn't break anything.
> >>>
> >>>>2) Are there units tests for this? (particularly
> >>>>
> >>PorterStemFilter). 
> >>
> >>>>The changes are obviously not spectacular, but I prefer not to
> >>>>
> >>screw 
> >>
> >>>>everyone up...
> >>>>
> >>>
> >>>I don't know of any unit tests specifically for this.  Mostly this
> 
> >>>change will affect compilation.  In general though, if you don't
> >>>
> >>see 
> >>
> >>>unit tests for things that you think you might break, then it
> never
> >>>
> >>>hurts to write more unit tests.
> >>>
> >>>>3) I've checked-out the latest version of lucene, is there
> >>>>
> >>anything 
> >>
> >>>>special I need to do if I get the go ahead to check my stuff in
> >>>>
> >>(like 
> >>
> >>>>a dev list review)?
> >>>>
> >>>
> >>>If you're not a regular committer then please send diffs to
> >>>
> >>lucene-dev 
> >>
> >>>before comitting and give folks a few days to consider the
> changes.
> >>>
> >>>Doug
> >>>
> >>>
> >>>-- 
> >>>To unsubscribe, e-mail:   
> >>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> >>>For additional commands, e-mail: 
> >>><mailto:lucene-dev-help@jakarta.apache.org>
> >>>
> >>>
> >>
> >>
> >>--
> >>To unsubscribe, e-mail:  
> >><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> >>For additional commands, e-mail:
> >><mailto:lucene-dev-help@jakarta.apache.org>
> >>
> >
> >
> >__________________________________________________
> >Do you Yahoo!?
> >Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> >http://mailplus.yahoo.com
> >
> >--
> >To unsubscribe, e-mail:  
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> >For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
> >
> >
> 
> 
> 
> --
> To unsubscribe, e-mail:  
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message