Ah, sorry about bringing up performance, I mixed that with another
thread.
Anyhow, I still think that setPosition offers a nice feature that some
people may want to use. It was on a to do list for a while, and it was
there because people requested it, so even though Lucene doesn't use
setPosition internally, maybe Lucene-based apps out there are.
Otis
--- stephane vaucher <vaucher@LUB.UMontreal.CA> wrote:
> I'm not sure if I understand your question. I'm not trying to
> optimise
> anything. This thread was spawned because the usage of Token was
> unclear
> and inconsistent (I don't see the purpose here a package scoped
> members). The result of this is that a few of us thought that an
> immutable Token might be clearer.
>
> The most simple change (I personally believe it's an essential
> change)
> is to make the members private.
> The second change for the object to be immutable would be to remove
> the
> positionIncrement, but since I'm no lucene guru, I can't tell what is
>
> better (hence the email).
>
> I'll test the simples changes tonight to see if there is a sizable
> performance hit, and I'll wait to see if a guru speaks out about the
> controversial second change (which is also trivial).
>
> Stephane
>
> Otis Gospodnetic wrote:
>
> >It sounds to me that having the ability to do that that point 13. in
> >CHANGES states is more important than trying to only slightly
> decrease
> >the number of temporary objects instantiated.
> >
> >By the way, have you observed or measured the difference in
> >performance, memory consumption or anything else, before and after
> your
> >local changes?
> >Not having those and making Token immutable for performance reasons
> >would be wrong.
> >
> >Thanks,
> >Otis
> >
> >
> >--- stephane vaucher <vaucher@LUB.UMontreal.CA> wrote:
> >
> >>I've noticed that there is a method public void
> >>setPositionIncrement(int
> >>positionIncrement) that would probably have to disappear for Token
> to
> >>be
> >>immutable. The CHANGES.txt doc seems to mention some good reasons
> why
> >>it
> >>was added, but there is no code in CVS that seems to depend on it.
> >>
> >> From CHANGES:
> >> 13. Added new method Token.setPositionIncrement().
> >>
> >> This permits, for the purpose of phrase searching, placing
> >> multiple terms in a single position. This is useful with
> >> stemmers that produce multiple possible stems for a word.
> >>
> >> This also permits the introduction of gaps between terms, so
> >>that
> >> terms which are adjacent in a token stream will not be matched
> >>by
> >> and exact phrase query. This makes it possible, e.g., to
> build
> >> an analyzer where phrases are not matched over stop words
> which
> >> have been removed.
> >>
> >> Finally, repeating a token with an increment of zero can also
> be
> >> used to boost scores of matches on that token. (cutting)
> >>
> >>Any comments? With an immutable Token, does the positionIncrement
> >>still
> >>have a reason for being there? If not, then I'll remove
> >>getPositionIncrement as well.
> >>
> >>Stephane
> >>
> >>Doug Cutting wrote:
> >>
> >>>stephane vaucher wrote:
> >>>
> >>>>1) Does anyone mind? Will it break anything?
> >>>>
> >>>
> >>>It shouldn't break anything.
> >>>
> >>>>2) Are there units tests for this? (particularly
> >>>>
> >>PorterStemFilter).
> >>
> >>>>The changes are obviously not spectacular, but I prefer not to
> >>>>
> >>screw
> >>
> >>>>everyone up...
> >>>>
> >>>
> >>>I don't know of any unit tests specifically for this. Mostly this
>
> >>>change will affect compilation. In general though, if you don't
> >>>
> >>see
> >>
> >>>unit tests for things that you think you might break, then it
> never
> >>>
> >>>hurts to write more unit tests.
> >>>
> >>>>3) I've checked-out the latest version of lucene, is there
> >>>>
> >>anything
> >>
> >>>>special I need to do if I get the go ahead to check my stuff in
> >>>>
> >>(like
> >>
> >>>>a dev list review)?
> >>>>
> >>>
> >>>If you're not a regular committer then please send diffs to
> >>>
> >>lucene-dev
> >>
> >>>before comitting and give folks a few days to consider the
> changes.
> >>>
> >>>Doug
> >>>
> >>>
> >>>--
> >>>To unsubscribe, e-mail:
> >>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> >>>For additional commands, e-mail:
> >>><mailto:lucene-dev-help@jakarta.apache.org>
> >>>
> >>>
> >>
> >>
> >>--
> >>To unsubscribe, e-mail:
> >><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> >>For additional commands, e-mail:
> >><mailto:lucene-dev-help@jakarta.apache.org>
> >>
> >
> >
> >__________________________________________________
> >Do you Yahoo!?
> >Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> >http://mailplus.yahoo.com
> >
> >--
> >To unsubscribe, e-mail:
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> >For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
> >
> >
>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
>
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
|