lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stephane vaucher <vauc...@LUB.UMontreal.CA>
Subject Re: Should Token be immutable?
Date Mon, 06 Jan 2003 23:17:32 GMT
I'm not sure if I understand your question. I'm not trying to optimise 
anything. This thread was spawned because the usage of Token was unclear 
and inconsistent (I don't see the purpose here a package scoped 
members). The result of this is that a few of us thought that an 
immutable Token might be clearer.

The most simple change (I personally believe it's an essential change) 
is to make the members private.
The second change for the object to be immutable would be to remove the 
positionIncrement, but since I'm no lucene guru, I can't tell what is 
better (hence the email).

I'll test the simples changes tonight to see if there is a sizable 
performance hit, and I'll wait to see if a guru speaks out about the 
controversial second change (which is also trivial).

Stephane

Otis Gospodnetic wrote:

>It sounds to me that having the ability to do that that point 13. in
>CHANGES states is more important than trying to only slightly decrease
>the number of temporary objects instantiated.
>
>By the way, have you observed or measured the difference in
>performance, memory consumption or anything else, before and after your
>local changes?
>Not having those and making Token immutable for performance reasons
>would be wrong.
>
>Thanks,
>Otis
>
>
>--- stephane vaucher <vaucher@LUB.UMontreal.CA> wrote:
>
>>I've noticed that there is a method public void
>>setPositionIncrement(int 
>>positionIncrement) that would probably have to disappear for Token to
>>be 
>>immutable. The CHANGES.txt doc seems to mention some good reasons why
>>it 
>>was added, but there is no code in CVS that seems to depend on it.
>>
>> From CHANGES:
>> 13. Added new method Token.setPositionIncrement().
>>
>>     This permits, for the purpose of phrase searching, placing
>>     multiple terms in a single position.  This is useful with
>>     stemmers that produce multiple possible stems for a word.
>>
>>     This also permits the introduction of gaps between terms, so
>>that
>>     terms which are adjacent in a token stream will not be matched
>>by
>>     and exact phrase query.  This makes it possible, e.g., to build
>>     an analyzer where phrases are not matched over stop words which
>>     have been removed.
>>
>>     Finally, repeating a token with an increment of zero can also be
>>     used to boost scores of matches on that token.  (cutting)
>>
>>Any comments? With an immutable Token, does the positionIncrement
>>still 
>>have a reason for being there? If not, then I'll remove 
>>getPositionIncrement as well.
>>
>>Stephane
>>
>>Doug Cutting wrote:
>>
>>>stephane vaucher wrote:
>>>
>>>>1) Does anyone mind? Will it break anything?
>>>>
>>>
>>>It shouldn't break anything.
>>>
>>>>2) Are there units tests for this? (particularly
>>>>
>>PorterStemFilter). 
>>
>>>>The changes are obviously not spectacular, but I prefer not to
>>>>
>>screw 
>>
>>>>everyone up...
>>>>
>>>
>>>I don't know of any unit tests specifically for this.  Mostly this 
>>>change will affect compilation.  In general though, if you don't
>>>
>>see 
>>
>>>unit tests for things that you think you might break, then it never
>>>
>>>hurts to write more unit tests.
>>>
>>>>3) I've checked-out the latest version of lucene, is there
>>>>
>>anything 
>>
>>>>special I need to do if I get the go ahead to check my stuff in
>>>>
>>(like 
>>
>>>>a dev list review)?
>>>>
>>>
>>>If you're not a regular committer then please send diffs to
>>>
>>lucene-dev 
>>
>>>before comitting and give folks a few days to consider the changes.
>>>
>>>Doug
>>>
>>>
>>>-- 
>>>To unsubscribe, e-mail:   
>>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>>For additional commands, e-mail: 
>>><mailto:lucene-dev-help@jakarta.apache.org>
>>>
>>>
>>
>>
>>--
>>To unsubscribe, e-mail:  
>><mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>>For additional commands, e-mail:
>><mailto:lucene-dev-help@jakarta.apache.org>
>>
>
>
>__________________________________________________
>Do you Yahoo!?
>Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
>http://mailplus.yahoo.com
>
>--
>To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>
>



--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message