lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Hall <mh...@informatics.jax.org>
Subject Re: Lucene's default settings & back compatibility
Date Thu, 21 May 2009 16:54:27 GMT
Sorry, I wasn't quite sure what to call this new class you guys have 
been talking about.

I was referring to the class that's being discussed to encapsulate all 
of the defaults for a given lucene release.  (Its caching strategies etc 
etc)

I'm just not certain that something like a static list of words belongs 
in a higher level defaults class like you guys are talking about, 
especially considering that anyone using a stop enabled analyzer really 
should be familiar with this list, and oftentimes needs to override it.

Meh, now that I'm actually typing it out though, perhaps I'm incorrect 
here, assuming this class you guys are describing will be well 
advertised/documented maybe it will actually make it easier for end 
developers to twiddle around with this list, or at least certainly make 
them more aware that its even something that they have the ability to 
actually change.

Matt

Michael McCandless wrote:
> What is the "lucene defaults class"?
>
> Mike
>
> On Thu, May 21, 2009 at 12:37 PM, Matthew Hall
> <mhall@informatics.jax.org> wrote:
>   
>> For extreme examples like this, couldn't the stopword list be encapsulated
>> into a single class that's used by the lucene defaults class.
>>
>> That way if you folks released updates to mostly static content like a
>> stopword list, new or old users could get it easily with a simple drop in
>> fix?
>>
>> Just my two cents.
>>
>> Matt
>>
>> Michael McCandless wrote:
>>     
>>> On Thu, May 21, 2009 at 12:19 PM, Robert Muir <rcmuir@gmail.com> wrote:
>>>
>>>       
>>>> even as simple as changing default stopword list for some analyzer could
>>>> be
>>>> an issue, if the user doesn't re-index in response to that change.
>>>>
>>>>         
>>> OK, right.
>>>
>>> So say we forgot to include "the" in the default English stopwords
>>> list (yes, an extreme example...).
>>>
>>> Under the proposed changes 1 & 2 to back-compat policy, we would add
>>> "the" to the default stopword list, so new users get the fix, but
>>> still keep the the-less list accessible (deprecated).  We'd add an
>>> entry in CHANGES.txt saying this happened, and then show code on how
>>> to get back to the the-less stopword list.
>>>
>>> New users using that StopFilter would properly see "the" filtered out.
>>>  Users who upgraded would need to fix their code to switch back to the
>>> deprecated the-less list.
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>>       
>> --
>> Matthew Hall
>> Software Engineer
>> Mouse Genome Informatics
>> mhall@informatics.jax.org
>> (207) 288-6012
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>   


-- 
Matthew Hall
Software Engineer
Mouse Genome Informatics
mhall@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message