lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2368) stopword files should be versioned; acessor for default(s) should take a Version property
Date Tue, 06 Apr 2010 01:48:28 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853688#action_12853688
] 

Hoss Man commented on LUCENE-2368:
----------------------------------

This is something i brought up with Robert on IRC a few days ago, and forgot to file an issue
for...

* We should make all the langauge specific stopword files have something in their name that
identifies them so we can add newer versions of them over time with distiguished names.  The
simplest convention moving forward would probably be to name the file after the first Lucene
version it was added in (ie: "russian_stop_3_3.txt") but there is no reason why the names
have to directly corrispond to the Lucene Version -- they could just as easily have completely
sequential names (ie: "russian_stop_001.txt" or "russian_stop_AAA.txt"). 

* All of the static "getDefaultStopSet()" methods in all of the various Analyzers should be
changed to take in a Version param which picks the appropriate file (or staticly compiled
set) based on the param.  Any Analyzer that already has Version based stopword switching logic
in it's constructor should instead just delegate to the getDefaultStopSet() method.



> stopword files should be versioned; acessor for default(s) should take a Version property
> -----------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2368
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2368
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Hoss Man
>             Fix For: 2.3.3
>
>
> The existing language specific stopword files on the trunk have no version info in their
filenames -- this will make it awkward/confusing to update them as time goes on.  LIkewise,
many classes have a "getDefaultStopSet()" which makes these methods (when called by client
code) suffer from the same API back-compat issues that the Analyzers themselves did before
we added Version.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message