lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jm <jmugur...@gmail.com>
Subject Re: questions on upgrading to 3.0: Version.LUCENE_* and Field.setOmitNorms()
Date Thu, 18 Feb 2010 15:00:32 GMT
Thanks for the replies Ian and Robert. In my case, I am in a bit of a
uneasy position, cannot reindex, original docs are gone...

What would you recommend? I have to choose one value, and some
customers started using our system with lucene 2.3, others with lucene
2.4 and others with 2.9.
My usage of problematic classes are (in the older code):

public final class CustomStopAnalyzer extends Analyzer {
    private Set stopWords;
    public CustomStopAnalyzer() {
        this.stopWords =
CustomStopFilter.makeStopSet(StopWordCustomList.getStopwords());
    }
    public TokenStream tokenStream(String fieldName, Reader reader) {
        return new CustomStopFilter(new
LowerCaseLetterNumberTokenizer(reader), stopWords);
    }
    public static CustomStopAnalyzer getTestCustomAnalyzer(String lang) {
        CustomStopAnalyzer ca = new CustomStopAnalyzer(lang);
        return ca;
    }
}

public class CustomStopFilter extends TokenFilter {...}
public final class LowerCaseLetterNumberTokenizer extends LetterTokenizer {
    public LowerCaseLetterNumberTokenizer(Reader in) {
        super(in);
    }

    protected boolean isTokenChar(char c) {
        return Character.isLetterOrDigit(c);
    }

    protected char normalize(char c) {
        return Character.toLowerCase(c);
    }
}

javi

On Thu, Feb 18, 2010 at 3:35 PM, Robert Muir <rcmuir@gmail.com> wrote:
> yes, if you use LUCENE_CURRENT, you may have to reindex (if any
> analyzers/tokenizers you are using have changed).
>
> if you use an actual version (for example LUCENE_30), you can upgrade your
> jar file to say a future 3.1 jar without reindexing, then later at your
> leisure (after testing/qa whatever you want), you can bump your version to
> LUCENE_31 and reindex.
>
> On Thu, Feb 18, 2010 at 9:24 AM, Ian Lea <ian.lea@gmail.com> wrote:
>
>> But typically you wouldn't need to reindex, would you?  From the 3.0
>> javadocs for LUCENE_CURRENT:
>>
>> WARNING: if you use this setting, and then upgrade to a newer release
>> of Lucene, sizable changes may happen. If precise back compatibility
>> is important then you should instead explicitly specify an actual
>> version.
>>
>> I read this as meaning that it is safe to use it unless you want
>> precise back compatibility and are prepared to accept the risk that
>> you may have to reindex.  When upgrading my code and indexes to 3.0
>> I've used LUCENE_CURRENT and haven't reindexed, and haven't noticed
>> any problems.
>>
>>
>> --
>> Ian.
>>
>>
>>
>> On Thu, Feb 18, 2010 at 1:20 PM, Robert Muir <rcmuir@gmail.com> wrote:
>> > Only use LUCENE_CURRENT if you do not care about backwards compatibility
>> at
>> > all: e.g. you are perfectly happy re-indexing all data when you upgrade
>> the
>> > lucene jar file in future.
>> >
>> > its not about relying on quirks in previous versions of lucene, its about
>> > being compatible with changes in future versions, you set it to LUCENE_30
>> or
>> > whatever so that you can upgrade to 3.1 jar, without reindexing.
>> >
>> > On Thu, Feb 18, 2010 at 6:42 AM, Ian Lea <ian.lea@gmail.com> wrote:
>> >
>> >>
>> >> Unless you are relying on quirks in particular versions of lucene
>> >> setting it to LUCENE_CURRENT is probably best.
>> >>
>> >>
>> > --
>> > Robert Muir
>> > rcmuir@gmail.com
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message