lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Burton-West, Tom" <tburt...@umich.edu>
Subject RE: Does change to ICU in Lucene/Solr 3.3 require re-indexing?
Date Thu, 14 Jul 2011 19:34:31 GMT
Thanks Robert,

Looks like we indexed with icu4j-4_4_2.jar, which I assume is a 4.4 version using unicode
5.2

3.1 dev: icu4j-4_4_2.jar
3.3:     icu4j-4_8.jar

So do I just put the icu4j-4_4_2.jar in $SOLR_HOME/lib alongside the lucene-icu-3.1-SNAPSHOT.jar?

Is there any easy way to test?

Sounds like we will need to re-index the whole 9 million books with the Solr/Lucene 3.3 (4.8
jar) to be on the safe side.


Tom
-----Original Message-----
From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Thursday, July 14, 2011 2:29 PM
To: java-user@lucene.apache.org
Subject: Re: Does change to ICU in Lucene/Solr 3.3 require re-indexing?

It could be the case, but I am not sure what version of icu jar you
had before without looking thru svn logs.

if you are currently using 4.6, you are probably ok, as that was when
the unicode version was bumped to 6.0.
most of the rules etc are driven by the unicode version itself.

I would suggest just using your old icu jar and lucene-icu.jar until
you yourself want to upgrade... its not guaranteed to work but I
suspect it will :)

On Thu, Jul 14, 2011 at 2:08 PM, Burton-West, Tom <tburtonw@umich.edu> wrote:
> We are about to upgrade to Solr/Lucene 3.3 from a 3.1dev version (Lucene Implementation
Version: 3.1-SNAPSHOT 1036094 - 2010-11-19 16:01:10)
>
> We have a 6 TB + index that includes somewhere over 200 languages that was indexed with
the ICUTokenizer and ICUFoldingFilter from  3.1dev and would like to avoid re-indexing if
possible.
>
> LUCENE-3149<http://issues.apache.org/jira/browse/LUCENE-3149>: Upgrade contrib/icu's
ICU jar file to ICU 4.8.
> I couldn't tell from looking at the release notes from ICU 4.8 whether the changes affected
internal API's or actual rules for tokenizing or folding
>
> Do the changes to the ICU filters/tokenizers in Solr/Lucene 3.3 change how tokenizing
and the folding filter work in terms of queries run through the 3.3 filters possibly not matching
documents indexed with the 3.1dev filters?
>
> Tom Burton-West
>
>
>



-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
View raw message