lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Upgrading from Solr 1.2 to 3.1/2 (was: Re: tentative! release notes drafts)
Date Fri, 03 Jun 2011 20:58:52 GMT
Hi,

Just to clarify some things:

a) IndexUpgrader has nothing to do with the problem of analyzer
incompatibility or changes in analyzers. The tool is only to upgrade all
segments of an index to the latest *file*format. This is especially
important once we upgrade to Lucene 4.0, as old indexes may cause a
slowdown, because of the so-called "surrogates dance" (reordering of terms
from UTF-16 to UTF-8 sort order). The Index Upgrader tool does not change
index contents or indexed terms/postings/whatever. It just upgrades the file
format. It is also of use, if you want to upgrade to Lucene 4.0 and your
index is pre-3.0. Because 1.x/2.x Indexes will no longer readable by Lucene
4.0, IndexUpgrader makes it possible to first upgrade all index segments to
version 3.2, so it's safe to upgrade to 4.0 once it's finished (see also my
talk @ Lucene Revolution and also next week @ Berlin Buzzwords:
http://s.apache.org/vQ)

b) Since Lucene 2.9 (and Solr 3.1, unfortunately Solr 1.4 did not support
this - Solr 1.4 always uses fixed Lucene 2.4 compatibility for all
analyzers, even if it ships with Lucene 2.9), you can configure
Tokenizers/TokenFilters/... to behave like previous Lucene versions. To make
the analysis chain of this old Lucene 2.1 index *compatible* to Lucene/Solr
3.1+, you can pass a "matchVersion" property to lot's of
Tokenizers/TokenFilters. To see what changed depending on Lucene version,
see the javadocs of the corresponding class. One example is
http://lucene.apache.org/java/3_1_0/api/core/org/apache/lucene/analysis/stan
dard/StandardTokenizer.html <-- you see that it changed its behavior several
times since Lucene 2.0. The same applies to Filters, too. Since Solr 3.1 you
can pass this version number to the Factory (see examples in SolrConfig).
The default value (if not given) is a luceneMatchVersion of Lucene 2.4 (if
you don’t specify a luceneMatchVersion anywhere). Because of this, it's safe
to upgrade your Solr installation to 3.1/3.2, because your schema.xml will
not contain a matchVersion anywhere, so it behaves like Lucene 2.4 on the
Analyzer components.
To be 100% sure, pass a luceneMatchVersion in the main solrconfig using 2.x
- so everything will behave as close as possible to this Lucene version and
analysis is compatible. If you want the latest features of Solr outside of
analysis, you should pass a recent luceneMatchVersion in your main config,
but specify a previous version in your analyzer definitions in schema.xml.

I hope that helps,
Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Ryan McKinley [mailto:ryantxu@gmail.com]
> Sent: Friday, June 03, 2011 9:14 PM
> To: dev@lucene.apache.org
> Subject: Re: Upgrading from Solr 1.2 to 3.1/2 (was: Re: tentative! release
> notes drafts)
> 
> When upgrading Solr from 1.x to 3.x it is *highly* recommended to re-index
> your data.
> 
> The tool Uwe is referring to is a low level, advanced tool -- it *may*
work to
> get your index updated, but unless you really know the inner workings of
> things I would strongly suggest just reindexing.
> 
> 
> On Fri, Jun 3, 2011 at 2:33 PM,  <johnmunir@aol.com> wrote:
> > I'm confused. So, if I'm upgrading from Solr 1.2 to 3.2 and I don't
> > change my <analyzer type="index"> and <analyzer type="query">
> > sections, I don't have to re-index my data?! I was told otherwise (i'm
> > trying to find that
> > post) because the analyzers have changed. In case it matters, i'm
> > using EnglishPorterFilterFactory.
> >
> >
> > -JM
> >
> > -----Original Message-----
> > From: Uwe Schindler <uwe@thetaphi.de>
> > To: dev@lucene.apache.org
> > Sent: Fri, Jun 3, 2011 12:04 pm
> > Subject: RE: tentative! release notes drafts
> >
> > You must only reindex, if you *change* the Analyzers to different
> > ones. If you keep the old schemas and ensure that tokenizer
> > luceneMatchVersion settings keep alive, you don’t need to.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> > From: johnmunir@aol.com [mailto:johnmunir@aol.com]
> > Sent: Friday, June 03, 2011 5:01 PM
> > To: dev@lucene.apache.org
> > Subject: Re: tentative! release notes drafts
> >
> > So, in my case, upgrading from Solr 1.2 to 3.2, I must re-index. OK, I
> > got that, thanks.
> >
> > Btw, where can I learn more about the "new IndexUpgrader tool"?  Is
> > there a doc/wiki for it?
> >
> > -JM
> >
> > -----Original Message-----
> > From: Uwe Schindler <uwe@thetaphi.de>
> > To: dev@lucene.apache.org
> > Sent: Fri, Jun 3, 2011 10:52 am
> > Subject: RE: tentative! release notes drafts Hi,
> >
> > When analyzers change behavior you have to reindex, there is no way
> around.
> > The index-upgrader tool simply rewrites index segments to the new file
> > format, but does not change contents.
> >
> > Uwe
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> > From: johnmunir@aol.com [mailto:johnmunir@aol.com]
> > Sent: Friday, June 03, 2011 4:47 PM
> > To: dev@lucene.apache.org
> > Subject: Re: tentative! release notes drafts
> >
> > Hi,
> >
> > Looking at http://wiki.apache.org/lucene-java/ReleaseNote32, the item:
> >
> >  * A new IndexUpgrader tool fully converts an old index to the
> >    current format.
> > Where can I learn more about this?  Is this for upgrading the Lucene
> > index from Lucene 2.1 to 3.2 due to the analyze change?
> >
> > My question is around the fact that an index created using Solr 1.2 is
> > not compatible with Solr 3.2, and I'm trying to see if the above will
> > solve my problem or must I re-index.
> >
> > Thanks in advanced !!!
> >
> > -JM
> >
> > -----Original Message-----
> > From: Robert Muir <rcmuir@gmail.com>
> > To: dev@lucene.apache.org
> > Sent: Thu, Jun 2, 2011 7:02 pm
> > Subject: tentative! release notes drafts
> >
> > Hello,
> >
> >
> >
> > While we await the results of the vote (72 hours), I thought it would
> >
> > be good to propose something new for the release emails, by putting
> >
> > something up on the wiki where anyone can just edit it. In the past,
> >
> > we would email each other back and forth and I think that's not the
> >
> > easiest way to build up a document.
> >
> >
> >
> > So, if you are motivated, please review and improve:
> >
> >
> >
> > http://wiki.apache.org/lucene-java/ReleaseNote32
> >
> > http://wiki.apache.org/solr/ReleaseNote32
> >
> >
> >
> > Again, this is a little premature and tentative since the vote still
> >
> > has not passed, but it looks like it stands a good chance of doing so,
> >
> > and it would be good to incorporate more feedback... these little
> >
> > "marketing"-type things probably deserve more attention than we
> >
> > normally give them.
> >
> >
> >
> > Also for discussion: for the next release, we could consider copying
> >
> > the templates to ReleaseNote33 and make these live documents that we
> >
> > edit as we work towards the release, instead of at the end (we could
> >
> > always fix them up at the end too). So if we add a really huge feature
> >
> > we could add it when its actually committed.
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message