lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Proposal about Version API "relaxation"
Date Thu, 15 Apr 2010 19:25:03 GMT
The reason Earwin why online migration is faster is because when u
finally need to *fully* migrate your index, most chances are that most
of the segments are already on the newer format. Offline migration
will just keep the application idle for some amount of time until ALL
segments are migrated.

During the lifecycle of the index, segments are merged anyway, so
migrating them on the fly virtually costs nothing. At the end, when u
upgrade to a Lucene version which doesn't support the previous index
format, you'll on the worse case need to migrate few large segments
which were never merged. I don't know how many of those there will be
as it really depends on the application, but I'd bet this process will
touch just a few segments. And hence, throughput wise it will be a lot
faster.

We should create a migrate() API on IW which will touch just those
segments and not incur a full optimize. That API can also be used for
an offline migration tool, if we decide that's what we want.

Shai

On Thursday, April 15, 2010, jm <jmuguruza@gmail.com> wrote:
> Not sure if plain users are allowed/encouraged to post in this list,
> but wanted to mention (just an opinion from a happy user), as other
> users have, that not all of us can reindex just like that. It would
> not be 10 min for one of our installations for sure...
>
> First, i would need to implement some code to reindex, cause my source
> data is postprocessed/compressed/encrypted/moved after it arrives to
> the application, so I would need to retrieve all etc. And then
> reindexing it would take days.
> javier
>
> On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <earwin@gmail.com> wrote:
>>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>>> manual migration on the segments that are still on old versions.
>>> That's not the point about whether optimize() is good or not. It is
>>> the difference between telling the customer to run a 5-day migration
>>> process, or a couple of hours. At the end of the day, the same
>>> migration code will need to be written whether for the manual or
>>> automatic case. And probably by the same developer which changed the
>>> index format. It's the difference of when does it happen.
>>
>> Converting stuff is easier then emulating, that's exactly why I want a
>> separate tool.
>> There's no need to support cross-version merging, nor to emulate old APIs.
>>
>> I also don't understand why offline migration is going to take days
>> instead of hours for online migration??
>> WTF, it's gonna be even faster, as it doesn't have to merge things.
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message