lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: SOLR Cloud Rebuild core
Date Sat, 14 Jun 2014 20:02:39 GMT
Exactly. Do one commit at the end. I do this for indexes with 4 million or more documents.
Works fine.

I don't know of a way to flush the queued add-document commands. Solr does not have the concept
of a single transaction that can be dropped.

wunder

On Jun 14, 2014, at 12:57 PM, "Branham, Jeremy [HR]" <Jeremy.D.Branham@sprint.com> wrote:

> Thanks Walter -
> 
> We have a few different use cases where there is no good way [currently] to uniquely
identify a document.
> We also have some cores where real-time freshness is key.
> I'd rather move to cloud than have 2 different instances.
> 
> Since some cores will need to have all documents deleted during the rebuild, there a
way to do complete rebuild without disturbing queries to the core?
> Maybe postponing the commit to the end of the rebuild process?
> 
> I'm thinking the commit would then delete all existing document add all the new documents.
> Is this flawed?
> 
> Assuming that would work - if the reindexing fails could we bail on the commit and still
have a sane core?
> 
> Jeremy Branham
> 
> -----Original Message-----
> From: Walter Underwood [mailto:wunder@wunderwood.org]
> Sent: Saturday, June 14, 2014 2:44 PM
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR Cloud Rebuild core
> 
> If you don't need near real-time index freshness, why are you implementing Solr Cloud?
It is harder to set up and adds functionality you are not using. Solr Cloud is designed for
a fully live index, not offline indexing.
> 
> Also, I don't understand why people do this complicated offline build and swapping stuff.
That is precisely what replication already does for you and it is built-in. You build an index
on one system and swap it in over the network.
> 
> wunder
> 
> On Jun 14, 2014, at 12:29 PM, "Branham, Jeremy [HR]" <Jeremy.D.Branham@sprint.com>
wrote:
> 
>> We are looking to move from legacy master/slave configuration to the cloud configuration.
>> 
>> In the past we have handled rebuilding cores by using a 'live' core and a core for
performing the rebuild on.
>> When a rebuild is complete, we swap the rebuilt core with the live core.
>> 
>> Is this still a good way to do offline rebuilding when using cloud?
>> 
>> Full re-indexing on our largest index only takes 25 min.
>> 
>> Thanks!
>> 
>> 
>> Jeremy Branham
>> 
> 
> 
> 
> 
> 
> ________________________________
> 
> This e-mail may contain Sprint proprietary information intended for the sole use of the
recipient(s). Any use by others is prohibited. If you are not the intended recipient, please
contact the sender and delete all copies of the message.
> 

--
Walter Underwood
wunder@wunderwood.org




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message