lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kelly, Frank" <frank.ke...@here.com>
Subject Re: Copying SolrCloud collections (Replication? Backup/Restore?)
Date Fri, 10 Feb 2017 18:41:56 GMT
Thanks Erick for that idea and the fast response


Cheers!

F

On 2/10/17, 1:24 PM, "Erick Erickson" <erickerickson@gmail.com> wrote:

>First, perhaps the slickest way to reindex without as much downtime would
>be to just index to a _new_ collection. Then use "collection aliasing" to
>point incoming requests to the old collection to the new one. True, you do
>need extra hardware....
>
>But that aside, Solr (well Lucene really) indexes are just files. There's
>a
>collection-wide backup restore but check the PDF for your Solr version to
>see if it's available to you.
>
>Beyond that, just copy things around. So here's a process, modify as you
>see fit:
>1> index to your new collection in region 1
>2> in region 2, create a new collection with the same number of shards (no
>followers, leader-only).
>3> with the Solr instances in region 2 down, copy the data dir from your
>servers in region 1 to the corresponding data dir on your severs in region
>2. It is _very_ important that the hash ranges match. If you look at your
>state.json you'll see an entry for each shard like "hash_range
>0x8000000-0xffffffff. The hash range on the source must match exactly the
>hash range on dest in region 2. Double check this as you basically copy
>from collection_shard1_replica1...data(on region 1)/data to
>collection_shard1_replica1...data on region 2.
>4> Once this is done for all shards, bring up Solr on region 2 and verify
>it's as you expect.
>5> Use the Collections API to ADDREPLICA in region 2 to build out your
>collection. the ADDREPLICA will automatically copy the index from the
>leader.
>
>Best,
>Erick
>
>On Fri, Feb 10, 2017 at 10:12 AM, Kelly, Frank <frank.kelly@here.com>
>wrote:
>
>> Hello,
>>
>>   We have a 100M+ documents across 2 collections and need to reindex the
>> entirety of the Collections as we need to turn on ³docValues²:true on a
>> number of fields (see previous emails from this week :-] ).
>> Unfortunately we have 4 AWS regions each with their own SolrCloud
>>cluster
>> each with its own copy of the entire search index.
>> So we have to do this reindex 4 times and in each case we have to take
>> down each region as we need to delete the collection. And reindexing
>>takes
>> about 2-3 days.
>>
>> Is there someway we can reindex in one (offline) region and then use
>>some
>> mechanism - replication? Backup/restore? EBS snapshot? to ³copy and
>>paste²
>> a known Solr state from one SolrCloud instance to another.
>> From that state then we¹d just reindex the delta (from when the snapshot
>> was taken to now)
>>
>> Appreciate any thoughts or ideas or hear how other folks do it,
>>
>> Thanks!
>>
>> -Frank
>>
>> [image: Description: Macintosh
>> 
>>HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRGB:PD
>>F:HERE_Logo_2016_POS_sRGB.pdf]
>>
>>
>>
>> *Frank Kelly*
>>
>> *Principal Software Engineer*
>>
>>
>>
>> HERE
>>
>> 5 Wayside Rd, Burlington, MA 01803, USA
>>
>> *42° 29' 7" N 71° 11' 32" W*
>>
>>
>> [image: Description:
>> 
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_360.gif]
>> 
>><https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2F360.he
>>re.com%2F&data=01%7C01%7C%7C05ea18ff9173472e95f008d451e22130%7C6d4034cd72
>>254f72b85391feaea64919%7C1&sdata=PXzSNwFL%2FgL2xo4tQ35vCzfIq4eQVr0roL6pzY
>>nbRvg%3D&reserved=0>    [image: Description:
>> 
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_Twitter.gif]
>> 
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.t
>>witter.com%2Fhere&data=01%7C01%7C%7C05ea18ff9173472e95f008d451e22130%7C6d
>>4034cd72254f72b85391feaea64919%7C1&sdata=lV7%2BO0mdqv%2Fj%2Fg05nt7nBwrfHe
>>ED7%2BOir%2B5OOcYByA8%3D&reserved=0>   [image: Description:
>> 
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_FB.gif]
>> 
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.f
>>acebook.com%2Fhere&data=01%7C01%7C%7C05ea18ff9173472e95f008d451e22130%7C6
>>d4034cd72254f72b85391feaea64919%7C1&sdata=1JMzDtPvN5lML9rvnrygoPi5vRwcrup
>>Rlko7oC1bT3w%3D&reserved=0>    [image: Description:
>> 
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_IN.gif]
>> 
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.l
>>inkedin.com%2Fcompany%2Fheremaps&data=01%7C01%7C%7C05ea18ff9173472e95f008
>>d451e22130%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=ySduRBgnY7f%2FDzx
>>0xdBmvq08oOtls5TcYs1G4jWJqFo%3D&reserved=0>    [image: Description:
>> 
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_Insta.gif]
>> 
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.i
>>nstagram.com%2Fhere%2F&data=01%7C01%7C%7C05ea18ff9173472e95f008d451e22130
>>%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=9tf7axgNV3jq5bYBkFNoRg6Pmwc
>>HXPcgcVsAN%2BBf85A%3D&reserved=0>
>>


Mime
View raw message