lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Wartes <jwar...@whitepages.com>
Subject Re: Copying a SolrCloud collection to other hosts
Date Wed, 28 Mar 2018 21:27:25 GMT

I really like the fetchindex approach. Once I figured out the undocumented API, it worked
really well, and I haven't had to change my usage for any Solr I've tried between 4.7-7.2.
I recall having some issues if I tried to apply a fetchindex to a shard that already had data,
where it'd get confused about whether it was already newer. But if you're using it against
a clean index, it's pretty slick, and about the only way to cleanly copy a collection to a
different cluster (using a different ZK) without a shared filesystem. 

In my tool, I rigged it so that if you were copying into a new collection with RF > 1,
it'd copy to the replica1's in the new collection from the old collection, but then copy the
replica2 from the replica1 in the *new* collection, saving on theoretical cross-cluster bandwidth,
which is the CDCR concern too. (fetchindex doesn't apparently trigger the standard replication,
since it doesn't go through the tlog)

In short, if the target collection is getting writes during this process, you're probably
going to have issues. If not though, I agree it'd be pretty simple to set up. You just need
to make sure the destination cluster can talk to the source cluster, and that the target collection
has the same shard count and routing strategy.


´╗┐On 3/28/18, 10:06 AM, "Erick Erickson" <erickerickson@gmail.com> wrote:

    Hmmm, wouldn't even be all that hard would it? A collections API call.
    
    Assuming both collection's state.json nodes were available from ZooKeeper
    a command would have all the necessary information, only an HTTP connection
    required. I don't think it would be too much of a stretch to be able to provide
    the other collection's ZooKeeper ensemble if they were different.
    
    I used "other collection" purposely here, among the many details to be
    worked out would be whether this was run against the source or target
    collection, how all the replicas on the target collection would be
    updated......
    
    There may even be infrastructure already in place for CDCR that could
    be leveraged since the bootstrapping already does this, even across
    data centers.
    
    Not that I'm going to have time to work on it in any near-term though.
    
    On Wed, Mar 28, 2018 at 9:53 AM, David Smiley <david.w.smiley@gmail.com> wrote:
    > Right, there is a shared filesystem requirement.  It would be nice if this
    > Solr feature could be enhanced to have more options like backing up
    > directly to another SolrCloud using replication/fetchIndex like your cool
    > solrcloud_manager thing.
    >
    > On Wed, Mar 28, 2018 at 12:34 PM Jeff Wartes <jwartes@whitepages.com> wrote:
    >
    >> The backup/restore still requires setting up a shared filesystem on all
    >> your nodes though right?
    >>
    >> I've been using the fetchindex trick in my solrcloud_manager tool for ages
    >> now: https://github.com/whitepages/solrcloud_manager#cluster-commands
    >> Some of the original features in that tool have been incorporated into
    >> Solr itself these days, but I still use clonecollection/copycollection
    >> regularly. (most recently with Solr 7.2)
    >>
    >>
    >> On 3/27/18, 9:55 PM, "David Smiley" <david.w.smiley@gmail.com> wrote:
    >>
    >>     The backup/restore API is intended to address this.
    >>
    >> https://builds.apache.org/job/Solr-reference-guide-master/javadoc/making-and-restoring-backups.html
    >>
    >>     Erick's advice is good (and I once drafted docs for the same scheme
    >> years
    >>     ago as well), but I consider it dated -- it's what people had to do
    >> before
    >>     the backup/restore API existed.  Internally, backup/restore is doing
    >>     similar stuff.  It's easy to give backup/restore a try; surely you
    >> have by
    >>     now?
    >>
    >>     ~ David
    >>
    >>     On Tue, Mar 6, 2018 at 9:47 AM Patrick Schemitz <ps@solute.de> wrote:
    >>
    >>     > Hi List,
    >>     >
    >>     > so I'm running a bunch of SolrCloud clusters (each cluster is: 8
    >> shards
    >>     > on 2 servers, with 4 instances per server, no replicas, i.e. 1 shard
    >> per
    >>     > instance).
    >>     >
    >>     > Building the index afresh takes 15+ hours, so when I have to deploy
    >> a new
    >>     > index, I build it once, on one cluster, and then copy (scp) over the
    >>     > data/<main_index>/index directories (shutting down the Solr instances
    >>     > first).
    >>     >
    >>     > I could get Solr 6.5.1 to number the shard/replica directories
    >> nicely via
    >>     > the createNodeSet and createNodeSet.shuffle options:
    >>     >
    >>     > Solr 6.5.1 /var/lib/solr:
    >>     >
    >>     > Server node 1:
    >>     > instance00/data/main_index_shard1_replica1
    >>     > instance01/data/main_index_shard2_replica1
    >>     > instance02/data/main_index_shard3_replica1
    >>     > instance03/data/main_index_shard4_replica1
    >>     >
    >>     > Server node 2:
    >>     > instance00/data/main_index_shard5_replica1
    >>     > instance01/data/main_index_shard6_replica1
    >>     > instance02/data/main_index_shard7_replica1
    >>     > instance03/data/main_index_shard8_replica1
    >>     >
    >>     > However, while attempting to upgrade to 7.2.1, this numbering has
    >> changed:
    >>     >
    >>     > Solr 7.2.1 /var/lib/solr:
    >>     >
    >>     > Server node 1:
    >>     > instance00/data/main_index_shard1_replica_n1
    >>     > instance01/data/main_index_shard2_replica_n2
    >>     > instance02/data/main_index_shard3_replica_n4
    >>     > instance03/data/main_index_shard4_replica_n6
    >>     >
    >>     > Server node 2:
    >>     > instance00/data/main_index_shard5_replica_n8
    >>     > instance01/data/main_index_shard6_replica_n10
    >>     > instance02/data/main_index_shard7_replica_n12
    >>     > instance03/data/main_index_shard8_replica_n14
    >>     >
    >>     > This new numbering breaks my copy script, and furthermode, I'm
    >> worried
    >>     > as to what happens when the numbering is different among target
    >> clusters.
    >>     >
    >>     > How can I switch this back to the old numbering scheme?
    >>     >
    >>     > Side note: is there a recommended way of doing this? Is the
    >>     > backup/restore mechanism suitable for this? The ref guide is kind of
    >> terse
    >>     > here.
    >>     >
    >>     > Thanks in advance,
    >>     >
    >>     > Ciao, Patrick
    >>     >
    >>     --
    >>     Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
    >>     LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
    >>     http://www.solrenterprisesearchserver.com
    >>
    >>
    >> --
    > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
    > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
    > http://www.solrenterprisesearchserver.com
    

Mime
View raw message