lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ralph tice <ralph.t...@gmail.com>
Subject Re: Loading an index (generated by map reduce) in SolrCloud
Date Thu, 18 Sep 2014 03:26:53 GMT
If you are updating or deleting from your indexes I don't believe it is
possible to get a consistent copy of the index from the file system
directly without monkeying with hard links.  The safest thing is to use the
ADDREPLICA command in the Collections API and then an UNLOAD from the CORE
API if you want to take the data offline.  If you don't care to use
additional servers/JVMs, you can use the replication handler to make backup
instead.

This older discussion covers most any backup strategy I can think of:
http://grokbase.com/t/lucene/solr-user/12c37h0g18/backing-up-solr-4-0

On Wed, Sep 17, 2014 at 9:01 PM, shushuai zhu <sszhu@yahoo.com.invalid>
wrote:

> Hi, my case is a little simpler. For example, I have 100 collections now
> in my solr cloud, and I want to backup 20 of them so I can restore them
> later. I think I can just copy the index and log for each shard/core to
> another location, then delete the collections. Later, I can create new
> collections (likely with different names), then copy the index and log back
> to the right directory structure on the node. After that, I can either
> reload the collection or core.
>
> However, some testing shows these do not work. I could not reload the
> collection or core. Have not tried re-starting the solr cloud. Can someone
> point out the best way to achieve the goal? I prefer not to re-start solr
> cloud.
>
> Shushuai
>
>
> ________________________________
>  From: ralph tice <ralph.tice@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, September 17, 2014 6:53 PM
> Subject: Re: Loading an index (generated by map reduce) in SolrCloud
>
>
> FWIW, I do a lot of moving Lucene indexes around and as long as the core is
> unloaded it's never been an issue for Solr to be running at the same time.
>
> If you move a core into the correct hierarchy for a replica, you can call
> the Collections API's CREATESHARD action with the appropriate params (make
> sure you use createNodeSet to point to the right server) and Solr will load
> the index appropriately.  It's easiest to create a dummy shard and see
> where data lands on your installation than to try to guess.
>
> Ex:
> PORT=8983
> SHARD=myshard
> COLLECTION=mycollection
> SOLR_HOST=box1.mysolr.corp
> curl "http://
>
> ${SOLR_HOST}:${PORT}/solr/admin/collections?action=CREATESHARD&shard=${SHARD}&collection=${COLLECTION}&createNodeSet=${SOLR_HOST}:${PORT}_solr"
>
> One file to watch out for if you are moving cores across machines/JVMs is
> the core.properties file, which you don't want to duplicate to another
> server/location when moving a data directory.  I don't recommend trying to
> move transaction logs around either.
>
>
>
>
>
> On Wed, Sep 17, 2014 at 5:22 PM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
> > Details please. You say MapReduce. Is this the
> > MapReduceIndexerTool? If so, you can use
> > the --go-live option to auto-merge them. Your
> > Solr instances need to be running over HDFS
> > though.
> >
> > If you don't have Solr running over HDFS, you can
> > just copy the results for each shard "to the right place".
> > What that means is that you must insure that the
> > shards produced via MRIT get copied to the corresponding
> > Solr local directory for each shard. If you put the wrong
> > one in the wrong place you'll have trouble with multiple
> > copies of documents showing up when you re-add any
> > doc that already exists in your Solr installation.
> >
> > BTW, I'd surely stop all my Solr instances while copying
> > all this around.
> >
> > Best,
> > Erick
> >
> > On Wed, Sep 17, 2014 at 1:41 PM, KNitin <nitin.tnvl@gmail.com> wrote:
> > > Hello
> > >
> > >  I have generated a lucene index (with 6 shards) using Map Reduce. I
> want
> > > to load this into a SolrCloud Cluster inside a collection.
> > >
> > > Is there any out of the box way of doing this?  Any ideas are much
> > > appreciated
> > >
> > > Thanks
> > > Nitin
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message