lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Biggin <jbig...@hipdigital.com>
Subject RE: Replicating Large Indexes
Date Tue, 01 Nov 2011 16:59:10 GMT
Thanks Erick,

Will take a look at this article.

Cheers,
Jason

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, November 01, 2011 8:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Replicating Large Indexes

Yes, that's expected behavior. When you optimize, all segments are copied over to new segments(s).
Since all changed/new segments are replicated to the slave, you'll (temporarily) have twice
the data on your disk.

You can stop optimizing, it's often not really very useful despite its name.

That said, due to how segments are merged you will always have the potential for replicating
your entire index to the slave if you happen to hit the magic segment merge event.

And *that* said, there's quite a bit of control you can exercise over how segments are merged,
here's a place to start:
http://juanggrande.wordpress.com/2011/02/07/merge-policy-internals/

Merge Policy lets you control some of this behavior, but I'd still be nervous if I had less
space on my disk than would allow a full copy of the index to be there for a while.

Best
Erick

On Tue, Nov 1, 2011 at 12:46 AM, Jason Biggin <jbiggin@hipdigital.com> wrote:
> Wondering if anyone has experience with replicating large indexes.  We have a Solr deployment
with 1 master, 1 master/slave and 5 slaves.  Our index contains 15+ million articles and
is ~55GB in size.
>
> Performance is great on all systems.
>
> Debian Linux
> Apache-Tomcat
> 100GB disk
> 6GB RAM
> 2 proc
>
> on VMWare ESXi 4.0
>
>
> We notice however that whenever the master is optimized, the complete index is replicated
to the slaves.  This causes a 100%+ bloat in disk requirements.
>
> Is this normal?  Is there a way around this?
>
> Currently our optimize is configured as such:
>
>        curl 'http://localhost:8080/solr/update?optimize=true&maxSegments=1&waitFlush=true&expungeDeletes=true'
>
> Willing to share our experiences with Solr.
>
> Thanks,
> Jason
>

Mime
View raw message