lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Mehrotra <himanshu.mehro...@snapdeal.com>
Subject Re: Solr and SolrCloud repllcation, and load balancing questions.
Date Sun, 06 Jul 2014 09:31:04 GMT
Erick, first up thanks for thoroughly answering my questions.

[A]  I had read the blot mentioned, and yet failed to 'get it'.  Now I
understand the flow.
[B]  The automatic, heuristic based approach as you said will be difficult
to get right, that is why I thought 'beefiness' index configuration similar
to load balancer might help get same effective result for the most part.  I
guess it is feature that most people won't need, only the ones in process
of upgrading their machines.
[C] I will go through the blog, and do empirical analysis.  Speaking of
caches, I see that for us filter cache hit ratio is good 97% while document
cache hit ratio is below 10%, does it mean that document cache (size=4096)
 is not big enough and I should increase the size or does it mean that we
are getting queries that result in too wide a result set and hence we would
probably better off switching off the document cache altogether if we could
do it.

Thanks,
Himanshu



On Sun, Jul 6, 2014 at 5:27 AM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Question1, both sub-cases.
>
> You're off on the wrong track here, you have to forget about replication.
>
> When documents are added to the index, they get forwarded to _all_
> replicas. So the flow is like this...
> 1> leader gets update request
> 2> leader indexes docs locally, and adds to (local) transaction log
>       _and_ forwards request to all followers
> 3> followers add docs to tlog and index locally
> 4> followers ack back to leader
> 5> leader acks back to client.
>
> There is no replication in the old sense at all in this scenario. I'll
> add parenthetically that old-style replication _is_ still used to
> "catch up" a follower that is waaaaaay behind, but the follower is
> in the "recovering" state if this ever occurs.
>
> About commit. If you commit from the client, the commit is forwarded
> to all followers (actually, all nodes in the collection). If you have
> autocommit configured, each of the replicas will fire their commit when
> the time period expires.
>
> Here's a blog that might help:
>
> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> [B] right, SolrCloud really supposes that the machines are pretty
> similar so doesn't provide any way to do what you're asking. Really,
> you're asking for some way to assign "beefiness" to the node in terms
> of load sent to it... I don't know of a way to do that and I'm not
> sure it's on the roadmap either.
>
> What you'd really want, though, is some kind of heuristic that was
> automatically applied. That would take into account transient load
> problems, i.e. replica N happened to get a really nasty query to run
> and is just slow for a while. I can see this being very tricky to get
> right though. Would a GC pause get weighted as "slow" even though the
> pause could be over already? Anyway, I don't think this is on the
> roadmap at present but could well be wrong.
>
> In your specific example, though (this works because of the convenient
> 2x....) you could host 2x the number of shards/replicas on the beefier
> machines.
>
> [C] Right, memory allocation is difficult. The general recommendation
> is that memory for Solr allocated in the JVM should be as small as
> possible, and leave let the op system use memory for MMapDirectory.
> See the excellent blog here:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html.
> If you over-allocate memory to the JVM, your GC profile worsens...
>
> Generally, when people throw "memory" around they're talking about JVM
> memory...
>
> And don't be mislead by the notion of "the index fitting into memory".
> You're absolutely right that when you get into a swapping situation,
> performance will suffer. But there are some very interesting tricks
> played to keep JVM consumption down. For instance, only every 128th
> term is stored in the JVM memory. Other terms are then read as needed.
> And stored in the OS memory via MMapDirectory implementations....
>
> Your GC stats look quite reasonable. You can get a snapshot of memory
> usage by attaching, say, jConsole to the running JVM and see what
> memory usage was after a forced GC. Sounds like you've already seen
> this, but in case not:
> http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It
> was written before there was much mileage on the new G1 garbage
> collector which has received mixed reviews.
>
> Note that the stored fields kept in memory are controlled by the
> documentCache in solrconfig.xml. I think of this as just something
> that holds documents when assembling the return list, it really
> doesn't have anything to do with searching per-se, just keeping disk
> seeks down during processing for a particular query. I.e. for a query
> returning 10 rows, only 10 docs will be kept here not the 5M rows that
> matched.
>
> Whether 4G is sufficient is.... not answerable. I've doubled the
> memory requirements by changing the query without changing the index.
> Here's a blog outlining why we can't predict and how to get an answer
> empirically:
>
> http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> Best,
> Erick
>
> On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra
> <himanshumehrotra2002@gmail.com> wrote:
> > Hi,
> >
> > I had three quesions/doubts regarding Solr and SolrCloud functionality.
> > Can anyone help clarify these? I know these are bit long, please bear
> with
> > me.
> >
> > [A] Replication related - As I understand before SolrCloud, under a
> classic
> > master/slave replication setup, every 'X' minutes slaves will pull/poll
> the
> > updated index (index segments added and deleted/merged away ).  And when
> a
> > client explicitly issues a 'commit' only master solr closes/finalizes
> > current index segment and creates a new current index segment.  As port
> of
> > this index segment merges as well as 'fsync' ensuring data is on the disk
> > also happens.
> >
> > I read documentation regarding replication on SolrCloud but unfortunately
> > it is still not very clear to me.
> >
> > Say I have solr cloud setup of 3 solr servers with just a single shard.
> > Let's call them L (the leader) and F1 and F2, the followers.
> >
> > Case 1: We are not using autoCommits, and explictly issue 'commit' via
> > Client.  How does replication happen now?
> > Does the each update to leader L that goes into tlog get replicated to
> > followers F1, and F2 (wher they also put update in tlog ) before client
> > sees response from leader L?  What happens when client issues a 'commit'?
> > Does  the creation of new segment, merging of index segments if required,
> > and fsync happen on all three solrs or that just happens on leader L and
> > followers F1, F2 simply sync the post commit state of index.  More-over
> > does leader L wait for fsync in followers F1, F2, before responding
> > sucessfully to Client?  If yes does it sequentially updates F1 and then
> F2
> > or is the process concurrent/parallel via threads.
> >
> > Case 2: We use autoCommit every 'X' minutes and do not issue 'commit' via
> > Client.  Is this setup similar to classic master slave in terms of
> > data/index updates?
> > As in since autoCommit happens every 'X' minutes replication will happen
> > after commit, every 'X' minutes followers get updated index.  But does
> > simple updates, the ones that go int tlog get replicated immediately to
> > follower's tlog .
> >
> > Another thing I noticed in Solr Admin UI, is that replication is set to
> > afterCommit, what are other possible settings for this knob.  And what
> > behaviour do we get out of them.
> >
> >
> >
> >
> > [B] Load balancing related - In traditional master/slave setup we use
> load
> > balancer to distribute load search query load equally over slaves.  In
> case
> > one of the slave solr is running on 'beefier' machine (say more RAM or
> CPU
> > or both) than others, then load balancers allow distributing load by
> > weights so that we can distribute load proportional to percieved machine
> > capacity.
> >
> > With solr cloud setup, lets take an example, 2 shards, 3 replicas per
> > shard, totaling to 6 solr servers are running and say we have
> > Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1,
> S2F1,
> > S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders of
> > their respective shard.  And lets say S1F2, and S2F1 happen to be twice
> as
> > big machines as others (twice the RAM and CPU).
> >
> > Ideally speaking in such case we would want S2F1 and S1F2 to handle twice
> > the search query load as their peers.  That is if 100 search queries come
> > we know each shard will receive these 100 queries.  So we want S1L1, and
> > S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.  Similarly
> > we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
> > queries.
> >
> > As far as I understand, this is not possible via smart client provided in
> > SolrJ.  All solr servers will handle 33% of the query load.
> >
> > Alternative is to use dumb client and load balancer over all servers.
>  But
> > even then I guess we won't get correct/desired distribution of queries.
> > Say we put following weights for each server
> >
> > 1 - S1L1
> > 1 - S1F1
> > 2 - S1F2
> > 1 - S1L1
> > 2 - S1F1
> > 1 - S1F2
> >
> > Now 1/4 of total number of requests go to S1F2 directly, plus now it
> > recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on shard
> 2.
> > This totals up to 10/24 of request load, not half as we would expect.
> >
> > One way could be to chose weight y and x such that y/(2*(y + 2x)) + 1/6 =
> > 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
> > Every time we add/remove/upgrade servers we need to recalculate new
> weights.
> >
> > A simpler alternative it appears would be that each solr node register
> its
> > 'query_weight' with zoo-keeper on joining the cluster. This
> 'query_weight'
> > could be a property similar to 'solr.solr.home' or 'zkHosts' that we
> > specify with startup commandline for solr server.
> >
> > And all smart clients and solr servers, to honour that weight when they
> > distribute load.  Is there such a feature planned for Solr Cloud?
> >
> >
> >
> >
> > [C] GC/Memory usage related - From the documentation and videos available
> > on internet, it appears that solr perform well if index fits into the
> > memory and stord fields fit in the memory.  Holding just index in memory
> > has more degrading impact on solr performance and if we don't have enough
> > memory to hold index solr is still slower, and the moment java process
> hits
> > swap space solr will slow to a crawl.
> >
> > My question is what the 'memory' being talked about is? Is it the Java
> Heap
> > we specify via Xmx and Xms options.  Or is it the free memory, or
> buffered,
> > or cached memory as available from output free command on *nix systems.
> > And how do we know if our index and stored fields will fit the memory.
>  For
> > example say the data directory for the core/collection occupies 200MB on
> > disk ( 150,000 live documents and 180,000 max documents per Solr UI) ,
> then
> > is a 8GB machine with solr being configured with Xmx 4G going to be
> > sufficient?
> >
> > Are there any guidlines as to configuring the java heap and total RAM,
> > given an index size and the expected query rate ( queries per
> > minute/second).
> > On production system I observed via gc logs that minor collections happen
> > at rate of 20 per minute, full gc happens every seven to ten minutes, are
> > these  too high or low given direct search query load on that solr node
> is
> > about 2500 request per minute.  What kind of GC behaviour I should expect
> > from an healthy and fast/optimal solr node in solr-cloud setup.  Is the
> > answer it depends on your specific response time and throughput
> > requirements or is there some kind of rule of thumb that can followed
> > irrespective of the situation.  Or should I see if any improvements can
> be
> > made via regular measure, tweak , measure cycles.
> >
> > Thanks,
> > Himanshu
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message