lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@odoko.co.uk>
Subject Re: Usage of CloudSolrServer?
Date Tue, 16 Apr 2013 09:44:58 GMT
If you are accessing Solr from Java code, you will likely use the SolrJ
client to do so. If your users are hitting Solr directly, you should
think about whether this is wise - as well as providing them with direct
search access, you are also providing them with the ability to delete
your entire index with a single command.

SolrJ isn't really a load balancer as such. When SolrJ is used to make a
request against a collection, it will ask Zookeeper for the names of the
shards that make up that collection, and for the hosts/cores that make
up the set of replicas for those shards.

It will then choose one of those hosts/cores for each shard, and send a
request to them as a distributed search request.

This has the advantage over traditional load balancing that if you bring
up a new node, that node will register itself with ZooKeeper, and thus
your SolrJ client(s) will know about it, without any intervention.

Upayavira

On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote:
> Hi Shawn;
> 
> I am sorry but what kind of Load Balancing is that? I mean does it check
> whether some leaders are using much CPU or RAM etc.? I think a problem
> may
> occur at such kind of scenario: if some of leaders getting more documents
> than other leaders (I don't know how it is decided that into which shard
> a
> document will go) than there will be a bottleneck on that leader?
> 
> 
> 2013/4/15 Shawn Heisey <solr@elyograg.org>
> 
> > On 4/15/2013 8:05 AM, Furkan KAMACI wrote:
> >
> >> My system is as follows: I crawl data with Nutch and send them into
> >> SolrCloud. Users will search at Solr.
> >>
> >> What is that CloudSolrServer, should I use it for load balancing or is it
> >> something else different?
> >>
> >
> > It appears that the Solr integration in Nutch currently does not use
> > CloudSolrServer.  There is an issue to add it.  The mutual dependency on
> > HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses
> > HttpClient 4.
> >
> > https://issues.apache.org/**jira/browse/NUTCH-1377<https://issues.apache.org/jira/browse/NUTCH-1377>
> >
> > Until that is fixed, a load balancer would be required for full redundancy
> > for updates with SolrCloud.  You don't have to use a load balancer for it
> > to work, but if the Solr server that Nutch is using goes down, then
> > indexing will stop unless you reconfigure Nutch or bring the Solr server
> > back up.
> >
> > Thanks,
> > Shawn
> >
> >

Mime
View raw message