lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <furkankam...@gmail.com>
Subject Re: Usage of CloudSolrServer?
Date Tue, 16 Apr 2013 11:23:01 GMT
Thanks for your detailed explanation. However you said:

"It will then choose one of those hosts/cores for each shard, and send a
request to them as a distributed search request." Is there any document
that explains of distributed search? What is the criteria for it?


2013/4/16 Upayavira <uv@odoko.co.uk>

> If you are accessing Solr from Java code, you will likely use the SolrJ
> client to do so. If your users are hitting Solr directly, you should
> think about whether this is wise - as well as providing them with direct
> search access, you are also providing them with the ability to delete
> your entire index with a single command.
>
> SolrJ isn't really a load balancer as such. When SolrJ is used to make a
> request against a collection, it will ask Zookeeper for the names of the
> shards that make up that collection, and for the hosts/cores that make
> up the set of replicas for those shards.
>
> It will then choose one of those hosts/cores for each shard, and send a
> request to them as a distributed search request.
>
> This has the advantage over traditional load balancing that if you bring
> up a new node, that node will register itself with ZooKeeper, and thus
> your SolrJ client(s) will know about it, without any intervention.
>
> Upayavira
>
> On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote:
> > Hi Shawn;
> >
> > I am sorry but what kind of Load Balancing is that? I mean does it check
> > whether some leaders are using much CPU or RAM etc.? I think a problem
> > may
> > occur at such kind of scenario: if some of leaders getting more documents
> > than other leaders (I don't know how it is decided that into which shard
> > a
> > document will go) than there will be a bottleneck on that leader?
> >
> >
> > 2013/4/15 Shawn Heisey <solr@elyograg.org>
> >
> > > On 4/15/2013 8:05 AM, Furkan KAMACI wrote:
> > >
> > >> My system is as follows: I crawl data with Nutch and send them into
> > >> SolrCloud. Users will search at Solr.
> > >>
> > >> What is that CloudSolrServer, should I use it for load balancing or
> is it
> > >> something else different?
> > >>
> > >
> > > It appears that the Solr integration in Nutch currently does not use
> > > CloudSolrServer.  There is an issue to add it.  The mutual dependency
> on
> > > HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses
> > > HttpClient 4.
> > >
> > > https://issues.apache.org/**jira/browse/NUTCH-1377<
> https://issues.apache.org/jira/browse/NUTCH-1377>
> > >
> > > Until that is fixed, a load balancer would be required for full
> redundancy
> > > for updates with SolrCloud.  You don't have to use a load balancer for
> it
> > > to work, but if the Solr server that Nutch is using goes down, then
> > > indexing will stop unless you reconfigure Nutch or bring the Solr
> server
> > > back up.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message