lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: How to make SolrCloud more elastic
Date Fri, 13 Feb 2015 21:53:21 GMT
Hi Matt,

See:
http://search-lucene.com/?q=query+routing&fc_project=Solr
https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Feb 12, 2015 at 2:09 PM, Matt Kuiper <matt.kuiper@issinc.com> wrote:

> Otis,
>
> Thanks for your reply.  I see your point about too many shards and search
> efficiency.  I also agree that I need to get a better handle on customer
> requirements and expected loads.
>
> Initially I figured that with the shard splitting option, I would need to
> double my Solr nodes every time I split (as I would want to split every
> shard within the collection).  Where actually only the number of shards
> would double, and then I would have the opportunity to rebalance the shards
> over the existing Solr nodes plus a number of new nodes that make sense at
> the time.  This may be preferable to defining many micro shards up front.
>
> The time-base collections may be an option for this project.  I am not
> familiar with query routing, can you point me to any documentation on how
> this might be implemented?
>
> Thanks,
> Matt
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis.gospodnetic@gmail.com]
> Sent: Wednesday, February 11, 2015 9:13 PM
> To: solr-user@lucene.apache.org
> Subject: Re: How to make SolrCloud more elastic
>
> Hi Matt,
>
> You could create extra shards up front, but if your queries are fanned out
> to all of them, you can run into situations where there are too many
> concurrent queries per node causing lots of content switching and
> ultimately being less efficient than if you had fewer shards.  So while
> this is an approach to take, I'd personally first try to run tests to see
> how much a single node can handle in terms of volume, expected query rates,
> and target latency, and then use monitoring/alerting/whatever-helps tools
> to keep an eye on the cluster so that when you start approaching the target
> limits you are ready with additional nodes and shard splitting if needed.
>
> Of course, if your data and queries are such that newer documents are
> queries   more, you should look into time-based collections... and if your
> queries can only query a subset of data you should look into query routing.
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Wed, Feb 11, 2015 at 3:32 PM, Matt Kuiper <matt.kuiper@issinc.com>
> wrote:
>
> > I am starting a new project and one of the requirements is that Solr
> > must scale to handle increasing load (both search performance and index
> size).
> >
> > My understanding is that one way to address search performance is by
> > adding more replicas.
> >
> > I am more concerned about handling a growing index size.  I have
> > already been given some good input on this topic and am considering a
> > shard splitting approach, but am more focused on a rebalancing
> > approach that includes defining many shards up front and then moving
> > these existing shards on to new Solr servers as needed.  Plan to
> > experiment with this approach first.
> >
> > Before I got too deep, I wondered if anyone has any tips or warnings
> > on these approaches, or has scaled Solr in a different manner.
> >
> > Thanks,
> > Matt
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message