lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: rough maximum cores (shards) per machine?
Date Tue, 24 Mar 2015 18:42:07 GMT
Don't confuse customers and tenants.

-- Jack Krupansky

On Tue, Mar 24, 2015 at 2:24 PM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> Sorry Jack. That doesn't scale when you have millions of customers. And
> these are good problems to have!
>
> On Tue, Mar 24, 2015 at 10:47 AM, Jack Krupansky <jack.krupansky@gmail.com
> >
> wrote:
>
> > Multi-tenancy is a bad idea for a single solr Cluster. Better to give
> each
> > tenant a separate Solr instance that you spin up and spin down based on
> > demand.
> >
> > Think about it: If there are a small number of tenants, just giving each
> > their own machine will be cheaper than the effort spent managing a
> > multi-tenant cluster, and if there are a large number of tenants of even
> a
> > moderate number of large tenants, you can't expect them to all run
> > reasonably on a relatively small cluster. Think about scalability.
> >
> >
> > -- Jack Krupansky
> >
> > On Tue, Mar 24, 2015 at 1:22 PM, Ian Rose <ianrose@fullstory.com> wrote:
> >
> > > Let me give a bit of background.  Our Solr cluster is multi-tenant,
> where
> > > we use one collection for each of our customers.  In many cases, these
> > > customers are very tiny, so their collection consists of just a single
> > > shard on a single Solr node.  In fact, a non-trivial number of them are
> > > totally empty (e.g. trial customers that never did anything with their
> > > trial account).  However there are also some customers that are larger,
> > > requiring their collection to be sharded.  Our strategy is to try to
> keep
> > > the total documents in any one shard under 20 million (honestly not
> sure
> > > where my coworker got that number from - I am open to alternatives but
> I
> > > realize this is heavily app-specific).
> > >
> > > So my original question is not related to indexing or query traffic,
> but
> > > just the sheer number of cores.  For example, if I have 10 active cores
> > on
> > > a machine and everything is working fine, should I expect that
> everything
> > > will still work fine if I add 10 nearly-idle cores to that machine?
> What
> > > about 100?  1000?  I figure the overhead of each core is probably
> fairly
> > > low but at some point starts to matter.
> > >
> > > Does that make sense?
> > > - Ian
> > >
> > >
> > > On Tue, Mar 24, 2015 at 11:12 AM, Jack Krupansky <
> > jack.krupansky@gmail.com
> > > >
> > > wrote:
> > >
> > > > Shards per collection, or across all collections on the node?
> > > >
> > > > It will all depend on:
> > > >
> > > > 1. Your ingestion/indexing rate. High, medium or low?
> > > > 2. Your query access pattern. Note that a typical query fans out to
> all
> > > > shards, so having more shards than CPU cores means less parallelism.
> > > > 3. How many collections you will have per node.
> > > >
> > > > In short, it depends on what you want to achieve, not some limit of
> > Solr
> > > > per se.
> > > >
> > > > Why are you even sharding the node anyway? Why not just run with a
> > single
> > > > shard per node, and do sharding by having separate nodes, to maximize
> > > > parallel processing and availability?
> > > >
> > > > Also be careful to be clear about using the Solr term "shard" (a
> slice,
> > > > across all replica nodes) as distinct from the Elasticsearch term
> > "shard"
> > > > (a single slice of an index for a single replica, analogous to a Solr
> > > > "core".)
> > > >
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > On Tue, Mar 24, 2015 at 9:02 AM, Ian Rose <ianrose@fullstory.com>
> > wrote:
> > > >
> > > > > Hi all -
> > > > >
> > > > > I'm sure this topic has been covered before but I was unable to
> find
> > > any
> > > > > clear references online or in the mailing list.
> > > > >
> > > > > Are there any rules of thumb for how many cores (aka shards, since
> I
> > am
> > > > > using SolrCloud) is "too many" for one machine?  I realize there
is
> > no
> > > > one
> > > > > answer (depends on size of the machine, etc.) so I'm just looking
> > for a
> > > > > rough idea.  Something like the following would be very useful:
> > > > >
> > > > > * People commonly run up to X cores/shards on a mid-sized (4 or 8
> > core)
> > > > > server without any problems.
> > > > > * I have never heard of anyone successfully running X cores/shards
> > on a
> > > > > single machine, even if you throw a lot of hardware at it.
> > > > >
> > > > > Thanks!
> > > > > - Ian
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message