lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jilani Shaik <jilani24...@gmail.com>
Subject Re: Limit the documents for each shard in solr cloud
Date Fri, 08 May 2015 16:55:49 GMT
Hi,

Actually we are facing lot of issues with Solr shards in our environment.
Our environment is fully loaded with around 150 million documents where
each document will have around 50+ stored fields which has multiple values.
And also we have lot of custom components in this environment which are
using "FieldCache" and various other Solr features.

The main issue we are facing is shards going down frequently in Solr cloud.

As you mentioned in this reply and I also I have observed various other
reply on memory issues. I will try to debug further and keep posted here if
any issues I found in that process.

Thanks,
Jilani

On Thu, May 7, 2015 at 10:17 PM, Daniel Collins <danwcollins@gmail.com>
wrote:

> Jilani, you did say "My team needs that option if at all possible", my
> first response would be "why?".   Why do they want to limit the number of
> documents per shard, what's the rationale/use case behind that
> requirement?  Once we understand that, we can explain why its a bad idea.
> :)
>
> I suspect I'm re-iterating Jack's comments, but why are you sharding in the
> first place? 8 shards split across 4 machines, so 2 shards per machine.
> But you have 2 replicas of each shard, so you have 16 Solr core, and hence
> 4 Solr cores per machine?  Since you need an instance of all 8 shards to be
> up in order to service requests, you can get away with everything on 2
> machines, but you still have 8 Solr cores to manage in order to have a
> fully functioning system.  What's the benefit of sharding in this
> scenario?  Sharding adds complexity, so you normally only add sharding if
> your search times are too slow without it.
>
> You need to work out how much disk space the whole 20m docs is going to
> take (maybe index 1m or 5m docs and extrapolate if they are all equivalent
> in size), then split it across 4 machines.  But as Erick points out you
> need to allow for merges to occur, so whatever the space of the "static"
> data set, you need to allow for double that from time to time if background
> merges are happening.
>
>
> On 7 May 2015 at 16:05, Jack Krupansky <jack.krupansky@gmail.com> wrote:
>
> > A leader is also a replica - SolrCloud is not a master/slave
> architecture.
> > Any replica can be elected to be the leader, but that is only temporary
> and
> > can change over time.
> >
> > You can place multiple shards on a single node, but was that really your
> > intention?
> >
> > Generally, number of nodes equals number of shards times the replication
> > factor. But then divided by shards per node if you do place more than one
> > shard per node.
> >
> > -- Jack Krupansky
> >
> > On Thu, May 7, 2015 at 1:29 AM, Jilani Shaik <jilani24239@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Is it possible to restrict number of documents per shard in Solr cloud?
> > >
> > > Lets say we have Solr cloud with 4 nodes, and on each node we have one
> > > leader and one replica. Like wise total we have 8 shards that includes
> > > replicas. Now I need to index my documents in such a way that each
> shard
> > > will have only 5 million documents. Total documents in Solr cloud
> should
> > be
> > > 20 million documents.
> > >
> > >
> > > Thanks,
> > > Jilani
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message