lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gili Nachum <gilinac...@gmail.com>
Subject Re: clarification regarding shard splitting and composite IDs
Date Thu, 05 Feb 2015 00:05:57 GMT
Alright. So shard splitting and composite routing plays nicely together.
Thank you Anshum.

On Wed, Feb 4, 2015 at 11:24 AM, Anshum Gupta <anshum@anshumgupta.net>
wrote:

> In one line, shard splitting doesn't cater to depend on the routing
> mechanism but just the hash range so you could have documents for the same
> prefix split up.
>
> Here's an overview of routing in SolrCloud:
> * Happens based on a hash value
> * The hash is calculated using the multiple parts of the routing key. In
> case of A!B, 16 bits are obtained from murmurhash(A) and the LSB 16 bits of
> the routing key are obtained from murmurhash(B). This sends the docs to the
> right shard.
> * When querying using A!, all shards that contain hashes from the range 16
> bits from murmurhash(A)-0000 to murmurhash(A)-ffff are used.
>
> When you split a shard, for say range 00000000 - ffffffff , it is split
> from the middle (by default) and over multiple split, docs for the same A!
> prefix might end up on different shards, but the request routing should
> take care of that.
>
> You can read more about routing here:
> https://lucidworks.com/blog/solr-cloud-document-routing/
> http://lucidworks.com/blog/multi-level-composite-id-routing-solrcloud/
>
> and shard splitting here:
> http://lucidworks.com/blog/shard-splitting-in-solrcloud/
>
>
> On Wed, Feb 4, 2015 at 12:59 AM, Gili Nachum <gilinachum@gmail.com> wrote:
>
> > Hi, I'm also interested. When using composite the ID, the _route_
> > information is not kept on the document itself, so to me it looks like
> it's
> > not possible as the split API
> > <
> >
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3
> > >
> > doesn't have a relevant parameter to split correctly.
> > Could report back once I try it in practice.
> >
> > On Mon, Nov 10, 2014 at 7:27 PM, Ian Rose <ianrose@fullstory.com> wrote:
> >
> > > Howdy -
> > >
> > > We are using composite IDs of the form <user>!<event>.  This ensures
> that
> > > all events for a user are stored in the same shard.
> > >
> > > I'm assuming from the description of how composite ID routing works,
> that
> > > if you split a shard the "split point" of the hash range for that shard
> > is
> > > chosen to maintain the invariant that all documents that share a
> routing
> > > prefix (before the "!") will still map to the same (new) shard.  Is
> that
> > > accurate?
> > >
> > > A naive shard-split implementation (e.g. that chose the hash range
> split
> > > point arbitrarily) could end up with "child" shards that split a
> routing
> > > prefix.
> > >
> > > Thanks,
> > > Ian
> > >
> >
>
>
>
> --
> Anshum Gupta
> http://about.me/anshumgupta
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message