hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jtay...@salesforce.com>
Subject Re: Prefix salting pattern
Date Sun, 18 May 2014 04:31:43 GMT
No, there's nothing wrong with your thinking. That's exactly what Phoenix
does - use the modulo of the hash of the key. It's important that you can
calculate the prefix byte so that you can still do fast point lookups.

Using a modulo that's bigger than the number of region servers can make
sense as well (up to the overall number of cores in your cluster). You
can't change the modulo without rewriting the data, so factoring in future
growth makes sense.


On Sat, May 17, 2014 at 8:50 PM, Software Dev <static.void.dev@gmail.com>wrote:

> Well kept reading on this subject and realized my second question may
> not be appropriate since this prefix salting pattern assumes that the
> prefix is random. I thought it was actually based off a hash that
> could be predetermined so you could alwasy, if needed, get to the
> exact row key with one get. Would there be something wrong with doing
> this.. ie, using a modulo of the hash of the key?
> On Sat, May 17, 2014 at 8:28 PM, Software Dev <static.void.dev@gmail.com>
> wrote:
> > I recently came across the pattern of adding a salting prefix to the
> > row keys to prevent hotspotting. Still trying to wrap my head around
> > it and I have a few questions.
> >
> > - Is there ever a reason to salt to more buckets than there are region
> > servers? The only reason why I think that may be beneficial is to
> > anticipate future growth???
> >
> > - Is it beneficial to always hash against a known number of buckets
> > (ie never change the size) that way for any individual row key you can
> > always determine the prefix?
> >
> > - Are there any good use cases of this pattern out in the wild?
> >
> > Thanks

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message