hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jtay...@salesforce.com>
Subject Re: Prefix salting pattern
Date Sun, 18 May 2014 21:09:54 GMT
@Software Dev - might be feasible to implement a Thrift client that speaks
Phoenix JDBC. I believe this is similar to what Hive has done.
Thanks,
James


On Sun, May 18, 2014 at 1:19 PM, Mike Axiak <mike@axiak.net> wrote:

> In our measurements, scanning is improved by performing against n
> range scans rather than 1 (since you are effectively striping the
> reads). This is even better when you don't necessary care about the
> order of every row, but want every row in a given range (then you can
> just get whatever row is available from a buffer in the client).
>
> -Mike
>
> On Sun, May 18, 2014 at 1:07 PM, Michael Segel
> <michael_segel@hotmail.com> wrote:
> > No, you’re missing the point.
> > Its not a good idea or design.
> >
> > Is your data mutable or static?
> >
> > To your point. Everytime you want to do a simple get() you have to open
> up n get() statements. On your range scans you will have to do n range
> scans, then join and sort the result sets. The fact that each result set is
> in sort order will help a little, but still not that clean.
> >
> >
> >
> > On May 18, 2014, at 4:58 PM, Software Dev <static.void.dev@gmail.com>
> wrote:
> >
> >> You may be missing the point. The primary reason for the salt prefix
> >> pattern is to avoid hotspotting when inserting time series data AND at
> >> the same time provide a way to perform range scans.
> >>
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> >>
> >>> NOTE:  Many people worry about hot spotting when they really don’t
> have to do so. Hot spotting that occurs on a the initial load of a table is
> .OK. Its when you have a sequential row key that you run in to problems
> with hot spotting and regions being only half filled.
> >>
> >> The data being inserted will be a constant stream of time ordered data
> >> so yes, hotspotting will be an issue
> >>
> >>> Adding a random value to give you a bit of randomness now means that
> you can’t do a range scan..
> >>
> >> That's not accurate. To perform a range scan you would just need to
> >> open up N scanners where N is the size of the buckets/random prefixes
> >> used.
> >>
> >>> Don’t take the modulo, just truncate to the first byte.  Taking the
> modulo is again a dumb idea, but not as dumb as using a salt.
> >>
> >> Well the only reason why I would think using a salt would be
> >> beneficial is to limit the number of scanners when performing a range
> >> scan. See above comment. And yes, performing a range scan will be our
> >> primary read pattern.
> >>
> >> On Sun, May 18, 2014 at 2:36 AM, Michael Segel
> >> <michael_segel@hotmail.com> wrote:
> >>> I think I should dust off my schema design talk… clearly the talks
> given by some of the vendors don’t really explain things …
> >>> (Hmmm. Strata London?)
> >>>
> >>> See my reply below…. Note I used SHA-1. MD-5 should also give you
> roughly the same results.
> >>>
> >>> On May 18, 2014, at 4:28 AM, Software Dev <static.void.dev@gmail.com>
> wrote:
> >>>
> >>>> I recently came across the pattern of adding a salting prefix to the
> >>>> row keys to prevent hotspotting. Still trying to wrap my head around
> >>>> it and I have a few questions.
> >>>>
> >>>
> >>> If you add a salt, you’re prepending a random number to a row in order
> to avoid hot spotting.  It amazes me that Sematext never went back and
> either removed the blog or fixed it and now the bad idea is getting
> propagated.  Adding a random value to give you a bit of randomness now
> means that you can’t do a range scan, or fetch the specific row with a
> single get()  so you’re going to end up boiling the ocean to get your data.
> You’re better off using hive/spark/shark than hbase.
> >>>
> >>> As James tries to point out, you take the hash of the row so that you
> can easily retrieve the value. But rather than prepend a 160 bit hash, you
> can easily achieve the same thing by just truncating the hash to the first
> byte in order to get enough randomness to avoid hot spotting. Of course,
> the one question you should ask is why don’t you just take the hash as the
> row key and then have a 160 bit row key (40 bytes in length)? Then store
> the actual key as a column in the table.
> >>>
> >>> And then there’s a bigger question… why are you worried about hot
> spotting? Are you adding rows where the row key is sequential?  Or are you
> worried about when you first start loading rows, that you are hot spotting,
> but the underlying row key is random enough that once the first set of rows
> are added, HBase splitting regions will be enough?
> >>>
> >>>> - Is there ever a reason to salt to more buckets than there are region
> >>>> servers? The only reason why I think that may be beneficial is to
> >>>> anticipate future growth???
> >>>>
> >>> Doesn’t matter.
> >>> Think about how HBase splits regions.
> >>> Don’t take the modulo, just truncate to the first byte.  Taking the
> modulo is again a dumb idea, but not as dumb as using a salt.
> >>>
> >>> Keep in mind that the first byte of the hash is going to be 0-f in a
> character representation. (4 bits of the 160bit key)  So you have 16 values
> to start with.
> >>> That should be enough.
> >>>
> >>>> - Is it beneficial to always hash against a known number of buckets
> >>>> (ie never change the size) that way for any individual row key you can
> >>>> always determine the prefix?
> >>>>
> >>> Your question doesn’t make sense.
> >>>
> >>>> - Are there any good use cases of this pattern out in the wild?
> >>>>
> >>> Yup.
> >>> Deduping data sets.
> >>>
> >>>> Thanks
> >>>>
> >>> NOTE:  Many people worry about hot spotting when they really don’t
> have to do so. Hot spotting that occurs on a the initial load of a table is
> OK. Its when you have a sequential row key that you run in to problems with
> hot spotting and regions being only half filled.
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message