hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amandeep Khurana <ama...@gmail.com>
Subject Re: md5 hash key and splits
Date Thu, 30 Aug 2012 23:30:55 GMT
Also, you might have read that an initial loading of data can be better
distributed across the cluster if the table is pre-split rather than
starting with a single region and splitting (possibly aggressively,
depending on the throughput) as the data loads in. Once you are in a stable
state with regions distributed across the cluster, there is really no
benefit in terms of spreading load by managing splitting manually v/s
letting HBase do it for you. At that point it's about what Ian mentioned -
predictability of latencies by avoiding splits happening at a busy time.

On Thu, Aug 30, 2012 at 4:26 PM, Ian Varley <ivarley@salesforce.com> wrote:

> The Facebook devs have mentioned in public talks that they pre-split their
> tables and don't use automated region splitting. But as far as I remember,
> the reason for that isn't predictability of spreading load, so much as
> predictability of uptime & latency (they don't want an automated split to
> happen at a random busy time). Maybe that's what you mean, Mohit?
> Ian
> On Aug 30, 2012, at 5:45 PM, Stack wrote:
> On Thu, Aug 30, 2012 at 7:35 AM, Mohit Anchlia <mohitanchlia@gmail.com
> <mailto:mohitanchlia@gmail.com>> wrote:
> From what I;ve read it's advisable to do manual splits since you are able
> to spread the load in more predictable way. If I am missing something
> please let me know.
> Where did you read that?
> St.Ack

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message