hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asaf Mesika <asaf.mes...@gmail.com>
Subject Re: Possibility of using timestamp as row key in HBase
Date Wed, 19 Jun 2013 20:58:22 GMT
The new splitted region might be moved due to load balancing. Aren't you
experiencing the classic hot spotting? Only 1 RS getting all write traffic?
Just place a preceding byte before the time stamp and round robin each put
on values 1-num of region servers.

On Wednesday, June 19, 2013, yun peng wrote:

> Hi, All,
> Our use case requires to persist a stream into system like HBase. The
> stream data is in format of <timestamp, value>. In other word, timestamp is
> used as rowkey. We want to explore whether HBase is suitable for such kind
> of data.
> The problem is that the domain of row key (or timestamp) grow constantly.
> For example, given 3 nodes, n1 n2 n3, they are resp. hosting row key
> partition [0,4], [5, 9], [10,12]. Currently it is the last node n3 who is
> busy receiving upcoming writes (of row key 13 and 14). This continues until
> the region reaches max size 5 (that is, partition grows to [10,14]) and
> potentially splits.
> I am not expert on HBase split, but I am wondering after split, will the
> new writes still go to node n3 (for [10,14]) or the write stream can be
> intelligently redirected to other less busy node, like n1.
> In case HBase can't do things like this, how easy is it to extend HBase for
> such functionality? Thanks...
> Yun

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message