hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From k8 robot <robot...@gmail.com>
Subject Re: Is it necessary to set MD5 on rowkey?
Date Wed, 06 Feb 2013 01:46:43 GMT
Mike, sometimes hashing is not an option. Consider the case where instead
of uniformly increasing rowkey like TS you have composite row key of say
userId+someOtherId, and the use case is to be able to look up data by a
specific userId. If the data distribution by userId is uneven, with large
number of records for only a few userIds, (say 1 million records for some
userIds and only a few hundred for others), then randomly generated salt
would be used to distribute the writes evenly amongst the RS.

In this senario, hashing does no good. It does not prevent the hot spotting
caused by userId at all, since the hashed userIds has the same uneven
distribution as the original userIds.

~Kate

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message