hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fuad Efendi <fefe...@outsideiq.com>
Subject Primary Key Design
Date Wed, 03 Aug 2011 14:25:27 GMT

I am starting to use following scheme for primary keys:
SHA256(URL) + "-RAW" Primary Key Schema

* PKs  in Lily (user-defined) will be prepended "USER." and I can't use URI
for instance (it contains dots which is special character in current
* Additionally to SHA-256-generated PK, Lily will still use UUID (which is
really unique) for versioningŠ
* IMPORTANT: we need randomize Pks; it is best practice with Hbase (data
will be randomly distributed in a cluster)

and I suggest to use similar SHA256(JSON-Object-in-UTF8) + "-OIQ" (it is
postfix so that we will have good "randomization"; in Hbase, all data is
physically sorted by PK)
- since all OIQ objects will be stored denormalized as JSON (string type
Lily) (note, it will be UTF-8 encoded, I believe it is also part of


 * {@link 


 * @author Fuad



public class SHA256 {

public static final String SHA256(byte[] bytes) throws
NoSuchAlgorithmException {

MessageDigest md = MessageDigest.getInstance("SHA-256");


byte[] mdbytes = md.digest();

// convert the byte to hex format

StringBuffer hexString = new StringBuffer();

for (int i = 0; i < mdbytes.length; i++) {

String hex = Integer.toHexString(0xff & mdbytes[i]);

if (hex.length() == 1)




return hexString.toString();


public static final String SHA256(String text) throws
NoSuchAlgorithmException, UnsupportedEncodingException  {

return SHA256(text.getBytes("UTF-8"));



Fuad Efendi

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message