hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fuad Efendi <fefe...@outsideiq.com>
Subject Primary Key Design
Date Wed, 03 Aug 2011 14:25:27 GMT
Hi,


I am starting to use following scheme for primary keys:
SHA256(URL) + "-RAW" Primary Key Schema
<https://outsideiq.jira.com/browse/CA-107>



RATIONALE:
* PKs  in Lily (user-defined) will be prepended "USER." and I can't use URI
for instance (it contains dots which is special character in current
version)
* Additionally to SHA-256-generated PK, Lily will still use UUID (which is
really unique) for versioningŠ
* IMPORTANT: we need randomize Pks; it is best practice with Hbase (data
will be randomly distributed in a cluster)

and I suggest to use similar SHA256(JSON-Object-in-UTF8) + "-OIQ" (it is
postfix so that we will have good "randomization"; in Hbase, all data is
physically sorted by PK)
- since all OIQ objects will be stored denormalized as JSON (string type
Lily) (note, it will be UTF-8 encoded, I believe it is also part of
ECMA-specs)




/**

 * {@link 
http://stackoverflow.com/questions/221165/pros-and-cons-of-using-md5-hash-of
-uri-as-the-primary-key-in-a-database}

 * 

 * @author Fuad

 *

 */

public class SHA256 {



public static final String SHA256(byte[] bytes) throws
NoSuchAlgorithmException {

MessageDigest md = MessageDigest.getInstance("SHA-256");

md.update(bytes);

byte[] mdbytes = md.digest();



// convert the byte to hex format

StringBuffer hexString = new StringBuffer();

for (int i = 0; i < mdbytes.length; i++) {

String hex = Integer.toHexString(0xff & mdbytes[i]);

if (hex.length() == 1)

hexString.append('0');

hexString.append(hex);

}



return hexString.toString();

}





public static final String SHA256(String text) throws
NoSuchAlgorithmException, UnsupportedEncodingException  {

return SHA256(text.getBytes("UTF-8"));

}



}











-- 
Fuad Efendi






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message