hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fernando Padilla <f...@alum.mit.edu>
Subject key hashing?
Date Mon, 27 Jul 2009 19:01:46 GMT
So I will be generating lots of rows into the db keyed by userId, in 
userId order.

I have already learned through this mailing list that this use-case is 
not ideal, since it would mean most row-inserts will be on one region 
server.  I know that some people suggest to add some randomization to 
the keys so that it will be spread around, but I can't do that, since 
I'll need to be able to do random access lookup on the rows via userId.


But I'm wondering if I could map/hash the real userId, into another 
number that will spread around the inserts.  And I can still do random 
access lookups given a real userId, by calculating the hash..



1) i think i like this idea, does anyone have any experience with this?

2) assume userId is a 8byte long, what would be some good hashing 
functions?  I would be lazy and use little-endian, but I bet one of you 
could come up with something better. :)


Mime
View raw message