hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From schausson <schaus...@softera.fr>
Subject Re: Writting bottleneck in HBase ?
Date Wed, 07 Dec 2016 13:24:38 GMT
Hi Ted, thanks for your help !

It seems I was not clear with my explanation, let me try again :
In my input file, let's say I have 2000 parameters and for each parameter,
5000 values recorded along given timeframe.
When I read the file, I read it part by part, basically by using a time
sliding window : For instance, I read all parameters values between t0 and
t1, 
which return me approximately  5 values per parameter. I write this chunk of
data to HBase and read the file for subsequent time window (t1 to t2), write
data to HBase and so on...

About hashing mechanism applied to rowId, here is the algorithm :

		public long hash(String string) {
		  long h = 1125899906842597L; // prime
		  int len = string.length();

		  for (int i = 0; i < len; i++) {
		    h = 31*h + string.charAt(i);
		  }
		  return h;
		}

Which does not guarantee any even distribution from what I understand...

Regards



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/Writting-bottleneck-in-HBase-tp4084656p4084985.html
Sent from the HBase User mailing list archive at Nabble.com.

Mime
View raw message