hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek Padmanabhan <vpadmanab...@aryaka.com>
Subject Some Hbase questions
Date Sun, 19 May 2013 15:29:37 GMT

  I am pretty new to HBase so it would be great if someone could help me out with my below

(Ours is a time series data and all the queries will be range scan on  composite row keys)

a) What is the usual practice of storing data types.
   We have noticed that converting datatypes to bytes render unreadable data while debugging.
   For ids, or int values we see the byte representation. So for some important columns 
   we converted into  datatype -> characters ->bytes, rather than datatype -> bytes
   (May be we can write a wrapper over hbase shell to solve this. But is there a simpler way)

b) What is the best way to achieve operations like AVG,SUM or some custom formula for real
time queries. Coprocessors or in-memory with query result? 
   (The formula that we apply might get changed at any time so storing result is not an option)

c) We are planning to start off with a four node cluster, having both HBase and MR jobs running.
   I have heard that it is not recommended to have both HBase and MR on the same cluster,
but I would 
   like to understand what could be the possible bottle necks.

  (We plan to run MR on HDFS and MR on Hbase. Most of our MR jobs are IO bound rather than
CPU bound)

View raw message