hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kun yan <yankunhad...@gmail.com>
Subject calculate the record size of HBase
Date Thu, 12 Sep 2013 09:20:40 GMT
Hi all
The hbase table scan 'test' like this

 010012010114200           column=s:STATION, timestamp=1378892292800,
value=00001
 010012010114200           column=s:YEAR, timestamp=1378892292800,
value=2010
 010012010114210           column=s:DAY, timestamp=1378892292800, value=14

 010012010114210           column=s:HOUR, timestamp=1378892292800, value=21

 010012010114210           column=s:MINUTE, timestamp=1378892292800,
value=0
 010012010114210           column=s:MONTH, timestamp=1378892292800, value=1


I want to calculate the record size:
Fixed part needed by KeyValue format = Key Length + Value Length + Row
Length + CF Length + Timestamp + Key Value = ( 4 + 4 + 2 + 1

+ 8 + 1) = 20 Bytes

Variable part needed by KeyValue format = Row + Column Family + Column
Qualifier + Value


Total bytes required = Fixed part + Variable part

1 Column = 20 + (15 + 1 + 7 + 5) = 48 Bytes
1 Column = 20 + (15 + 1 + 4 + 4) = 44 Bytes
1 Column = 20 + (15 + 1 + 3 + 2) = 41 Bytes
1 Column = 20 + (15 + 1 + 4 + 2) = 42 Bytes
1 Column = 20 + (15 + 1 + 6 + 1) = 43 Bytes
1 Column = 20 + (15 + 1 + 6 + 1) = 43 Bytes
one record need 271 Bytes
And I hava 2 million record Total Size is about 542MB
My Question is:
This calculation method is right ?
-- 

In the Hadoop world, I am just a novice, explore the entire Hadoop
ecosystem, I hope one day I can contribute their own code

YanBit
yankunhadoop@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message