hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Sharma <va...@pinterest.com>
Subject Puzzling behaviour with HBase checksums
Date Fri, 05 Jul 2013 21:55:25 GMT

We are running hbase with hbase.regionserver.checksum.verify set to true.
But we are seeing an equal # of seeks for .meta files on HDFS and data
blocks. This is rather puzzling and I dont know if its broken. The hbase
jar is compiled against 2.0.3-alpha and this behaviour occurs for both
0.94.3 and 0.94.7. Shortcircuit local reads is enabled is working well
since only the region server is accessing the disk.

We run an strace limited to lseek calls and get the following:

28162 lseek(*668*, 0, SEEK_SET)           = 0
28162 lseek(*635*, 57479463, SEEK_SET)    = 57479463
28162 lseek(*2255*, 0, SEEK_SET)          = 0
28162 lseek(*1938*, 29285843, SEEK_SET)   = 29285843

Then we use lsof to find the underlying files and match them against the
corresponding file decriptors...

java    27947 hbase * 668u *  REG             202,32   1048583 36176608
java    27947 hbase  *635u*      REG             202,32 134217728 36176607
java    27947 hbase *2255u*   REG             202,16    802375 32768850
java    27947 hbase *1938u*   REG             202,16 102702747 32768849

The pattern in strace is pretty clear - first the .meta is read and then
the block is accessed. I am wondering if there are other places apart from
the checksum where the .meta file for the HDFS block is being accessed or
if the checksum stuff is simply broken ? It seems we are accessing 7 byte
values in these .meta files from more strace output. Is there a way I can
find out if the checksums were actually written out to HFiles in the first
place ?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message