hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5074) support checksums in HBase block cache
Date Wed, 22 Feb 2012 19:19:51 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213884#comment-13213884

Phabricator commented on HBASE-5074:

stack has commented on the revision "[jira] [HBASE-5074] Support checksums in HBase block

  Answering Dhruba.

  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:115 Seems like we could have better
names for these methods, ones that give more of a clue as to what they are about.  getBackingFS,

  Maybe you are keepign them generic like this because you will be back in this area again
soon doing another beautiful speedup on top of this checksumming fix (When we going to do
read-ahead?  Would that speed scanning?)
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:44 ok. np.
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:49 Ok.  So, two
readers.  Our file count is going to go up?  We should release note this as side effect of
enabling this feature (previous you may have been well below xceivers limit but now you could
go over the top?)  I didn't notice this was going on.  Need to foreground it I'd say.
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:84 I figured.  Its fine
as is.


> support checksums in HBase block cache
> --------------------------------------
>                 Key: HBASE-5074
>                 URL: https://issues.apache.org/jira/browse/HBASE-5074
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch,
D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch,
D1521.6.patch, D1521.7.patch, D1521.7.patch
> The current implementation of HDFS stores the data in one block file and the metadata(checksum)
in another block file. This means that every read into the HBase block cache actually consumes
two disk iops, one to the datafile and one to the checksum file. This is a major problem for
scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that
the storage-hardware offers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message