hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5074) support checksums in HBase block cache
Date Tue, 06 Mar 2012 01:54:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222899#comment-13222899
] 

Phabricator commented on HBASE-5074:
------------------------------------

mbautin has accepted the revision "[jira] [HBASE-5074] Support checksums in HBase block cache".

  @dhruba: looks good! A few minor comments inline.

  Also, I still think there is some code duplication between TestHFileBlock and TestHFileBlockCompatibility
that we could get rid of, but we can do that in a separate patch.

  Could you please attach the final patch to the JIRA and run it on Hadoop QA?

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 s/do do/do/
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1242 do do -> do
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174
It would be great to factor out the common part of this hard-coded gzip blob so that it is
not repeated in TestHFileBlock and here.

  This is an example of what I meant in my comment regarding code duplication.

  Alternatively, we can remove code duplication in a follow-up patch.

REVISION DETAIL
  https://reviews.facebook.net/D1521

BRANCH
  svn

                
> support checksums in HBase block cache
> --------------------------------------
>
>                 Key: HBASE-5074
>                 URL: https://issues.apache.org/jira/browse/HBASE-5074
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.94.0
>
>         Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch,
D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch,
D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch,
D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch,
D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch
>
>
> The current implementation of HDFS stores the data in one block file and the metadata(checksum)
in another block file. This means that every read into the HBase block cache actually consumes
two disk iops, one to the datafile and one to the checksum file. This is a major problem for
scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that
the storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message