hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5074) support checksums in HBase block cache
Date Wed, 22 Feb 2012 23:39:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214107#comment-13214107
] 

Phabricator commented on HBASE-5074:
------------------------------------

stack has commented on the revision "[jira] [HBASE-5074] Support checksums in HBase block
cache".

  Dhruba, have you been running this patch anywhere?

  I'm +1 on commit if tests pass.  If its not been run anywhere, i can test it local before
committing.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 Is it odd that we only
take in the minor version here and not major too?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:861 Why WARN?  This is a
'normal' operation?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 So, yeah, aren't we
doubling the FDs when we do this?  The iops may be the same but the threads floating in the
datanode for reading will double?
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 I'm not getting why
no major version in here.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 So, again we are defaulting
true (though it seems that if no checksums in hfiles, we'll flip this flag to off pretty immediately)
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1589 Smile.  Like now.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 Extreme nit: Should
we close the nochecksumistream if its not going to be used?


  Hmm... now I see we can flip back to using them again later in the stream
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Now we have our
own filesystem, we can dump a bunch of crud in there !  We can add things like the hbase.version
check, etc. (joke -- sortof).
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I'm reluctant
adding stuff to this Interface but I think this method qualifies as important enough to be
allowed in.
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:70 Great

REVISION DETAIL
  https://reviews.facebook.net/D1521

                
> support checksums in HBase block cache
> --------------------------------------
>
>                 Key: HBASE-5074
>                 URL: https://issues.apache.org/jira/browse/HBASE-5074
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch,
D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch,
D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch
>
>
> The current implementation of HDFS stores the data in one block file and the metadata(checksum)
in another block file. This means that every read into the HBase block cache actually consumes
two disk iops, one to the datafile and one to the checksum file. This is a major problem for
scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that
the storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message