hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5074) support checksums in HBase block cache
Date Thu, 23 Feb 2012 22:11:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215108#comment-13215108
] 

Phabricator commented on HBASE-5074:
------------------------------------

dhruba has commented on the revision "[jira] [HBASE-5074] Support checksums in HBase block
cache".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This constructor is used
only for V2, hence the major number is not a parameter.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think there won;t
be any changes to the number of threads in the datanode. A datanode thread is not tied up
with a client FileSystem object. Instead, a global pool of threads in the datanode are free
to serve any read-requests from any client
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor version indicates
disk-format changes inside an HFileBlock. The major version indicates disk-format changes
within a entire HFile. Since the AbstractFSReader only reads HFileBlocks, so it is logical
that it contains the minorVersion, is it not?

  But I can put in the majorVersion in it as well, if you so desire.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the default it
to enable hbase-checksum verification. And you are right that if the hfile is of the older
type, then we will quickly flip this back to false (in the next line)
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think we should keep
both streams active till the HFile itself is closed.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Yes, precisely.
Going forward, I would like to see if we can make HLogs go to a filesystem object that is
different from the filesystem used for hfiles.
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I agree
with you completely. This is an interface that should not change often.

REVISION DETAIL
  https://reviews.facebook.net/D1521

                
> support checksums in HBase block cache
> --------------------------------------
>
>                 Key: HBASE-5074
>                 URL: https://issues.apache.org/jira/browse/HBASE-5074
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch,
D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch,
D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch
>
>
> The current implementation of HDFS stores the data in one block file and the metadata(checksum)
in another block file. This means that every read into the HBase block cache actually consumes
two disk iops, one to the datafile and one to the checksum file. This is a major problem for
scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that
the storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message