Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 68C40925B for ; Thu, 2 Feb 2012 18:43:21 +0000 (UTC) Received: (qmail 53724 invoked by uid 500); 2 Feb 2012 18:43:21 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 53619 invoked by uid 500); 2 Feb 2012 18:43:21 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 53605 invoked by uid 99); 2 Feb 2012 18:43:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Feb 2012 18:43:21 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Feb 2012 18:43:17 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 64374188FB8 for ; Thu, 2 Feb 2012 18:42:56 +0000 (UTC) Date: Thu, 2 Feb 2012 18:42:56 +0000 (UTC) From: "Phabricator (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1758112737.4015.1328208176412.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <155647929.29704.1324361611230.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5074) support checksums in HBase block cache MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199073#comment-13199073 ] Phabricator commented on HBASE-5074: ------------------------------------ dhruba has commented on the revision "[jira] [HBASE-5074] Support checksums in HBase block cache". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java:206-207 will fix src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:1072 sure src/main/java/org/apache/hadoop/hbase/HConstants.java:598 I will make this part of the code cleaner. I still am hoping to keep only one knob: whether to verify hbase checksums or not. If hbase checksums is switched on, then hdfs checksums will automatically be switched off. If hbase checksums is configured 'off', then it will automatically switch on hdfs checksums. I feel that the other knobs (e.g. no checksums at all or use both checksums) are not very interesting in *any* production environment and I would like to keep the code complexity a little lower by avoiding those two combinations. Hope that is ok with you. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3597 Good idea, will do src/main/java/org/apache/hadoop/hbase/util/ChecksumByteArrayOutputStream.java:31 It tried this, but it needs a few changes, so I anyway landed up with needing my own object wrapper over DataOutputBuffer. src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java:39 I too feel that we should add the checksum type to the hfileblock header. That will make us future proof to try new checksum algorithms in the future. Will make this change. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:132-133 This is equivalent to the existing FileSystem.get() and many places in hbase uses this. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:80 I will make this public so that users can create a HFileSystem object on a non-default path src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:102 I am making changes here based on mikhial's suggestion too. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:229 as you would see, the existing code path that create a HFileBlock usin g this constructor uses it for only in-memory caching, so it never fills up or uses the onDiskDataSizeWithHeader field. But I will set it to what you propose. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:252 ondisksizewithheader = ondiskdatasizewithheader + checksum bytes src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:751 I am in complete agreement with you. I wish I could have used the hadoop trunk code here. Unfortunately, hbase pulls in hadoop 1.0 which does not have this implementation. Another option is to make a copy of this code from hadoop into hbase code, but this has its own set of problems for maintainability. I am hoping that hbase will move to hadoop 2.0 very soon and then we can start the more optimal checksum implementation. Hope that is ok with you. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1401-1402 This needs to be thread safe. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1634 This is an internal method and this error is handled by upper layers (by switching off hbase checksums). So, I am following the paradigm of using Exceptions only when true errors happen; I would like to avoid writing code that generates exceptions in one layer catches them in another layer and handles them. The discussion with Doug Cutting on the hdfs-symlink patch is etched in my mind:-) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1888 I will work (in a later patch) to use bulk checksum verifications, using native code, etc (from hadoop) in a later patch. I would like to keep this patch smaller that what it already is by focussing on the disk format change, compatibility with older versions, etc. The main reason is that most of the hadoop checksum optimizations are only in hadoop 2.0. I am hoping that it is ok with you. REVISION DETAIL https://reviews.facebook.net/D1521 > support checksums in HBase block cache > -------------------------------------- > > Key: HBASE-5074 > URL: https://issues.apache.org/jira/browse/HBASE-5074 > Project: HBase > Issue Type: Improvement > Components: regionserver > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Attachments: D1521.1.patch, D1521.1.patch > > > The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira