Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9E3EF106B2 for ; Sat, 6 Jul 2013 00:22:20 +0000 (UTC) Received: (qmail 56708 invoked by uid 500); 6 Jul 2013 00:22:19 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 56652 invoked by uid 500); 6 Jul 2013 00:22:19 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 56644 invoked by uid 99); 6 Jul 2013 00:22:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Jul 2013 00:22:19 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of varun@pinterest.com designates 209.85.219.46 as permitted sender) Received: from [209.85.219.46] (HELO mail-oa0-f46.google.com) (209.85.219.46) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Jul 2013 00:22:15 +0000 Received: by mail-oa0-f46.google.com with SMTP id h1so4076877oag.19 for ; Fri, 05 Jul 2013 17:21:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=+qi3GHVWIQoNzb+Z3o440msl6IyAomGbPZLIVv93DYY=; b=V4/JtdwMJ9MegPVnOZ/UQqoY2kTTBXgeEmrTNM4mkxG2y/hkdUl4cYkO6uJ2cF6pvF 5WDcvoxG20m36f0k6kSL4Mh50HW+jFdD/g3plbU2Ljnm4sgWJ1gueAzQzyBQDSmgwlQ/ 83uT+c9o1geH4Xwgiplf+SZd8GT72nZQF7y/V/+0zqbRP2UG+OtCDEjKN7DMTTWRVUwg TCSrNkIPtKkb9+R7r9Mv4DswXfNoMY0FSTQOAvM6JMznEewBLKLqlUb64BoKrysV34zr U8d55Q9+7I6HdKqvw7CoNwxR1ZSJZ6TX/elDzveeaqyuM7K+wzW1s2jI4tQWde3L0mfR AWnA== MIME-Version: 1.0 X-Received: by 10.60.97.200 with SMTP id ec8mr13324079oeb.33.1373070114908; Fri, 05 Jul 2013 17:21:54 -0700 (PDT) Received: by 10.76.116.228 with HTTP; Fri, 5 Jul 2013 17:21:54 -0700 (PDT) In-Reply-To: References: Date: Fri, 5 Jul 2013 17:21:54 -0700 Message-ID: Subject: Re: Puzzling behaviour with HBase checksums From: Varun Sharma To: "dev@hbase.apache.org" Content-Type: multipart/alternative; boundary=089e0115f34e737d1d04e0cccc64 X-Gm-Message-State: ALoCoQnF/V6J9w6hYCpgnrR4Oy7hPTsIE1q7nfBenS0w+o3AWcevfaxIwhqYvC83Le4x1zHOtMZc X-Virus-Checked: Checked by ClamAV on apache.org --089e0115f34e737d1d04e0cccc64 Content-Type: text/plain; charset=ISO-8859-1 I just set this value in hbase-site.xml but still the 7 byte reads and lseek(s) persist. On Fri, Jul 5, 2013 at 4:22 PM, Ted Yu wrote: > What value did you set for dfs.client.read.shortcircuit.skip.checksum ? > > Cheers > > On Fri, Jul 5, 2013 at 2:55 PM, Varun Sharma wrote: > > > Hi, > > > > We are running hbase with hbase.regionserver.checksum.verify set to true. > > But we are seeing an equal # of seeks for .meta files on HDFS and data > > blocks. This is rather puzzling and I dont know if its broken. The hbase > > jar is compiled against 2.0.3-alpha and this behaviour occurs for both > > 0.94.3 and 0.94.7. Shortcircuit local reads is enabled is working well > > since only the region server is accessing the disk. > > > > We run an strace limited to lseek calls and get the following: > > > > 28162 lseek(*668*, 0, SEEK_SET) = 0 > > 28162 lseek(*635*, 57479463, SEEK_SET) = 57479463 > > 28162 lseek(*2255*, 0, SEEK_SET) = 0 > > 28162 lseek(*1938*, 29285843, SEEK_SET) = 29285843 > > > > Then we use lsof to find the underlying files and match them against the > > corresponding file decriptors... > > > > java 27947 hbase * 668u * REG 202,32 1048583 36176608 > > > > > /data/xvdc/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir54/ > > *blk_5081211948968918615_597521.meta* > > * > > * > > java 27947 hbase *635u* REG 202,32 134217728 > 36176607 > > > > > /data/xvdc/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir54/ > > *blk_5081211948968918615* > > * > > * > > java 27947 hbase *2255u* REG 202,16 802375 32768850 > > > > > /mnt/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir40/ > > *blk_2670783290218647110_614641.meta* > > * > > * > > java 27947 hbase *1938u* REG 202,16 102702747 32768849 > > > > > /mnt/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir40/ > > *blk_2670783290218647110* > > > > The pattern in strace is pretty clear - first the .meta is read and then > > the block is accessed. I am wondering if there are other places apart > from > > the checksum where the .meta file for the HDFS block is being accessed or > > if the checksum stuff is simply broken ? It seems we are accessing 7 byte > > values in these .meta files from more strace output. Is there a way I can > > find out if the checksums were actually written out to HFiles in the > first > > place ? > > > > Thanks > > Varun > > > --089e0115f34e737d1d04e0cccc64--