Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 564FF9D70 for ; Tue, 20 Dec 2011 20:22:15 +0000 (UTC) Received: (qmail 68967 invoked by uid 500); 20 Dec 2011 20:22:14 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 68912 invoked by uid 500); 20 Dec 2011 20:22:14 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 68899 invoked by uid 99); 20 Dec 2011 20:22:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Dec 2011 20:22:14 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tomaz.logar@tobonet.com designates 195.95.173.245 as permitted sender) Received: from [195.95.173.245] (HELO toby.tobonet.com) (195.95.173.245) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Dec 2011 20:22:06 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by toby.tobonet.com (Postfix) with ESMTP id 37E567E2B36 for ; Tue, 20 Dec 2011 21:21:46 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at tobonet.com Received: from toby.tobonet.com ([127.0.0.1]) by localhost (toby.tobonet.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id J6KbkZl1+nUn for ; Tue, 20 Dec 2011 21:21:44 +0100 (CET) Received: from [89.212.4.58] (89-212-4-58.dynamic.t-2.net [89.212.4.58]) by toby.tobonet.com (Postfix) with ESMTPSA id 41B3C7E2B2E for ; Tue, 20 Dec 2011 21:21:44 +0100 (CET) Message-ID: <4EF0EE52.1050808@tobonet.com> Date: Tue, 20 Dec 2011 21:21:38 +0100 From: Tomaz Logar User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: dev@hbase.apache.org Subject: Re: Table refuses a scan of old data, but not new References: <1324320423.62271.YahooMailNeo@web121704.mail.ne1.yahoo.com> <47EA5E8AC8F34924896B09C9964C4DF6@china.huawei.com> <4EF0C9A8.5010507@tobonet.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Hej, Todd. The relevant clipping is: --- Version: 0.90.4-cdh3u2 .......... Number of Tables: 15 Number of live region servers: 8 Number of dead region servers: 0 .ERROR: Version file does not exist in root dir file:/tmp/hbase-ta/hbase Number of empty REGIONINFO_QUALIFIER rows in .META.: 0 ERROR: Region xxx found in META, but not in HDFS, and deployed on nx (times 48) Summary: table is okay. Number of regions: 48 Deployed on: n1, n2, ... n8 --- Summary says the table in question is ok. For every region I get "found in META, but not in HDFS", but that seems a false positive as it is reported for all of them (11k+) and other tables work ok. And the files are in HDFS, ofcourse. No mention of any specific region being broken... :( T. Dne 20.12.2011 18:56, pi�e Todd Lipcon: > Hi Tomaz, > > What does "hbase hbck" report? Maybe you have a broken region of sorts? > > -Todd > > On Tue, Dec 20, 2011 at 9:45 AM, Tomaz Logar wrote: >> Hello, everybody. >> >> I hit a strange snag in HBase today. I have a table with 48 regions spread >> over 8 regionservers. It grows by about one region per day. It's like 6M >> small (30-100 bytes each) records at the moment, 3.2G of Snappy-encoded data >> on disks. >> >> What happened is that suddenly I can't scan over any previously inserted >> data in just one table. Freshly put data seems to be ok: >> >> --- >> hbase(main):035:0> put 'table', "\x00TEST", "*:t", "TEST" >> 0 row(s) in 0.0300 seconds >> >> hbase(main):041:0* scan 'table', {STARTROW=>"\x00TEST", LIMIT=>2} >> ROW COLUMN+CELL >> \x00TEST column=*:t, timestamp=1324392041600, value=TEST >> ERROR: java.lang.RuntimeException: >> org.apache.hadoop.hbase.regionserver.LeaseException: >> org.apache.hadoop.hbase.regionserver.LeaseException: lease >> '-1785731371547934030' does not exist >> --- >> >> So scan gets the record I put just before, but times out on old record that >> comes right after it. :( >> >> If I target an old record I don't even get an exception, just a huge >> timeout, no exception in regionserver log either: >> --- >> hbase(main):049:0> scan 'table', {STARTROW=>"0ua", LIMIT=>1} >> ROW COLUMN+CELL >> 0 row(s) in 146.2210 seconds >> --- >> >> It may be relevant that I'm getting these on another, much bigger (3T >> Snappy, 7+B records), yet working table: >> --- >> 11/12/20 17:50:37 WARN ipc.HBaseServer: IPC Server Responder, call >> next(-15185895745499515, 1) from 192.168.32.192:64307: output error >> 11/12/20 17:50:37 WARN ipc.HBaseServer: IPC Server handler 5 on 60020 >> caught: java.nio.channels.ClosedChannelException >> 11/12/20 17:32:43 WARN snappy.LoadSnappy: Snappy native library is available >> --- >> But these scans seem to recover while map-reducing. >> >> I'm running hbase-0.90.4-cdh3u2 from Cloudera SCM bundle on mixed nodes (5 * >> 2 core 4G RAM, 3 * 12 core 16G RAM) with 1.5G RAM allocated for each HBase >> regionserver. >> >> >> Can anyone share some wisdom? Anyone got a similar half-broken problem >> solved before? >> >> >> Thanks, >> >> T. >> >> > >