Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 16132 invoked from network); 6 Aug 2010 22:00:30 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 Aug 2010 22:00:30 -0000 Received: (qmail 45104 invoked by uid 500); 6 Aug 2010 22:00:28 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 45065 invoked by uid 500); 6 Aug 2010 22:00:28 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 45052 invoked by uid 99); 6 Aug 2010 22:00:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Aug 2010 22:00:28 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [206.190.48.147] (HELO web52304.mail.re2.yahoo.com) (206.190.48.147) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 06 Aug 2010 22:00:19 +0000 Received: (qmail 21528 invoked by uid 60001); 6 Aug 2010 21:59:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1281131997; bh=l7W65sdMZxhnpUuouAchte6G9Fyjl51ysXrB4AV9uFg=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type; b=hintn9fRxLiDX9GoR1/GRbqdK2zj+JYCW4vRfIVpI3fy9RmHKTyg/bDQ8UBUA/g0iQqdkLUGs1gMVlsCsdBuPk5l89jzi7+n1fCKsGj4zJNqp+ay1p5WU2BHm9p0/3F4WdYHICo0Lf+JxssnQSeU2XxxZAee8DZyYdAtjJEyDlk= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type; b=NL1xOi4/P5twOZuPwSfUczycfNecBAybSqvXi4sCZRlTzV20ibasOkVXSVzAjK1k0ly33bx3v3O87gA4GqtMSEVhd8VjtmXwkq1i6Fw+ZmALR85/dcPPVe/+HPybDGykjUBqaK/5pDK+F7lZ67app6WmiKHJr4ftVROmsgsnwkw=; Message-ID: <584545.20008.qm@web52304.mail.re2.yahoo.com> X-YMail-OSG: qpJTekoVM1m9JWCGOXNj1y92VO2qp6TVm3D3ztk9_z6Bf.P d0NGH66bD9Zehur3HRHVlydRZ7P5GPIlqS3my.ZZeQ7_dD0jzPKct.RRhKGK R5_6r6cnJSxQSyoBCLvPxFBVe9m6ubL8.vzqVtvxWE1MXcEP3qIIF_E7_pKS JOh_NFvAGxrMX1rJ6Pc9tFRqNNfl2S7112dW_XSADQ3LdH3OlhrXSmnGS_Jz lha7VUBTrW428da57pM.h0_gMLyZYGgvgaeFNYPoTQYIyOUSca3nVr5cblL0 9UhFc3RfMHPA9E9S21K0xxgbmdfM_ia.59X6AFQUleRbA Received: from [12.155.58.181] by web52304.mail.re2.yahoo.com via HTTP; Fri, 06 Aug 2010 14:59:57 PDT X-Mailer: YahooMailClassic/11.3.2 YahooMailWebService/0.8.105.279950 Date: Fri, 6 Aug 2010 14:59:57 -0700 (PDT) From: Stuart Smith Subject: Batch puts interrupted ... Requested row out of range for HRegion filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException: To: user@hbase.apache.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org Hello, I'm running hbase 0.20.5, and seeing Puts() fail repeatedly when trying to insert a specific item into the database. Client side I see: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, numtries=10, i=0, listsize=1, region=filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836 for region filestore, I then looked up which node was hosting the given region (filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b) on the gui, found the following debug message in the regionserver log: 2010-08-06 14:23:47,414 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Batch puts interrupted at index=0 because:Requested row out of range for HRegion filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836, startKey='bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b', getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633', row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d' Which appears to be coming from: /regionserver/HRegionServer.java:1786: LOG.debug("Batch puts interrupted at index=" + i + " because:" + Which is coming from: ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658: throw new WrongRegionException("Requested row out of range for " + This happens repeatedly on a specific item over at least a day or so, even when not much is happening with the cluster. As far as I can tell, it looks like the logic to select the correct region for a given row is wrong. The row is indeed not in the correct range (at least from what I can tell of the exception thrown), and the check in HRegion.java:1658: /** Make sure this is a valid row for the HRegion */ private void checkRow(final byte [] row) throws IOException { if(!rowIsInRange(regionInfo, row)) { Is correctly rejecting the Put(). So it appears the error would be somewhere in: HRegion.java:1550: private void put(final Map> familyMap, boolean writeToWAL) throws IOException { Which appears to be the actual guts of the insert operation. However, I don't know enough about the design of HRegions to really decipher this method. I'll dig into it more, but I thought it might be more efficient just to ask you guys first. Any ideas? I can update to 0.20.6, but I don't see any fixed jira's on 0.20.6 that seem related.. I could be wrong. I'm not sure what I should do next. Any more information you guys need? Note that I am inserting file into the database, and using it's sha256sum as the key. And the file that is failing does indeed have a sha that corresponds to the key in the message above (and is out of range). Take care, -stu