Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B4B5E17472 for ; Fri, 30 Jan 2015 20:11:35 +0000 (UTC) Received: (qmail 84477 invoked by uid 500); 30 Jan 2015 20:11:35 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 84343 invoked by uid 500); 30 Jan 2015 20:11:35 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 84065 invoked by uid 99); 30 Jan 2015 20:11:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jan 2015 20:11:35 +0000 Date: Fri, 30 Jan 2015 20:11:35 +0000 (UTC) From: "Jerry He (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Jerry He created HBASE-12949: -------------------------------- Summary: Scanner can be stuck in infinite loop if the HFile is corrupted Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.98.10 Reporter: Jerry He We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning -> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename -> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous -> \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current -> Exception in thread "main" java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread "main" java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } -- This message was sent by Atlassian JIRA (v6.3.4#6332)