Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6E75F184CF for ; Tue, 13 Oct 2015 05:52:06 +0000 (UTC) Received: (qmail 51520 invoked by uid 500); 13 Oct 2015 05:52:06 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 51451 invoked by uid 500); 13 Oct 2015 05:52:05 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 51434 invoked by uid 99); 13 Oct 2015 05:52:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Oct 2015 05:52:05 +0000 Date: Tue, 13 Oct 2015 05:52:05 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-14501) NPE in replication with TDE MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954438#comment-14954438 ] Hudson commented on HBASE-14501: -------------------------------- FAILURE: Integrated in HBase-1.0 #1078 (See [https://builds.apache.org/job/HBase-1.0/1078/]) HBASE-14501 NPE in replication with TDE (enis: rev 2c6cd83b8add75927e6b84ddc412753fa19ba2f4) * hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValueUtil.java * hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValue.java * hbase-common/src/main/java/org/apache/hadoop/hbase/codec/CellCodec.java * hbase-common/src/main/java/org/apache/hadoop/hbase/codec/BaseDecoder.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureWALCellCodec.java > NPE in replication with TDE > --------------------------- > > Key: HBASE-14501 > URL: https://issues.apache.org/jira/browse/HBASE-14501 > Project: HBase > Issue Type: Bug > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: hbase-14501_v1.patch > > > We are seeing a NPE when replication (or in this case async wal replay for region replicas) is run on top of an HDFS cluster with TDE configured. > This is the stack trace: > {code} > java.lang.NullPointerException > at org.apache.hadoop.hbase.CellUtil.matchingRow(CellUtil.java:370) > at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.countDistinctRowKeys(ReplicationSource.java:649) > at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:450) > at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:346) > {code} > This stack trace can only happen if WALEdit.getCells() returns an array containing null entries. I believe this happens due to {{KeyValueCodec.parseCell()}} uses {{KeyValueUtil.iscreate()}} which returns null in case of EOF at the beginning. However, the contract for the Decoder.parseCell() is not clear whether returning null is acceptable or not. The other Decoders (CompressedKvDecoder, CellCodec, etc) do not return null while KeyValueCodec does. > BaseDecoder has this code: > {code} > public boolean advance() throws IOException { > if (!this.hasNext) return this.hasNext; > if (this.in.available() == 0) { > this.hasNext = false; > return this.hasNext; > } > try { > this.current = parseCell(); > } catch (IOException ioEx) { > rethrowEofException(ioEx); > } > return this.hasNext; > } > {code} > which is not correct since it uses {{IS.available()}} not according to the javadoc: (https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available()). DFSInputStream implements {{available()}} as the remaining bytes to read from the stream, so we do not see the issue there. {{CryptoInputStream.available()}} does a similar thing but see the issue. > So two questions: > - What should be the interface for Decoder.parseCell()? Can it return null? > - How to properly fix BaseDecoder.advance() to not rely on {{available()}} call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)