Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A8AEAD90D for ; Wed, 6 Feb 2013 05:27:17 +0000 (UTC) Received: (qmail 17530 invoked by uid 500); 6 Feb 2013 05:27:16 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 16977 invoked by uid 500); 6 Feb 2013 05:27:15 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 16475 invoked by uid 99); 6 Feb 2013 05:27:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Feb 2013 05:27:14 +0000 Date: Wed, 6 Feb 2013 05:27:14 +0000 (UTC) From: "Jean-Daniel Cryans (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HBASE-7774) RegionObserver.prePut() cannot rely on the Put's timestamps, can even cause data loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans resolved HBASE-7774. --------------------------------------- Resolution: Duplicate Yes I think we can resolve that as a duplicate, thanks for chiming in guys! > RegionObserver.prePut() cannot rely on the Put's timestamps, can even cause data loss > ------------------------------------------------------------------------------------- > > Key: HBASE-7774 > URL: https://issues.apache.org/jira/browse/HBASE-7774 > Project: HBase > Issue Type: Bug > Affects Versions: 0.92.2, 0.96.0, 0.94.4 > Reporter: Jean-Daniel Cryans > Priority: Critical > > We had a user that had code that looked like this in a coprocessor's prePut(): > {code} > if (put.has(expectedKv)) > put.add(kvSayingIFoundIt); > else > put.add(kvSayingNotFound); > {code} > If you have MSLAB turned *off*, and you have the {{expectedKv}} in your {{Put}}, doing a {{Get}} following your insert will only return {{kvSayingIFoundIt}} and not the KV you were actually inserting. > More so, if you only do {{put.has(expectedKv)}}, you will not get anything back. Your data seems to be gone. > The reason is that in {{prePut()}} the timestamp hasn't been set yet, so calling {{kv.getTimestamp()}} during the comparisons in {{put.has()}} will populate {{kv.timestampCache}} with {{Long.MAX_VALUE}}. Then it will stay in the {{MemStore}} with that big timestamp and be filtered out because {{TimeRange}} will compare {{Long.MAX_VALUE}} >= {{Long.MAX_VALUE}} and return {{SKIP}}. > And the reason it works correctly with MSLAB *on* is that the KV is cloned in {{maybeCloneWithAllocator()}} and the cache is reset. > Now, I think this has bigger implications. Basically, you can't rely on the timestamp at all in {{prePut()}}. I'm sure this can screw someone else in a creative way later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira