Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F7BDD5A1 for ; Fri, 21 Sep 2012 22:36:09 +0000 (UTC) Received: (qmail 61401 invoked by uid 500); 21 Sep 2012 22:36:09 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 61365 invoked by uid 500); 21 Sep 2012 22:36:09 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 61298 invoked by uid 99); 21 Sep 2012 22:36:09 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Sep 2012 22:36:09 +0000 Date: Sat, 22 Sep 2012 09:36:08 +1100 (NCT) From: "Eugene Morozov (JIRA)" To: issues@hbase.apache.org Message-ID: <2142546136.109966.1348266969057.JavaMail.jiratomcat@arcas> Subject: [jira] [Created] (HBASE-6861) HFileOutputFormat set TIMERANGE wrongly MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Eugene Morozov created HBASE-6861: ------------------------------------- Summary: HFileOutputFormat set TIMERANGE wrongly Key: HBASE-6861 URL: https://issues.apache.org/jira/browse/HBASE-6861 Project: HBase Issue Type: Bug Reporter: Eugene Morozov Priority: Minor In case if timestamps for KeyValues specified differently for different column families, then TIMERANGEs of both HFiles would be wrong. Example (in pseudo code): my reducer has a condition if (condition ) { keyValue = new KeyValue(.., CF1, .., timestamp, ..); } else { keyValue = new KeyValue(.., CF2, .., ..); // <- no timestamp } context.write( keyValue ); These two keyValues would be written into two different HFiles. But the code, which is actually write do the following: // we now have the proper HLog writer. full steam ahead kv.updateLatestStamp(this.now); trt.includeTimestamp(kv); wl.writer.append(kv); Basically, two HFiles shares the same instance of trt (TimeRangeTracker), which leads to the same TIMERANGEs of both of them. Which is definitely incorrect, because first HFile must have TIMERANGE=timestamp...timestamp, cause we do not write any other timestamps there. And another HFile must have TIMERANGE=now...now by same meaning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira