Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 394AFDF97 for ; Tue, 10 Jul 2012 11:58:28 +0000 (UTC) Received: (qmail 53672 invoked by uid 500); 10 Jul 2012 11:58:26 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 53464 invoked by uid 500); 10 Jul 2012 11:58:25 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 53452 invoked by uid 99); 10 Jul 2012 11:58:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jul 2012 11:58:25 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of michael_segel@hotmail.com designates 65.55.111.107 as permitted sender) Received: from [65.55.111.107] (HELO blu0-omc2-s32.blu0.hotmail.com) (65.55.111.107) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jul 2012 11:58:18 +0000 Received: from BLU0-SMTP138 ([65.55.111.73]) by blu0-omc2-s32.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 10 Jul 2012 04:57:58 -0700 X-Originating-IP: [173.15.87.37] X-Originating-Email: [michael_segel@hotmail.com] Message-ID: Received: from [192.168.0.100] ([173.15.87.37]) by BLU0-SMTP138.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Tue, 10 Jul 2012 04:57:56 -0700 Subject: Re: Mixing Puts and Deletes in a single RPC MIME-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset="iso-8859-1" From: Michael Segel In-Reply-To: Date: Tue, 10 Jul 2012 06:57:54 -0500 CC: lars hofhansl Content-Transfer-Encoding: 7bit References: <1341532233.25236.YahooMailNeo@web121706.mail.ne1.yahoo.com> To: user@hbase.apache.org X-Mailer: Apple Mail (2.1278) X-OriginalArrivalTime: 10 Jul 2012 11:57:56.0374 (UTC) FILETIME=[3E66CB60:01CD5E93] X-Virus-Checked: Checked by ClamAV on apache.org Regardless, Its still a bad design. On Jul 9, 2012, at 10:02 PM, Jonathan Hsieh wrote: > Keith, > > The HBASE-3584 feature is a 0.94 and we are strongly considering an 0.94 > version for for a future CDH4 update. There is very little chance this > will get into a CDH3 release. > > Jon. > > On Thu, Jul 5, 2012 at 4:50 PM, lars hofhansl wrote: > >> I'll let the Cloudera folks speak, but I has assumed CDH4 would include >> HBase 0.94. >> >> -- Lars >> >> >> >> ________________________________ >> From: Ted Yu >> To: user@hbase.apache.org >> Sent: Thursday, July 5, 2012 11:28 AM >> Subject: Re: Mixing Puts and Deletes in a single RPC >> >> Take a look at HBASE-3584: Allow atomic put/delete in one call >> It is in 0.94, meaning it is not even in cdh4 >> >> Cheers >> >> On Thu, Jul 5, 2012 at 11:19 AM, Keith Wyss >> wrote: >> >>> Hi, >>> >>> My organization has been doing something zany to simulate atomic row >>> operations is HBase. >>> >>> We have a converter-object model for the writables that are populated in >>> an HBase table, and one of the governing assumptions >>> is that if you are dealing with an Object record, you read all the >> columns >>> that compose it out of HBase or a different data source. >>> >>> When we read lots of data in from a source system that we are trying to >>> mirror with HBase, if a column is null that means that whatever is >>> in HBase for that column is no longer valid. We have simulated what I >>> believe is now called a AtomicRowMutation by using a single Put >>> and populating it with blanks. The downside is the wasted space accrued >> by >>> the metadata for the blank columns. >>> >>> Atomicity is not of utmost importance to us, but performance is. My >>> approach has been to create a Put and Delete object for a record and >>> populate the Delete with the null columns. Then we call >>> HTable.batch(List) on a bunch of these. It is my impression that >> this >>> shouldn't appreciably increase network traffic as the RPC calls will be >>> bundled. >>> >>> Has anyone else addressed this problem? Does this seem like a reasonable >>> approach? >>> What sort of performance overhead should I expect? >>> >>> Also, I've seen some Jira tickets about making this an atomic operation >> in >>> its own right. Is that something that >>> I can expect with CDH3U4? >>> >>> Thanks, >>> >>> Keith Wyss >>> >> > > > > -- > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > // jon@cloudera.com