Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AE8A8D6D2 for ; Thu, 6 Dec 2012 03:15:38 +0000 (UTC) Received: (qmail 54191 invoked by uid 500); 6 Dec 2012 03:15:36 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 54087 invoked by uid 500); 6 Dec 2012 03:15:36 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 54067 invoked by uid 99); 6 Dec 2012 03:15:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Dec 2012 03:15:35 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of manoj444@gmail.com designates 209.85.223.169 as permitted sender) Received: from [209.85.223.169] (HELO mail-ie0-f169.google.com) (209.85.223.169) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Dec 2012 03:15:30 +0000 Received: by mail-ie0-f169.google.com with SMTP id c14so11136916ieb.14 for ; Wed, 05 Dec 2012 19:15:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=jA3IuEwaMDowaZVPiTvPuzmnfR7S0MTPZI+tdM9IREw=; b=bl3CFa2qvzh4Gfr/sr+xwREPPJRwv58t5Ec3fI0Q+zRUKHe8HbWxqXCcisa/BwyN/l OcZfOZEJflk3NpA7wBttSawl9+L/MlfLkhu6pC10eQfDGGH+QBvB+FAAeoegLg/6oTYU ku5B2ZpCokyppBPA3xnG3oO29L3/vMFa9X2FsJ4fCtfyQ25vYTZqDJdmpNLLgzjIgqlY OlrV8bCMbEWPtCJsHJGlCoSIGqvBnHndYoGPA46AKy9aIuov5RXum5r5cZJfeuywjTie XcivwHIdP/k7Zcf4iAmnMnnWFVpnwOJ8BbRwAzAUfE530jYiS63KkEUjm3vQEII+ieb6 uDLA== Received: by 10.42.27.74 with SMTP id i10mr50910icc.47.1354763710020; Wed, 05 Dec 2012 19:15:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.50.181.232 with HTTP; Wed, 5 Dec 2012 19:14:49 -0800 (PST) In-Reply-To: References: From: Manoj Babu Date: Thu, 6 Dec 2012 08:44:49 +0530 Message-ID: Subject: Re: Reg:delete performance on HBase table To: user Content-Type: multipart/alternative; boundary=20cf303f64eab0bb2004d026818c X-Virus-Checked: Checked by ClamAV on apache.org --20cf303f64eab0bb2004d026818c Content-Type: text/plain; charset=ISO-8859-1 Team, Thank you very much for the valuable information. HBase version am using is: HBase Version0.90.3-cdh3u1, r Use case is: We are collecting information on where the user is spending time in our site(tracking the user events) also we are doing historical data migration from existing system also based on the data we need to populate metrics for the year. like Customer A hits option x n times, hits option y n times, Customer B hits option x1 n times, hits option y1 n time. Earlier by using Hadoop MapReduce we are aggregating the whole year data every 2 or 4 days once and using DBOutputFormat emiting to Oracle Table and for inserting 181 Million rows it took only 20 mins through 20 reducers hitting parallel so before populating the year table we use to delete the existing 181 Million rows of that year alone but it tooks more than 3hrs even not deleted then by killing the session done a truncate actually we are in development stage so planning to try HBase for this case since delete is taking too much time in oracle for millions of rows. Need to delete rows based on the year only cannot drop, In oracle also truncate is extremely fast. Cheers! Manoj. On Wed, Dec 5, 2012 at 11:44 PM, Nick Dimiduk wrote: > On Wed, Dec 5, 2012 at 7:46 AM, Doug Meil >wrote: > > > You probably want to read this section on the RefGuide about deleting > from > > HBase. > > > > http://hbase.apache.org/book.html#perf.deleting > > > So hold on. From the guide: > > 11.9.2. Delete RPC Behavior > > > > > Be aware that htable.delete(Delete) doesn't use the writeBuffer. It will > > execute an RegionServer RPC with each invocation. For a large number of > > deletes, consider htable.delete(List). > > > > > See > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#delete%28org.apache.hadoop.hbase.client.Delete%29 > > > So Deletes are like Puts except they're not executed the same why. Indeed, > HTable.put() is implemented using the write buffer while HTable.delete() > makes a MutateRequest directly. What is the reason for this? Why is the > semantic of Delete subtly different from Put? > > For that matter, why not buffer all mutation operations? > HTable.checkAndPut(), checkAndDelete() both make direct MutateRequest calls > as well. > > Thanks, > -n > --20cf303f64eab0bb2004d026818c--