Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BBA5D10E97 for ; Tue, 9 Jul 2013 14:49:50 +0000 (UTC) Received: (qmail 36983 invoked by uid 500); 9 Jul 2013 14:49:48 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 36848 invoked by uid 500); 9 Jul 2013 14:49:47 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 36840 invoked by uid 99); 9 Jul 2013 14:49:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jul 2013 14:49:47 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bbeaudreault@hubspot.com designates 74.125.149.69 as permitted sender) Received: from [74.125.149.69] (HELO na3sys009aog102.obsmtp.com) (74.125.149.69) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 09 Jul 2013 14:49:43 +0000 Received: from mail-vb0-f46.google.com ([209.85.212.46]) (using TLSv1) by na3sys009aob102.postini.com ([74.125.148.12]) with SMTP ID DSNKUdwi5gLaA+To2wtCzUw8I0fr9JZPVGI4@postini.com; Tue, 09 Jul 2013 07:49:22 PDT Received: by mail-vb0-f46.google.com with SMTP id 10so4483607vbe.19 for ; Tue, 09 Jul 2013 07:49:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=FT8ViN3hQPIbjO8gu8903nBTG7MAouLj6MBKamMkq4o=; b=kgukrz+eSX1GqGWDs8P8laW5rDPeHZhI8XIlo7chjQ4UeMvbluJXocq2a80gAMuLYu J/waK50jgaRvd/+p5WuwrsddXwyVTbfyxuuOCdWbSvQldCAh02WyRvvPqBOfRHKEOdU1 5520u2Uj7p0qNuxDz+pZlKFUXlegXOoMvNZtBLX/yg+mVL2hzwaxBXNBVetEsyVrygyq bP2hY6IcpsLYC/fW9JJlPmCLomCpiwpB2RYHaq4Jg2B1czuDIIMc2mauF1qgiJ+qCor7 u10egPxLJ8LAbYA2W5R6BVn9PjtxVaVLmnoiFNIjDbVbn1DB/a+N3N3v7LNOBw2gAxNe BYSA== X-Received: by 10.220.111.206 with SMTP id t14mr16760745vcp.77.1373381350337; Tue, 09 Jul 2013 07:49:10 -0700 (PDT) X-Received: by 10.220.111.206 with SMTP id t14mr16760740vcp.77.1373381350230; Tue, 09 Jul 2013 07:49:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.82.71 with HTTP; Tue, 9 Jul 2013 07:48:50 -0700 (PDT) In-Reply-To: References: From: Bryan Beaudreault Date: Tue, 9 Jul 2013 10:48:50 -0400 Message-ID: Subject: Re: Disabled automated compaction - table still compacting To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=047d7b3432308596f604e11543b6 X-Gm-Message-State: ALoCoQktG+e4cVgeQjREL43FfPXmfBbHL+yQuM35E3pcO/W33xUp5DhO6v/0MacZJfwOf7j2lYPtYpSAAegytzYbXqi6sg1Eyg+VqE/BsGj6a8tXU7NFs14gJzuorY7KZPlTUQTxRFn1YjW1TO2V0QHpEiae6JuJuhl3QYwqfA89JVah+pT4zT0= X-Virus-Checked: Checked by ClamAV on apache.org --047d7b3432308596f604e11543b6 Content-Type: text/plain; charset=ISO-8859-1 You should be able to limit what JM describes by tuning the following two configs: hbase.hstore.compactionThreshold hbase.hstore.compaction.max Beware of this property as well when tuning the above so you don't accidentally cause blocking of flushes, though I imagine you would be tuning down not up and so wouldn't be a problem: hbase.hstore.blockingStoreFiles On Tue, Jul 9, 2013 at 10:41 AM, Jean-Marc Spaggiari < jean-marc@spaggiari.org> wrote: > Hi David, > > Minor compactions can be promoted to Major compactions when all the > files are selected for compaction. And the property below will not > avoid that to occur. > > Section 9.7.6.5 there: http://hbase.apache.org/book/regions.arch.html > > JM > > > 2013/7/9 David Koch : > > Hello, > > > > We disabled automated major compactions by setting > > hbase.hregion.majorcompaction=0. > > This was to avoid issues during buik import of data since compactions > > seemed to cause the running imports to crash. However, even after > > disabling, region server logs still show compactions going on, as well as > > aborted compactions. We also get compaction queue size warnings in > Cloudera > > Manager. > > > > Why is this the case? > > > > To be fair, we only disabled automated compactions AFTER the import > failed > > for the first time (yes, HBase was restarted) so maybe there are some > > trailing compactions, but the queue size keeps increasing which I guess > > should not be the case. Then again, I don't know how aborted compactions > > are counted - i.e not sure whether or not to trust the metrics on this. > > > > A bit more about what I am trying to accomplish: > > > > I am bulk loading about 100 indexed .lzo files with 20 * 10^6 Key-Value > > (0.5kb) each into an HBase table. Each file is loaded by a separate > Mapper > > job, several of these jobs run in parallel to make sure all task trackers > > are used. Key distribution is the same in each file so even region growth > > is to be expected. We did not pre-split the table as it does not seem to > > have been a limiting factor earlier. > > > > On a related note. What if any experience do other HBase/Cloudera users > > have with the Snapshotting feature detailed below? > > > > http://www.cloudera.com/content/cloudera-content/cloudera- > > docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_20_12.html > > > > We need of a robust way to do inter-cluster cloning/back-up of tables, > > preferably without taking the source table offline or impacting > performance > > of the source cluster. We only use HDFS files for importing because the > > CopyTable job needs to run on the source cluster and cannot be resumed > once > > it fails. > > > > Thanks, > > > > /David > --047d7b3432308596f604e11543b6--