Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AB6CB186D6 for ; Thu, 28 May 2015 16:31:11 +0000 (UTC) Received: (qmail 93439 invoked by uid 500); 28 May 2015 16:31:09 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 93373 invoked by uid 500); 28 May 2015 16:31:09 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 93361 invoked by uid 99); 28 May 2015 16:31:09 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 May 2015 16:31:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id DABCB1A359E for ; Thu, 28 May 2015 16:31:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.131 X-Spam-Level: *** X-Spam-Status: No, score=3.131 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, KAM_LOTSOFHASH=0.25, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id wZPvp28o7bnR for ; Thu, 28 May 2015 16:30:55 +0000 (UTC) Received: from mail-qk0-f181.google.com (mail-qk0-f181.google.com [209.85.220.181]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 1E32120DC7 for ; Thu, 28 May 2015 16:30:54 +0000 (UTC) Received: by qkoo18 with SMTP id o18so29506940qko.1 for ; Thu, 28 May 2015 09:30:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Coaqh0lQ7Cu3rnhSrVHh/H2feWuxiYdlmcoOwMNk9EY=; b=S0QSU563Vk7umHm1nQL5dfvA53Dbl+GW59XfLhk/auyz8GfyRn2zLacMdBkeqGcGIl B9yGeGLzDTPy5jU/sYwXZ4hYNZA1MPyq3iRqbNV931Aos0wLkQ+Tbg8AK2Ufjfg6gNK0 LskP6KECFaEPMls4bf970saAxMMV2HPdSc+O9sGVLDicT4wyO1jXJ8f4JuMRs5YPGEWn lk/8nhejNfvDkKPrTTWmNCK6p3lUDu1WsrdPxPUjnNV6uJ3O2qAjcdZAdMhmoKtyqrxc U61c6IoPqUHCrfV63iVbSQgqalpUJTEzJMg7nueAZb1f00avpaPoKBjYKqbQqMa28BE5 QgqA== MIME-Version: 1.0 X-Received: by 10.140.216.135 with SMTP id m129mr4531616qhb.20.1432830653138; Thu, 28 May 2015 09:30:53 -0700 (PDT) Received: by 10.140.83.199 with HTTP; Thu, 28 May 2015 09:30:53 -0700 (PDT) In-Reply-To: <011D66D7-7DE7-4956-9663-6ED4728D70FF@icloud.com> References: <8D6E150D-548B-481E-AC2F-627054A18CD4@icloud.com> <58643EA9-AC7B-4F76-9A4D-0CDB22D8BA31@icloud.com> <011D66D7-7DE7-4956-9663-6ED4728D70FF@icloud.com> Date: Thu, 28 May 2015 09:30:53 -0700 Message-ID: Subject: Re: HBase snapshots and Compaction From: Vladimir Rodionov To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a1135d24a1a8f45051726e209 --001a1135d24a1a8f45051726e209 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Its possible that logic of ExploringCompactionPolicy (default compaction policy) is broken. I am looking into this code (master): private boolean isBetterSelection(List bestSelection, long bestSize, List selection, long size, boolean mightBeStuck) { if (mightBeStuck && bestSize > 0 && size > 0) { // Keep the selection that removes most files for least size. That penaltizes adding // large files to compaction, but not small files, so we don't become totally inefficient // (might want to tweak that in future). Also, given the current order of looking at // permutations, prefer earlier files and smaller selection if the difference is small. final double REPLACE_IF_BETTER_BY =3D 1.05; double thresholdQuality =3D ((double)bestSelection.size() / bestSize)= * REPLACE_IF_BETTER_BY; return thresholdQuality < ((double)selection.size() / size); } // Keep if this gets rid of more files. Or the same number of files for less io. return selection.size() > bestSelection.size() || (selection.size() =3D=3D bestSelection.size() && size < bestSize); } which compares two selections and what I see here is when mightBeStuck =3D false selection with more files will always be preferred. Correct if I am wrong. -Vlad On Thu, May 28, 2015 at 8:00 AM, Akmal Abbasov wrote: > Hi Ted, > Thank you for reply. > Yes, it was promoted to a major compaction because all files were > eligible, but the thing I don=E2=80=99t understand, is why all of them we= re > eligible? > afaik, the compaction algorithm should select the best match for > compaction, and it should include files with similar sizes. > But as you can see from the logs the files selected have: 4.7K, 5.1K, 3.8= K > and 10.8M. > Why it is including 10.8M file? > Which setting should be tuned to avoid this? > Thank you. > > Kind regards, > Akmal Abbasov > > > On 28 May 2015, at 16:54, Ted Yu wrote: > > > > bq. Completed major compaction of 4 file(s) in s of metrics, > > V\xA36\x56\x5E\xC5}\xA1\x43\x00\x32\x00T\x1BU\xE0, > > 3547f43afae5ac3f4e8a162d43a892b4.1417707276446. > > > > The compaction involved all the files of store 's' for the region. Thus > it > > was considered major compaction. > > > > Cheers > > > > On Thu, May 28, 2015 at 2:16 AM, Akmal Abbasov > > > wrote: > > > >> Hi Ted, > >> Sorry for a late reply. > >> Here is a snippet from log file > >> 2015-05-28 00:54:39,754 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > >> regionserver.CompactSplitThread: CompactSplitThread Status: > >> compaction_queue=3D(0:27), split_queue=3D0, merge_queue=3D0 > >> 2015-05-28 00:54:39,754 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > >> compactions.RatioBasedCompactionPolicy: Selecting compaction from 4 > store > >> files, 0 compacting, 4 eligible, 10 blocking > >> 2015-05-28 00:54:39,755 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > >> compactions.ExploringCompactionPolicy: Exploring compaction algorithm > has > >> selected 4 files of size 11304175 starting at candidate #0 after > >> considering 3 permutations with 3 in ratio > >> 2015-05-28 00:54:39,755 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] regionserver.HStore= : > >> 3547f43afae5ac3f4e8a162d43a892b4 - s: Initiating major compaction > >> 2015-05-28 00:54:39,755 INFO > >> [regionserver60020-smallCompactions-1432714643311] regionserver.HRegio= n: > >> Starting compaction on s in region > >> > metrics,V\xA36\x56\x5E\xC5}\xA1\x43\x00\x32\x00T\x1BU\xE0,1417707276446.3= 547f43afae5ac3f4e8a162d43a892b4. > >> 2015-05-28 00:54:39,755 INFO > >> [regionserver60020-smallCompactions-1432714643311] regionserver.HStore= : > >> Starting compaction of 4 file(s) in s of > >> > metrics,V\xA36\x56\x5E\xC5}\xA1\x43\x00\x32\x00T\x1BU\xE0,1417707276446.3= 547f43afae5ac3f4e8a162d43a892b4. > >> into > >> > tmpdir=3Dhdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d= 43a892b4/.tmp, > >> totalSize=3D10.8 M > >> 2015-05-28 00:54:39,756 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > compactions.Compactor: > >> Compacting > >> > hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a892b4/= s/dab3e768593e44a39097451038c5ebd0, > >> keycount=3D3203, bloomtype=3DROW, size=3D10.8 M, encoding=3DNONE, > seqNum=3D172299974, > >> earliestPutTs=3D1407941317178 > >> 2015-05-28 00:54:39,756 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > compactions.Compactor: > >> Compacting > >> > hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a892b4/= s/2d6472ef99a5478689f7ba822bc407a7, > >> keycount=3D4, bloomtype=3DROW, size=3D4.7 K, encoding=3DNONE, seqNum= =3D172299976, > >> earliestPutTs=3D1432761158066 > >> 2015-05-28 00:54:39,756 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > compactions.Compactor: > >> Compacting > >> > hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a892b4/= s/bdbc806d045740e69ab34e3ea2e113c4, > >> keycount=3D6, bloomtype=3DROW, size=3D5.1 K, encoding=3DNONE, seqNum= =3D172299977, > >> earliestPutTs=3D1432764757438 > >> 2015-05-28 00:54:39,756 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > compactions.Compactor: > >> Compacting > >> > hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a892b4/= s/561f93db484b4b9fb6446152c3eef5b8, > >> keycount=3D2, bloomtype=3DROW, size=3D3.8 K, encoding=3DNONE, seqNum= =3D172299978, > >> earliestPutTs=3D1432768358747 > >> 2015-05-28 00:54:41,881 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] > >> regionserver.HRegionFileSystem: Committing store file > >> > hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a892b4/= .tmp/144f05a9546f446984a5b8fa173dd13e > >> as > >> > hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a892b4/= s/144f05a9546f446984a5b8fa173dd13e > >> 2015-05-28 00:54:41,918 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] regionserver.HStore= : > >> Removing store files after compaction... > >> 2015-05-28 00:54:41,959 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] backup.HFileArchive= r: > >> Finished archiving from class > >> org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, > >> > file:hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a8= 92b4/s/dab3e768593e44a39097451038c5ebd0, > >> to > >> > hdfs://prod1/hbase/archive/data/default/metrics/3547f43afae5ac3f4e8a162d4= 3a892b4/s/dab3e768593e44a39097451038c5ebd0 > >> 2015-05-28 00:54:42,030 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] backup.HFileArchive= r: > >> Finished archiving from class > >> org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, > >> > file:hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a8= 92b4/s/2d6472ef99a5478689f7ba822bc407a7, > >> to > >> > hdfs://prod1/hbase/archive/data/default/metrics/3547f43afae5ac3f4e8a162d4= 3a892b4/s/2d6472ef99a5478689f7ba822bc407a7 > >> 2015-05-28 00:54:42,051 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] backup.HFileArchive= r: > >> Finished archiving from class > >> org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, > >> > file:hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a8= 92b4/s/bdbc806d045740e69ab34e3ea2e113c4, > >> to > >> > hdfs://prod1/hbase/archive/data/default/metrics/3547f43afae5ac3f4e8a162d4= 3a892b4/s/bdbc806d045740e69ab34e3ea2e113c4 > >> 2015-05-28 00:54:42,071 DEBUG > >> [regionserver60020-smallCompactions-1432714643311] backup.HFileArchive= r: > >> Finished archiving from class > >> org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, > >> > file:hdfs://prod1/hbase/data/default/metrics/3547f43afae5ac3f4e8a162d43a8= 92b4/s/561f93db484b4b9fb6446152c3eef5b8, > >> to > >> > hdfs://prod1/hbase/archive/data/default/metrics/3547f43afae5ac3f4e8a162d4= 3a892b4/s/561f93db484b4b9fb6446152c3eef5b8 > >> 2015-05-28 00:54:42,072 INFO > >> [regionserver60020-smallCompactions-1432714643311] regionserver.HStore= : > >> Completed major compaction of 4 file(s) in s of > >> > metrics,V\xA36\x56\x5E\xC5}\xA1\x43\x00\x32\x00T\x1BU\xE0,1417707276446.3= 547f43afae5ac3f4e8a162d43a892b4. > >> into 144f05a9546f446984a5b8fa173dd13e(size=3D10.8 M), total size for > store is > >> 10.8 M. This selection was in queue for 0sec, and took 2sec to execute= . > >> 2015-05-28 00:54:42,072 INFO > >> [regionserver60020-smallCompactions-1432714643311] > >> regionserver.CompactSplitThread: Completed compaction: Request =3D > >> > regionName=3Dmetrics,V\xA36\x56\x5E\xC5}\xA1\x43\x00\x32\x00T\x1BU\xE0,14= 17707276446.3547f43afae5ac3f4e8a162d43a892b4., > >> storeName=3Ds, fileCount=3D4, fileSize=3D10.8 M, priority=3D6, > >> time=3D1368019430741233; duration=3D2sec > >> > >> My question is, why the major compaction was executed instead of minor > >> compaction. > >> I have this messages all over the log file. > >> Thank you! > >> > >>> On 12 May 2015, at 23:53, Ted Yu wrote: > >>> > >>> Can you pastebin major compaction related log snippets ? > >>> See the following for example of such logs: > >>> > >>> 2015-05-09 10:57:58,961 INFO > >>> [PriorityRpcServer.handler=3D13,queue=3D1,port=3D16020] > >>> regionserver.RSRpcServices: Compacting > >>> > >> > IntegrationTestBigLinkedList,\x91\x11\x11\x11\x11\x11\x11\x08,14311939787= 41.700b34f5d2a3aa10804eff35906fd6d8. > >>> 2015-05-09 10:57:58,962 DEBUG > >>> [PriorityRpcServer.handler=3D13,queue=3D1,port=3D16020] regionserver.= HStore: > >>> Skipping expired store file removal due to min version being 1 > >>> 2015-05-09 10:57:58,962 DEBUG > >>> [PriorityRpcServer.handler=3D13,queue=3D1,port=3D16020] > >>> compactions.RatioBasedCompactionPolicy: Selecting compaction from 5 > store > >>> files, 0 compacting, 5 eligible, 10 blocking > >>> 2015-05-09 10:57:58,963 DEBUG > >>> [PriorityRpcServer.handler=3D13,queue=3D1,port=3D16020] regionserver.= HStore: > >>> 700b34f5d2a3aa10804eff35906fd6d8 - meta: Initiating major compaction > (all > >>> files) > >>> > >>> > >>> Cheers > >>> > >>> On Tue, May 12, 2015 at 2:06 PM, Akmal Abbasov < > akmal.abbasov@icloud.com > >>> > >>> wrote: > >>> > >>>> Hi Ted, > >>>> Thank you for reply. > >>>> I am running with the default settings. > >>>> > >>>> Sent from my iPhone > >>>> > >>>>> On 12 May 2015, at 22:02, Ted Yu wrote: > >>>>> > >>>>> Can you show us compaction related parameters you use ? > >>>>> > >>>>> e.g. hbase.hregion.majorcompaction , > >>>> hbase.hregion.majorcompaction.jitter , > >>>>> etc > >>>>> > >>>>> On Tue, May 12, 2015 at 9:52 AM, Akmal Abbasov < > >> akmal.abbasov@icloud.com > >>>>> > >>>>> wrote: > >>>>> > >>>>>> HI, > >>>>>> I am using HBase 0.98.7. > >>>>>> I am using HBase snapshots to backup data. I create snapshot of > tables > >>>>>> each our. > >>>>>> Each create snapshot process will cause the flush of the memstore, > and > >>>>>> creation of hfiles. > >>>>>> When the number of hfiles will reach 3 the MINOR compaction proces= s > >> will > >>>>>> start for each CF. > >>>>>> Ok, I was expecting that the compaction will process only small > >> hfiles, > >>>>>> and I won=E2=80=99t have problems with moving all data to archive = folder > >>>>>> each time after compaction process ends. > >>>>>> But most of the times, the minor compaction is promoted to > major(more > >>>> than > >>>>>> 100 in 24 hours without loads). > >>>>>> As far as I know, the only possibility for this is that all hfiles > are > >>>>>> eligible for compaction. > >>>>>> But when I tested the archive folder for a CF I see the strange > >>>> situation > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 06:04 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/36d= c06f4c34242daadc343d857a35734 > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 06:04 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/7e8= b993f97b84f4594542144f15b0a1e > >>>>>> -rw-r--r-- 3 akmal supergroup 1.1 K 2015-05-10 06:04 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/b9a= fc64792ba4bf99a08f34033cc46ac > >>>>>> -rw-r--r-- 3 akmal supergroup 638.4 K 2015-05-10 06:04 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/dff= 846ae4fc24d418289a95322b35d46 > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 08:50 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/228= eee22c32e458e8eb7f5d031f64b58 > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 08:50 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/529= 257432308466f971e41db49ecffdf > >>>>>> -rw-r--r-- 3 akmal supergroup 638.5 K 2015-05-10 08:50 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/839= d3a6fc523435d8b44f63315fd11b8 > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 08:50 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/8c2= 45e8661b140439e719f69a535d57f > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 11:37 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/234= 97a31d3e721fe9b63c58fbe0224d5 > >>>>>> -rw-r--r-- 3 akmal supergroup 638.7 K 2015-05-10 11:37 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/8c9= af0357d164221ad46b336cd660b30 > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 11:37 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/8eb= 55b43c22d434954e2e0bfda656018 > >>>>>> -rw-r--r-- 3 akmal supergroup 1.0 K 2015-05-10 11:37 > >>>>>> > >>>> > >> > /hbase/archive/data/default/table1/0e8e3bf44a2ea5dfaa8a9c58d99b92e6/c/b8b= 6210d9e6d4ec2344238c6e9c17ddf > >>>>>> > >>>>>> As I understood this files were copied to archive folder after > >>>> compaction. > >>>>>> The part I didn=E2=80=99t understand is, why the file with 638 K w= as also > >>>> selected > >>>>>> for compaction? > >>>>>> Any ideas? > >>>>>> Thank you. > >>>>>> > >>>>>> Kind regards, > >>>>>> Akmal Abbasov > >>>>>> > >>>>>> > >>>>>> > >>>> > >> > >> > > --001a1135d24a1a8f45051726e209--