Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1D97810D55 for ; Thu, 3 Oct 2013 06:30:31 +0000 (UTC) Received: (qmail 6890 invoked by uid 500); 3 Oct 2013 06:30:28 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 6535 invoked by uid 500); 3 Oct 2013 06:30:23 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 6527 invoked by uid 99); 3 Oct 2013 06:30:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Oct 2013 06:30:21 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of prvs=1981e86cc6=matt.dickson@defence.gov.au designates 203.6.68.1 as permitted sender) Received: from [203.6.68.1] (HELO defence.gov.au) (203.6.68.1) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Oct 2013 06:30:14 +0000 From: "Dickson, Matt MR" To: "'user@accumulo.apache.org'" Date: Thu, 3 Oct 2013 16:29:44 +1000 Subject: RE: Efficient Tablet Merging [SEC=UNOFFICIAL] Thread-Topic: Efficient Tablet Merging [SEC=UNOFFICIAL] Thread-Index: Ac6/6LXU0fLZ1wGpRq68IbKrWY/AbAAFtGPQ Message-ID: <24070BEF0A3F684489AA943FD3439EF20586FA0617@CARRXM06.drn.mil.au> References: <24070BEF0A3F684489AA943FD3439EF20586FA0611@CARRXM06.drn.mil.au> <24070BEF0A3F684489AA943FD3439EF20586FA0612@CARRXM06.drn.mil.au> In-Reply-To: Accept-Language: en-US, en-AU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-protective-marking: VER=2012.3, NS=gov.au, SEC=UNOFFICIAL, ORIGIN=matt.dickson@defence.gov.au x-tituslabs-classifications-30: TLPropertyRoot=Titus;SEC=UNOFFICIAL; x-tituslabs-classificationhash-30: VgNFIFU9Hx+/nZJb9Kg7IsK92PiBa6ffrlYIa3+xds8Pxut+2phdqoFWkHAlRREeD/xVFfcmakU5IJL/d0P0S5S3q5NC0ZKwgDLgh3COrdIQ9NQZEqSn7JxjIgmMezdOkaolv8NICMJ+diX3DvvTfpFmUCalZTQM03UpOfrWdBLyo3Jz1++26k7OdW+dCHL+4qPilAXhXTb2YL8cU6lauw== x-titus-version: 3.5.8.4 x-tituslabs-subjectpostlabel: [SEC=UNOFFICIAL] acceptlanguage: en-US, en-AU Content-Type: multipart/alternative; boundary="_000_24070BEF0A3F684489AA943FD3439EF20586FA0617CARRXM06drnmi_" MIME-Version: 1.0 X-OriginalArrivalTime: 03 Oct 2013 06:29:46.0530 (UTC) FILETIME=[F43A8820:01CEC001] X-Virus-Checked: Checked by ClamAV on apache.org --_000_24070BEF0A3F684489AA943FD3439EF20586FA0617CARRXM06drnmi_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable UNOFFICIAL Hi Eric, We have gone with the second more conservative option. We changed our split= threshold to 10GB and then we ran a merge over a week worth of tablets whi= ch has resulted in one tablet with a massive number of files. We then ran a= query over that range and it is returning an message saying: Tablet has too many files (3n;20130914;20130907...) retrying... We assumed that when the merge was done that a major compaction would be st= arted, which would notice that the tablet is too large, split it into 10GB = tablets. We assumed that we would not have to manually start any compaction= but instead it would be scheduled at some point after the merge finished. We have completed three separate merges of week long ranges and now have id= entified 3 tablet extents with too many files. Can you please explain what is supposed to happen? And whether after the me= rge, compact command for those ranges needs to be run (or will it do it aut= omatically, as we have not seen any started)? Cheers Matt ________________________________ From: Eric Newton [mailto:eric.newton@gmail.com] Sent: Thursday, 3 October 2013 13:28 To: user@accumulo.apache.org Subject: Re: Efficient Tablet Merging [SEC=3DUNOFFICIAL] I'll use ASCII graphics to demonstrate the size of a tablet. Small: [] Medium: [ ] Large: [ ] Think of it like this... if you are running age-off... you probably have lo= ts of little buckets of rows at the beginning and larger buckets at the end= : [][][][][][][][][]...[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][= ] What you probably want is something like this: [ ][ ][ ][ ][ ][ ][ ][ = ] Some big bucket at the start, with old data, and some larger buckets for ev= erything afterwards. But... this would probably work: [ ][ ][ ][ ][ ][ ][ ][ ][ = ] Just a bunch of larger tablets throughout. So you need to set your merge size to "[ ]" (4G), and you can always k= eep creating smaller tablets for future rows with manual splits: [ ][ ][ ][ ][ ][ ][ ][ ][ = ][ ][ ][ ][ ][ ] So increase the split threshold to 4G, and merge on 4G, but continue to mak= e manual splits for your current days, as necessary. Merge them away later= . -Eric On Wed, Oct 2, 2013 at 6:35 PM, Dickson, Matt MR > wrote: UNOFFICIAL Thanks Eric, If I do the merge with size of 4G does the split threshold need to be incre= ased to the 4G also? ________________________________ From: Eric Newton [mailto:eric.newton@gmail.com] Sent: Wednesday, 2 October 2013 23:05 To: user@accumulo.apache.org Subject: Re: Efficient Tablet Merging [SEC=3DUNOFFICIAL] The most efficient way is kind of scary. If this is a production system, I= would not recommend it. First, find out the size of your 10x tablets. Let's say it's 10G. Set you= r split threshold to 10G. Then merge all old tablets.... all of them into = one tablet. This will dump thousands of files into a single tablet, but it= will soon split out again into the nice 10G tablets you are looking for. = The system will probably be unusable during this operation. The more conservative way is to specify the merge in single steps (the mast= er will only coordinate a single merge on a table at a time anyhow). You c= an do it by range or by size... I would do it by size, especially if you ar= e aging off your old data. Compacting the data won't have any effect on the speed of the merge. -Eric On Tue, Oct 1, 2013 at 11:58 PM, Dickson, Matt MR > wrote: UNOFFICIAL I have a table that we create splits of the form yyyymmdd-nnnn where nnnn r= anges from 0000 to 0840. The bulk of our data is loaded for the current da= te with no data loaded for days older than 3 days so from my understanding = it would be wise to merge splits older than 3 days in order to reduce the o= verall tablet count. It would still be optimal to maintain some distributi= on of tablets for a day across the cluster so I'm looking at merging splits= in 10 increments eg, merge -b 20130901-0000 -e 20130901-0009, therefore re= ducing 840 splits per day to 84. Currently we have 120K tablets (size 1G) on a cluster of 56 nodes and our i= ngest has slowed as the data quantity and tablet count has grown. Initialy= we were achieving 200-300K, now 50-100K. My question is, what is the best way to do this merge? Should we use the m= erge command with the size option set at something like 5G, or maybe use th= e compaction command? >From my tests this process could take some time so I'm keen to understand t= he most efficient approach. Thanks in advance, Matt Dickson --_000_24070BEF0A3F684489AA943FD3439EF20586FA0617CARRXM06drnmi_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

UNOFFICIAL

Hi Eric,
 
We have gone with the second more conservative option= . We=20 changed our split threshold to 10GB and then we ran a merge over a wee= k=20 worth of tablets which has resulted in one tablet with a massive number of= =20 files. We then ran a query over that range and it is returning an message=20 saying:
 
Tablet has too many files (3n;20130914;20130907...)=20 retrying...
 
We assumed that when the merge was done that a major= =20 compaction would be started, which would notice that the tablet is too larg= e,=20 split it into 10GB tablets. We assumed that we would not have to manually s= tart=20 any compaction but instead it would be scheduled at some point after the me= rge=20 finished.
 
We have completed three separate merges of week long = ranges=20 and now have identified 3 tablet extents with too many files.=20
 
Can you please explain what is supposed to happen? An= d whether=20 after the merge, compact command for those ranges needs to be run (or will = it do=20 it automatically, as we have not seen any started)?
 
Cheers
Matt


From: Eric Newton [mailto:eric.newton@g= mail.com]=20
Sent: Thursday, 3 October 2013 13:28
To:=20 user@accumulo.apache.org
Subject: Re: Efficient Tablet Merging=20 [SEC=3DUNOFFICIAL]

I'll use ASCII graphics to demon= strate=20 the size of a tablet.

Small: []
Medium: [ ]
Large: [  ]

Think of it like this... if you = are=20 running age-off... you probably have lots of little buckets of rows at the= =20 beginning and larger buckets at the end:=20

[][][][][][][][][]...[ ][ ][ ][ = ][ ][=20  ][  ][    ][    ][    ][  =20  ][    ][    ]

What you probably want is someth= ing=20 like this:

[         &n= bsp;=20     ][       ][       ][   &nb= sp;=20   ][       ][       ][     &nb= sp;=20 ][       ]

Some big bucket at the start, wi= th old=20 data, and some larger buckets for everything afterwards.  But... this = would=20 probably work:

[       ][  =  =20   ][       ][       ][     &nb= sp;=20 ][       ][       ][       ][= =20       ]

Just a bunch of larger tablets=20 throughout.

So you need to set your merge si= ze to=20 "[      ]" (4G), and you can always keep creating smaller ta= blets=20 for future rows with manual splits:

[       ][  =  =20   ][       ][       ][     &nb= sp;=20 ][       ][       ][       ][= =20       ][  ][  ][  ][  ][=20  ]


So increase the split threshold = to 4G,=20 and merge on 4G, but continue to make manual splits for your current days, = as=20 necessary.  Merge them away later.


-Eric




On Wed, Oct = 2, 2013=20 at 6:35 PM, Dickson, Matt MR <matt.dickson@defence.gov.au> wrote:

UNOFFICIAL

Thanks Eric,
<= FONT=20 color=3D#0000ff> 
If I do the merge with size of 4G does th= e split=20 threshold need to be increased to the 4G also?


From: Eric Newton [mailto:eric.newton@gmail.c= om]=20
Sent: Wednesday, 2 October 2013 23:05
To: user@accumulo.apache.org
Subject: Re: Effic= ient=20 Tablet Merging [SEC=3DUNOFFICIAL]

The most efficient w= ay is=20 kind of scary.  If this is a production system, I would not recommen= d it.=20

First, find out the size of yo= ur 10x=20 tablets.  Let's say it's 10G.  Set your split threshold to 10G.= =20  Then merge all old tablets.... all of them into one tablet.  T= his=20 will dump thousands of files into a single tablet, but it will soon split= out=20 again into the nice 10G tablets you are looking for.  The system wil= l=20 probably be unusable during this operation.

The more conservative way is t= o=20 specify the merge in single steps (the master will only coordinate a sing= le=20 merge on a table at a time anyhow).  You can do it by range or by si= ze...=20 I would do it by size, especially if you are aging off your old=20 data.

Compacting the data won't have= any=20 effect on the speed of the merge.

-Eric



On Tue, Oc= t 1, 2013=20 at 11:58 PM, Dickson, Matt MR <matt.dickson@defence.gov.au> wrote:

UNOFFICIAL

I have a table that&nb= sp;we=20 create splits of the form yyyymmdd-nnnn where nnnn ranges from= 0000=20 to 0840.  The bulk of our data is loaded for the current date with= no=20 data loaded for days older than 3 days so from my understanding it woul= d be=20 wise to merge splits older than 3 days in order to reduce the overall t= ablet=20 count.  It would still be optimal to=20 maintain some distribution of tablets for a day across the cluster so I= 'm=20 looking at merging splits in 10 increments eg, merge -b 20130901-0000 -= e=20 20130901-0009, therefore reducing 840 splits per day to=20 84.
 
Currently we have 120K= =20 tablets (size 1G) on a cluster of 56 nodes and our ingest has= =20 slowed as the data quantity and tablet count has grown.  Initialy = we=20 were achieving 200-300K, now 50-100K.
 
My question is, what i= s the=20 best way to do this merge?  Should we use the merge command with t= he=20 size option set at something like 5G, or maybe use the compaction=20 command?
 
From my tests this pro= cess=20 could take some time so I'm keen to understand the most efficient=20 approach.
 
Thanks in=20 advance,
Matt=20 Dickson


--_000_24070BEF0A3F684489AA943FD3439EF20586FA0617CARRXM06drnmi_--