Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D6BA511DD5 for ; Sat, 9 Aug 2014 04:52:29 +0000 (UTC) Received: (qmail 3130 invoked by uid 500); 9 Aug 2014 04:52:27 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 3062 invoked by uid 500); 9 Aug 2014 04:52:26 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 3049 invoked by uid 99); 9 Aug 2014 04:52:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Aug 2014 04:52:26 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lhofhansl@yahoo.com designates 72.30.239.74 as permitted sender) Received: from [72.30.239.74] (HELO nm34-vm2.bullet.mail.bf1.yahoo.com) (72.30.239.74) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Aug 2014 04:52:00 +0000 Received: from [66.196.81.172] by nm34.bullet.mail.bf1.yahoo.com with NNFMP; 09 Aug 2014 04:51:58 -0000 Received: from [98.139.212.194] by tm18.bullet.mail.bf1.yahoo.com with NNFMP; 09 Aug 2014 04:51:58 -0000 Received: from [127.0.0.1] by omp1003.mail.bf1.yahoo.com with NNFMP; 09 Aug 2014 04:51:58 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 871464.77730.bm@omp1003.mail.bf1.yahoo.com Received: (qmail 34855 invoked by uid 60001); 9 Aug 2014 04:51:58 -0000 X-YMail-OSG: FjMRJigVM1m.Ss.t9WdTTQGwDd8u3OmjpZ3q7bDjoihWoJu 1tKJ5biVXUZtoAw.gVa.ewrOUMvumQSeGO36Zn9oW667iasdHM.6X3L2LVm5 ..q0OZV_.ZIpt0dmytCgyHh.Yj5Yb0ZQkywPqHtAAgusdKzmyvhANaS8ywPR BApIrV8VraVuQnRePXgRWXTiZrYkFio760u7eX1kYKmqgS0cKod7Q.JTY8Ok tm6qkClulGfh.5aeXMfH.inAtZG_n9QoGstVYnTiHy9I57mYNWb9yp5pTW.8 erBokkcwxO0ZtFDddY0TPmntQ9pLvPJcbdQ6iXdmNrMbUSKQhJ8WjbQGxfvB OJLH5KWLbCou9Dre.AU6pbkpDGvbw0DJ_j7AJgRAuBU1hSHGem0ncto9Y798 9_WWNuA1ffCMRNDA8GrH6hJt0rPw6GwAn6tE4tMsjZk_.PDDBlhi6MUIE39E S1tHAmjWAV9K_.5KPTBIxbyudZPRHP70rMiILFErZ5RlJ1N_ixOskrS5Fdau LXvCdlVin4p.Ge1uDwQwIiJsDddjObdl5dLdoCpqxR9gZEdLpl5Z8.VeX9yU gEsC8Vj146y0sGUQAdVjYgA77kn9j9JBzyUuwX0.c_sP1iqyS0zeYHdj0NYI ugE9hHChFPqjiBkp8sGFtdkGnGdVMmMLJkifaOhoz4af0XH9HUYjZMB91IE3 LbrbbSsJBaMnUKJlNtGVh4NFH2.SIYoHIRepwHvEaeCKzaIvfQUv6_tCX465 ltOUsDffP_9gqZSJqfJvWikZiHhC7rjoYAQE4px9hMWhImEhmuQ5oBw-- Received: from [24.4.160.78] by web140602.mail.bf1.yahoo.com via HTTP; Fri, 08 Aug 2014 21:51:58 PDT X-Rocket-MIMEInfo: 002.001,SGkgQ29saW4sCgp5b3UgbWlnaHQgd2FudCB0byBjb25zaWRlciB1cGdyYWRpbmcuIFRoZSBjdXJyZW50IHN0YWJsZSB2ZXJzaW9uIGlzIDAuOTguNCAoc29vbiAuNSkuCgpFdmVuIGp1c3QgZ29pbmcgdG8gMC45NCB3aWxsIGdpdmUgYSBsb3Qgb2YgbmV3IGZlYXR1cmVzLCBzdGFiaWxpdHksIGFuZCBwZXJmb3JtYW5jZS4KMC45Mi54IGNhbiBiZSB1cGdyYWRlZCB0byAwLjk0Lnggd2l0aG91dCBhbnkgZG93bnRpbWUgYW5kIHdpdGhvdXQgYW55IHVwZ3JhZGUgc3RlcHMgbmVjZXNzYXJ5LgpGb3IgYW4gdXBncmEBMAEBAQE- X-RocketYMMF: lhofhansl X-Mailer: YahooMailWebService/0.8.198.689 References: Message-ID: <1407559918.21193.YahooMailNeo@web140602.mail.bf1.yahoo.com> Date: Fri, 8 Aug 2014 21:51:58 -0700 From: lars hofhansl Reply-To: lars hofhansl Subject: Re: Large discrepancy in hdfs hbase rootdir size after copytable operation. To: "user@hbase.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi Colin,=0A=0Ayou might want to consider upgrading. The current stable ver= sion is 0.98.4 (soon .5).=0A=0AEven just going to 0.94 will give a lot of n= ew features, stability, and performance.=0A0.92.x can be upgraded to 0.94.x= without any downtime and without any upgrade steps necessary.=0AFor an upg= rade to 0.98 and later you'd need some downtime and also excute an upgrade = step.=0A=0A=0A-- Lars=0A=0A=0A=0A----- Original Message -----=0AFrom: Colin= Kincaid Williams =0ATo: user@hbase.apache.org=0ACc: =0ASen= t: Friday, August 8, 2014 1:16 PM=0ASubject: Re: Large discrepancy in hdfs = hbase rootdir size after copytable operation.=0A=0ANot in the hbase shell I= have:=0A=0Ahbase version=0A14/08/08 14:16:08 INFO util.VersionInfo: HBase = 0.92.1-cdh4.1.3=0A14/08/08 14:16:08 INFO util.VersionInfo: Subversion=0Afil= e:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hbase= -0.92.1-cdh4.1.3=0A-r Unknown=0A14/08/08 14:16:08 INFO util.VersionInfo: Co= mpiled by jenkins on Sat Jan 26=0A17:11:38 PST 2013=0A=0A=0A=0A=0A=0A=0AOn = Fri, Aug 8, 2014 at 12:56 PM, Ted Yu wrote:=0A=0A> Us= ing simplified version of your command, I saw the following in shell=0A> ou= tput (you may have noticed as well):=0A>=0A> An argument ignored (unknown o= r overridden): BLOOMFILTER=0A> An argument ignored (unknown or overridden):= VERSIONS=0A> 0 row(s) in 2.1110 seconds=0A>=0A> Cheers=0A>=0A>=0A> On Fri,= Aug 8, 2014 at 12:23 PM, Colin Kincaid Williams =0A> wrote= :=0A>=0A> > I have discovered the error. I made the mistake regarding the c= ompression=0A> > and the bloom filter. The new table doesn't have them enab= led, and the=0A> old=0A> > does. However I'm wondering how I can create tab= les with splits and bf=0A> and=0A> > compression enabled. Shouldn't the fol= lowing command return an error?=0A> >=0A> > hbase(main):001:0> create 'ADMd= 5','a',{=0A> >=0A> > hbase(main):002:1* BLOOMFILTER =3D> 'ROW',=0A> > hbase= (main):003:1* VERSIONS =3D> '1',=0A> > hbase(main):004:1* COMPRESSION =3D> = 'SNAPPY',=0A> > hbase(main):005:1* MIN_VERSIONS =3D> '0',=0A> > hbase(main)= :006:1* SPLITS =3D>['/++ASUZm4u7YsTcF/VtK6Q=3D=3D',=0A> > hbase(main):007:2= * '/zyuFR1VmhJyF4rbWsFnEg=3D=3D',=0A> > hbase(main):008:2* '0sZYnBd83ul58d1= O8I2JnA=3D=3D',=0A> > hbase(main):009:2* '2+03N7IicZH3ltrqZUX6kQ=3D=3D',=0A= > > hbase(main):010:2* '4+/slRQtkBDU7Px6C9MAbg=3D=3D',=0A> > hbase(main):01= 1:2* '6+1dGCQ/IBrCsrNQXe/9xQ=3D=3D',=0A> > hbase(main):012:2* '7+2pvtpHUQHW= kZJoouR9wQ=3D=3D',=0A> > hbase(main):013:2* '8+4n2deXhzmrpe//2Fo6Fg=3D=3D',= =0A> > hbase(main):014:2* '9+4SKW/BmNzpL68cXwKV1Q=3D=3D',=0A> > hbase(main)= :015:2* 'A+4ajStFkjEMf36cX5D9xg=3D=3D',=0A> > hbase(main):016:2* 'B+6Zm6Kcc= b3l6iM2L0epxQ=3D=3D',=0A> > hbase(main):017:2* 'C+6lKKDiOWl5qrRn72fNCw=3D= =3D',=0A> > hbase(main):018:2* 'D+6dZMyn7m+NhJ7G07gqaw=3D=3D',=0A> > hbase(= main):019:2* 'E+6BrimmrpAd92gZJ5hyMw=3D=3D',=0A> > hbase(main):020:2* 'G+5t= isu4xWZMOJnDHeYBJg=3D=3D',=0A> > hbase(main):021:2* 'I+7fRy4dvqcM/L6dFRQk9g= =3D=3D',=0A> > hbase(main):022:2* 'J+8ECMw1zeOyjfOg/ypXJA=3D=3D',=0A> > hba= se(main):023:2* 'K+7tenLYn6a1aNLniL6tbg=3D=3D',]}=0A> > 0 row(s) in 1.8010 = seconds=0A> >=0A> > hbase(main):024:0> describe 'ADMd5'=0A> > DESCRIPTION= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 ENABLED=0A> >=0A> >=A0 {NAME =3D> 'ADMd5', FAMILIES =3D> [{NAME =3D> '= a', BLOO true=0A> >=0A> >=A0 MFILTER =3D> 'NONE', REPLICATION_SCOPE =3D> '0= ', VERS=0A> >=0A> >=A0 IONS =3D> '3', COMPRESSION =3D> 'NONE', MIN_VERSIONS= =0A> >=0A> >=A0 =3D> '0', TTL =3D> '2147483647', BLOCKSIZE =3D> '65536'=0A>= >=0A> >=A0 , IN_MEMORY =3D> 'false', BLOCKCACHE =3D> 'true'}]}=0A> >=0A> >= 1 row(s) in 0.0420 seconds=0A> >=0A> >=0A> >=0A> > On Thu, Aug 7, 2014 at = 5:50 PM, Jean-Marc Spaggiari <=0A> > jean-marc@spaggiari.org=0A> > > wrote:= =0A> >=0A> > > Hi Colin,=0A> > >=0A> > > Just to make sure.=0A> > >=0A> > >= Is table A from the source cluster and not compressed, and table B in=0A> = the=0A> > > destination cluster and SNAPPY compressed? Is that correct? The= n ratio=0A> > > should be the opposite. Are you able to du -h from hadoop t= o see if all=0A> > > regions are evenly bigger or if anything else is wrong= ?=0A> > >=0A> > >=0A> > > 2014-08-07 20:44 GMT-04:00 Colin Kincaid Williams= :=0A> > >=0A> > > > I haven't yet tried to major compact t= able B. I will look up some=0A> > > > documentation on WALs and snapshots t= o find this information in the=0A> > hdfs=0A> > > > filesystem tomorrow. Co= uld it be caused by the bloomfilter existing=0A> on=0A> > > > table B, but = not table A? The funny thing is the source table is=0A> > smaller=0A> > > >= than the destination.=0A> > > >=0A> > > >=0A> > > > On Thu, Aug 7, 2014 at= 4:50 PM, Esteban Gutierrez <=0A> > esteban@cloudera.com>=0A> > > > wrote:= =0A> > > >=0A> > > > > Hi Colin,=0A> > > > >=0A> > > > > Have you verified = if the content of /a_d includes WALs and/or the=0A> > > content=0A> > > > >= of the snapshots or the HBase archive? have you tried to major=0A> > compa= ct=0A> > > > > table B?=A0 does it makes any difference?=0A> > > > >=0A> > = > > > regards,=0A> > > > > esteban.=0A> > > > >=0A> > > > >=0A> > > > >=0A>= > > > > --=0A> > > > > Cloudera, Inc.=0A> > > > >=0A> > > > >=0A> > > > >= =0A> > > > > On Thu, Aug 7, 2014 at 2:00 PM, Colin Kincaid Williams <=0A> >= discord@uw.edu=0A> > > >=0A> > > > > wrote:=0A> > > > >=0A> > > > > > I us= ed the copy table command to copy a database between the=0A> > original=0A>= > > > > > cluster A and a new cluster B. I have noticed that the rootdir i= s=0A> > > > larger=0A> > > > > > than 2X the size of the original. I am try= ing to account for=0A> such a=0A> > > > large=0A> > > > > > difference. The= following are some details about the table.=0A> > > > > >=0A> > > > > >=0A= > > > > > > I'm trying to figure out why my copied table is more than 2X th= e=0A> > size=0A> > > > of=0A> > > > > > the original table. Could the bloom= filter itself account for=0A> this?=0A> > > > > >=0A> > > > > > The guide I= used as a reference:=0A> > > > > >=0A> > > > > >=0A> > > > >=0A> > > >=0A>= > >=0A> >=0A> http://blog.pivotal.io/pivotal/products/migrating-an-apache-= hbase-table-between-different-clusters=0A> > > > > >=0A> > > > > >=0A> > > = > > >=0A> > > > > > Supposedly the original command used to create the tabl= e on=0A> cluster=0A> > > A:=0A> > > > > >=0A> > > > > > create 'ADMd5', {NA= ME =3D> 'a', BLOOMFILTER =3D> 'ROW', VERSIONS =3D>=0A> > '1',=0A> > > > > >= COMPRESSION =3D> 'SNAPPY', MIN_VERSIONS =3D> '0'}=0A> > > > > >=0A> > > > = > >=0A> > > > > > How I created the target table on cluster B:=0A> > > > > = >=0A> > > > > > create 'ADMd5','a',{=0A> > > > > >=0A> > > > > >=0A> > > > = > >=0A> > > > > > BLOOMFILTER =3D> 'ROW',=0A> > > > > > VERSIONS =3D> '1',= =0A> > > > > > COMPRESSION =3D> 'SNAPPY',=0A> > > > > > MIN_VERSIONS =3D> '= 0',=0A> > > > > > SPLITS =3D>['/++ASUZm4u7YsTcF/VtK6Q=3D=3D',=0A> > > > > >= '/zyuFR1VmhJyF4rbWsFnEg=3D=3D',=0A> > > > > > '0sZYnBd83ul58d1O8I2JnA=3D= =3D',=0A> > > > > > '2+03N7IicZH3ltrqZUX6kQ=3D=3D',=0A> > > > > > '4+/slRQt= kBDU7Px6C9MAbg=3D=3D',=0A> > > > > > '6+1dGCQ/IBrCsrNQXe/9xQ=3D=3D',=0A> > = > > > > '7+2pvtpHUQHWkZJoouR9wQ=3D=3D',=0A> > > > > > '8+4n2deXhzmrpe//2Fo6= Fg=3D=3D',=0A> > > > > > '9+4SKW/BmNzpL68cXwKV1Q=3D=3D',=0A> > > > > > 'A+4= ajStFkjEMf36cX5D9xg=3D=3D',=0A> > > > > > 'B+6Zm6Kccb3l6iM2L0epxQ=3D=3D',= =0A> > > > > > 'C+6lKKDiOWl5qrRn72fNCw=3D=3D',=0A> > > > > > 'D+6dZMyn7m+Nh= J7G07gqaw=3D=3D',=0A> > > > > > 'E+6BrimmrpAd92gZJ5hyMw=3D=3D',=0A> > > > >= > 'G+5tisu4xWZMOJnDHeYBJg=3D=3D',=0A> > > > > > 'I+7fRy4dvqcM/L6dFRQk9g=3D= =3D',=0A> > > > > > 'J+8ECMw1zeOyjfOg/ypXJA=3D=3D',=0A> > > > > > 'K+7tenLY= n6a1aNLniL6tbg=3D=3D']}=0A> > > > > >=0A> > > > > >=0A> > > > > > How the t= ables now appear in hbase shell:=0A> > > > > >=0A> > > > > > table A:=0A> >= > > > >=0A> > > > > > describe 'ADMd5'=0A> > > > > > DESCRIPTION=0A> > > >= > >=0A> > > > > >=A0 ENABLED=0A> > > > > >=0A> > > > > >=A0 {NAME =3D> 'A= DMd5', FAMILIES =3D> [{NAME =3D> 'a', BLOOMFILTER =3D>=0A> > 'NONE',=0A> > = > > > > REPLICATION_SCOPE =3D> '0', VERSIONS =3D> '3', COMPRESSION =3D> 'NO= NE',=0A> > > > MIN_VER=0A> > > > > > true=0A> > > > > >=0A> > > > > >=A0 SI= ONS =3D> '0', TTL =3D> '2147483647', BLOCKSIZE =3D> '65536',=0A> IN_MEMORY= =0A> > > =3D>=0A> > > > > > 'false', BLOCKCACHE =3D> 'true'}]}=0A> > > > > = >=0A> > > > > >=0A> > > > > > 1 row(s) in 0.0370 seconds=0A> > > > > >=0A> = > > > > >=0A> > > > > > table B:=0A> > > > > >=0A> > > > > > hbase(main):00= 3:0> describe 'ADMd5'=0A> > > > > > DESCRIPTION=0A> > > > > >=0A> > > > > >= =A0 ENABLED=0A> > > > > >=0A> > > > > >=A0 {NAME =3D> 'ADMd5', FAMILIES = =3D> [{NAME =3D> 'a', BLOOMFILTER =3D>=0A> 'ROW',=0A> > > > > > REPLICATION= _SCOPE =3D> '0', VERSIONS =3D> '1', COMPRESSION =3D>=0A> 'SNAPPY',=0A> > > = > > MIN_VE=0A> > > > > > true=0A> > > > > >=0A> > > > > >=A0 RSIONS =3D> '0= ', TTL =3D> '2147483647', BLOCKSIZE =3D> '65536',=0A> > IN_MEMORY=0A> > > = =3D>=0A> > > > > > 'false', BLOCKCACHE =3D> 'true'}]}=0A> > > > > >=0A> > >= > > >=0A> > > > > > 1 row(s) in 0.0280 seconds=0A> > > > > >=0A> > > > > >= =0A> > > > > >=0A> > > > > > The containing foldersize in hdfs:=0A> > > > >= > table A:=0A> > > > > > sudo -u hdfs hadoop fs -dus -h /a_d=0A> > > > > >= dus: DEPRECATED: Please use 'du -s' instead.=0A> > > > > > 227.4g=A0 /a_d= =0A> > > > > >=0A> > > > > > table B:=0A> > > > > > sudo -u hdfs hadoop fs = -dus -h /a_d=0A> > > > > > dus: DEPRECATED: Please use 'du -s' instead.=0A>= > > > > > 501.0g=A0 /a_d=0A> > > > > >=0A> > > > > >=0A> > > > > > https:/= /gist.github.com/drocsid/80bba7b6b19d64fde6c2=0A> > > > > >=0A> > > > >=0A>= > > >=0A> > >=0A> >=0A>=0A