hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lord Khan Han <khanuniver...@gmail.com>
Subject Re: Hbase export / import Why doubling the Table Size ?
Date Sat, 10 Dec 2011 18:09:29 GMT
When we exporting from hbase table which is LZO compression on it, the
exported file is decompressed  or as is with LZO  columns?



On Sat, Dec 10, 2011 at 6:40 PM, Lord Khan Han <khanuniverse1@gmail.com>wrote:

> It is a succes for both lzo  snappy.  Content is the html document.. Web
> document
>
>
> hbase org.apache.hadoop.hbase.util.CompressionTest
> hdfs://localhost:8020/user/root/testfile.lzo lzo
>
> 11/12/10 18:37:04 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
>
> 11/12/10 18:37:04 INFO lzo.LzoCodec: Successfully loaded & initialized
> native-lzo library [hadoop-lzo rev 2ad6654f3e9cad97d13f716e51a0509253c0aabb]
>
> 11/12/10 18:37:04 INFO compress.CodecPool: Got brand-new compressor
>
> SUCCESS
>
>
>
>
>
> On Sat, Dec 10, 2011 at 1:03 PM, Lars George <lars.george@gmail.com>wrote:
>
>> Could you use the ComressionTest to verify that the library path is set
>> up properly?
>>
>> $ hbase org.apache.hadoop.hbase.util.CompressionTest
>> hdfs://<your-namenode>:8020/<some-writable-path>/test.lzo lzo
>>
>> Does it report OK? Same for Snappy? The reason I am asking is that when
>> it does not find the native libs it uses no compression at all, and if your
>> original was compressed then you will see the copied one being uncompressed
>> and therefore much larger.
>>
>> Also, what is the content like? How large are the cells that are stored?
>>
>> Lars
>>
>>
>> On Dec 10, 2011, at 8:53 AM, Lord Khan Han wrote:
>>
>> > I will check the reverse export imprt to cdh3b4 today to see is it same
>> > size in the cluster..
>> >
>> > when we use the hadoop dst copy how we candeal with the .META ? because
>> we
>> > are copying 1 tabel not all and also there is region info in .META
>> > including their dns which is different offcoures in new  cluster.
>> >
>> > I tried the import again today with no compression.. It is doubled the
>> > exported file size!!  I mean I have 200gig exported hbase table size.
>> when
>> > import without compression its going 400gig.. Its definitely writing
>> twice
>> > something..
>> >
>> > thanks
>> >
>> >
>> >
>> > On Sat, Dec 10, 2011 at 2:19 AM, lars hofhansl <lhofhansl@yahoo.com>
>> wrote:
>> >
>> >> There's copytable (also an MR job - written by J-D), but it reuses the
>> >> mapper class from the Import.java, so it
>> >> probably won't make a difference.
>> >>
>> >> What I meant to say below... When you export/import the table from your
>> >> CDH3u2 cluster back to your CDH3B4
>> >> cluster, is the size still doubled?
>> >>
>> >>
>> >> If both clusters are shutdown, you can use Hadoop's distcp to copy
>> >> directly on the filesystem level; in fact that might be your
>> >> best option.
>> >>
>> >> -- Lars
>> >>
>> >>
>> >> ----- Original Message -----
>> >> From: Lord Khan Han <khanuniverse1@gmail.com>
>> >> To: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
>> >> Cc:
>> >> Sent: Friday, December 9, 2011 4:05 PM
>> >> Subject: Re: Hbase export / import Why doubling the Table Size ?
>> >>
>> >> Thanks for your time..
>> >>
>> >> Is there any reliable way to copy table between these cluster instead
>> of
>> >> export/import?
>> >>
>> >>
>> >>
>> >> On Sat, Dec 10, 2011 at 1:39 AM, lars hofhansl <lhofhansl@yahoo.com>
>> >> wrote:
>> >>
>> >>> Hmm... I'm afraid I am out of options. If you want you can try to copy
>> >> the
>> >>> table
>> >>> from CHD3u2 and your CDH3B4 system, and see if the size remains
>> doubled.
>> >>>
>> >>> Does this happen with very small table, too? If so, you could take a
>> >> small
>> >>> sample
>> >>> HFile and upload it (both the CHD3B4 and CDH3u2 versions) somewhere
so
>> >>> that we can have a look.
>> >>>
>> >>>
>> >>> -- Lars
>> >>>
>> >>>
>> >>> ----- Original Message -----
>> >>> From: Lord Khan Han <khanuniverse1@gmail.com>
>> >>> To: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
>> >>> Cc:
>> >>> Sent: Friday, December 9, 2011 2:45 PM
>> >>> Subject: Re: Hbase export / import Why doubling the Table Size ?
>> >>>
>> >>> in same configured cluster (carbon copy)  when I made import  there
>> is no
>> >>> increas on size.. same size..
>> >>>
>> >>> problem in the cdh3u2..
>> >>>
>> >>>
>> >>> On Sat, Dec 10, 2011 at 12:42 AM, lars hofhansl <lhofhansl@yahoo.com>
>> >>> wrote:
>> >>>
>> >>>> What happens when you export/import into the same (CDH3B4) cluster
>> >> using
>> >>> a
>> >>>> new table name?
>> >>>> Does the size double as well?
>> >>>>
>> >>>>
>> >>>>
>> >>>> ----- Original Message -----
>> >>>> From: Lord Khan Han <khanuniverse1@gmail.com>
>> >>>> To: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
>> >>>> Cc:
>> >>>> Sent: Friday, December 9, 2011 2:27 PM
>> >>>> Subject: Re: Hbase export / import Why doubling the Table Size ?
>> >>>>
>> >>>> I flush  ed  and major_compact  ed ..  nothing changed...   i am
>> stuck
>> >>> this
>> >>>> last two days...:(  any idea?
>> >>>>
>> >>>>
>> >>>> On Sat, Dec 10, 2011 at 12:11 AM, Lord Khan Han <
>> >> khanuniverse1@gmail.com
>> >>>>> wrote:
>> >>>>
>> >>>>> Now flushed  and compacting again..
>> >>>>>
>> >>>>> one more clue:
>> >>>>>
>> >>>>> I tested to import CDH3B4 (same as exported cluster) with lzo..
 all
>> >> is
>> >>>>> okay.. table size is same..
>> >>>>> than I upgrade to cdh3u2  table also is ok and same size..
>> >>>>>
>> >>>>> But when I try to import in cdh3u2  this size doubling happens..
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Sat, Dec 10, 2011 at 12:07 AM, Lord Khan Han <
>> >>> khanuniverse1@gmail.com
>> >>>>> wrote:
>> >>>>>
>> >>>>>> I made major_compact but not flush...  will do now with
flush..
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Fri, Dec 9, 2011 at 11:58 PM, lars hofhansl <
>> lhofhansl@yahoo.com
>> >>>>> wrote:
>> >>>>>>
>> >>>>>>> Can you try flushing and compacting the table? How did
you measure
>> >>> the
>> >>>>>>> size?
>> >>>>>>>
>> >>>>>>> Both can be done from the shell using the 'flush' and
>> >> 'major_compact'
>> >>>>>>> commands, resp.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> ----- Original Message -----
>> >>>>>>> From: Lord Khan Han <khanuniverse1@gmail.com>
>> >>>>>>> To: user@hbase.apache.org
>> >>>>>>> Cc:
>> >>>>>>> Sent: Friday, December 9, 2011 1:50 PM
>> >>>>>>> Subject: Hbase export / import Why doubling the Table
Size ?
>> >>>>>>>
>> >>>>>>> Hi ,
>> >>>>>>>
>> >>>>>>> We are usng CDH3B4  and want to upgrade to CDH3u2. 
Before doing
>> >> this
>> >>>>>>> we make a separate cluster with same config and installed
CDH3u2.
>> >>>>>>>
>> >>>>>>> We exported our hbase table from cdh3b4  cluster  and
import it to
>> >>> the
>> >>>>>>> new cdh3u2  cluster. Table is LZO and both cluster config
is same.
>> >>>>>>>
>> >>>>>>> After import finished hbase table size doubled!! even
its
>> >> configured
>> >>>>>>> to use LZO.  We changed table to snappy  import again
and same
>> >>> result.
>> >>>>>>> Table size multiplied x 2  in new cdh3u2  cluster.
>> >>>>>>>
>> >>>>>>> We didnt find why ? Is there any ideas for this ?
>> >>>>>>>
>> >>>>>>> thanks
>> >>>>>>>
>> >>>>>>> Khan
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>
>> >>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message