Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AA3EEF110 for ; Tue, 13 Aug 2013 07:14:03 +0000 (UTC) Received: (qmail 95470 invoked by uid 500); 13 Aug 2013 07:14:02 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 95191 invoked by uid 500); 13 Aug 2013 07:14:01 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 95183 invoked by uid 99); 13 Aug 2013 07:14:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Aug 2013 07:14:01 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [77.238.189.207] (HELO nm5-vm0.bullet.mail.ird.yahoo.com) (77.238.189.207) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 13 Aug 2013 07:13:53 +0000 Received: from [77.238.189.238] by nm5.bullet.mail.ird.yahoo.com with NNFMP; 13 Aug 2013 07:13:12 -0000 Received: from [212.82.98.68] by tm19.bullet.mail.ird.yahoo.com with NNFMP; 13 Aug 2013 07:13:12 -0000 Received: from [127.0.0.1] by omp1005.mail.ir2.yahoo.com with NNFMP; 13 Aug 2013 07:13:12 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 17468.5359.bm@omp1005.mail.ir2.yahoo.com Received: (qmail 61666 invoked by uid 60001); 13 Aug 2013 07:13:11 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.de; s=s1024; t=1376377991; bh=v11vngloNrkAtOlZ59Wrp3hcrfB9inrE3+0IqrCYyVs=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=TcnlGEswvL7y6vXa7Mx6mrni+F9xEx952hekQL6rQJAsppGOZ/d/qqa+m1spDOGXjda4zB6MJdS4SN6b1La8hauOYqYJOeATjmxWZuVjsvZQyIiPNw3fxAHrlsbUVmuD7zn5C/aJ28GuORNZ4OSK6SzELuI4JPRlVXJmmBqGjqA= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.de; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=OMTpm+sCh1CM6OwQgRUSx/iWKAQ9NN6mDBC5YAEjHL+soMpch/NkIpEU7GCrMmClqbmwNjRrErMK/rmrlhUZ0n9Xf29l8zOS+damLQknJQAzCFoAB3rJ/gtrH87AfVnreOCYYJg9YVMERPZ7Xx4SAUwI3z59IY0rGzZG4SZjfNA=; X-YMail-OSG: Swq26poVM1nUAy0.fzEJxrXf_6.gc5czgTVYun8IIWFOimV JH7SiefCh3j0qvA7ZD_k2xp7CCbBf3YK2QRARHwEFF5RQXvovSJBpFF.ERRJ 9cOU2X6HtMGPSyqTYMZ40CaE1dWGwyb4_k8SpZBWEOFpUVvWqiY4V8TKGgpL lxs3wkxQPwUja6tzdzArDPxSsvcouX3.frDy_gq8wHHU4E8U0JuYvSB9R_O. QlXiwSoUmko427wXK7arPIOsvMLR5GWzeTcMvWhjvl5EbhtxWz2zNFgR8Q_D 0kOzHIBjlvgz6Jn7HSVg6gFRq_ovY914PT9zaO_iEQGPbxhQnuh5eo7N8ep3 1o4SV03y8sDkIu.Ug2InCCJAP35sQO6m0G1UIJ5_CSzjMeEpcfLjWkZeg36G tiD1xhckaeDCI3pQTRy8vEIMgoIor5PkTRRBVOoy4wkIUY8zL37iM2N.9ZQA S0P0kHk3Fk3WcXmwgEET69BBKPQgSp5s264Bs_81Og3FxPID5HVfs.khQtdD gBY3A.u1ZHtvQ4Azz.etDenGR4CKtMWGvF9EeFQSG.QfOeH9ekHG09dtWxrV LES_1Sn1qi0r1UjTo0v4- Received: from [153.65.16.10] by web173201.mail.ir2.yahoo.com via HTTP; Tue, 13 Aug 2013 08:13:11 BST X-Rocket-MIMEInfo: 002.001,VGhhbmtzIGZvciB5b3VyIHJlcGxpZXMgYW5kIHRoZSBsaW5rLgoKSSBjb3VsZCBnZXQgaXQgd29ya2luZywgYnV0IHdvbmRlcmVkIHdoeSB0aGUgQ1JFQVRFIFRBQkxFIHN0YXRlbWVudCB3b3JrZWQgd2l0aG91dCB0aGUgU1RPUkVEIEFTIENsYXVzZSBhcyB3ZWxsLi4udGhhdCdzIHdoYXQgcHV6emxlcyBtZSBhIGJpdC4uLgoKQnV0IEkgd2lsbCB1c2UgdGhlIFNUT1JFRCBBUyBDbGF1c2UgdG8gYmUgb24gdGhlIHNhZmUgc2lkZS4KCgoKCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwogVm9uOiABMAEBAQE- X-Mailer: YahooMailWebService/0.8.154.571 References: <1375952557.19563.YahooMailNeo@web173204.mail.ir2.yahoo.com> Message-ID: <1376377991.60751.YahooMailNeo@web173201.mail.ir2.yahoo.com> Date: Tue, 13 Aug 2013 08:13:11 +0100 (BST) From: w00t w00t Reply-To: w00t w00t Subject: Re: Hive and Lzo Compression To: "user@hive.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1090207000-950177474-1376377991=:60751" X-Virus-Checked: Checked by ClamAV on apache.org --1090207000-950177474-1376377991=:60751 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Thanks for your replies and the link.=0A=0AI could get it working, but wond= ered why the CREATE TABLE statement worked without the STORED AS Clause as = well...that's what puzzles me a bit...=0A=0ABut I will use the STORED AS Cl= ause to be on the safe side.=0A=0A=0A=0A=0A=0A_____________________________= ___=0A Von: Lefty Leverenz =0AAn: user@hive.apache= .org =0ACC: w00t w00t =0AGesendet: 19:06 Samstag, 10.Augu= st 2013=0ABetreff: Re: Hive and Lzo Compression=0A =0A=0A=0AI'm not seeing = any documentation link in Sanjay's message, so here it is again (in the Hiv= e wiki's language manual): =A0https://cwiki.apache.org/confluence/display/H= ive/LanguageManual+LZO.=0A=0A=0A=0AOn Thu, Aug 8, 2013 at 3:30 PM, Sanjay S= ubramanian wrote:=0A=0APlease refer t= his documentation here=0A>Let me know if u need more clarifications so that= we can make this document better and complete=0A>=0A>=0A>Thanks=0A>=0A>=0A= >sanjay=0A>=0A>From: w00t w00t =0A>Reply-To: "user@hive.ap= ache.org" , w00t w00t =0A>Date: Thur= sday, August 8, 2013 2:02 AM=0A>To: "user@hive.apache.org" =0A>Subject: Hive and Lzo Compression=0A>=0A>=0A>=0A>=0A>=0A>Hello,= =0A>=A0=0A>I am started to run Hive with Lzo compression on Hortonworks 1.2= =0A>=A0=0A>I have managed to install/configure Lzo and=A0 hive -e "set io.c= ompression.codecs" shows me the Lzo Codecs:=0A>io.compression.codecs=3D=0A>= org.apache.hadoop.io.compress.GzipCodec,=0A>org.apache.hadoop.io.compress.D= efaultCodec,=0A>com.hadoop.compression.lzo.LzoCodec,=0A>com.hadoop.compress= ion.lzo.LzopCodec,=0A>org.apache.hadoop.io.compress.BZip2Codec=0A>=A0=0A>Ho= wever, I have some questions where I would be happy if you could help me.= =0A>(1) CREATE TABLE statement=0A>=0A>=0A>I read in different postings, tha= t in the CREATE TABLE statement, I have to use the following STORAGE clause= :=0A>=A0=0A>CREATE EXTERNAL TABLE txt_table_lzo (=0A>=A0=A0 txt_line STRING= =0A>)=0A>ROW FORMAT DELIMITED FIELDS TERMINATED BY '||||'=0A>STORED AS INPU= TFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat' OUTPUTFORMAT 'org.= apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'=0A>LOCATION '/user/= myuser/data/in/lzo_compressed';=0A>=A0=0A>It works withouth any problems no= w to execute SELECT statements on this table with Lzo data.=0A>=A0=0A>Howev= er I also created a table on the same data without this STORAGE clause:=0A>= =A0=0A>CREATE EXTERNAL TABLE txt_table_lzo_tst (=0A>=A0=A0 txt_line STRING= =0A>)=0A>ROW FORMAT DELIMITED FIELDS TERMINATED BY '||||'=0A>LOCATION '/use= r/myuser/data/in/lzo_compressed';=0A>=A0=0A>The interesting thing is, it wo= rks as well, when I execute a SELECT statement and this table.=0A>=A0=0A>Ca= n you help, why the second CREATE TABLE statement works as well?=0A>What sh= ould I use in DDLs? =0A>Is it best practice to use the STORED AS clause wit= h a "deprecatedLzoTextInputFormat"? Or should I remove it?=0A>=A0=0A>=A0(2)= Output and Intermediate Compression Settings =0A>=A0=0A>I want to use outp= ut compression .=0A>=A0=0A>In "Programming Hive" from Capriolo, Wampler, Ru= therglen the following commands are recommended:=0A>SET hive.exec.compress.= output=3Dtrue;=0A>SET mapred.output.compression.codec=3Dcom.hadoop.compress= ion.lzo.LzopCodec;=0A>=A0=0A>=A0=A0=A0=A0=A0=A0=A0=A0=A0 However, in some o= ther places in forums, I found the following recommended settings:=0A>SET h= ive.exec.compress.output=3Dtrue=0A>SET mapreduce.output.fileoutputformat.co= mpress=3Dtrue=0A>SET mapreduce.output.fileoutputformat.compress.codec=3Dcom= .hadoop.compression.lzo.LzopCodec=0A>=A0=0A>Am I right, that the first sett= ings are for Hadoop versions prior 0.23?=0A>Or is there any other reason wh= y the settings are different?=0A>=A0=0A>I am using Hadoop 1.1.2 with Hive 0= .10.0.=0A>Which settings would you recommend to use?=0A>=A0=0A>------------= --=0A>=A0=A0=A0=A0=A0=A0=A0=A0=A0 I also want to compress intermediate resu= lts.=0A>=A0=A0=A0=A0=A0=A0=A0=A0 =0A>=A0=A0=A0=A0=A0=A0=A0=A0 Again, in=A0 = "Programming Hive" the following settings are recommended:=0A>=A0=A0=A0=A0= =A0=A0=A0=A0 SET hive.exec.compress.intermediate=3Dtrue;=0A>=A0=A0=A0=A0=A0= =A0=A0=A0 SET mapred.map.output.compression.codec=3Dcom.hadoop.compression.= lzo.LzopCodec;=0A>=A0=0A>=A0=A0=A0=A0=A0=A0=A0=A0=A0 Is this the right sett= ing?=0A>=0A>=A0=A0=A0=A0=A0=A0=A0=A0=A0 Or should I again use the settings = (which look more valid for Hadoop 0.23 and greater)?:=0A>=A0=A0=A0=A0=A0=A0= =A0=A0=A0 SET hive.exec.compress.intermediate=3Dtrue;=0A>=A0=A0=A0=A0=A0=A0= =A0=A0=A0 SET mapreduce.map.output.compression.codec=3Dcom.hadoop.compressi= on.lzo.LzopCodec;=0A>=A0=0A>Thanks=0A>=A0=0A>=0A>=0A>=0A>CONFIDENTIALITY NO= TICE=0A>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =0A>This email message and any attachments are for the exclusive use of the= intended recipient(s) and may contain confidential and privileged informat= ion. Any unauthorized review, use, disclosure or distribution is prohibited= . If you are not the intended recipient,=0A please contact the sender by re= ply email and destroy all copies of the original message along with any att= achments, from your computer system. If you are the intended recipient, ple= ase be advised that the content of this message is subject to access, revie= w=0A and disclosure by the sender's Email System Administrator.=0A>=0A=0A= =0A--=A0Lefty --1090207000-950177474-1376377991=:60751 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
Thanks for= your replies and the link.

I could get it working, but wondered why the CREATE TABLE statement wor= ked without the STORED AS Clause as well...that's what puzzles me a bit...<= /span>

But I will use t= he STORED AS Clause to be on the safe side.

=


<= span style=3D"font-weight:bold;">Von: Lefty Leverenz <leftyle= verenz@gmail.com>
An: user@hive.apache.org
CC: w00t w00t <w00tel@yahoo.de>
Gesende= t: 19:06 Samstag, 10.August 2013
Betreff: Re: Hive and Lzo Compression

I'm not seeing any documentation link in Sanjay's message, so here= it is again (in the Hive wiki's language manual):  https://cwiki.apache.org/confluence/display/Hive/Lang= uageManual+LZO.
=0A

On Thu, Aug 8, 2013 at 3:30 PM, Sanjay S= ubramanian <Sanjay.Subramanian@wizecommerce.com> wrote:
=0A
=0A=0A=0A=0A<= div style=3D"font-size:14px;font-family:Calibri, sans-serif;word-wrap:break= -word;">=0A
Please refer this documentation here
=0A
Let me kn= ow if u need more clarifications so that we can make this document better a= nd complete
=0A

=0A
=0A
Thanks
=0A

=0A=0A
sanjay
=0A

=0A
=0A=0A
= =0A=0AFrom: w00t w00t <w00tel@yahoo.de>
=0AReply-To: "user= @hive.apache.org" <user@hive= .apache.org>, w00t w00t <w00tel@yah= oo.de>
=0A=0ADate: Thurs= day, August 8, 2013 2:02 AM
=0ATo: "user@hive.apache.org" <= user@hive.apache.org>
=0A= =0ASubject: Hive and Lzo Compressi= on
=0A
=0A

=0A
=0A=
=0A
=0A
=0A
=0A
=0A
=0A
=0A
=0A=0A
=0A
Hello,
=0A
 
=0A
I am started to run Hive with Lzo compression on Hortonworks 1= .2
=0A
=  
=0A
I have managed to install/configure Lzo and =0Ahive -= e "set io.compression.codecs" shows me the Lzo Codecs:
=0A
io.compression.codecs= =3D
=0A
org.apache.hadoop.io.compress.GzipCodec,
=0A
org.apache.hadoop.io.compress.Defaul= tCodec,
=0A
com.hadoop.compression.lzo.LzoCodec,
=0A
com.hadoop.compression.lzo.LzopCodec= ,
=0A
o= rg.apache.hadoop.io.compress.BZip2Codec
=0A
 
=0A
However, I have some questions whe= re I would be happy if you could help me.
=0A
=0A(1) CREATE TABLE statement

=0A=
=0A
=0AI read in different postings, that in the CREATE TABLE st= atement, I have to use the following STORAGE clause:
=0A
=0A&nb= sp;
=0A
=0ACREATE EXTERNAL TABLE txt_table_lzo (
=0A
=0A   txt_line STRING
=0A
=0A)
= =0A
=0AROW FORMAT DELIMITED FIELDS TERMINATED BY '||||'
=0A=0ASTORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputForma= t' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat= '
=0A
=0ALOCATION '/user/myuser/data/in/lzo_compressed';
= =0A
=0A 
=0A
=0AIt works withouth any problems now = to execute SELECT statements on this table with Lzo data.
=0A
= =0A 
=0A
=0AHowever I also created a table on the same dat= a without this STORAGE clause:
=0A
=0A 
=0A
= =0ACREATE EXTERNAL TABLE txt_table_lzo_tst (
=0A
=0A = ;  txt_line STRING
=0A
=0A)
=0A
=0AROW = FORMAT DELIMITED FIELDS TERMINATED BY '||||'
=0A
=0ALOCATION '/= user/myuser/data/in/lzo_compressed';
=0A
=0A 
=0A=0AThe interesting thing is, it works as well, when I execute a SELECT = statement and this table.
=0A
=0A 
=0A
=0ACan = you help, why the second CREATE TABLE statement works as well?
=0A=0AWhat should I use in DDLs?
=0A
=0AIs it best practice t= o use the STORED AS clause with a "deprecatedLzoTextInputFormat"? Or should= I remove it?
=0A
=0A 
=0A
 
=0A(2) Output and Intermediate Compression Setti= ngs=0A
=0A 
=0A
=0AI want to use output comp= ression .
=0A
=0A 
=0A
=0AIn "Programming Hive= " from Capriolo, Wampler, Rutherglen the following commands are recommended= :
=0A
=0ASET hive.exec.compress.output=3Dtrue;
=0A
= =0ASET mapred.output.compression.codec=3Dcom.hadoop.compression.lzo.LzopCod= ec;
=0A
=0A 
=0A
       =   =0AHowever, in some other places in forums, I found the = following recommended settings:
=0A
=0ASET hive.exec.compress.o= utput=3Dtrue
=0A
=0ASET mapreduce.output.fileoutputformat.compr= ess=3Dtrue
=0A
=0ASET mapreduce.output.fileoutputformat.compres= s.codec=3Dcom.hadoop.compression.lzo.LzopCodec
=0A
=0A =0A
=0AAm I right, that the first settings are for Hadoop version= s prior 0.23?
=0A
=0AOr is there any other reason why the setti= ngs are different?
=0A
=0A 
=0A
=0AI am using = Hadoop 1.1.2 with Hive 0.10.0.
=0A
=0AWhich settings would you = recommend to use?
=0A
=0A 
=0A
=0A------------= --
=0A
=          =0AI als= o want to compress intermediate results.
=0A
     &= nbsp;  =0A
=0A
       &n= bsp;=0AAgain, in  "Programming Hive" the following= settings are recommended:
=0A
       &nb= sp;=0ASET hive.exec.compress.intermediate=3Dtrue;
=0A
  &nb= sp;     =0ASET mapred.map.output.compressio= n.codec=3Dcom.hadoop.compression.lzo.LzopCodec;
=0A
 
=0A
   &n= bsp;     =0AIs this the right setting?=0A

=0A=           Or shou= ld I again use the settings (which look more valid for Hadoop 0.23 and grea= ter)?:
=0A          <= /span>SET hive.exec.compress.intermediate=3Dtrue;
=0A
   &nbs= p;     =0ASET mapreduce.map.output.compress= ion.codec=3Dcom.hadoop.compression.lzo.LzopCodec;
=0A
 
=0A
Thanks
=0A
 
=0A=0A
=0A
=0A
=0A
=0A
=0A
=0A
=0A
=0A=0A
=0A

=0A
CONFIDENTIALITY NOTICE
=0A=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=0AThis email message and any attac= hments are for the exclusive use of the intended recipient(s) and may conta= in confidential and privileged information. Any unauthorized review, use, d= isclosure or distribution is prohibited. If you are not the intended recipi= ent,=0A please contact the sender by reply email and destroy all copies of = the original message along with any attachments, from your computer system.= If you are the intended recipient, please be advised that the content of t= his message is subject to access, review=0A and disclosure by the sender's = Email System Administrator.
=0A
=0A
=0A=0A
<= br>

-- Lefty=0A

<= br>
--1090207000-950177474-1376377991=:60751--