Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EDFDAE815 for ; Wed, 28 Nov 2012 12:00:49 +0000 (UTC) Received: (qmail 87742 invoked by uid 500); 28 Nov 2012 12:00:45 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 87543 invoked by uid 500); 28 Nov 2012 12:00:44 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 87531 invoked by uid 99); 28 Nov 2012 12:00:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Nov 2012 12:00:44 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [85.115.52.190] (HELO cluster-a.mailcontrol.com) (85.115.52.190) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Nov 2012 12:00:36 +0000 Received: from sportingindex.com (83-244-202-21.cust-83.exponential-e.net [83.244.202.21]) by rly20a.srv.mailcontrol.com (MailControl) with ESMTP id qASC0DrI007432 for ; Wed, 28 Nov 2012 12:00:13 GMT Received: from dss-protector.sig.ads (unknown [127.0.0.1]) by dss-protector.sig.ads (Service) with ESMTP id 1DC6C128013 for ; Wed, 28 Nov 2012 12:00:13 +0000 (GMT) Received: from GBGH-SVEXCHFE02.sig.ads (unknown [10.10.14.23]) by dss-protector.sig.ads (Service) with ESMTP id 7439A128002 for ; Wed, 28 Nov 2012 12:00:12 +0000 (GMT) Received: from GBGH-EXCH-CMS.sig.ads ([fe80::dcac:17fe:e957:d280]) by GBGH-SVEXCHFE02.sig.ads ([fe80::69ae:bd72:c665:191c%10]) with mapi; Wed, 28 Nov 2012 12:00:12 +0000 From: Tony Burton To: "'user@hadoop.apache.org'" Date: Wed, 28 Nov 2012 12:00:11 +0000 Subject: RE: Map output compression in Hadoop 1.0.3 Thread-Topic: Map output compression in Hadoop 1.0.3 Thread-Index: Ac3NXWhFo3fSjwotQmynEJU84f/0CAAAmVyw Message-ID: <556325346CA26341B6F0530E07F90D96016C64CD9682@GBGH-EXCH-CMS.sig.ads> References: <556325346CA26341B6F0530E07F90D96016C64CD967F@GBGH-EXCH-CMS.sig.ads> <556325346CA26341B6F0530E07F90D96016C64CD9681@GBGH-EXCH-CMS.sig.ads> In-Reply-To: Accept-Language: en-US, en-GB Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US, en-GB Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Scanned-By: MailControl 11783.69 (www.mailcontrol.com) on 10.65.0.130 X-Virus-Checked: Checked by ClamAV on apache.org Got it - thanks Harsh. -----Original Message----- From: Harsh J [mailto:harsh@cloudera.com]=20 Sent: 28 November 2012 11:41 To: Subject: Re: Map output compression in Hadoop 1.0.3 No, I see your point of confusion and I can think of others who may be conf= used that way, but the API changes did not trigger the config naming change= . The config naming changes could instead be viewed by you as a MR1 vs. MR2 thing, for simplification. So unless you move onto YARN-based MR2, keep= using the mapred.* style properties. On Wed, Nov 28, 2012 at 5:07 PM, Tony Burton wr= ote: > Also, another point that prompted my initial question: I'd come across "m= apred.compress.map.output" in the documentation, but I wasn't 100% sure if = there has been or will be any equivalence or correspondence between config = setting like this one and the naming of the stable and new API. > > For example, we've got o.a.h.mapreduce.Job rather than o.a.h.mapred.JobCo= nf as previously mentioned, from the "mapred" and "mapreduce" parts of the = API. > > Are config settings that begin with mapred.* related to the stable API wi= th the implication that there's an mapreduce.* equivalent (eg mapred.compre= ss.map.output vs mapreduce.compress.map.output), or am I seeing a connectio= n that doesn't exist? > > (Hope that makes sense!) > > > > > -----Original Message----- > From: Harsh J [mailto:harsh@cloudera.com] > Sent: 28 November 2012 11:25 > To: > Subject: Re: Map output compression in Hadoop 1.0.3 > > Hi, > > The property mapred.output.compress, as its name reads, controls job-outp= ut compression, not intermediate/transient data compression, which is what = you mean by "Map output compression". > > Also note that this property is a per job one and can be toggled, if a us= er wanted, on/off for each job specifically. > > These should be the many ways, exhaustively, for MR1, to turn on "Map out= put compression": > > 1. Set "mapred.compress.map.output" to true in your client's mapred-site.= xml to turn it on for all jobs run from such a client machine. > 2. Set the above in cluster, with true at every node (JT p= lus TTs) and restart them, to turn it on for all job, regardless of what th= e job itself specifies. > 3. Turn it on per-job basis: > 3.1. Stable API: JobConf.setCompressMapOutput(true); > 3.2. New API: Job.getConfiguration().set("mapred.compress.map.output",=20 > true); > > On Wed, Nov 28, 2012 at 4:42 PM, Tony Burton = wrote: >> Hi, >> >> >> >> Quick question: What's the best way to turn on Map Output Compression=20 >> in Hadoop 1.0.3? The tutorial at=20 >> http://hadoop.apache.org/docs/r1.0.3/mapred_tutorial.html says to use=20 >> JobConf.setCompressMapOutput(boolean), but I'm using=20 >> o.a.h.mapreduce.Job rather than o.a.h.mapred.JobConf. >> >> >> >> Is it simply a case of using getConf.set("mapred.output.compress", >> true) then constructing my Job from the Configuration object, or is=20 >> there more direct way that I've missed? >> >> >> >> Thanks, >> >> >> >> Tony >> >> >> >> >> >> >> ********************************************************************* >> * >> ******* P Please consider the environment before printing this email=20 >> or attachments >> >> >> This email and any attachments are confidential, protected by=20 >> copyright and may be legally privileged. If you are not the intended=20 >> recipient, then the dissemination or copying of this email is=20 >> prohibited. If you have received this in error, please notify the=20 >> sender by replying by email and then delete the email completely from=20 >> your system. Neither Sporting Index nor the sender accepts=20 >> responsibility for any virus, or any other defect which might affect=20 >> any computer or IT system into which the email is received and/or=20 >> opened. It is the responsibility of the recipient to scan the email=20 >> and no responsibility is accepted for any loss or damage arising in=20 >> any way from receipt or use of this email. Sporting Index Ltd is a=20 >> company registered in England and Wales with company number 2636842,=20 >> whose registered office is at Gateway House, Milverton Street, London, S= E11 4AP. Sporting Index Ltd is authorised and regulated by the UK Financial= Services Authority (reg. no. >> 150404) and Gambling Commission (reg. no. 000-027343-R-308898-001). >> Any financial promotion contained herein has been issued and approved=20 >> by Sporting Index Ltd. >> >> >> Outbound email has been scanned for viruses and SPAM > > > > -- > Harsh J > > > Please consider the environment before printing this email > > www.sportingindex.com > Inbound Email has been scanned for viruses and SPAM -- Harsh J