Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3BB13DAA6 for ; Wed, 12 Dec 2012 13:08:25 +0000 (UTC) Received: (qmail 47335 invoked by uid 500); 12 Dec 2012 13:08:23 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 47273 invoked by uid 500); 12 Dec 2012 13:08:23 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 47240 invoked by uid 99); 12 Dec 2012 13:08:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Dec 2012 13:08:22 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dontariq@gmail.com designates 209.85.216.176 as permitted sender) Received: from [209.85.216.176] (HELO mail-qc0-f176.google.com) (209.85.216.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Dec 2012 13:08:15 +0000 Received: by mail-qc0-f176.google.com with SMTP id n41so293005qco.35 for ; Wed, 12 Dec 2012 05:07:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=PzlEijja6oASqWPegf2DndQG7bOrR9oQt2wJnMefoNA=; b=F6+vqXfH9Q4rpSPiJ7zKX0nUv1Id0UGGWAoQ3qu7Z35Ipxrz0S5Znk/0fZftcgz/MA CYO4P84JpQ/FvT/f6/HrPz9eOTX4M4AKLFSerHylZtZzERJmB6FeLUGAWHfoWHVqP5Qc g6MEVt5V/M1UtEpy6yq9I7WNg1yV9mGopd7xw88gr84B2A01mLCd21i6ki3sHJtLqT1q UpDv4RzURhRvBce/6nHAjaQF2ajTZv3dJqY0PlADXSTBfTu28VnOI7T0NM4ynGx2Evlz w3HpcjuoJ/l5VVFyMB3MgLgwIsHwTMPfOkf/IGAcmMmVy8tYytTCBOiuDQo1XnLNbCa+ gYwQ== Received: by 10.224.95.196 with SMTP id e4mr1700389qan.88.1355317674720; Wed, 12 Dec 2012 05:07:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.126.165 with HTTP; Wed, 12 Dec 2012 05:07:13 -0800 (PST) In-Reply-To: References: From: Mohammad Tariq Date: Wed, 12 Dec 2012 18:37:13 +0530 Message-ID: Subject: Re: Modify the number of map tasks To: user Content-Type: multipart/alternative; boundary=20cf3063e3558f3e4f04d0a77c35 X-Virus-Checked: Checked by ClamAV on apache.org --20cf3063e3558f3e4f04d0a77c35 Content-Type: text/plain; charset=ISO-8859-1 Can I have a look at your config files? Regards, Mohammad Tariq On Wed, Dec 12, 2012 at 6:31 PM, imen Megdiche wrote: > i run the start-all.sh and all daemons starts without problems. But i the > log of the tasktracker look like this : > > > 2012-12-12 13:53:45,495 INFO org.apache.hadoop.mapred.TaskTracker: > STARTUP_MSG: > /************************************************************ > STARTUP_MSG: Starting TaskTracker > STARTUP_MSG: host = megdiche-OptiPlex-GX280/127.0.1.1 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 1.0.4 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r > 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012 > ************************************************************/ > 2012-12-12 13:53:47,009 INFO > org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from > hadoop-metrics2.properties > 2012-12-12 13:53:47,331 INFO > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source > MetricsSystem,sub=Stats registered. > 2012-12-12 13:53:47,336 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot > period at 10 second(s). > 2012-12-12 13:53:47,336 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics > system started > 2012-12-12 13:53:48,165 INFO > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi > registered. > 2012-12-12 13:53:48,192 WARN > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already > exists! > 2012-12-12 13:53:48,513 ERROR org.apache.hadoop.mapred.TaskTracker: Can > not start task tracker because java.lang.IllegalArgumentException: Does not > contain a valid host:port authority: local > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:162) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:128) > at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2560) > at org.apache.hadoop.mapred.TaskTracker.(TaskTracker.java:1426) > at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742) > > 2012-12-12 13:53:48,519 INFO org.apache.hadoop.mapred.TaskTracker: > SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down TaskTracker at megdiche-OptiPlex-GX280/ > 127.0.1.1 > ************************************************************/ > > > > > 2012/12/12 Mohammad Tariq > >> I would check if all the daemons are running properly or not, before >> anything else. If some problem is found, next place to track is the log of >> each daemon. >> >> The correct command to check the status of a job from command line is : >> hadoop job -status jobID. >> (Mind the 'space' after job and remove 'command' from the statement) >> >> HTH >> >> Regards, >> Mohammad Tariq >> >> >> >> On Wed, Dec 12, 2012 at 6:14 PM, imen Megdiche wrote: >> >>> My goal is to analyze the response time of MapReduce depending on the size >>> of the input files. I need to change the number of map and / or Reduce >>> tasks and recover the execution time. S it turns out that nothing works locally >>> on my pc : >>> neither hadoop job-status command job_local_0001 (which return no job >>> found ) >>> nor localhost: 50030 >>> I will be very grateful if you can help m better understand these >>> problem >>> >>> >>> 2012/12/12 Mohammad Tariq >>> >>>> Are you working locally?What exactly is the issue? >>>> >>>> Regards, >>>> Mohammad Tariq >>>> >>>> >>>> >>>> On Wed, Dec 12, 2012 at 6:00 PM, imen Megdiche >>> > wrote: >>>> >>>>> no >>>>> >>>>> >>>>> 2012/12/12 Mohammad Tariq >>>>> >>>>>> Any luck with "localhost:50030"?? >>>>>> >>>>>> Regards, >>>>>> Mohammad Tariq >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Dec 12, 2012 at 5:53 PM, imen Megdiche < >>>>>> imen.megdiche@gmail.com> wrote: >>>>>> >>>>>>> i run the job through the command line >>>>>>> >>>>>>> >>>>>>> 2012/12/12 Mohammad Tariq >>>>>>> >>>>>>>> You have to replace "JobTrackerHost" in "JobTrackerHost:50030" >>>>>>>> with the actual name of the machine where JobTracker is running. >>>>>>>> For example, If you are working on a local cluster, you have to use >>>>>>>> "localhost:50030". >>>>>>>> >>>>>>>> Are you running your job through the command line or some IDE? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Mohammad Tariq >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Dec 12, 2012 at 5:42 PM, imen Megdiche < >>>>>>>> imen.megdiche@gmail.com> wrote: >>>>>>>> >>>>>>>>> excuse me the data size is 98 MB >>>>>>>>> >>>>>>>>> >>>>>>>>> 2012/12/12 imen Megdiche >>>>>>>>> >>>>>>>>>> the size of data 49 MB and n of map 4 >>>>>>>>>> the web UI JobTrackerHost:50030 does not wok, what should i do to >>>>>>>>>> make this appear , i work on ubuntu >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2012/12/12 Mohammad Tariq >>>>>>>>>> >>>>>>>>>>> Hi Imen, >>>>>>>>>>> >>>>>>>>>>> You can visit the MR web UI at "JobTrackerHost:50030" and >>>>>>>>>>> see all the useful information like no. of mappers, no of reducers, time >>>>>>>>>>> taken for the execution etc. >>>>>>>>>>> >>>>>>>>>>> One quick question for you, what is the size of your data and >>>>>>>>>>> what is the no of maps which you are getting right now? >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Mohammad Tariq >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Dec 12, 2012 at 5:11 PM, imen Megdiche < >>>>>>>>>>> imen.megdiche@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thank you Mohammad but the number of map tasks still the same >>>>>>>>>>>> in the execution. Do you know how to capture the time spent on execution. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2012/12/12 Mohammad Tariq >>>>>>>>>>>> >>>>>>>>>>>>> Hi Imen, >>>>>>>>>>>>> >>>>>>>>>>>>> You can add "mapred.map.tasks" property in your >>>>>>>>>>>>> mapred-site.xml file. >>>>>>>>>>>>> >>>>>>>>>>>>> But, it is just a hint for the InputFormat. Actually no. of >>>>>>>>>>>>> maps is actually determined by the no of InputSplits created by >>>>>>>>>>>>> the InputFormat. >>>>>>>>>>>>> >>>>>>>>>>>>> HTH >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Mohammad Tariq >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Dec 12, 2012 at 4:11 PM, imen Megdiche < >>>>>>>>>>>>> imen.megdiche@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I try to force the number of map for the mapreduce job with >>>>>>>>>>>>>> the command : >>>>>>>>>>>>>> public static void main(String[] args) throws Exception { >>>>>>>>>>>>>> >>>>>>>>>>>>>> JobConf conf = new JobConf(WordCount.class); >>>>>>>>>>>>>> conf.set("mapred.job.tracker", "local"); >>>>>>>>>>>>>> conf.set("fs.default.name", "local"); >>>>>>>>>>>>>> conf.setJobName("wordcount"); >>>>>>>>>>>>>> >>>>>>>>>>>>>> conf.setOutputKeyClass(Text.class); >>>>>>>>>>>>>> conf.setOutputValueClass(IntWritable.class); >>>>>>>>>>>>>> >>>>>>>>>>>>>> conf.setNumMapTask(6); >>>>>>>>>>>>>> conf.setMapperClass(Map.class); >>>>>>>>>>>>>> conf.setCombinerClass(Reduce.class); >>>>>>>>>>>>>> conf.setReducerClass(Reduce.class); >>>>>>>>>>>>>> ... >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> But it doesn t work. >>>>>>>>>>>>>> What can i do to modify the number of map and reduce tasks. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > --20cf3063e3558f3e4f04d0a77c35 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Can I have a look at your config files?

Regards,
=A0=A0 =A0Mohammad Tariq



On Wed, Dec 12, 2012 at 6:31 PM, imen Me= gdiche <imen.megdiche@gmail.com> wrote:
i run the start-all.sh and all daemons starts without problems. But i the l= og of the tasktracker look like this :


2012-12-12 13:53:45,495 = INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
/**************= **********************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:=A0=A0 host =3D megdiche-O= ptiPlex-GX280/127.0.1.1<= br>STARTUP_MSG:=A0=A0 args =3D []
STARTUP_MSG:=A0=A0 version =3D 1.0.4STARTUP_MSG:=A0=A0 build =3D https://svn.apache.org= /repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by &#= 39;hortonfo' on Wed Oct=A0 3 05:13:58 UTC 2012
************************************************************/
2012-12-12= 13:53:47,009 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded pr= operties from hadoop-metrics2.properties
2012-12-12 13:53:47,331 INFO or= g.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source Metric= sSystem,sub=3DStats registered.
2012-12-12 13:53:47,336 INFO org.apache.hadoop.metrics2.impl.MetricsSystemI= mpl: Scheduled snapshot period at 10 second(s).
2012-12-12 13:53:47,336 = INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics= system started
2012-12-12 13:53:48,165 INFO org.apache.hadoop.metrics2.impl.MetricsSourceA= dapter: MBean for source ugi registered.
2012-12-12 13:53:48,192 WARN or= g.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already ex= ists!
2012-12-12 13:53:48,513 ERROR org.apache.hadoop.mapred.TaskTracker: Can not= start task tracker because java.lang.IllegalArgumentException: Does not co= ntain a valid host:port authority: local
=A0=A0=A0 at org.apache.hadoop.= net.NetUtils.createSocketAddr(NetUtils.java:162)
=A0=A0=A0 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:= 128)
=A0=A0=A0 at org.apache.hadoop.mapred.JobTracker.getAddress(JobTrac= ker.java:2560)
=A0=A0=A0 at org.apache.hadoop.mapred.TaskTracker.<ini= t>(TaskTracker.java:1426)
=A0=A0=A0 at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:374= 2)

2012-12-12 13:53:48,519 INFO org.apache.hadoop.mapred.TaskTracker= : SHUTDOWN_MSG:
/******************************************************= ******
SHUTDOWN_MSG: Shutting down TaskTracker at megdiche-OptiPlex-GX280/127.0.1.1
******************= ******************************************/




2012/12/12 Mohammad Tariq <dontariq@gmail.com>
I would check if all the daemons are running properly or not, before anythi= ng else. If some problem is found, next place to track is the log of each d= aemon.

The correct command to check the status of a job = from command line is :
hadoop job -status jobID.=A0
(Mind the 'space' after= job and remove 'command' from the statement)

HTH

Regards,=
=A0=A0 =A0Mohammad Tariq



On Wed, Dec 12, 2012 at 6:14 PM, imen Me= gdiche <imen.megdiche@gmail.com> wrote:
My goal is to analyze the= response time of MapReduce depending on the size of the input files. I need to change the number of = map and / or Reduce tasks and<= /span> recover the execution t= ime. S it turns out tha= t nothing works locally on my pc :
neither hadoop job-status command job_local_0001 (which return no job fo= und )
nor localhost: 50030
I will be very grateful if you can <= span>help
m better understand these = problem


2012/12/12 Mohammad Tariq <dontariq@gm= ail.com>
Are you working locally?What exactly is the issue?

Regards,
=A0=A0 =A0Mohammad Tariq
<= div>



On Wed, Dec 12, 2012 at 6:00 PM, imen Me= gdiche <imen.megdiche@gmail.com> wrote:
no


2012/12/12 Mohammad Tariq <dontariq@gmail.com>
Any luck with "localhost:50030"??

Regards,
=A0=A0 =A0Mohammad Tariq



On Wed, Dec 12, 2012 at 5:53 PM, imen Me= gdiche <imen.megdiche@gmail.com> wrote:
i run the job through the command line


2012/12/12 Mohammad Tariq <dontar= iq@gmail.com>
You have to replace "JobTrackerHost" in "JobTrackerHost:50030= " with the actual name=A0of the machine where JobTracker is running. For examp= le, If you are working on a local cluster, you have to use "localhost:= 50030".

=
Are you running your job through the command line or some= IDE?

Regards,
=A0=A0 =A0Mo= hammad Tariq



On Wed, Dec 12, 2012 at 5:42 PM, imen Me= gdiche <imen.megdiche@gmail.com> wrote:
excuse me the data size is 98 MB
<= br>
2012/12/12 imen Megdiche <imen.meg= diche@gmail.com>
the size of data 49 MB and n of map 4=A0
the web UI Jo= bTrackerHost:50030 does not wok, what should i do to make this appear , i w= ork on ubuntu


201= 2/12/12 Mohammad Tariq <dontariq@gmail.com>
Hi Imen,

=A0 =A0 =A0You can visit the M= R web UI at "JobTrackerHost:50030" and see all the useful informa= tion like no. of mappers, no of reducers, time taken =A0for the execution e= tc.

One quick question for you, what is the size of your data and what is = the no of maps which you are getting right now?

Regards,
=A0=A0 =A0Mohammad Tariq



On Wed, Dec 12, 2012 at 5:11 PM, imen Me= gdiche <imen.megdiche@gmail.com> wrote:
Thank you Mohammad but the number of map tasks still the same in the execut= ion. Do you know how to capture the time spent on execution.
=


2012/12/12 Mohammad Tariq <dontariq@gmail.com>
Hi Imen,

=A0 =A0 You can add "mapr= ed.map.tasks" property in your mapred-site.xml file.=A0

But, it is just a hint for the InputFormat. Actually no. of = maps is actually determined by the no of InputSplits created by the=A0Input= Format.

HTH

<= div>Regards,
=A0=A0 =A0Mohammad Tariq



On Wed, Dec 12, 2012 at 4:11 PM, imen Me= gdiche <imen.megdiche@gmail.com> wrote:
Hi,

I try to force the number of map for the mapreduce job with the= command :
=A0 public static void main(String[] args) throws Exception = {

=A0=A0=A0 =A0=A0=A0 =A0 JobConf conf =3D new JobConf(WordCount.cla= ss);
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 conf.set("mapred.job.trac= ker", "local");
=A0=A0=A0 =A0=A0=A0=A0 conf.set("fs.default.name", "local");=A0=A0=A0 =A0=A0=A0=A0 =A0=A0=A0=A0 conf.setJobName("wordcount");
=A0= =A0=A0=A0
=A0=A0=A0=A0 =A0=A0=A0=A0 conf.setOutputKeyClass(Text.class);=
=A0=A0=A0 =A0=A0=A0=A0 conf.setOutputValueClass(IntWritable.class);
=A0= =A0=A0=A0 =A0=A0=A0=A0
=A0 =A0=A0=A0 =A0=A0=A0=A0 conf.setNumMapTask(6)= ;
=A0=A0=A0=A0 =A0=A0=A0=A0 conf.setMapperClass(Map.class);
=A0=A0=A0= =A0 =A0=A0=A0=A0 conf.setCombinerClass(Reduce.class);
=A0=A0=A0=A0 =A0= =A0=A0=A0 conf.setReducerClass(Reduce.class);
...
}

But it doesn t work.
What can i do to modify the number= of map and reduce tasks.

Thank you














--20cf3063e3558f3e4f04d0a77c35--