Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B2FE11BCB for ; Thu, 12 Jun 2014 05:00:27 +0000 (UTC) Received: (qmail 76203 invoked by uid 500); 12 Jun 2014 05:00:20 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 76066 invoked by uid 500); 12 Jun 2014 05:00:19 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 76059 invoked by uid 99); 12 Jun 2014 05:00:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Jun 2014 05:00:18 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of unmeshabiju@gmail.com designates 209.85.220.178 as permitted sender) Received: from [209.85.220.178] (HELO mail-vc0-f178.google.com) (209.85.220.178) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Jun 2014 05:00:16 +0000 Received: by mail-vc0-f178.google.com with SMTP id ij19so279508vcb.37 for ; Wed, 11 Jun 2014 21:59:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=xdipuceXvLk7kriZ/cbSIGNNNwciqXKvI8RxxPD5h94=; b=ncWRbRGSAj/ddg/Rl/DXmy6BRAb9kFU2bqpc16PDOriYLmUk75myaGhHtHkkPtil0D cydFQzqPtCidlwr+luSh7Gi6OtSYtTUmBZ5Vfb1bCuYlsziNuwcPSNYdwJiK2U2YGrJb ZJ4KH3mxlUF99oeaX95Z6rt2c7gwIjlaU14Lu/5GkCRLiCYVaGiZez2LGpnihk79ekFO i2H/ni4sH1aMk82d/DTl0krr/dU2fQvNxMkMb619Cuqacd1ZHDztxrcSxZLdKqop5V/A ihgVOUQv8FVXvcZx4esHGuwn8Ij9+gZBBhyMo2x7M++LyiubwlFl39rB/Y8Lr/PiXzrs tcGQ== X-Received: by 10.52.12.229 with SMTP id b5mr5611235vdc.52.1402549191740; Wed, 11 Jun 2014 21:59:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.77.201 with HTTP; Wed, 11 Jun 2014 21:59:11 -0700 (PDT) In-Reply-To: References: From: unmesha sreeveni Date: Thu, 12 Jun 2014 10:29:11 +0530 Message-ID: Subject: Re: Counters in MapReduce To: User Hadoop Content-Type: multipart/alternative; boundary=20cf302efbb25a919d04fb9c6e04 X-Virus-Checked: Checked by ClamAV on apache.org --20cf302efbb25a919d04fb9c6e04 Content-Type: text/plain; charset=UTF-8 I tried out by setting an enum to count no. of lines in output file from job3. But I am getting 14/06/12 10:12:30 INFO mapred.JobClient: Total committed heap usage (bytes)=1238630400 conf3 Exception in thread "main" java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:116) at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491) Below is my current code *static enum UpdateCounter {* * INCOMING_ATTR* * }* *public static void main(String[] args) throws Exception {* * Configuration conf = new Configuration();* * int res = ToolRunner.run(conf, new Driver(), args);* * System.exit(res);* *}* *@Override* *public int run(String[] args) throws Exception {* *while(counter >= 0){* * Configuration conf = getConf();* * /** * * Job 1: * * */* * Job job1 = new Job(conf, "");* * //other configuration* * job1.setMapperClass(ID3ClsLabelMapper.class);* * job1.setReducerClass(ID3ClsLabelReducer.class);* * Path in = new Path(args[0]);* * Path out1 = new Path(CL);* * if(counter == 0){* * FileInputFormat.addInputPath(job1, in);* * }* * else{* * FileInputFormat.addInputPath(job1, out5); * * }* * FileInputFormat.addInputPath(job1, in);* * FileOutputFormat.setOutputPath(job1,out1);* * job1.waitForCompletion(true);* * /** * * Job 2: * * * * * */* * Configuration conf2 = getConf();* * Job job2 = new Job(conf2, "");* * Path out2 = new Path(ANC);* * FileInputFormat.addInputPath(job2, in);* * FileOutputFormat.setOutputPath(job2,out2);* * job2.waitForCompletion(true);* * /** * * Job3* * */* * Configuration conf3 = getConf();* * Job job3 = new Job(conf3, "");* * System.out.println("conf3");* * Path out5 = new Path(args[1]);* * if(fs.exists(out5)){* * fs.delete(out5, true);* * }* * FileInputFormat.addInputPath(job3,out2);* * FileOutputFormat.setOutputPath(job3,out5);* * job3.waitForCompletion(true);* * FileInputFormat.addInputPath(job3,new Path(args[0]));* * FileOutputFormat.setOutputPath(job3,out5);* * job3.waitForCompletion(true);* * counter = job3.getCounters().findCounter(UpdateCounter.INCOMING_ATTR).getValue();* * }* * return 0;* Am I doing anything wrong? On Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar wrote: > You should use FileStatus to decide what files you want to include in the > InputPath, and use the FileSystem class to delete or process the > intermediate / final paths. Moving each job in your iteration logic into > different methods would help keep things simple. > > > > From: unmesha sreeveni > Reply-To: "user@hadoop.apache.org" > Date: Monday, June 9, 2014 at 6:02 AM > To: User Hadoop > Subject: Re: Counters in MapReduce > > Ok I will check out with counters. > And after I st iteration the input file to job1 will be the output file of > job 3.How to give that. > *Inorder to satisfy 2 conditions* > First iteration : users input file > after first iteration :job 3 's output file as job 1 s input. > > > >> -- >> *Thanks & Regards* >> >> >> *Unmesha Sreeveni U.B* >> *Hadoop, Bigdata Developer* >> *Center for Cyber Security | Amrita Vishwa Vidyapeetham* >> http://www.unmeshasreeveni.blogspot.in/ >> >> >> >> ------------------------------ >> *Kai Voigt* Am Germaniahafen 1 k@123.org >> 24143 Kiel +49 160 96683050 >> Germany @KaiVoigt >> >> > > > -- > *Thanks & Regards* > > > *Unmesha Sreeveni U.B* > *Hadoop, Bigdata Developer* > *Center for Cyber Security | Amrita Vishwa Vidyapeetham* > http://www.unmeshasreeveni.blogspot.in/ > > > -- *Thanks & Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Center for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ --20cf302efbb25a919d04fb9c6e04 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I tried out by setting an enum to count no. of lines in output = file from job3.

But I am getting=C2=A0
14/06/12 10:12:30 = INFO mapred.JobClient: =C2=A0 =C2=A0 Total committed heap usage (bytes)=3D1= 238630400
conf3
Ex= ception in thread "main" java.lang.IllegalStateException: Job in = state DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Jo= b.ensureState(Job.java:116)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:491)


<= /div>
Below is my current code

stat= ic enum UpdateCounter {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 INCOMING_ATTR
=C2=A0 = =C2=A0 }

public s= tatic void main(String[] args) throws Exception {
=C2=A0 =C2=A0 Configuration conf = =3D new Configuration();
=C2=A0 =C2=A0 int res =3D Too= lRunner.run(conf, new Driver(), args);
=C2=A0 =C2=A0 System.exit(res);
}


=
@Override
public int run(String[] args) throws Exception {
whil= e(counter >=3D 0){

=C2=A0 =C2=A0 =C2=A0 Configuration conf =3D getC= onf();
=C2=A0 =C2=A0 =C2=A0/*
=C2=A0 =C2=A0 =C2=A0* Job 1:= =C2=A0
=C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 =C2=A0Job job1 =3D new Job(conf, "");=
=C2=A0 =C2=A0 =C2=A0//other config= uration
=C2=A0 =C2=A0 =C2=A0job1.setMapperClass(ID3C= lsLabelMapper.class);
=C2=A0 =C2=A0 =C2=A0job1.setReducerClass(ID3ClsLabelReducer.class);
=C2=A0 =C2=A0 =C2=A0Path in =3D new Path(args[0]);
=C2=A0 =C2=A0 =C2=A0Path out1 =3D new Path(CL);
=C2= =A0 =C2=A0 =C2=A0if(counter =3D=3D 0){
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 FileInputFormat.addInputPath(job= 1, in);
=C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0else{
=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 FileInputFormat.addInputPath(job1, out5); =C2=A0=C2=A0
=C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0FileInputFo= rmat.addInputPath(job1, in);
=C2=A0 =C2=A0 =C2=A0FileOutputFormat.setOutputPath(job1,out1);
=C2=A0 =C2=A0 =C2=A0job1.waitForCompletion(true);<= /div>
=C2=A0 =C2=A0 /*
=C2=A0 =C2=A0 =C2=A0* Job 2:=C2=A0
=C2=A0 =C2=A0 =C2=A0* =C2=A0
=C2=A0 =C2=A0 =C2=A0*/
=C2=A0 =C2=A0 C= onfiguration conf2 =3D getConf();
=C2=A0 =C2=A0 Job job2 =3D new Job(conf2, "");
=C2=A0 =C2=A0 Path out2 =3D new Path(ANC);
=C2=A0 =C2=A0 FileInputFormat.addInputPath(job2, in);
=
=C2=A0 =C2=A0 FileOutputFormat.setOutputPath(job2,out2);
=C2=A0 =C2=A0job2.waitForCompletion(true);

<= /i>
=C2=A0 =C2=A0/*
=C2=A0 =C2=A0 =C2=A0* Job3
=C2=A0 =C2=A0 */
=C2=A0 =C2=A0 Configuration conf3 =3D getConf();
=C2=A0 =C2=A0 Job job3 =3D new Job= (conf3, "");
=C2=A0 =C2=A0 System.out.printl= n("conf3");
=C2=A0 =C2=A0 Path out5 =3D new Path(args[1]);
=C2=A0= =C2=A0 if(fs.exists(out5)){
=C2=A0 =C2=A0 =C2=A0 =C2=A0 fs.delete(out5, true);
= =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 FileInputFormat.addInputPath(job3,out2);
<= b>=C2=A0 =C2=A0 FileOutputFormat.setOutputPath(job3,out5);
=C2=A0 =C2=A0 job3.waitForCompletion(true);
=C2=A0 = =C2=A0 FileInputFormat.addInputPath(job3,new Path(args[0]));=
=C2=A0 =C2=A0 FileOutputFormat.setOutputPath(job3,out5);
=C2=A0 =C2=A0 job3.waitForCompletion(true);
=C2=A0 =C2=A0 counter =3D job3.getCounters().findCounter(UpdateCounter.INC= OMING_ATTR).getValue();
=C2=A0 }
=C2=A0return 0;

Am I doing anything wrong?
<= /div>


On= Mon, Jun 9, 2014 at 4:37 PM, Krishna Kumar <kkumar@nanigans.com&g= t; wrote:
You should use = FileStatus to =C2=A0decide what files you want to include in the InputPath,= and use the FileSystem class to delete or process the intermediate / final= paths. Moving each job in your iteration logic into different methods woul= d help keep things simple.



From: unmesha sreeveni <unmeshabiju@gmail.com>
Reply-To: "
user@hadoop.apache.org= " <= user@hadoop.apache.org>
Date: Monday, June 9, 2014 at 6:02= AM
To: User Hadoop <user@hadoop.apache.org= >
Subject: Re: Counters in MapReduce=

Ok I will check= out with counters.
And a= fter I st iteration the input file to job1 will be the output file of job 3= .How to give that.
Inorder to satisfy 2 conditions



--
--20cf302efbb25a919d04fb9c6e04--