hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adarsh Sharma <adarsh.sha...@orkash.com>
Subject Re: Running Back to Back Map-reduce jobs
Date Tue, 07 Jun 2011 10:45:42 GMT
Harsh J wrote:
> Yes, I believe Oozie does have Pipes and Streaming action helpers as well.
>
> On Thu, Jun 2, 2011 at 5:05 PM, Adarsh Sharma <adarsh.sharma@orkash.com> wrote:
>   
>> Ok, Is it valid for running jobs through Hadoop Pipes too.
>>
>> Thanks
>>
>> Harsh J wrote:
>>     
>>> Oozie's workflow feature may exactly be what you're looking for. It
>>> can also do much more than just chain jobs.
>>>
>>> Check out additional features at: http://yahoo.github.com/oozie/
>>>
>>> On Thu, Jun 2, 2011 at 4:48 PM, Adarsh Sharma <adarsh.sharma@orkash.com>
>>> wrote:
>>>
>>>       
After following the below points, I am confused about the examples used 
in the documentation :

http://yahoo.github.com/oozie/releases/3.0.0/WorkflowFunctionalSpec.html#a3.2.2.3_Pipes

What I want to achieve is to terminate a job after my permission i.e if 
I want to run again a map-reduce job after the completion of one , it 
runs & then terminates after my code execution.
I struggled to find a simple example that proves this concept. In the 
Oozie documentation, they r just setting parameters and use them.

fore.g a simple Hadoop Pipes job is executed by :

int main(int argc, char *argv[]) {
  return HadoopPipes::runTask(HadoopPipes::TemplateFactory<WordCountMap,
                              WordCountReduce>());
}

Now if I want to run another job after this on the reduced data in HDFS, 
how this could be possible. Do i need to add some code.

Thanks




>>>> Dear all,
>>>>
>>>> I ran several map-reduce jobs in Hadoop Cluster of 4 nodes.
>>>>
>>>> Now this time I want a map-reduce job to be run again after one.
>>>>
>>>> Fore.g to clear my point, suppose a wordcount is run on gutenberg file in
>>>> HDFS and after completion
>>>>
>>>> 11/06/02 15:14:35 WARN mapred.JobClient: No job jar file set.  User
>>>> classes
>>>> may not be found. See JobConf(Class) or JobConf#setJar(String).
>>>> 11/06/02 15:14:35 INFO mapred.FileInputFormat: Total input paths to
>>>> process
>>>> : 3
>>>> 11/06/02 15:14:36 INFO mapred.JobClient: Running job:
>>>> job_201106021143_0030
>>>> 11/06/02 15:14:37 INFO mapred.JobClient:  map 0% reduce 0%
>>>> 11/06/02 15:14:50 INFO mapred.JobClient:  map 33% reduce 0%
>>>> 11/06/02 15:14:59 INFO mapred.JobClient:  map 66% reduce 11%
>>>> 11/06/02 15:15:08 INFO mapred.JobClient:  map 100% reduce 22%
>>>> 11/06/02 15:15:17 INFO mapred.JobClient:  map 100% reduce 100%
>>>> 11/06/02 15:15:25 INFO mapred.JobClient: Job complete:
>>>> job_201106021143_0030
>>>> 11/06/02 15:15:25 INFO mapred.JobClient: Counters: 18
>>>>
>>>>
>>>>
>>>> Again a map-reduce job is started on the output or original data say
>>>> again
>>>>
>>>> 1/06/02 15:14:36 INFO mapred.JobClient: Running job:
>>>> job_201106021143_0030
>>>> 11/06/02 15:14:37 INFO mapred.JobClient:  map 0% reduce 0%
>>>> 11/06/02 15:14:50 INFO mapred.JobClient:  map 33% reduce 0%
>>>>
>>>> Is it possible or any parameters to achieve it.
>>>>
>>>> Please guide .
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>         
>>>
>>>
>>>       
>>     
>
>
>
>   


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message