Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 37D8310E27 for ; Fri, 10 Jan 2014 02:18:06 +0000 (UTC) Received: (qmail 13753 invoked by uid 500); 10 Jan 2014 02:18:01 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 13643 invoked by uid 500); 10 Jan 2014 02:18:01 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 13636 invoked by uid 99); 10 Jan 2014 02:18:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jan 2014 02:18:00 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.223.178 as permitted sender) Received: from [209.85.223.178] (HELO mail-ie0-f178.google.com) (209.85.223.178) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jan 2014 02:17:55 +0000 Received: by mail-ie0-f178.google.com with SMTP id lx4so4621476iec.9 for ; Thu, 09 Jan 2014 18:17:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=6q06thyx72e8ILLesggyaa1cAhqcfqfwuUevfd6nozs=; b=i/UXrzWiXKMePSlBhGkKi9sftURLF0+6RcjeCi00OVauQOe75YSfpKo7NCdP3ewzZP PgzEOdfYOQDuFuNWgsmsjoDTq63MgiSQp/QcIAu3jVJRNIE+v7BLEfgu7HCxB6fEBhkq sdtIynqLD++Da6btiTz6oRfmU40sxcbIwRcJE9vJBIrdlngPWs6yD0myPLavBmbEioM0 Sr5b8eq6I3QDtnRLGW3XBK9xt4c5U2Wz8J6A0PRX5mKsMrbBkYm1KRsYaistAL9wmfAG Xt1E5YdPP8Vu08k3sAl0aDL4OXxxnE42pkbK5acsyP6cS6J/LQa/9SqqH3JeAEaUtucq rV/A== X-Gm-Message-State: ALoCoQmP+djU9zdCsbol81uu1plABeCLAQURFr9Juo66Q42cgi/Dxy1gEBL4aTG0ku7iw7y4WVCq X-Received: by 10.43.158.72 with SMTP id lt8mr5327086icc.33.1389320254508; Thu, 09 Jan 2014 18:17:34 -0800 (PST) MIME-Version: 1.0 Received: by 10.50.234.225 with HTTP; Thu, 9 Jan 2014 18:17:13 -0800 (PST) In-Reply-To: References: From: Harsh J Date: Fri, 10 Jan 2014 07:47:13 +0530 Message-ID: Subject: Re: Wordcount Hadoop pipes C++ Running issue To: "" Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Hello, Do you have a proper MR cluster configured? Does your hadoop-1.2.1/bin/conf/mapred-site.xml point mapred.job.tracker to a specific hostname and port, and a JT+TT is running? I believe your error's due to the Pipes app probably running into issues due to the LocalJobRunner default execution mode. On Wed, Jan 8, 2014 at 4:40 PM, Massimo Simoniello wrote: > Hi all, > > I am trying to run the example of wordcount in C++ like this link that > describes the way to run the WordCount program in C++. > > So I have this code in the file wordcount.cpp: > > #include > #include > #include > > #include "stdint.h" // <--- to prevent uint64_t errors! > > #include "Pipes.hh" > #include "TemplateFactory.hh" > #include "StringUtils.hh" > > using namespace std; > > class WordCountMapper : public HadoopPipes::Mapper { > public: > // constructor: does nothing > WordCountMapper( HadoopPipes::TaskContext& context ) { > } > > // map function: receives a line, outputs (word,"1") > // to reducer. > void map( HadoopPipes::MapContext& context ) { > //--- get line of text --- > string line = context.getInputValue(); > > //--- split it into words --- > vector< string > words = HadoopUtils::splitString( line, " " ); > > //--- emit each word tuple (word, "1" ) --- > for ( unsigned int i=0; i < words.size(); i++ ) { > context.emit( words[i], HadoopUtils::toString( 1 ) ); > } > } > }; > > class WordCountReducer : public HadoopPipes::Reducer { > public: > // constructor: does nothing > WordCountReducer(HadoopPipes::TaskContext& context) { > } > > // reduce function > void reduce( HadoopPipes::ReduceContext& context ) { > int count = 0; > > //--- get all tuples with the same key, and count their numbers --- > while ( context.nextValue() ) { > count += HadoopUtils::toInt( context.getInputValue() ); > } > > //--- emit (word, count) --- > context.emit(context.getInputKey(), HadoopUtils::toString( count )); > } > }; > > int main(int argc, char *argv[]) { > return > HadoopPipes::runTask(HadoopPipes::TemplateFactory() > ); > } > > I have this Makefile: > > CC = g++ > HADOOP_INSTALL = /home/hduser/Scrivania/hadoop-1.2.1 > PLATFORM = Linux-amd64-64 > CPPFLAGS = -m64 -I$(HADOOP_INSTALL)/c++/$(PLATFORM)/include/hadoop/ > > wordcount: wordcount.cpp > $(CC) $(CPPFLAGS) $< -Wall -lssl -lcrypto > -L$(HADOOP_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils > -lpthread -g -O2 -o $@ > > The compilation works fine, but when I try to run my program as follow: > > $ hadoop-1.2.1/bin/hadoop pipes -D hadoop.pipes.java.recordreader=true \ > -D hadoop.pipes.java.recordwriter=true -input input -output output -program > wordcount > > I have this result: > > INFO util.NativeCodeLoader: Loaded the native-hadoop library > WARN mapred.JobClient: No job jar file set. User classes may not be found. > See JobConf(Class) or JobConf#setJar(String). > WARN snappy.LoadSnappy: Snappy native library not loaded > INFO mapred.FileInputFormat: Total input paths to process : 4 > INFO filecache.TrackerDistributedCacheManager: Creating filewordcount in > /tmp/hadoop-hduser/mapred/local/archive/8648114132384070327_893673541_1470671038-work--6818354830621303575 > with rwxr-xr-x > INFO filecache.TrackerDistributedCacheManager: Cached wordcount as > /tmp/hadoop-hduser/mapred/local/archive/8648114132384070327_893673541_1470671038/filewordcount > INFO filecache.TrackerDistributedCacheManager: Cached wordcount as > /tmp/hadoop-hduser/mapred/local/archive/8648114132384070327_893673541_1470671038/filewordcount > INFO mapred.JobClient: Running job: job_local2050700100_0001 > INFO mapred.LocalJobRunner: Waiting for map tasks > INFO mapred.LocalJobRunner: Starting task: > attempt_local2050700100_0001_m_000000_0 > INFO util.ProcessTree: setsid exited with exit code 0 > INFO mapred.Task: Using ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@15b734b > INFO mapred.MapTask: Processing split: > file:/home/hduser/Scrivania/input/sample.txt:0+530 > INFO mapred.MapTask: numReduceTasks: 1 > INFO mapred.MapTask: io.sort.mb = 100 > INFO mapred.MapTask: data buffer = 79691776/99614720 > INFO mapred.MapTask: record buffer = 262144/327680 > INFO mapred.LocalJobRunner: Starting task: > attempt_local2050700100_0001_m_000001_0 > INFO mapred.Task: Using ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@62d1f025 > INFO mapred.MapTask: Processing split: > file:/home/hduser/Scrivania/input/matrix.txt:0+255 > INFO mapred.MapTask: numReduceTasks: 1 > INFO mapred.MapTask: io.sort.mb = 100 > INFO mapred.MapTask: data buffer = 79691776/99614720 > INFO mapred.MapTask: record buffer = 262144/327680 > INFO mapred.LocalJobRunner: Starting task: > attempt_local2050700100_0001_m_000002_0 > INFO mapred.Task: Using ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3d04562f > INFO mapred.MapTask: Processing split: > file:/home/hduser/Scrivania/input/matrix.txt~:0+235 > INFO mapred.MapTask: numReduceTasks: 1 > INFO mapred.MapTask: io.sort.mb = 100 > INFO mapred.MapTask: data buffer = 79691776/99614720 > INFO mapred.MapTask: record buffer = 262144/327680 > INFO mapred.LocalJobRunner: Starting task: > attempt_local2050700100_0001_m_000003_0 > INFO mapred.Task: Using ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@133d9211 > INFO mapred.MapTask: Processing split: > file:/home/hduser/Scrivania/input/sample.txt~:0+0 > INFO mapred.MapTask: numReduceTasks: 1 > INFO mapred.MapTask: io.sort.mb = 100 > INFO mapred.MapTask: data buffer = 79691776/99614720 > INFO mapred.MapTask: record buffer = 262144/327680 > INFO mapred.LocalJobRunner: Map task executor complete. > WARN mapred.LocalJobRunner: job_local2050700100_0001 > java.lang.Exception: java.lang.NullPointerException > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.mapred.pipes.Application.(Application.java:103) > at > org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > INFO mapred.JobClient: map 0% reduce 0% > INFO mapred.JobClient: Job complete: job_local2050700100_0001 > INFO mapred.JobClient: Counters: 0 > INFO mapred.JobClient: Job Failed: NA > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357) > at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:248) > at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:479) > at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:494) > > I have tried with the Hadoop version: > > 0.19.2 > 1.2.1 > 2.2.0 -- Harsh J