hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SUJIT PAL <sujit....@comcast.net>
Subject Re: good way to debug map reduce code
Date Wed, 26 Dec 2012 16:53:31 GMT
Hi Jamal,

A missing semi-colon should get flagged by the Java compiler, but one way to keep you debug
cycles short is to (1) use local mode and (2) small data sets which you can run through under
a minute. Once you are happy that your stuff works, move to distributed and target data sets.


On Dec 25, 2012, at 5:56 PM, jamal sasha wrote:

> Hi, 
>   I have been using python hadoop streaming framework to write the code and now I am
slowly moving towards the core java api's.
> And I am getting comfortable with it but what is the quickest way to debug the map reduce
native code.. 
> like in hadoop streaming this worked great.
> % cat input.txt | python mapper.py | sort | python reducer.py
> If there use to be any coding error.. it use to just throw them off and it was very fast
to debug as you code.
> Is there any similar way .. where i dont have to run hadoop jobs to debg and wait and
go thru hadoop logs to see that maybe i miss a semi-colon..
> Thanks
> Jamal

View raw message