hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Rabkin" <asrab...@gmail.com>
Subject Re: Hadoop Profiling!
Date Fri, 10 Oct 2008 18:22:15 GMT
That code is in, unfortunately it doesn't quite solve the problem;
you'd need to do some more work.  You'd have to write subclasses that
spit out the statistics you want.  Then set the appropriate options in
hadoop-site, so that those classes get loaded.

On Wed, Oct 8, 2008 at 12:30 PM, George Porter <George.Porter@sun.com> wrote:
> Hi Ashish,
>
> I believe that Ari committed two instrumentation classes,
> TaskTrackerInstrumentation and JobTrackerInstrumentation, (both in
> src/mapred/org/apache/hadoop/mapred) that can give you information on when
> components of your M/R jobs start and stop.  I'm in the process of writing
> some additional instrumentation APIs that collect timing information about
> the RPC and HDFS layers, and will hopefully be able to submit a patch in a
> few weeks.
>
> Thanks,
> George
>
> Ashish Venugopal wrote:
>>
>> Are you interested in simply profiling your own code (in which case you
>> can
>> clearly use what ever java profiler you want), or your construction of the
>> MapReduce job, ie  how much time is being spent in the Map vs the sort vs
>> the shuffle vs the Reduce. I am not aware of a good solution to the second
>> problem, can anyone comment?
>>
>> Ashish
>>
>> On Wed, Oct 8, 2008 at 12:06 PM, Stefan Groschupf <sg@101tec.com> wrote:
>>
>>
>>>
>>> Just run your map reduce job local and connect your profiler. I use
>>> yourkit.
>>> Works great!
>>> You can profile your map reduce job running the job in local mode as ant
>>> other java app as well.
>>> However we also profiled in a grid. You just need to install the yourkit
>>> agent into the jvm of the node you want to profile and than you connect
>>> to
>>> the node when the job runs.
>>> However you need to time things well, since the task jvm is shutdown as
>>> soon your job is done.
>>> Stefan
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> 101tec Inc., Menlo Park, California
>>> web:  http://www.101tec.com
>>> blog: http://www.find23.net
>>>
>>>
>>>
>>>
>>> On Oct 8, 2008, at 11:27 AM, Gerardo Velez wrote:
>>>
>>>  Hi!
>>>
>>>>
>>>> I've developed a Map/Reduce algorithm to analyze some logs from web
>>>> application.
>>>>
>>>> So basically, we are ready to start QA test phase, so now, I would like
>>>> to
>>>> now how efficient is my application
>>>> from performance point of view.
>>>>
>>>> So is there any procedure I could use to do some profiling?
>>>>
>>>>
>>>> Basically I need basi data, like time excecution or code bottlenecks.
>>>>
>>>>
>>>> Thanks in advance.
>>>>
>>>> -- Gerardo Velez
>>>>
>>>>
>>>
>>>
>>
>>
>
> --
> George Porter, Sun Labs/CTO
> Sun Microsystems - San Diego, Calif.
> george.porter@sun.com 1.858.526.9328
>
>



-- 
Ari Rabkin asrabkin@gmail.com
UC Berkeley Computer Science Department

Mime
View raw message