hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Venugopal" <...@andrew.cmu.edu>
Subject Re: Hadoop Profiling!
Date Wed, 08 Oct 2008 19:34:02 GMT
Great, thanks for this info, is there any chance that this information can
also be exposed for streaming jobs as well?
(All of the jobs that we run in our lab are only via streaming...)

Thanks!

Ashish

On Wed, Oct 8, 2008 at 12:30 PM, George Porter <George.Porter@sun.com>wrote:

> Hi Ashish,
>
> I believe that Ari committed two instrumentation classes,
> TaskTrackerInstrumentation and JobTrackerInstrumentation, (both in
> src/mapred/org/apache/hadoop/mapred) that can give you information on when
> components of your M/R jobs start and stop.  I'm in the process of writing
> some additional instrumentation APIs that collect timing information about
> the RPC and HDFS layers, and will hopefully be able to submit a patch in a
> few weeks.
>
> Thanks,
> George
>
>
> Ashish Venugopal wrote:
>
>> Are you interested in simply profiling your own code (in which case you
>> can
>> clearly use what ever java profiler you want), or your construction of the
>> MapReduce job, ie  how much time is being spent in the Map vs the sort vs
>> the shuffle vs the Reduce. I am not aware of a good solution to the second
>> problem, can anyone comment?
>>
>> Ashish
>>
>> On Wed, Oct 8, 2008 at 12:06 PM, Stefan Groschupf <sg@101tec.com> wrote:
>>
>>
>>
>>> Just run your map reduce job local and connect your profiler. I use
>>> yourkit.
>>> Works great!
>>> You can profile your map reduce job running the job in local mode as ant
>>> other java app as well.
>>> However we also profiled in a grid. You just need to install the yourkit
>>> agent into the jvm of the node you want to profile and than you connect
>>> to
>>> the node when the job runs.
>>> However you need to time things well, since the task jvm is shutdown as
>>> soon your job is done.
>>> Stefan
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> 101tec Inc., Menlo Park, California
>>> web:  http://www.101tec.com
>>> blog: http://www.find23.net
>>>
>>>
>>>
>>>
>>> On Oct 8, 2008, at 11:27 AM, Gerardo Velez wrote:
>>>
>>>  Hi!
>>>
>>>
>>>> I've developed a Map/Reduce algorithm to analyze some logs from web
>>>> application.
>>>>
>>>> So basically, we are ready to start QA test phase, so now, I would like
>>>> to
>>>> now how efficient is my application
>>>> from performance point of view.
>>>>
>>>> So is there any procedure I could use to do some profiling?
>>>>
>>>>
>>>> Basically I need basi data, like time excecution or code bottlenecks.
>>>>
>>>>
>>>> Thanks in advance.
>>>>
>>>> -- Gerardo Velez
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> --
> George Porter, Sun Labs/CTO
> Sun Microsystems - San Diego, Calif.
> george.porter@sun.com 1.858.526.9328
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message