hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vitthal Gogate <gog...@yahoo-inc.com>
Subject Re: Hadoop Vaidya tool
Date Mon, 22 Jun 2009 22:44:15 GMT
Hello Pratik, -joblog also should be a specific job history file path not a
directory. Usually, I copy the job conf xml file and job history log file to
a local file system and then use a file:// protocol (although hdfs:// should
also work) e.g, 

Sh /home/hadoop/Desktop/hadoop-0.20.0/contrib/vaidya/bin/vaidya.sh -jobconf
file://localhost/logs/job_200906221335_0001_conf.xml  -joblog
file://localhost/logs/job_00906221335_0001_jobxxx

I discovered few problems with the tool in hadoop 20 for some specific
scenarios such as map_only jobs etc. Following Jiras fix the problems,

If you download latest hadoop (trunk), then 5582 is already part of it, else
with hadoop 20, you can apply following Jiras in sequence.

https://issues.apache.org/jira/browse/HADOOP-5582
https://issues.apache.org/jira/browse/HADOOP-5950

1. Hadoop Vaidya being standalone tool, you may not need to change your
existing installed version of hadoop, but rater separately download the
hadoop trunk, apply patch 5950, re-build and replace the
$HADOOP_HOME/contrib/vaidya/hadoop-0.20.0-vaidya.jar file in your existing
hadoop 20 installation with the one newly built.

2. Also if you have big job (i.e. Lots of map/reduce tasks), you may face
out of memory problem while analyzing it. In which case you can edit the
$HADOOP_HOME/contrib/vaidya/bin/vaidya.sh and add -Xmx1024m option on the
java command line before class path.

Hope it helps

Thanks & Regards, Suhas


On 6/22/09 1:13 PM, "Pankil Doshi" <forpankil@gmail.com> wrote:

> Hello ,
> 
> I am trying to use Hadoop Vaidya tool . Its available with version 0.20.0.
> But I see following error.Can anyone Guide me on that. I have pseudo mode
> cluster i/e single node cluster for testing..
> 
> *cmd I submit is *" sh
> /home/hadoop/Desktop/hadoop-0.20.0/contrib/vaidya/bin/vaidya.sh -jobconf
> hdfs://localhost:9000/logs/job_200906221335_0001_conf.xml  -joblog
> hdfs://localhost:9000/logs/ "
> 
> *Error :-*
> Exception:java.net.MalformedURLException: unknown protocol:
> hdfsjava.net.MalformedURLException: unknown protocol: hdfs
>     at java.net.URL.<init>(URL.java:590)
>     at java.net.URL.<init>(URL.java:480)
>     at java.net.URL.<init>(URL.java:429)
>     at
> org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.readJobInf
> ormation(PostExPerformanceDiagnoser.java:124)
>     at
> org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.<init>(Pos
> tExPerformanceDiagnoser.java:112)
>     at
> org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.main(PostE
> xPerformanceDiagnoser.java:220)
> 
> Can anyone guide me on that..
> 
> Regards
> Pankil

--Regards Suhas
[Getting stated w/ Grid]
http://twiki.corp.yahoo.com/view/GridDocumentation/GridDocAbout
[Search HADOOP/PIG Information]
http://ucdev20.yst.corp.yahoo.com/griduserportal/griduserportal.php




Mime
View raw message