hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Song Liu <lamfeeli...@gmail.com>
Subject Re: 2 HOD Questions
Date Mon, 15 Mar 2010 20:52:01 GMT
Thanks Peeyush,
   I tried your solution, it sometimes works. I guess it should be a bug of
torque, because I also see jobs in queue after I delete it using qdel.  I
will keep an eye on this issue.

  The second is a little tough. I tried to allocate more than 3 nodes, but
the error remains. I compared the two clusters and found they use different
torque software. The one HOD works uses torque 2.3.3, but the later one uses
2.3.7 and  Python 2.6 was installed on that cluster.

  BTW, I'm a bit confused about the Cluster Name in config file, it says to
take the value of "Cluster Name", but where can I see this attribute for my
cluster ? I use the showbf command, it shows:

Partition     Tasks  Nodes   StartOffset      Duration       StartDate
---------     -----  -----  ------------  ------------  --------------
ALL             497     72      00:00:00      INFINITY  20:50:22_03/15
main            265     43      00:00:00      INFINITY  20:50:22_03/15
test            232     29      00:00:00      INFINITY  20:50:22_03/15

  is the "Partition" column tells about the cluster name?

  Thanks a lot!

Regards
Song Liu

On Mon, Mar 15, 2010 at 6:03 PM, Peeyush Bishnoi <peeyushb@yahoo-inc.com>wrote:

> Song,
>
> For answer to question 1.
> Before deallocating the node( hod deallocate ...) are you setting the
> HADOOP_CONF_DIR  to hod allocated directory.
>
> For answer to question 2.
> Yes hod need mininum 3 nodes. As one node will run with JobTracker, second
> node for Namenode and third and more number of nodes will be tasktracker and
> datanode.
>
> If you have any questions please let me know :-)
>
> Thanks,
> ---
> Peeyush
> ________________________________________
> From: Song Liu [lamfeeling2@gmail.com]
> Sent: Monday, March 15, 2010 8:25 PM
> To: common-user@hadoop.apache.org
> Subject: 2 HOD Questions
>
> Hi all, I have two questions about HOD
>
>  1. I confiured and setup a HOD on one cluster, it works fine, but when I
> finished jobs and deallocated the nodes, I found my jobID can still be seen
> using "qstat", until I kill them using "qdel". Is this normal? or do I have
> to do this manually, or leave with that?
>
>  2. I failed to configure the HOD on another cluster using the approach
> shown in userguide. It fails when allocating the nodes, and shows this
> error:
>
> Uncaught Exception : need more than 2 values to unpack
>
> Did anyone met this message before?
>
> BTW, to make the hod script work, I commented the line 576 which is
> "finally".
>
> Thanks
>
> Song Liu
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message