hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod KV <vino...@yahoo-inc.com>
Subject Re: Problems with HOD and HDFS
Date Mon, 14 Jun 2010 05:57:28 GMT
On Monday 14 June 2010 09:51 AM, David Milne wrote:
> Ok, thanks Jeff.
>
> This is pretty surprising though. I would have thought many people
> would be in my position, where they have to use Hadoop on a general
> purpose cluster, and need it to play nice with a resource manager?
> What do other people do in this position, if they don't use HOD?
> Deprecated normally means there is a better alternative.
>
> - Dave
>    


It isn't formally deprecated though. May be we'll need to do it 
explicitly; that'll help putting up proper documentation about what else 
to use instead.

A quick reply is that you start a static cluster on a set of nodes. 
Static cluster means bringing up hadoop dameons on a set of nodes using 
the startup scripts distributed along in bin/ directory.

That said, there are no changes in HOD in 0.21 and beyond. Deploying 
0.21 clusters should mostly work out of the box. But beyond 0.21, it may 
not work because HOD needs to be updated w.r.t removed/updated hadoop 
specific configuration parameters and environmental variables it 
generates itself.

HTH,
+vinod

> On Mon, Jun 14, 2010 at 2:39 PM, Jeff Hammerbacher<hammer@cloudera.com>  wrote:
>    
>> Hey Dave,
>>
>> I can't speak for the folks at Yahoo!, but from watching the JIRA, I don't
>> think HOD is actively used or developed anywhere these days. You're
>> attempting to use a mostly deprecated project, and hence not receiving any
>> support on the mailing list.
>>
>> Thanks,
>> Jeff
>>
>> On Sun, Jun 13, 2010 at 7:33 PM, David Milne<d.n.milne@gmail.com>  wrote:
>>
>>      
>>> Anybody? I am completely stuck here. I have no idea who else I can ask
>>> or where I can go for more information. Is there somewhere specific
>>> where I should be asking about HOD?
>>>
>>> Thank you,
>>> Dave
>>>
>>> On Thu, Jun 10, 2010 at 2:56 PM, David Milne<d.n.milne@gmail.com>  wrote:
>>>        
>>>> Hi there,
>>>>
>>>> I am trying to get Hadoop on Demand up and running, but am having
>>>> problems with the ringmaster not being able to communicate with HDFS.
>>>>
>>>> The output from the hod allocate command ends with this, with full
>>>>          
>>> verbosity:
>>>        
>>>> [2010-06-10 14:40:22,650] CRITICAL/50 hadoop:298 - Failed to retrieve
>>>> 'hdfs' service address.
>>>> [2010-06-10 14:40:22,654] DEBUG/10 hadoop:631 - Cleaning up cluster id
>>>> 34029.symphony.cs.waikato.ac.nz, as cluster could not be allocated.
>>>> [2010-06-10 14:40:22,655] DEBUG/10 hadoop:635 - Calling rm.stop()
>>>> [2010-06-10 14:40:22,665] DEBUG/10 hadoop:637 - Returning from rm.stop()
>>>> [2010-06-10 14:40:22,666] CRITICAL/50 hod:401 - Cannot allocate
>>>> cluster /home/dmilne/hadoop/cluster
>>>> [2010-06-10 14:40:23,090] DEBUG/10 hod:597 - return code: 7
>>>>
>>>>
>>>> I've attached the hodrc file below, but briefly HOD is supposed to
>>>> provision an HDFS cluster as well as a Map/Reduce cluster, and seems
>>>> to be failing to do so. The ringmaster log looks like this:
>>>>
>>>> [2010-06-10 14:36:05,144] DEBUG/10 ringMaster:479 - getServiceAddr name:
>>>>          
>>> hdfs
>>>        
>>>> [2010-06-10 14:36:05,145] DEBUG/10 ringMaster:487 - getServiceAddr
>>>> service:<hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8>
>>>> [2010-06-10 14:36:05,147] DEBUG/10 ringMaster:504 - getServiceAddr
>>>> addr hdfs: not found
>>>> [2010-06-10 14:36:06,195] DEBUG/10 ringMaster:479 - getServiceAddr name:
>>>>          
>>> hdfs
>>>        
>>>> [2010-06-10 14:36:06,197] DEBUG/10 ringMaster:487 - getServiceAddr
>>>> service:<hodlib.GridServices.hdfs.Hdfs instance at 0x8f97e8>
>>>> [2010-06-10 14:36:06,198] DEBUG/10 ringMaster:504 - getServiceAddr
>>>> addr hdfs: not found
>>>>
>>>> ... and so on, until it gives up
>>>>
>>>> Any ideas why? One red flag is that when running the allocate command,
>>>> some of the variables echo-ed back look dodgy:
>>>>
>>>> --gridservice-hdfs.fs_port 0
>>>> --gridservice-hdfs.host localhost
>>>> --gridservice-hdfs.info_port 0
>>>>
>>>> These are not what I specified in the hodrc. Are the port numbers just
>>>> set to 0 because I am not using an external HDFS, or is this a
>>>> problem?
>>>>
>>>>
>>>> The software versions involved are:
>>>>   - Hadoop 0.20.2
>>>>   - Python 2.5.2 (no Twisted)
>>>>   - Java 1.6.0_20
>>>>   - Torque 2.4.5
>>>>
>>>>
>>>> The hodrc file looks like this:
>>>>
>>>> [hod]
>>>> stream                          = True
>>>> java-home                       = /opt/jdk1.6.0_20
>>>> cluster                         = debian5
>>>> cluster-factor                  = 1.8
>>>> xrs-port-range                  = 32768-65536
>>>> debug                           = 3
>>>> allocate-wait-time              = 3600
>>>> temp-dir                        = /scratch/local/dmilne/hod
>>>>
>>>> [ringmaster]
>>>> register                        = True
>>>> stream                          = False
>>>> temp-dir                        = /scratch/local/dmilne/hod
>>>> log-dir                         = /scratch/local/dmilne/hod/log
>>>> http-port-range                 = 8000-9000
>>>> idleness-limit                  = 864000
>>>> work-dirs                       =
>>>> /scratch/local/dmilne/hod/1,/scratch/local/dmilne/hod/2
>>>> xrs-port-range                  = 32768-65536
>>>> debug                           = 4
>>>>
>>>> [hodring]
>>>> stream                          = False
>>>> temp-dir                        = /scratch/local/dmilne/hod
>>>> log-dir                         = /scratch/local/dmilne/hod/log
>>>> register                        = True
>>>> java-home                       = /opt/jdk1.6.0_20
>>>> http-port-range                 = 8000-9000
>>>> xrs-port-range                  = 32768-65536
>>>> debug                           = 4
>>>>
>>>> [resource_manager]
>>>> queue                           = express
>>>> batch-home                      = /opt/torque-2.4.5
>>>> id                              = torque
>>>> options                         =
>>>>          
>>> l:pmem=3812M,W:X="NACCESSPOLICY:SINGLEJOB"
>>>        
>>>> #env-vars                       =
>>>> HOD_PYTHON_HOME=/foo/bar/python-2.5.1/bin/python
>>>>
>>>> [gridservice-mapred]
>>>> external                        = False
>>>> pkgs                            = /opt/hadoop-0.20.2
>>>> tracker_port                    = 8030
>>>> info_port                       = 50080
>>>>
>>>> [gridservice-hdfs]
>>>> external                        = False
>>>> pkgs                            = /opt/hadoop-0.20.2
>>>> fs_port                         = 8020
>>>> info_port                       = 50070
>>>>
>>>> Cheers,
>>>> Dave
>>>>
>>>>          
>>>        
>>      
>    


Mime
View raw message