whirr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <msegel_had...@hotmail.com>
Subject Re: [GSOC]failed to start a yarn cluster on EC2
Date Wed, 31 Jul 2013 17:48:01 GMT
Actually I found out what is causing your script to ignore your hardware-id... 

Comment out the whirr.template line... that seems to be suggesting the machine size which
would then override the specific instance size. 

HTH

-Mike

On Jul 30, 2013, at 1:57 PM, "Han,Meng" <menghan@ufl.edu> wrote:

> I didn't customise the recipes, I only setup up the AWS credential environment variables.
> 
> 
> whirr.hardware-id=m1.large
> #whirr.image-id=us-east-1/ami-da0cf8b3
> whirr.template=osFamily=UBUNTU,osVersionMatches=10.04,os64Bit=true,minRam=2048
> 
> The AMI information is as above. Although I asked for a large instance but when the nodes
are up, their type is m1.medium.
> 
> 
> 
> 
> 
> 
> On Tue, 30 Jul 2013 21:51:22 +0300, Andrei Savu wrote:
>> Have you customised the recipe? What AMI are you using?
>> 
>> It looks like Whirr is unable to connect to the machines over SSH.
>> 
>> -- Andrei Savu / axemblr.com
>> 
>> 
>> On Tue, Jul 30, 2013 at 9:47 PM, Han,Meng <menghan@ufl.edu> wrote:
>> 
>>> Hi all,
>>> 
>>> I tried to start a Yarn cluster on EC2 using the file
>>> hadoop-yarn-ec2.properties under the recipes directory in Whirr source. The
>>> following error showed up.
>>> 
>>> 2013-07-29 16:54:02,615 DEBUG [org.jclouds.http.handlers.**BackoffLimitedRetryHandler]
>>> (user thread 10) Retry 5/7: delaying for 2000 ms:
>>> (meng:rsa[fingerprint(a4:6e:**cc:53:10:73:0b:f4:a9:d0:19:01:**
>>> 7f:3f:99:dd),sha1(72:68:cf:a7:**92:e8:92:5b:80:5b:a2:6f:10:20:**
>>> ef:2e:e3:c7:11:ec)]@107.20.81.**124:22 <http://107.20.81.124:22>) error
>>> acquiring {hostAndPort=107.20.81.124:22, loginUser=meng, ssh=null,
>>> connectTimeout=60000, sessionTimeout=60000}: connect timed out
>>> 2013-07-29 16:55:04,675 DEBUG [org.jclouds.http.handlers.**BackoffLimitedRetryHandler]
>>> (user thread 10) Retry 6/7: delaying for 2000 ms:
>>> (meng:rsa[fingerprint(a4:6e:**cc:53:10:73:0b:f4:a9:d0:19:01:**
>>> 7f:3f:99:dd),sha1(72:68:cf:a7:**92:e8:92:5b:80:5b:a2:6f:10:20:**
>>> ef:2e:e3:c7:11:ec)]@107.20.81.**124:22 <http://107.20.81.124:22>) error
>>> acquiring {hostAndPort=107.20.81.124:22, loginUser=meng, ssh=null,
>>> connectTimeout=60000, sessionTimeout=60000}: connect timed out
>>> 2013-07-29 16:56:06,713 ERROR [jclouds.ssh] (user thread 10) <<
>>> (meng:rsa[fingerprint(a4:6e:**cc:53:10:73:0b:f4:a9:d0:19:01:**
>>> 7f:3f:99:dd),sha1(72:68:cf:a7:**92:e8:92:5b:80:5b:a2:6f:10:20:**
>>> ef:2e:e3:c7:11:ec)]@107.20.81.**124:22 <http://107.20.81.124:22>) error
>>> acquiring {hostAndPort=107.20.81.124:22, loginUser=meng, ssh=null,
>>> connectTimeout=60000, sessionTimeout=60000} (out of retries - max 7):
>>> connect timed out
>>> java.net.**SocketTimeoutException: connect timed out
>>> 
>>> On the AWS manage console I see that the nodes are up and running, but on
>>> the Whirr side, it is in a stuck state. Could someone light me up on this?
>>> 
>>> Thank you all.
>>> 
>>> Cheers,
>>> Meng
>>> 
>>> 
> 
> 


Mime
View raw message