vcl-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Kurth <andy_ku...@ncsu.edu>
Subject Re: New capture failed attempting Windows post-load tasks
Date Mon, 13 Aug 2012 17:59:00 GMT
A few minor changes when updating the file on the computer being loaded:

Change https to http or else wget may fail:
* wget http://svn.apache.org/repos/asf/vcl/trunk/managementnode/tools/Windows/Scripts/update_cygwin.cmd
* Set the file to executable after you download it: chmod +x update_cygwin.cmd
* Manually run update_cygwin.cmd: ./update_cygwin.cmd
* After running update_cygwin.cmd, log off as root.

-Andy

On Mon, Aug 13, 2012 at 1:49 PM, Andy Kurth <andy_kurth@ncsu.edu> wrote:
> I believe there is a bug in the latest version of Cygwin which is
> causing update_cygwin.cmd to fail.  As a result, the computer being
> loaded never responds to SSH.  When ssh-keygen.exe is run from a
> normal non-Cygwin command prompt, the following occurs:
>
> C:\cygwin\home\root\VCL\Scripts>C:\Cygwin\bin\ssh-keygen.exe -t rsa1
> -f C:\cygwin\etc\ssh_host_key -N ""
> Generating public/private rsa1 key pair.
>       8 [main] ssh-keygen 224 exception::handle: Exception:
> STATUS_ACCESS_VIOLATION
>    2114 [main] ssh-keygen 224 open_stackdumpfile: Dumping stack trace
> to ssh-keygen.exe.stackdump
>   61325 [main] ssh-keygen 224 exception::handle: Exception:
> STATUS_ACCESS_VIOLATION
>   68272 [main] ssh-keygen 224 exception::handle: Error while dumping
> state (probably corrupted stack)
>
> Running rebaseall doesn't help. The command succeeds if run from a
> Cygwin shell.  I just committed an update to update_cygwin.cmd to wrap
> the ssh-keygen.exe commands in "bash.exe -c".
> (https://issues.apache.org/jira/browse/VCL-616)
>
> You're going to have to update the file on the management node and in
> any images which were captured but aren't loading:
>
> On the management node:
> * cd /usr/local/vcl/tools/Windows/Scripts
> * rm -f update_cygwin.cmd
> * wget https://svn.apache.org/repos/asf/vcl/trunk/managementnode/tools/Windows/Scripts/update_cygwin.cmd
>
> For images which aren't loading correctly, update_cygwin.cmd will need
> to be updated within the image and then a new revision of the VCL
> image must be created.
>
> * Make an imaging reservation for the problematic image.
> * Watch the console as the image is being loaded.  Assuming you're
> using ESXi, view the Console tab from the vSphere Client.  You should
> see the VM being powered on, the root account automatically logs in,
> runs a few scripts, and then logs off.
> * After root is automatically logged off, manually log in as root.
> The password will be the value of WINDOWS_ROOT_PASSWORD configured in
> /etc/vcl/vcld.conf.
> * Once logged in as root, open the Cygwin shell.
> * cd ~/VCL/Scripts
> * rm -f update_cygwin.cmd
> * wget https://svn.apache.org/repos/asf/vcl/trunk/managementnode/tools/Windows/Scripts/update_cygwin.cmd
> * Manually run update_cygwin.cmd: ./update_cygwin.cmd
>
> The vcld process should still be running and waiting for the computer
> to respond to SSH (you have 900 seconds).  When you run
> update_cygwin.cmd, the computer should begin responding and the
> reservation should finish loading.  You should be able to log in
> normally from the information on the Current Reservations page.  Save
> a new revision of the image.  It should be saved with the updated copy
> of update_cygwin.cmd which was downloaded to the management node.
>
> -Andy
>
>
> On Fri, Aug 3, 2012 at 12:25 PM, Basilio, Norvin <nbasilio@odu.edu> wrote:
>> I am also experiencing this issue when using Cygwin 1.7. I've run the "update_cygwin.cmd"
manually and saw that its unable to regenerate the keys. I decided to try and capture my image
using the older Cygwin 1.5 and the update_cygwin.cmd was able to regenerate the keys correctly
allowing the reload process to complete.
>>
>> Norvin Basilio
>> nbasilio@odu.edu
>>
>>
>> -----Original Message-----
>> From: Hechler, Adam [mailto:hechla@rpi.edu]
>> Sent: Friday, August 03, 2012 12:14 PM
>> To: user@vcl.apache.org
>> Subject: RE: New capture failed atttempting Windows post-load tasks
>>
>> Hello again,
>>
>> So, I walked out of the office last night thinking that my re-capture was running
smoothly. It got about 20 minutes in, I think and then (I think this is the section containing
the fatal error - it failed to configure the firewall to all SSH?). Is that my problem?  If
so, any idea how to correct that?
>>
>> Thanks,
>> Adam
>>
>> ----
>>
>> 2012-08-02 17:39:37|24852|125:125|image|Windows.pm:firewall_enable_ssh_private(4633)|SSH
will be enabled on private interface: Local Area Connection 3
>> 2012-08-02 17:39:37|24852|125:125|image|utils.pm:run_ssh_command(5380)|executing
SSH command on vmwg0-120-57:
>> |24852|125:125|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>> |24852|StrictHostKeyChecking=no -l root -p 22 -x vmwg0-120-57
>> |24852|'C:/Windows/System32/netsh.exe firewall delete portopening
>> |24852|protocol = TCP port = 22 interface = "Local Area Connection 4"
>> |24852|;C:/Windows/System32/netsh.exe firewall delete portopening
>> |24852|protocol = TCP port = 22 profile = ALL
>> |24852|;C:/Windows/System32/netsh.exe firewall set portopening name =
>> |24852|"Cygwin SSHD" protocol = TCP port = 22 mode = ENABLE interface =
>> |24852|"Local Area Connection 3"' 2>&1
>> 2012-08-02 17:39:42|24852|125:125|image|utils.pm:run_ssh_command(5464)|run_ssh_command
output:
>> |24852|125:125|image| The interface was not found.
>> |24852|125:125|image| Ok.
>> |24852|125:125|image| The interface was not found.
>> 2012-08-02 17:39:42|24852|125:125|image|utils.pm:run_ssh_command(5474)|SSH command
executed on vmwg0-120-57, command:
>> |24852|125:125|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>> |24852|StrictHostKeyChecking=no -l root -p 22 -x vmwg0-120-57
>> |24852|'C:/Windows/System32/netsh.exe firewall delete portopening
>> |24852|protocol = TCP port = 22 interface = "Local Area Connection 4" ;C:/Windows/System32/netsh.exe
firewall delete portopening protocol = TCP port = 22 profile = ALL ;C:/Windows/System32/netsh.exe
firewall set portopening name = "Cygwin SSHD" protocol = TCP port = 22 mode = ENABLE interface
= "Local Area Connection 3"' 2>&1 125:125|image| returning (1, "The interface was not
found. O...") 125:125|image| ---- WARNING ---- 125:125|image| 2012-08-02 17:39:42|24852|125:125|image|Windows.pm:firewall_enable_ssh_private(4665)|failed
to configure firewall to allow SSH on private interface, exit status: 1, output:
>> |24852|125:125|image| The interface was not found. Ok. The interface was not found.
>> |24852|125:125|image| ( 0) Windows.pm, firewall_enable_ssh_private
>> |24852|(line: 4665) 125:125|image| (-1) Windows.pm, reboot (line: 3335)
>> |24852|125:125|image| (-2) Windows.pm, disable_pagefile (line: 2077)
>> |24852|125:125|image| (-3) Windows.pm, pre_capture (line: 474)
>> |24852|125:125|image| (-4) Version_5.pm, pre_capture (line: 105)
>> |24852|125:125|image| (-5) VMware.pm, capture (line: 556) 125:125|image|
>> |24852|---- WARNING ---- 125:125|image| 2012-08-02
>> |24852|17:39:42|24852|125:125|image|Windows.pm:reboot(3336)|reboot not
>> |24852|attempted, failed to enable ssh from private IP addresses
>>
>>
>>
>>> -----Original Message-----
>>> From: Hechler, Adam [mailto:hechla@rpi.edu]
>>> Sent: Thursday, August 02, 2012 4:38 PM
>>> To: user@vcl.apache.org
>>> Subject: RE: New capture failed atttempting Windows post-load tasks
>>>
>>> Thanks Dmitri,
>>>
>>> I was able to ssh to the vm from the management node before I captured.
>>>
>>> Curious.. because I never thought about it before... I can re-capture
>>> an existing vm that's already been captured? I guess it makes logical
>>> sense. It's still just a vm existing in VMWare Server.
>>>
>>> I'll give that a try.
>>>
>>> Adam
>>>
>>> > -----Original Message-----
>>> > From: dchebota@gmu.edu [mailto:dchebota@gmu.edu]
>>> > Sent: Thursday, August 02, 2012 4:35 PM
>>> > To: user@vcl.apache.org
>>> > Subject: Re: New capture failed atttempting Windows post-load tasks
>>> >
>>> > Adam
>>> >
>>> > Where you able to 'ssh -i /etc/vcl/vcl.key image-computer-name'
>>> > before
>>> you
>>> > captured the image?
>>> >
>>> > Yes, it seems like a good idea to redo ssh config, run
>>> > get-node-key.sh from management node and re-capture the image.
>>> > You will have new image under Manage Images and can delete the old
>>> image
>>> > which is not working.
>>> >
>>> > Reboot the image before you start capture to make sure Cygwin SSH
>>> > starts up.
>>> >
>>> > Thanks
>>> >
>>> >
>>> > On Aug 2, 2012, at 16:18 , "Hechler, Adam" <hechla@rpi.edu> wrote:
>>> >
>>> > > Hi Dmitri,
>>> > >
>>> > > I tried that and it's not working. I even went into Cygwin and
>>> > > tried to
>>> > manually start sshd from in there and it's giving me the following
>>> > error
>>> > messages:
>>> > >
>>> > > Could not load host key: /etc/ssh_host_rsa_key Could not load host
>>> > > key: /etc/ssh_host_dsa_key Could not load host key:
>>> > > /etc/ssh_host_ecdsa_key Disabling protocol version 2. Could not
>>> > > load host key
>>> > > sshd: no hostkeys available -- exiting.
>>> > >
>>> > > When I check in etc, there are files for the host keys but they're
>>> > > empty
>>> > now.  When I check the sshd log there's a bunch of entries showing
>>> > that it matched host keys and then three sets of "no host keys
>>> > available" at the bottom of the log (presumably from my last three
>>> > attempts to start sshd beginning with the reload).
>>> > >
>>> > > Can I just run the cywin-sshd-config.sh again on the vm and then
>>> > > run the
>>> > gen-node-key again on the management node?  It's already been
>>> > captured so I'm not sure if that would cause havoc.
>>> > >
>>> > > Adam
>>> > >
>>> > >
>>> > >> -----Original Message-----
>>> > >> From: dchebota@gmu.edu [mailto:dchebota@gmu.edu]
>>> > >> Sent: Thursday, August 02, 2012 4:03 PM
>>> > >> To: user@vcl.apache.org
>>> > >> Subject: Re: New capture failed atttempting Windows post-load
>>> > >> tasks
>>> > >>
>>> > >> Hi Adam
>>> > >>
>>> > >> Once you connect to Windows XP using VI client, can you start
>>> > >> Cygwin
>>> SSH
>>> > >> service manually under Control Panel -> Services?
>>> > >>
>>> > >> Thanks.
>>> > >> On Aug 2, 2012, at 15:34 , "Hechler, Adam" <hechla@rpi.edu>
wrote:
>>> > >>
>>> > >>> Hi again,
>>> > >>>
>>> > >>> So after getting the new sshd-config file this morning, I
>>> > >>> configured it
>>> and
>>> > all
>>> > >> seemed good. I then attempted to capture my base image. The
>>> > >> capture
>>> > itself
>>> > >> completed successfully but then I got an error that the reload
>>> > >> process
>>> > failed
>>> > >> right after this:
>>> > >>>
>>> > >>> 2012-08-02
>>> > >>
>>> 12:23:18|21124|124:124|reload|Windows.pm:post_load(583)|beginning
>>> > >> Windows post-load tasks on vmwg0-120-57
>>> > >>>
>>> > >>> After numerous attempts (about 107) to connect to SSH it finally
>>> > >>> failed
>>> > >> reporting:
>>> > >>>
>>> > >>> 2012-08-02
>>> > >>
>>> >
>>> 12:38:35|21124|124:124|reload|Module.pm:code_loop_timeout(767)|waiti
>>> > >> ng for vmwg0-120-57 to respond to SSH, code did not return true
>>> > >> after waiting 900 seconds
>>> > >>>
>>> > >>> Since it didn't finish the post-load tasks I was still able
to
>>> > >>> login as root to
>>> > my
>>> > >> Windows XP image using the VI Client console. I opened Cygwin and
>>> > typed ps
>>> > >> -ef looking to see if sshd was running but it's not. The only
>>> > >> processes
>>> > running
>>> > >> are ps, bash and mintty. Should I be able to see if sshd is
>>> > >> running using
>>> this
>>> > >> method of checking. I know about ps -ef from very limited unix
>>> > interactions
>>> > >> so I thought I'd try it.
>>> > >>>
>>> > >>> I know that in the past, when sshd didn't start (before
>>> > >>> capturing into
>>> > VCL) I
>>> > >> would have to open a cmd prompt and run the rebaseall but it
>>> > >> looks like
>>> > that
>>> > >> cmd file gets deleted during the capture? because it's no longer
>>> > >> in C:\cygwin\home\root which is where it used to be. I was
>>> > >> thinking I
>>> would
>>> > >> just try to run that again.
>>> > >>>
>>> > >>> Any clues?
>>> > >>>
>>> > >>> Thanks,
>>> > >>> Adam
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- -
>>> > >>> Adam Hechler Senior Analyst /PC Systems Administrator Rensselaer
>>> > >>> Polytechnic Institute
>>> > >>> 275 Windsor Street
>>> > >>> Hartford, CT 06120 USA
>>> > >>> Ph: 860-548-2446
>>> > >>> Email: hechla@rpi.edu
>>> > >>> Web: http://www.ewp.rpi.edu
>>> > >>> <image001.jpg> <image002.jpg> <image003.jpg>
 <image004.png>
>>> > >>>
>>> > >>
>>> > >>
>>> > >>
>>> > >> --
>>> > >> Thank you,
>>> > >>
>>> > >> Dmitri Chebotarov
>>> > >> Virtual Computing Lab Systems Engineer, TSD - Ent Servers &
>>> > >> Messaging
>>> > >> 223 Aquia Building, Ffx, MSN: 1B5
>>> > >> Phone: (703) 993-6175
>>> > >> Fax: (703) 993-3404
>>> > >>
>>> > >>
>>> > >>
>>> > >
>>> >
>>> >
>>> >
>>> > --
>>> > Thank you,
>>> >
>>> > Dmitri Chebotarov
>>> > Virtual Computing Lab Systems Engineer, TSD - Ent Servers &
>>> > Messaging
>>> > 223 Aquia Building, Ffx, MSN: 1B5
>>> > Phone: (703) 993-6175
>>> > Fax: (703) 993-3404
>>> >
>>> >
>>> >
>>
>>
>>
>> --
>> BEGIN-ANTISPAM-VOTING-LINKS
>> ------------------------------------------------------
>>
>> Teach CanIt if this mail (ID 690561004) is spam:
>> Spam:        https://www.spamtrap.odu.edu/b.php?i=690561004&m=46685512045c&t=20120803&c=s
>> Not spam:    https://www.spamtrap.odu.edu/b.php?i=690561004&m=46685512045c&t=20120803&c=n
>> Forget vote: https://www.spamtrap.odu.edu/b.php?i=690561004&m=46685512045c&t=20120803&c=f
>> ------------------------------------------------------
>> END-ANTISPAM-VOTING-LINKS
>>

Mime
View raw message