hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Boudnik <...@apache.org>
Subject Re: ssh'ing around the cluster
Date Sun, 28 Feb 2016 06:33:02 GMT
Hi Lei.

Looks like in the current incarnation, Hawq isn't capable of running local
operations at all. I finally found the source of this issue, which took me a
while considering that I've dealt with Python three time in my lifetime,
including this one.  Anyway, hawq_ctl is _only_ operates with remote_ssh when
it gets to the init of different parts of the cluster. The function is taken
from hawqpylib/hawqlib.py and runs ssh explicitly.

I propose to implement the functionality where local execution is indeed
possible, as I've described earlier. Thoughts?

Regards,
  Cos

On Sat, Feb 27, 2016 at 02:44PM, Konstantin Boudnik wrote:
> On Thu, Feb 25, 2016 at 09:03PM, Konstantin Boudnik wrote:
> > On Fri, Feb 26, 2016 at 10:53AM, Lei Chang wrote:
> > > Hi Konstantin,
> > > 
> > > If I understand correctly, what you requested is in current hawq code base.
> > > To init hawq, "hawq init cluster" is one way, you can also use "hawq init
> > > master" and "hawq init segment" on all cluster nodes. Master and segments
> > > are decoupled in 2.0.
> > 
> > Ah, nice. Let me try that. Thanks!
> 
> Well, unfortunately, it doesn't work as expected. Even as I do
>     $ sudo -u hawq /usr/bin/hawq init master
> 
> I see that the process is always going to local_ssh call, which ends up in an attempt
to ssh into localhost.
> Traceback (most recent call last):
>   File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
>     "__main__", fname, loader, pkg_name)
>   File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
>     exec code in run_globals
>   File "/usr/lib64/python2.7/trace.py", line 819, in <module>
>     main()
>   File "/usr/lib64/python2.7/trace.py", line 807, in main
>     t.runctx(code, globs, globs)
>   File "/usr/lib64/python2.7/trace.py", line 513, in runctx
>     exec cmd in globals, locals
>   File "/usr/lib/hawq/bin/hawq", line 133, in <module>
>     main()
>   File "/usr/lib/hawq/bin/hawq", line 80, in main
>     result = local_ssh(cmd)
>   File "/usr/lib/hawq/bin/hawq", line 35, in local_ssh
>     result = subprocess.Popen(cmd, shell=True).wait()
>   File "/usr/lib64/python2.7/subprocess.py", line 1376, in wait
>     pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
>   File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_call
>     return func(*args)
> 
> Any ideas what I am doing wrong? Thanks
>   Cos
> 
> > > On Fri, Feb 26, 2016 at 9:40 AM, Konstantin Boudnik <cos@apache.org>
wrote:
> > > 
> > > > Guys,
> > > >
> > > > more revelations about the way Hawq is designed to work with the service
> > > > bootstraps, config management, and so on. Looking into hawq_ctl and
> > > > observing
> > > > the behavior of 'hawq init cluster' I see that a number of operations
is
> > > > intended to be initiated from presumably, a master node, and carried on
all
> > > > the nodes of a hawq cluster via the means or ssh (or rsync). While doing
so
> > > > might be a convenient shortcut for the development environment, it isn't
as
> > > > much in a real deployment. For one, it requires password-less SSH access
> > > > between nodes, which isn't (generally speaking) how data centers might
> > > > operate.
> > > >
> > > > Perhaps a better way of separating the concerns here is to have isolated
> > > > functions to perform only local-node operations, and a wrapper to run
the
> > > > same
> > > > functionality on the all remote nodes, via ssh or else. If such split
is
> > > > done,
> > > > then an orchestration mechanism (such as a state machine similar to Puppet
> > > > or
> > > > Chef), would execute the scripts on separate nodes in full isolation.
And
> > > > if
> > > > so desired in a dev environment, the current functionality would be
> > > > available
> > > > as well.
> > > >
> > > > Thoughts? Regards,
> > > >   Cos
> > > >
> 
> 



Mime
View raw message