brooklyn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Heneveld <alex.henev...@cloudsoftcorp.com>
Subject Re: [PROPOSAL] Separate management addresses from the concept of an entity's public address
Date Wed, 07 Dec 2016 15:44:01 GMT
Hi Sam,

How does this relate to the strategy suggested in the Networking Proposal
[1] ?

TL;DR I agree with the intention but think some tweaks to the mechanism
would make for even better clarify and consistency going forwarD


The proposal [1] suggests several things.  One is a format for network
addresses:   *host.address.network*
For instance:

    host.address.public: 123.0.0.45
    host.address.private1: 10.0.0.45
    host.address.private2: 192.168.0.45

Another is a format for services:  *service.port* . (and *url* and others)
I've added a suggestion to use the convention . *service.network* . to be
able to indicate a specific one of the networks above.
Where required, policies can open up additional ingress or port forwards,
of the form  *service.field.mapped.network* . e.g. *http.port.mapped.public*
.

The format suggested in Sam's proposal -- *management.host.address* --
feels inconsistent with the above, and it neglects the need to indicate the
port (e.g. if port forwarding is needed) and bearer (if it's not
straightforward ssh).  I think of management access on a particular
protocol as a *service* -- eg management or management.ssh or *brooklyn.ssh*.
So the proposal would instead say:

    brooklyn.ssh.network:  private1
    brooklyn.ssh.port:  22

With enrichers then able to create eg  *brooklyn.ssh.endpoint: 10.0.0.45:22
<http://10.0.0.45:22>*

This is saying "the service this entity exposes for Brooklyn to SSH in is
10.0.0.45:22".  It isn't assuming a general management.ssh endpoint (you
could, as a different service, but we shouldn't assume that's always the
case), nor is it assuming a dedicated management network, although again it
could support it (eg replace "private1" with "management" in the above
network name).

This would let us be consistent with other port-forwarding / enriching
strategies as well as naming conventions.  For instance if Brooklyn needs
to ssh through port-forwarding (PFW) with say a rule created on "firewall1"
at 10.0.0.1:10001 CIDR'd to Brooklyn we might start with:

    ssh.network:  private1
    ssh.port:  22

and

    host.address.firewall1: 10.0.0.1

enrichers would create this:

    ssh.endpoint:  10.0.0.45:22

but brooklyn wouldn't be able to access it ... instead a PFW customiser
would set up the following:

    ssh.port.mapped.firewall1:  10001
    ssh.endpoint.mapped.firewall1:  10.0.0.1:10001

and then brooklyn/location setup would create the following sensors from
the above:

    brooklyn.ssh.network:  firewall1
    brooklyn.ssh.port:  10001
*    brooklyn.ssh.endpoint:  10.0.0.1:10001 <http://10.0.0.1:10001>*

The bold line above is what Brooklyn will use to make ssh connections --
completely unambiguous and it can be populated through different
strategies.  (And if it isn't direct ssh but instead
ssh-via-an-intermediate-machine, or https, or some other ssh-encoding
strategy we could also have  "brooklyn.ssh.bearer:  my-https" ... not
immediately relevant here but important to keep in mind that not all SSH
commands are sent via a straightforward ssh.)


Other points to address, some of which you touch on, are:

(a) which network is deemed the "default" (used to populate host.address)
(b) which network Brooklyn uses to connect for ssh purposes
(c) how are networks named (eg "public", "private1", "private2")
(d) how do we infer the host name
(e) which network Brooklyn uses to connect for other monitoring/control
purposes (http etc)

Currently as you note, Brooklyn when creating a JcloudsSshMachineLocation
tries to connect to sockets on the reported public addresses and then on
the reported private addresses, and decides that the first one which
listens on port 22 is to be used for (a) and (b).  It neglects (c)
altogether and publishes only `host.address` (and `host,name`).

Whilst it is a big ask to make the perfect strategies for all this, I think
a big improvement would be to permit this behaviour to be customised.  I
suggest we provide write a `LocationNetworkInfoCustomizer` instance
(implementing `LocationCustomizer`) to perform (a)-(d), together with a
`networkInfoCustomizer` config key to load it (so that we don't disrupt
other uses of LocationCustomizers).  That class could be the default, but
it could take some additional customization (eg to define specific
strategies for (b)) and of course a developer could subclass it; this
allows behaviour to be overridden in either the location's definition or in
an entity's provisioning properties.

The initial behaviour of such an instance could be as follows:

* attempt to find ports 22 to which brooklyn is able to successfully log in
via ssh (not just reach, as you noted Svet is going to fix ... I've had
bizarre problems in hotels where Brooklyn won't connect because an
inaccessible 10.x.x.x address of the machine is an address of a machine on
the hotel wifi!)
  * preferring private addresses
  * but configurable something like "preferPublic" "preferPrivate"
"allowPublic" etc ... and/or a CIDR preference order
* give names "public1/2/3" and "private1/2/3" to those networks reported by
jclouds (in future we could support CIDR constraints or subclasses could
connect to the machine and see which nics they correspond to)
* do simple things for now like make the network brooklyn uses the default
and use the current strategies for hostname, but again these could be
extended in the future
* publish the sensors above

As for (e) I suggest a similar pattern to explicitly identify the
brooklyn-accessible endpoints for other services that brooklyn needs access
to, eg creating `brooklyn.http.url` from a `http.port` and `http.url` or
`http.network`, possibly through an intermediate `http.url.mapped.jumphost`.

This should let us solve a lot of issues, not just management network
conflation but also public/private issues we have when configuring clusters
of nodes who all need to talk to each other.

Best
Alex


[1]
https://docs.google.com/document/d/1IrWLWunWSl_ScwY3MRICped8eJMjQEH1FbWZJcoK0Iw

On 7 December 2016 at 11:52, Geoff Macartney <
geoff.macartney@cloudsoftcorp.com> wrote:

> +1  This sounds like a good idea.  In most customer deployments I've seen
> in "previous lives" there has been a separate management network for
> production deployments, it would be good for Brooklyn to have this as an
> explicit concept factored out from 'host.address'.
>
> Geoff
>
>
>
> On Wed, 7 Dec 2016 at 11:31 Richard Downer <richard@apache.org> wrote:
>
> > +1
> >
> > The host.address and host.subnet.address has always been confusing, at
> > least for me, and especially so for figuring out what SshMachineLocation
> is
> > going to do. Having a dedicated sensor for the management address with
> > unambiguous purpose for SshMachineLocation (and others!) seems obvious to
> > me!
> >
> > Thanks
> > Richard.
> >
> >
> > On 6 December 2016 at 16:26, Sam Corbett <sam.corbett@cloudsoftcorp.com>
> > wrote:
> >
> > > Summary:
> > >
> > > Brooklyn conflates the management address of an entity with its public
> > > address. I want to break this by publishing a new sensor called
> > > management.host.address on entities that use JcloudsLocation and using
> it
> > > in preference to host.address when creating SSH connections and polling
> > > feeds. Its value is the entity's host.subnet.address if that is
> > accessible,
> > > otherwise host.address.
> > >
> > > Some background:
> > >
> > > JcloudsLocation makes a guess at the best host and port to use for SSH
> > > connections to each instance. The value it chooses subsequently
> informs a
> > > variety of entity sensors, most importantly the "host.address" sensor
> > which
> > > itself informs the address at which various feeds are polled and other
> > > sensors like "main.uri". Right now Brooklyn always prefers a value from
> > the
> > > set of "public" addresses returned by the cloud. Internally to
> > > JcloudsLocation the value that is picked is referred to as the
> > "management
> > > host and port", but by subsequently using it for host.address we
> conflate
> > > it with the publicly accessible address.
> > >
> > > An obvious change to make to this is to have Brooklyn use the private
> > > address for SSH connections when it's on the same subnet as the thing
> it
> > > has provisioned. In order to do this without mucking up all of the
> > existing
> > > assumptions I propose we introduce a new sensor called
> > > "management.host.address", whose value is a reachable private address,
> if
> > > one exists, and otherwise the value chosen for the public address. When
> > > creating SSH connections SshMachineLocation would check for the
> > management
> > > address and fall back to its current behaviour if it's unset.
> > >
> > > An alternative is to have SshMachineLocation itself work out whether it
> > > can connect to a private address. I do not like this option because we
> > > would be unable to reuse the information that a private address is
> better
> > > in entity feeds and BrooklynAccessUtils.
> > >
> > > Svet raised the excellent point that we risk having an instance's
> private
> > > address match an irrelevant machine on the same network as Brooklyn. To
> > > resolve this he suggests that we change the check for reachability to
> > also
> > > test credentials rather than simply trying to open a socket as happens
> at
> > > the moment.
> > >
> > > I'm going to start on an implementation of this. Any feedback or
> > questions?
> > >
> > > Sam
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message