cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Syed Ahmed <sah...@cloudops.com>
Subject Re: [DISCUSS] Replacing the VR
Date Fri, 16 Sep 2016 13:37:44 GMT
I agree with Will Ilya. There are so many problems with the VR right now.
Most of the outages we've had recently have somehow involved the VR. We set
custom iptables rules on the VR which can and have easily gone wrong.
Openswan is broken, Strongswan replacement still needs to be tested. VVRP
with redundant router still needs work, and not to mention the problems we
will have when we introduce IPv6 into the whole picture.

I think the spirit of the discussion is to rely on a 3rd party to do the
networking for us (eg VyOS) and have us handle just the orchestration. All
the problems that I've described have already been solved in VyOS. We also
get the advantage of a potential wider community to fix and maintain the VR
and given our current development velocity, it think it totally makes sense
to look for a 3rd party option.

-Syed


On Fri, Sep 16, 2016 at 9:18 AM, Will Stevens <wstevens@cloudops.com> wrote:

> The VR has been biting us far too often recently, which is why we have
> started looking into alternative implementations.
>
> One of the things that is nice about potentially using the VyOS is that it
> is based on Debian, so we should be able to run the other services that we
> currently have like the password server and userdata on the VyOS.  This
> means we would not have to change our architecture initially and could
> focus on only replacing the networking paths.
>
> *Will STEVENS*
> Lead Developer
>
> *CloudOps* *| *Cloud Solutions Experts
> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> w cloudops.com *|* tw @CloudOps_
>
> On Fri, Sep 16, 2016 at 6:20 AM, Nux! <nux@li.nux.ro> wrote:
>
> > The more this is discussed the more I think we should stick with our VR.
> >
> > All these other options either seem unfinished or with incompatible
> > license.
> >
> > VyOS looks the most promising so far, it's a serious, mature project.
> > Adopting it though means we'll have to microservice our way out of it
> with
> > extra machines for DNS/USERDATA/etc, unless we can make VyOS serve those
> > too. Imho this adds complexity we should void.
> >
> > --
> > Sent from the Delta quadrant using Borg technology!
> >
> > Nux!
> > www.nux.ro
> >
> > ----- Original Message -----
> > > From: "Will Stevens" <wstevens@cloudops.com>
> > > To: dev@cloudstack.apache.org
> > > Sent: Thursday, 15 September, 2016 17:21:28
> > > Subject: Re: [DISCUSS] Replacing the VR
> >
> > > Ya, we would need to add a daemon for VPN as well.  Load balancing is
> > > another aspect which we will need to consider if we went this route.
> > > Something like https://traefik.io/ could potentially be a good fit due
> > to
> > > its API driven configuration, but it may be more than what we need.
> > >
> > > We should probably try define which pieces make sense to be solved
> > together
> > > and which pieces would be best suited to be broken out.
> > >
> > > I think the network connectivity, routing and firewalling should
> probably
> > > all stay together since the majority of the tools we would potentially
> > use
> > > would handle all of that together in a single implementation.
> > >
> > > The password server and userdata seems like a good option for being
> > broken
> > > out and handled independently (and probably rewritten completely since
> > they
> > > currently have some issues).
> > >
> > > Load balancing is another that could warrant splitting out, but that
> > > depends on what direction we go and how we would be managing it.  DHCP
> > and
> > > DNS are others which could go either way.
> > >
> > > If we do split out services, I think we should consolidate as much as
> we
> > > can into each service we break out.  Ideally a network packet would
> never
> > > hit more than one, maybe two, services.  I don't think we should be
> > > splitting services 'just because', I think we need a valid case for
> > > splitting any service out because it adds complexity.  Our project is
> > > already complex enough, we need to avoid adding complexity unless it is
> > > really needed.
> > >
> > > Some more of my thoughts on this anyway...
> > >
> > > *Will STEVENS*
> > > Lead Developer
> > >
> > > *CloudOps* *| *Cloud Solutions Experts
> > > 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> > > w cloudops.com *|* tw @CloudOps_
> > >
> > > On Thu, Sep 15, 2016 at 10:28 AM, Simon Weller <sweller@ena.com>
> wrote:
> > >
> > >> I do agree with you that this probably isn't the right place the
> > password
> > >> service and user data.
> > >>
> > >>
> > >> Having said that, after taking a cursory look at the dev docs, it
> > doesn't
> > >> seem that difficult to add new daemons: https://opensnaproute.github.
> > >> io/docs/developer.html#creating-new-component
> > >>
> > >> <https://opensnaproute.github.io/docs/developer.html#
> > >> creating-new-component>
> > >>
> > >>
> > >> They've definitely build it with a microservices architecture in mind,
> > so
> > >> each individual feature is abstracted into it's own small daemon
> > process.
> > >> We could just create a daemon for the password server and the userdata
> > >> components if we really had to.
> > >>
> > >>
> > >> - Si
> > >>
> > >>
> > >> ________________________________
> > >> From: williamstevens@gmail.com <williamstevens@gmail.com> on behalf
> of
> > >> Will Stevens <wstevens@cloudops.com>
> > >> Sent: Thursday, September 15, 2016 9:17 AM
> > >> To: dev@cloudstack.apache.org
> > >> Subject: Re: [DISCUSS] Replacing the VR
> > >>
> > >> A big part of why I know about it is because it is written in Go.  :P
> > >>
> > >> Yes, it is definitely interesting for the routing and traffic handling
> > >> aspects of the VR.  We will likely have to rethink some of the pieces
> a
> > >> little bit like the password server and userdata if we are to adopt a
> > >> different VR approach.  This is where I think some of JohnB and
> > Chiradeep's
> > >> ideas make sense.  In many ways, it does not make sense for the device
> > >> handling routing and network traffic to also be responsible for
> > passwords
> > >> and userdata.
> > >>
> > >> *Will STEVENS*
> > >> Lead Developer
> > >>
> > >> *CloudOps* *| *Cloud Solutions Experts
> > >> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> > >> w cloudops.com *|* tw @CloudOps_
> > >>
> > >> On Thu, Sep 15, 2016 at 9:10 AM, Simon Weller <sweller@ena.com>
> wrote:
> > >>
> > >> > I hadn't heard of Flexswitch until you mentioned it. It looks pretty
> > >> cool!
> > >> > It even supports ONIE install.
> > >> >
> > >> > To be honest, the ipsec feature could be added, or we could offload
> > it to
> > >> > separate vm if we needed to. The fact it is so feature rich from a
> > >> routing
> > >> > perspective (and all API driven) is really nice.
> > >> >
> > >> >
> > >> > Based on the roadmap, it looks like they plan to also support
> > >> capabilities
> > >> > such as BGP-MPLS based L3VPN, EVPN, VPLS in the future. This will
be
> > huge
> > >> > for our carrier community that rely on these technologies to do
> > private
> > >> > gateway and inter-VPC interconnections today. We handle this stuff
> on
> > our
> > >> > ASRs right now with a vlan interconnect into the VR. Being able to
> do
> > >> MPLS
> > >> > all the way to the VR would be awesome.
> > >> >
> > >> >
> > >> > It also seems to be written in GO (a language here at ENA we know
> very
> > >> > well).
> > >> >
> > >> >
> > >> > - Si
> > >> >
> > >> >
> > >> >
> > >> > ________________________________
> > >> > From: Will Stevens <williamstevens@gmail.com>
> > >> > Sent: Thursday, September 15, 2016 7:06 AM
> > >> > To: dev@cloudstack.apache.org
> > >> > Subject: RE: [DISCUSS] Replacing the VR
> > >> >
> > >> > Ya. I don't think it covers our whole use case, but what it does
> > cover is
> > >> > all api driven...
> > >> >
> > >> > On Sep 15, 2016 1:48 AM, "Marty Godsey" <marty@gonsource.com>
> wrote:
> > >> >
> > >> > > Though I don’t see VPN in Snaproute.. Makes sense since it
was not
> > >> > > intended to do IPSec.
> > >> > >
> > >> > > It seems as though VyOS is starting to look like the best option.
> > >> > >
> > >> > > Regards,
> > >> > > Marty Godsey
> > >> > > nSource Solutions
> > >> > >
> > >> > > -----Original Message-----
> > >> > > From: williamstevens@gmail.com [mailto:williamstevens@gmail.com]
> On
> > >> > > Behalf Of Will Stevens
> > >> > > Sent: Wednesday, September 14, 2016 11:06 PM
> > >> > > To: dev@cloudstack.apache.org
> > >> > > Subject: Re: [DISCUSS] Replacing the VR
> > >> > >
> > >> > > Or we could go completely crazy and go with something like
> > FlexSwitch
> > >> > from
> > >> > > SnapRoute
> > >> > > - http://www.snaproute.com/
> > >> > > - https://opensnaproute.github.io/docs/apis.html
> > >> > >
> > >> > > *Will STEVENS*
> > >> > > Lead Developer
> > >> > >
> > >> > > *CloudOps* *| *Cloud Solutions Experts
> > >> > > 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com
> *|*
> > tw
> > >> > > @CloudOps_
> > >> > >
> > >> > > On Wed, Sep 14, 2016 at 10:55 PM, Will Stevens <
> > wstevens@cloudops.com>
> > >> > > wrote:
> > >> > >
> > >> > > > I tend to agree with Syed and Marty.  I am not sure what
> problems
> > are
> > >> > > > solved by splitting up the function of the VR into a bunch
of
> > >> separate
> > >> > > > services.  As Syed points out, the complexity added is
> > non-trivial.
> > >> > > > We now have to manage all the intercontainer networking
as well
> as
> > >> the
> > >> > > > orchestrated ACS networking.
> > >> > > >
> > >> > > > VyOS is interesting to me because it covers the majority
of our
> > use
> > >> > > > case with a single unified control plane.  It also has good
> > support
> > >> > > > for extending features we care about, like IPv6, VXLAN,
VRRP,
> > >> > > > transactions, etc...
> > >> > > >
> > >> > > > *Will STEVENS*
> > >> > > > Lead Developer
> > >> > > >
> > >> > > > *CloudOps* *| *Cloud Solutions Experts
> > >> > > > 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com
> > *|*
> > >> tw
> > >> > > > @CloudOps_
> > >> > > >
> > >> > > > On Wed, Sep 14, 2016 at 9:49 PM, Syed Ahmed <
> sahmed@cloudops.com>
> > >> > wrote:
> > >> > > >
> > >> > > >> Agree with Marty, adding Docker containers to the picture
> > although
> > >> > > >> can make the VR more flexible but the added complexity
is just
> > not
> > >> > > >> worth it. Not to mention we would need to take care
of
> networking
> > >> > > >> each container manually and given that our iptable rules
are
> very
> > >> > > >> unstable at the moment I don't see a big value add.
> > >> > > >>
> > >> > > >> Vyos looks like a better solution to me. I know that
it does
> not
> > >> > > >> provide an api but it does fit the bill quite well otherwise.
I
> > >> > > >> specially like the fact that it has a transaction based
model
> and
> > >> you
> > >> > > >> can rollback changes if something goes wrong.
> > >> > > >> On Wed, Sep 14, 2016 at 9:06 PM Marty Godsey <
> > marty@gonsource.com>
> > >> > > wrote:
> > >> > > >>
> > >> > > >> > Licensing aside, I think splitting the various
functions into
> > >> > > >> > containers is not a good route either. This will
force users
> to
> > >> > > >> > have to maintain
> > >> > > >> and
> > >> > > >> > use containers and adds complexity to the networking
aspects
> of
> > >> ACS.
> > >> > > >> > Complexity decreases stability. Now I understand
the argument
> > that
> > >> > > >> > a monolithic approach also brings its own set of
issues but
> it
> > >> also
> > >> > > >> > simplifies it.
> > >> > > >> >
> > >> > > >> > Regards,
> > >> > > >> > Marty Godsey
> > >> > > >> > nSource Solutions
> > >> > > >> >
> > >> > > >> > -----Original Message-----
> > >> > > >> > From: Chiradeep Vittal [mailto:chiradeepv@gmail.com]
> > >> > > >> > Sent: Wednesday, September 14, 2016 5:37 PM
> > >> > > >> > To: dev@cloudstack.apache.org
> > >> > > >> > Subject: Re: [DISCUSS] Replacing the VR
> > >> > > >> >
> > >> > > >> > I rather doubt that the Cloudrouter will fit the
needs of the
> > >> > > >> > CloudStack project
> > >> > > >> >  - it is AGPL licensed. Many enterprises will not
touch
> > anything
> > >> > > >> > that
> > >> > > >> has
> > >> > > >> > AGPL
> > >> > > >> >  - the github repo shows rather infrequent updates.
Quite
> > likely
> > >> > > >> > they aren't considering the use cases of the CloudStack
> > community
> > >> > > >> >
> > >> > > >> > I'd back John B's comments on disaggregating the
VR. Split it
> > into
> > >> > > >> > many docker containers
> > >> > > >> >  - password server
> > >> > > >> >  - userdata server
> > >> > > >> >  - DHCP / DNS
> > >> > > >> >  - s2s VPN
> > >> > > >> >  - RA VPN
> > >> > > >> >  - intra-VPC routing and ACL
> > >> > > >> >  - Port forwarding + NAT
> > >> > > >> >  - FW
> > >> > > >> >  - LB (public)
> > >> > > >> >  - LB (internal),
> > >> > > >> >  - secondary storage
> > >> > > >> >  - agent
> > >> > > >> > Glue them together with  docker compose files (one
per use
> > case -
> > >> > > >> > basic zone, isolated, VPC, SSVM, etc).
> > >> > > >> >
> > >> > > >> > The VR image then becomes a JeOS + docker. You
can test each
> of
> > >> the
> > >> > > >> > components independently and fixing one bug in
the field (say
> > >> DHCP)
> > >> > > >> > is hitless to the other components. You don't need
to build
> > >> > > >> > per-hypervisor VRs. You could even run on baremetal.
> > >> > > >> >
> > >> > > >> > Along the way you need to figure out how to
> > >> > > >> >  - make the traffic traverse the containers that
are needed
> to
> > be
> > >> > > >> > traversed (in most cases just 1)
> > >> > > >> >  - bootstrap the router (how does it find its compose
file?
> > where
> > >> > > >> > is the
> > >> > > >> > registry?)
> > >> > > >> >  - rethink the command and control of the VR functions.
SSH
> > works,
> > >> > > >> > but something more declarative, idempotent should
be
> explored.
> > >> > > >> >
> > >> > > >> > As you do this, it becomes clearer which of the
functions can
> > be
> > >> > > >> > substituted by for example CloudRouter. Command
and Control
> of
> > the
> > >> > > >> docker
> > >> > > >> > containers can be moved out to another container.
Etc.
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >> >
> > >> > > >> > On Wed, Sep 14, 2016 at 12:59 AM, Marty Godsey
> > >> > > >> > <marty@gonsource.com>
> > >> > > >> > wrote:
> > >> > > >> >
> > >> > > >> > > This one does look nice. My biggest concern
is the lack of
> > >> > > >> > > VXLANs. It seems that any of the ones we mentioned
do not
> > have
> > >> an
> > >> > > >> > > API so we may be stuck at the SSH method.
> > >> > > >> > >
> > >> > > >> > > Regards,
> > >> > > >> > > Marty Godsey
> > >> > > >> > > nSource Solutions
> > >> > > >> > >
> > >> > > >> > > -----Original Message-----
> > >> > > >> > > From: Abhinandan Prateek
> > >> > > >> > > [mailto:abhinandan.prateek@shapeblue.com]
> > >> > > >> > > Sent: Wednesday, September 14, 2016 2:26 AM
> > >> > > >> > > To: dev@cloudstack.apache.org
> > >> > > >> > > Subject: Re: [DISCUSS] Replacing the VR
> > >> > > >> > >
> > >> > > >> > > Cloudrouter looks promising. These have potential
to save
> > future
> > >> > > >> > > engineering effort for example on ipv6 routing,
OSPF etc.
> > >> > > >> > > And the best part is they come with test automation
> > framework.
> > >> > > >> > >
> > >> > > >> > >
> > >> > > >> > >
> > >> > > >> > >
> > >> > > >> > >
> > >> > > >> > > On 13/09/16, 4:22 PM, "Jayapal Uradi"
> > >> > > >> > > <jayapal.uradi@accelerite.com>
> > >> > > >> > > wrote:
> > >> > > >> > >
> > >> > > >> > > >Hi,
> > >> > > >> > > >
> > >> > > >> > > >Instead of replacing the VR in first place
we should add
> > >> > > >> > > >VyOS/cloudrouter
> > >> > > >> > > as provider. Once it is stable, network offerings
(on
> > upgrade)
> > >> > > >> > > can be updated to use it and we can drop the
VR if we want
> at
> > >> > > >> > > that release
> > >> > > >> > onwards.
> > >> > > >> > > >
> > >> > > >> > > >VR is stabilized over a period of time
and some of them
> are
> > >> > > >> > > >running
> > >> > > >> > > without issues.  When we replicate the ACS
VR features in
> new
> > >> > > >> > > solution it takes some to find the missing
pieces (hidden
> > bugs).
> > >> > > >> > > >
> > >> > > >> > > >Thanks,
> > >> > > >> > > >Jayapal
> > >> > > >> > > >
> > >> > > >> > > >> On Sep 13, 2016, at 2:52 PM, Nux!
<
> > >> > > >> > > >
> > >> > > >> > > >> nux@li.nux.ro> wrote:
> > >> > > >> > > >>
> > >> > > >> > > >> Hi,
> > >> > > >> > > >>
> > >> > > >> > > >> I like the idea.
> > >> > > >> > > >>
> > >> > > >> > > >> Cloudrouter looks really promising,
I'm not too keen on
> > VyOS
> > >> > > >> > > >> (it
> > >> > > >> > > doesn't have a proper http api etc).
> > >> > > >> > > >>
> > >> > > >> > > >> --
> > >> > > >> > > >> Sent from the Delta quadrant using
Borg technology!
> > >> > > >> > > >>
> > >> > > >> > > >> Nux!
> > >> > > >> > > >> www.nux.ro
> > >> > > >> > > >>
> > >> > > >> > > >>
> > >> > > >> > > abhinandan.prateek@shapeblue.com
> > >> > > >> > > www.shapeblue.com<http://www.shapeblue.com>
> > >> > > >> > > 53 Chandos Place, Covent Garden, London  WC2N
4HSUK
> > @shapeblue
> > >> > > >> > >
> > >> > > >> > >
> > >> > > >> > >
> > >> > > >> > > ----- Original Message -----
> > >> > > >> > > >>> From: "Will Stevens" <williamstevens@gmail.com>
> > >> > > >> > > >>> To: dev@cloudstack.apache.org
> > >> > > >> > > >>> Sent: Monday, 12 September, 2016
21:20:11
> > >> > > >> > > >>> Subject: [DISCUSS] Replacing
the VR
> > >> > > >> > > >>
> > >> > > >> > > >>> *Disclaimer:* This is a thought
experiment and should
> be
> > >> > > >> > > >>> treated as
> > >> > > >> > > such.
> > >> > > >> > > >>> Please weigh in with the good
and bad of this idea...
> > >> > > >> > > >>>
> > >> > > >> > > >>> A couple of us have been discussing
the idea of
> > potentially
> > >> > > >> > > >>> replacing the ACS VR with the
VyOS [1] (Open Source
> > Vyatta
> > >> > VM).
> > >> > > >> > > >>> There may be a license issue
because I think it is
> > licensed
> > >> > > >> > > >>> under GPL, but for the sake of
discussion, let's assume
> > we
> > >> > > >> > > >>> can overcome any
> > >> > > >> > > license issues.
> > >> > > >> > > >>>
> > >> > > >> > > >>> I have spent some time recently
with the VyOS and I
> have
> > to
> > >> > > >> > > >>> admit, I was pretty impressed.
 It is simple and
> > intuitive
> > >> > > >> > > >>> and it gives you a lot more options
for auditing the
> > >> > > configuration etc...
> > >> > > >> > > >>>
> > >> > > >> > > >>> Items of potential interest:
> > >> > > >> > > >>> - Clean up our current VR script
spaghetti to a simpler
> > more
> > >> > > >> > > >>> auditable configuration workflow.
> > >> > > >> > > >>> - Gives a cleaner path for IPv6
support.
> > >> > > >> > > >>> - Handles VPN configuration via
the same configuration
> > >> > > interface.
> > >> > > >> > > >>> - Support for OSPF & BGP.
> > >> > > >> > > >>> - VPN support through OpenVPN
& StrongSwan.
> > >> > > >> > > >>> - Easily supports HA (redundant
routers) through VRRP.
> > >> > > >> > > >>> - VXLAN support.
> > >> > > >> > > >>> - Transaction based changes to
the VR with rollback on
> > >> error.
> > >> > > >> > > >>>
> > >> > > >> > > >>> Items that could be difficult
to solve:
> > >> > > >> > > >>> - Userdata password reset workflow
and implementation.
> > >> > > >> > > >>> - Upgrade process.
> > >> > > >> > > >>>
> > >> > > >> > > >>> The VyOS is not the only option
if we were to consider
> > this
> > >> > > >> approach.
> > >> > > >> > > >>> Another option, which I don't
know as well, would be
> > >> > > >> > > >>> CloudRouter (AGPL
> > >> > > >> > > >>> license) [2] which is purely
API driven.
> > >> > > >> > > >>>
> > >> > > >> > > >>> Anyway, would love to hear your
thoughts...
> > >> > > >> > > >>>
> > >> > > >> > > >>> Will
> > >> > > >> > > >>>
> > >> > > >> > > >>> [1] https://vyos.io/
> > >> > > >> > > >>> [2] https://cloudrouter.org/
> > >> > > >> > > >
> > >> > > >> > > >
> > >> > > >> > > >
> > >> > > >> > > >
> > >> > > >> > > >DISCLAIMER
> > >> > > >> > > >==========
> > >> > > >> > > >This e-mail may contain privileged and
confidential
> > information
> > >> > > >> > > >which is
> > >> > > >> > > the property of Accelerite, a Persistent Systems
business.
> > It is
> > >> > > >> > > intended only for the use of the individual
or entity to
> > which
> > >> it
> > >> > > >> > > is addressed. If you are not the intended
recipient, you
> are
> > not
> > >> > > >> > > authorized to read, retain, copy, print, distribute
or use
> > this
> > >> > > >> > > message. If you have received this communication
in error,
> > >> please
> > >> > > >> > > notify the sender and delete all copies of
this message.
> > >> > > >> > > Accelerite, a Persistent Systems business
does not accept
> any
> > >> > > >> > > liability for virus
> > >> > > >> > infected mails.
> > >> > > >> > >
> > >> > > >> >
> > >> > > >>
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message