cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc-Aurèle Brothier <ma...@exoscale.ch>
Subject Re: [Discuss] Management cluster / Zookeeper holding locks
Date Mon, 18 Dec 2017 10:55:49 GMT
I understand your point, but there isn't any "transaction" in ZK. The
transaction and commit stuff are really for DB and not part of ZK. All
entries (if you start writing data in some nodes) are versioned. For
example you could enforce that to overwrite a node value you must submit
the node data having the same last version id to ensure you were
overwriting from the latest value/state of that node. Bear in mind that you
should not put too much data into your ZK, it's not a database replacement,
neither a nosql db.

The ZK client (CuratorFramework object) is started on the server startup,
and you only need to pass it along your calls so that the connection is
reused, or retried, depending on the state. Nothing manual has to be done,
it's all in this curator library.

On Mon, Dec 18, 2017 at 11:44 AM, Rafael Weingärtner <
rafaelweingartner@gmail.com> wrote:

> I did not check the link before. Sorry about that.
>
> Reading some of the pages there, I see curator more like a client library
> such as MySQL JDBC client.
>
> When I mentioned framework, I was looking for something like Spring-data.
> So, we could simply rely on the framework to manage connections and
> transactions. For instance, we could define a pattern that would open
> connection with a read-only transaction. And then, we could annotate
> methods that would write in the database something with
> @Transactional(readonly = false). If we are going to a change like this we
> need to remove manually open connections and transactions. Also, we have to
> remove the transaction management code from our code base.
>
> I would like to see something like this [1] in our future. No manually
> written transaction code, and no transaction management in our code base.
> Just simple annotation usage or transaction pattern in Spring XML files.
>
> [1]
> https://github.com/rafaelweingartner/daily-tasks/
> blob/master/src/main/java/br/com/supero/desafio/services/TaskService.java
>
> On Mon, Dec 18, 2017 at 8:32 AM, Marc-Aurèle Brothier <marco@exoscale.ch>
> wrote:
>
> > @rafael, yes there is a framework (curator), it's the link I posted in my
> > first message: https://curator.apache.org/curator-recipes/shared-lock.
> html
> > This framework helps handling all the complexity of ZK.
> >
> > The ZK client stays connected all the time (as the DB connection pool),
> and
> > only one connection (ZKClient) is needed to communicate with the ZK
> server.
> > The framework handles reconnection as well.
> >
> > Have a look at ehc curator website to understand its goal:
> > https://curator.apache.org/
> >
> > On Mon, Dec 18, 2017 at 11:01 AM, Rafael Weingärtner <
> > rafaelweingartner@gmail.com> wrote:
> >
> > > Do we have framework to do this kind of looking in ZK?
> > > I mean, you said " create a new InterProcessSemaphoreMutex which
> handles
> > > the locking mechanism.". This feels that we would have to continue
> > opening
> > > and closing this transaction manually, which is what causes a lot of
> our
> > > headaches with transactions (it is not MySQL locks fault entirely, but
> > our
> > > code structure).
> > >
> > > On Mon, Dec 18, 2017 at 7:47 AM, Marc-Aurèle Brothier <
> marco@exoscale.ch
> > >
> > > wrote:
> > >
> > > > We added ZK lock for fix this issue but we will remove all current
> > locks
> > > in
> > > > ZK in favor of ZK one. The ZK lock is already encapsulated in a
> project
> > > > with an interface, but more work should be done to have a proper
> > > interface
> > > > for locks which could be implemented with the "tool" you want,
> either a
> > > DB
> > > > lock for simplicity, or ZK for more advanced scenarios.
> > > >
> > > > @Daan you will need to add the ZK libraries in CS and have a running
> ZK
> > > > server somewhere. The configuration value is read from the
> > > > server.properties. If the line is empty, the ZK client is not created
> > and
> > > > any lock request will immediately return (not holding any lock).
> > > >
> > > > @Rafael: ZK is pretty easy to setup and have running, as long as you
> > > don't
> > > > put too much data in it. Regarding our scenario here, with only
> locks,
> > > it's
> > > > easy. ZK would be only the gatekeeper to locks in the code, ensuring
> > that
> > > > multi JVM can request a true lock.
> > > > For the code point of view, you're opening a connection to a ZK node
> > (any
> > > > of a cluster) and you create a new InterProcessSemaphoreMutex which
> > > handles
> > > > the locking mechanism.
> > > >
> > > > On Mon, Dec 18, 2017 at 10:24 AM, Ivan Kudryavtsev <
> > > > kudryavtsev_ia@bw-sw.com
> > > > > wrote:
> > > >
> > > > > Rafael,
> > > > >
> > > > > - It's easy to configure and run ZK either in single node or
> cluster
> > > > > - zookeeper should replace mysql locking mechanism used inside ACS
> > code
> > > > > (places where ACS locks tables or rows).
> > > > >
> > > > > I don't think from the other size, that moving from MySQL locks to
> ZK
> > > > locks
> > > > > is easy and light and (even implemetable) way.
> > > > >
> > > > > 2017-12-18 16:20 GMT+07:00 Rafael Weingärtner <
> > > > rafaelweingartner@gmail.com
> > > > > >:
> > > > >
> > > > > > How hard is it to configure Zookeeper and get everything up
and
> > > > running?
> > > > > > BTW: what zookeeper would be managing? CloudStack management
> > servers
> > > or
> > > > > > MySQL nodes?
> > > > > >
> > > > > > On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev <
> > > > > > kudryavtsev_ia@bw-sw.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello, Marc-Aurele, I strongly believe that all mysql locks
> > should
> > > be
> > > > > > > removed in favour of truly DLM solution like Zookeeper.
The
> > > > performance
> > > > > > of
> > > > > > > 3node ZK ensemble should be enough to hold up to 1000-2000
> locks
> > > per
> > > > > > second
> > > > > > > and it helps to move to truly clustered MySQL like galera
> without
> > > > > single
> > > > > > > master server.
> > > > > > >
> > > > > > > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier <
> > marco@exoscale.ch
> > > >:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I was wondering how many of you are running CloudStack
with a
> > > > cluster
> > > > > > of
> > > > > > > > management servers. I would think most of you, but
it would
> be
> > > nice
> > > > > to
> > > > > > > hear
> > > > > > > > everyone voices. And do you get hosts going over their
> capacity
> > > > > limits?
> > > > > > > >
> > > > > > > > We discovered that during the VM allocation, if you
get a lot
> > of
> > > > > > parallel
> > > > > > > > requests to create new VMs, most notably with large
profiles,
> > the
> > > > > > > capacity
> > > > > > > > increase is done too far after the host capacity checks
and
> > > results
> > > > > in
> > > > > > > > hosts going over their capacity limits. To detail
the steps:
> > the
> > > > > > > deployment
> > > > > > > > planner checks for cluster/host capacity and pick
up one
> > > deployment
> > > > > > plan
> > > > > > > > (zone, cluster, host). The plan is stored in the database
> > under a
> > > > > > VMwork
> > > > > > > > job and another thread picks that entry and starts
the
> > > deployment,
> > > > > > > > increasing the host capacity and sending the commands.
Here
> > > > there's a
> > > > > > > time
> > > > > > > > gap between the host being picked up and the capacity
> increase
> > > for
> > > > > that
> > > > > > > > host of a couple of seconds, which is well enough
to go over
> > the
> > > > > > capacity
> > > > > > > > on one or more hosts. A few VMwork job can be added
in the DB
> > > queue
> > > > > > > > targeting the same host before one gets picked up.
> > > > > > > >
> > > > > > > > To fix this issue, we're using Zookeeper to act as
the multi
> > JVM
> > > > lock
> > > > > > > > manager thanks to their curator library (
> > > > > > > > https://curator.apache.org/curator-recipes/shared-lock.html
> ).
> > We
> > > > > also
> > > > > > > > changed the time when the capacity is increased, which
occurs
> > now
> > > > > > pretty
> > > > > > > > much after the deployment plan is found and inside
the
> > zookeeper
> > > > > lock.
> > > > > > > This
> > > > > > > > ensure we don't go over the capacity of any host,
and it has
> > been
> > > > > > proven
> > > > > > > > efficient since a month in our management server cluster.
> > > > > > > >
> > > > > > > > This adds another potential requirement which should
be
> discuss
> > > > > before
> > > > > > > > proposing a PR. Today the code works seamlessly without
ZK
> too,
> > > to
> > > > > > ensure
> > > > > > > > it's not a hard requirement, for example in a lab.
> > > > > > > >
> > > > > > > > Comments?
> > > > > > > >
> > > > > > > > Kind regards,
> > > > > > > > Marc-Aurèle
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > With best regards, Ivan Kudryavtsev
> > > > > > > Bitworks Software, Ltd.
> > > > > > > Cell: +7-923-414-1515
> > > > > > > WWW: http://bitworks.software/ <http://bw-sw.com/>
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rafael Weingärtner
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > With best regards, Ivan Kudryavtsev
> > > > > Bitworks Software, Ltd.
> > > > > Cell: +7-923-414-1515
> > > > > WWW: http://bitworks.software/ <http://bw-sw.com/>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
>
>
>
> --
> Rafael Weingärtner
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message