Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cloudstack.apache.org
Subject: =?UTF-8?Q?Re:_Let=e2=80=99s_discuss_database_upgrades?=
To: dev@cloudstack.apache.org
References: 
 <CAG97raeKwetAmNsQOqRFpLXve7rL1WnJ2eELyaMZkMF+YBn=_Q@mail.gmail.com>
 <56814C27.3090102@widodh.nl>
 <CAG97rafLKADE-0gk++NOROW0cWaG=_9dvZqAj72bvH9oYFiH3w@mail.gmail.com>
From: Wido den Hollander <wido@widodh.nl>
Message-ID: <568157F0.7010708@widodh.nl>
Date: Mon, 28 Dec 2015 16:40:32 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.4.0
MIME-Version: 1.0
In-Reply-To: 
 <CAG97rafLKADE-0gk++NOROW0cWaG=_9dvZqAj72bvH9oYFiH3w@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit


On 28-12-15 16:21, Rafael Weingärtner wrote:
> Thanks for your contribution Wido,
> I have not seen Rohit’s email; I will take a look at it.
> 

Ok, he has a FS here:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+Chimp

> About database schema changes happening only in X.Y, I also agree with you
> (that is a convention we all could agree on, and such as conding and
> release procedures we could have a wiki page for that). However, I think we
> still might have scripts in versions X.Y.Z  to add data to a table such as
> “guest_os_hypervisor”.
> 

Yes, that is true. A bugfix could be a addition into the database, but
we have to prevent it as much as possible.

> The point to manage such scripts is that, if we are in version such as
> 4.7.0 and a new script emerges in version 4.5.3, we would have to decide to
> run or not to run it. I would rather not run them, since if they add
> something to the code base; those changes should also be applied into
> master and as a consequence it will be available in a future update.
> 

I understand, but this is where our release cycle becomes the problem.
It is because we release a X.Y.Z release we run into these kind of problems.

If we as a project simple do not release the .Z releases we would be
fine as well ;)

You can try to complicate things with technical things, or if we release
every two / three weeks we don't run into these kind of situations :)

We might even cut the database version loose from the code version.

Database version is simple 100, 101, 102, 103, 104, 105. And a code
version requires a certain version of the database.

Wido

> On Mon, Dec 28, 2015 at 12:50 PM, Wido den Hollander <wido@widodh.nl> wrote:
> 
>>
>>
>> On 28-12-15 14:16, Rafael Weingärtner wrote:
>>> Hi all devs,
>>> First of all, sorry the long text, but I hope we can start a discussion
>>> here and improve that part of ACS.
>>>
>>> A while ago I have faced the code that Apache CloudStack (ACS) uses to
>>> upgrade from a version to newer one and that did not seem to be a good
>> way
>>> to execute our upgrades. Therefore, I decided to use some time to search
>>> for alternatives.
>>>
>>
>> I think we all saw that happen once or more :)
>>
>>> I have read some material about versioning of scripts used to upgrade a
>>> database (DB) of a system and went through some frameworks that could
>> help
>>> us.
>>>
>>> In the literature of software engineering, it is firmly stated that we
>> have
>>> to version DB scripts as we do with the source code of the application,
>>> using the baseline approach. Gladly, we were not that bad at this point,
>> we
>>> already versioned our routines for DB upgrade (.sql and .java).
>> Therefore,
>>> it seemed that we just did not have used a practical approach to help us
>>> during DB upgrades.
>>>
>>> From my readings and looking at the ACS source code I raised the
>> following
>>> requirement:
>>> •    We should be able to write more than one routine to upgrade to a
>>> version; those routines can be written in Java and SQL. We might have
>> more
>>> than a routine to be executed for each version and we should be able to
>>> define an order of execution. Additionally, to go to an upper version, we
>>> have to run all of the routines from smaller versions first, until we
>>> achieve the desired version.
>>>
>>> We could also add another requirement that is the downgrade from a
>> version,
>>> which we currently do not support. With that comes my first question for
>>> discussion:
>>> •    Do we want/need a method to downgrade from a version to a previous
>> one?
>>>
>>
>> I personally do not care. Usually people should create a backup PRIOR to
>> a upgrade. If that fails they can restore the backup.
>>
>>> I found an explanation for not supporting downgrades, and I liked it:
>>> http://flywaydb.org/documentation/faq.html#downgrade
>>>
>>> So, what I devised for us:
>>> First the bureaucracy part  - our migrations occur basically in three (3)
>>> steps, first we have a "prepare script", then a cleanup script and
>> finally
>>> the migration per se that is written in Java, at least, that is what we
>> can
>>> expect when reading the interface “com.cloud.upgrade.dao.DbUpgrade”.
>>>
>>> Additionally, our scripts have the following naming convention:
>>> schema-<currentVersion>to<desiredVersion>, which in IMHO may cause some
>>> confusion because at first sight we may think that from the same version
>> we
>>> could have different paths to an upper version, which in practice is not
>>> happening. Instead of a <currentVersion>to<version> we could simply use
>>> V_<numberOfVersion>_<sequencial>.<fileExtension>, giving that, we have to
>>> execute all of the V_<version> scripts that are smaller than the version
>> we
>>> want to upgrade.
>>>
>>> To clarify what I am saying, I will use an example. Let’s say we have
>> just
>>> installed ACS and ran the cloudstack-setup-database. That command will
>>> create a database schema in version 4.0.0. To upgrade that schema to
>>> version 4.3.0 (it is just an example, it could be any other version), ACS
>>> will use the following mapping:
>>>
>>> _upgradeMap.put("4.0.0", new DbUpgrade[] {new Upgrade40to41(), new
>>> Upgrade410to420(), new Upgrade420to421(), new Upgrade421to430())
>>>
>>> After loading the mapping, ACS will execute the scripts defined in each
>> one
>>> of the Upgrade path classes and the migration code per se.
>>>
>>> Now, let’s say we change the “.sql” scripts name to the pattern I
>>> mentioned, we would have the following scripts; those are the scripts
>> found
>>> that aim to upgrade to versions between the interval 4.0.0 – 4.3.0
>>> (considering 4.3.0, since that is the goal version):
>>>
>>>
>>>    - schema-40to410, can be named to:  V_410_A.sql
>>>    - schema-40to410-cleanup, can be named to:  V_410_B.sql
>>>    - schema-410to420, can be named to:  V_420_A.sql
>>>    - schema-410to420-cleanup , can be named to:  V_420_b.sql
>>>    - schema-420to421, can be named to:  V_421_A.sql
>>>    - schema-421to430, can be named to:  V_430_A.sql
>>>    - schema-421to430-cleanup, can be named to:  V_430_B.sql
>>>
>>>
>>> Additionally, all of the java code would have to follow the same
>>> convention. For instance, we have “com.cloud.upgrade.dao.Upgrade40to41”,
>>> which has some java code to migrate from 4.0.0 to 4.1.0. The idea is to
>>> extract that migration code to a Java class named:  V_410_C.java, giving
>>> that it has to execute the SQL scripts before the java code.
>>>
>>> In order to go from a smaller version (4.0.0) to an upper one (4.3.0), we
>>> have to run all of the migration routines from intermediate versions.
>> That
>>> is what we are already doing, but we do all of that manually.
>>>
>>> Bottom line, I think we could simple use the convention
>>> V_<numberOfVersion>_<sequencial>.<fileExtension> to name upgrade
>> routines.
>>> That would facilitate us to use a framework to help us with that process.
>>> Additionally, I believe that we should always assume that to go from a
>>> smaller version to a higher one, we should run all of the scripts that
>>> exist between them. What do you guys think of that?
>>>
>>
>> That seems good to me. But we still have to prevent that we perform
>> database changes in a X.Y.Z release since that is branched off to a
>> different branch.
>>
>> Imho database changes should only happen in X.Y releases.
>>
>>> After the bureaucracy, we can discuss tools. If we use that convention to
>>> name migration (upgrade) routines, we can start thinking on tools to
>>> support our migration process. I found two (2) promising ones: Liquibase
>>> and Flywaydb (both seem to be under Apache license, but the first one has
>>> an enterprise version?!). After reading the documentation and some usage
>>> examples I found the flywaydb easier and simpler to use.
>>>
>>> What are the options of tools that we can use to help us manage the
>>> database upgrade, without needing to code the upgrade path that you know?
>>>
>>> After that, I think we should decide if we should create another
>>> project/component to take care of migrations, or we can just  add the
>>> dependency of the tool to a project such as “cloud-framework-db” and
>> start
>>> using it.
>>>
>>> The “cloud-framework-db” project seems to have a focus on other things
>> such
>>> as managing transactions and generating SQLs from annotations (?!? That
>>> should be a topic for another discussion). Therefore, I would rather
>> create
>>> a new project that has the specific goal of managing ACS DB upgrades.  I
>>> would also move all of the routines (SQL and Java) to this new project.
>>> This project would be a module of the CloudStack project and it would
>>> execute the upgrade routines at the startup of ACS.
>>>
>>> I believe that going from a homemade solution to one that is more
>>> consolidated and used by other communities would be the way to go.
>>>
>>> I can volunteer myself to create a PR with the aforementioned changes and
>>> using flywaydb to manage our upgrades. However, I prefer to have a good
>>> discussion with other devs first, before starting coding.
>>>
>>> Do you have suggestions or points that should be raised before we start
>>> working on that?
>>
>> Rohit suggested Chimp earlier this year:
>>
>> http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201508.mbox/%3C677BD09F-FC75-4888-8DC8-2B7AF7439221@shapeblue.com%3E
>>
>> The thread is called: "[DISCUSS] Let's fix CloudStack Upgrades and DB
>> migrations with CloudStack Chimp"
>>
>> Maybe there is something good in there.
>>
>>>
>>> --
>>> Rafael Weingärtner
>>>
>>
> 
> 
>