asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murtadha Hubail (Code Review)" <>
Subject Change in asterixdb[master]: Divide Cluster into Unique Partitions
Date Thu, 31 Dec 2015 02:23:18 GMT
Murtadha Hubail has posted comments on this change.

Change subject: Divide Cluster into Unique Partitions

Patch Set 4:

Here is some background.
The user is supposed to provide two files:
1. cluster.xml
In this file, the user specify the nodes, their io devices, log directory ect.. as well as
a tag called <store> which can be added at both the cluster level and each NC level.
This tag is just a single string that specifies the storage directory name. If it is not defined
on an NC, that NC would use the CC value. If it is not defined for both, configuration validation
error is reported.
This cluster.xml file is supposed to be on the CC only but now we distribute it on all NCs
as well.

2. asterix-configuration:
This file contains the other runtime properties such as page size, memory parameters, ect...

When an instance is created, these two files are read and a new file that follows the AsterixConfiguration
class is created and distributed on all NCs.
The value of the new tag in this file (storageDirName) is just copied from the cluster.xml
<store> tag. The store tag in AsterixConfiguration file has a different format. It has
an NC id and a storeDirs child tags. These values are constructed by taking each io device
from the cluster.xml file per NC and appending to it the single string store, then a comma
is added after each io device. Finally, these values are assigned to the <store> tag
of the AsterixConfiguration class per NC. These stores are still needed if you would like
to get the complete storage path per IO device on an NC. For example, to do temp dataset clean
up on NC shutdown, these paths are used. Of course one can find out these values on runtime
by checking the IOManager mounting points then appending to it the new tag <storageDirName>.
I kept them because there are still callers to them and because they provide a static easy
way of finding this information.

One thing you should know is that the file you pointed out (asterix-build-configuration.xml)
is supposed to be the equivalent of the automatically produced file on instance creation,
but since this is not done in AsterixHyracksIntegrationUtil, this file was hacked manually
and made to look like the produced one.

The only change I made to this process is that I removed the single string store tag from
the NC level in the cluster.xml file and made it required in the cluster level.

I know the usage of the <store> tag twice for two different things is confusing but
I don't know how who came up with it.

I hope this made things clear, if not, please feel free to ask.

Of course this change breaks backward-compatibility. There is a complete change to the storage
file structure. This isn't caused by removing the store tag from NCs, but because of introducing
the unique partition id in the path. Even the hyracks part of this change breaks the backward-compatibility
since we don't append the IO Device number to the end of the file split path. There are many
pending changes that will break backward-compatibility to enable future backward-compatibility.
There was a discussion on this on metadata indexes change.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c7fbca5113dd7ad569a46dfa2591addb5bf8655
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Murtadha Hubail <>
Gerrit-Reviewer: Ian Maxon <>
Gerrit-Reviewer: Jenkins <>
Gerrit-Reviewer: Murtadha Hubail <>
Gerrit-Reviewer: Yingyi Bu <>
Gerrit-Reviewer: abdullah alamoudi <>
Gerrit-HasComments: No

View raw message