ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Kovalenko (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-8459) Searching checkpoint history for WAL rebalance is broken
Date Tue, 08 May 2018 18:05:00 GMT
Pavel Kovalenko created IGNITE-8459:

             Summary: Searching checkpoint history for WAL rebalance is broken
                 Key: IGNITE-8459
                 URL: https://issues.apache.org/jira/browse/IGNITE-8459
             Project: Ignite
          Issue Type: Bug
          Components: cache
    Affects Versions: 2.5
            Reporter: Pavel Kovalenko
            Assignee: Pavel Kovalenko

Currently the mechanism to search available checkpoint records in WAL to have history for
WAL rebalance is broken. It means that WAL (Historical) rebalance will never find history
for rebalance and full rebalance will be always used.

This mechanism was broken in https://github.com/apache/ignite/commit/ec04cd174ed5476fba83e8682214390736321b37
by unclear reasons.

If we swap the following two code blocks (database().beforeExchange() and exchCtx if block):

        /* It is necessary to run database callback before all topology callbacks.
           In case of persistent store is enabled we first restore partitions presented on
           We need to guarantee that there are no partition state changes logged to WAL before
this callback
           to make sure that we correctly restored last actual states. */

        if (!exchCtx.mergeExchanges()) {
            for (CacheGroupContext grp : cctx.cache().cacheGroups()) {
                if (grp.isLocal() || cacheGroupStopping(grp.groupId()))

                // It is possible affinity is not initialized yet if node joins to cluster.
                if (grp.affinity().lastVersion().topologyVersion() > 0)
                    grp.topology().beforeExchange(this, !centralizedAff && !forceAffReassignment,

the searching mechanism will start to work correctly. Currently it's unclear why it's happened.

This message was sent by Atlassian JIRA

View raw message