ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Goncharuk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-8610) Searching checkpoint / WAL history for rebalancing is not properly working in case of local/global WAL disabling
Date Wed, 06 Jun 2018 15:08:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-8610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503419#comment-16503419
] 

Alexey Goncharuk commented on IGNITE-8610:
------------------------------------------

[~Jokser], a few comments:
* In {{GridDhtPreloader}} you've added the following code:
{code}
        if (!assignments.isEmpty() && grp.persistenceEnabled()) {
            ctx.database().checkpointReadLock();

            try {
                ((GridCacheDatabaseSharedManager) ctx.database()).lastCheckpointInapplicableForWalRebalance(grp.groupId());
            }
            finally {
                ctx.database().checkpointReadUnlock();
            }
        }
{code}
I suggest to introduce such a method to the DatabaseSharedManager and have it empty for default
implementation, while persistence-enabled implementation will acquire checkpoint read lock
and du necessary work. This will hide both {{instanceof}} and {{if (persistenceEnabled())}}

* You've added a synchronous wait for partition re-creation in {{generateAssignments}}, which
happens in exchange thread. Let's add our generic timed-spin-wait and warn if the wait is
too long.

> Searching checkpoint / WAL history for rebalancing is not properly working in case of
local/global WAL disabling
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8610
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8610
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 2.5
>            Reporter: Pavel Kovalenko
>            Assignee: Pavel Kovalenko
>            Priority: Major
>             Fix For: 2.6
>
>
> After implementation IGNITE-6411 and IGNITE-8087 we can face with situation when after
some checkpoint, WAL was temporarily disabled and enabled again. In this case we can't treat
that checkpoint as start point to rebalance, because WAL history after such checkpoint may
contain gaps.
> We should rework our checkpoint / wal history searching mechanism and ignore such checkpoints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message