geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (GEODE-4285) Temporary failure with "Unable to determine PDXType" using WAN
Date Wed, 17 Jan 2018 23:39:00 GMT


ASF subversion and git services commented on GEODE-4285:

Commit da426077cc621196907df59c6a9e5f7dce907333 in geode's branch refs/heads/develop from
[;h=da42607 ]

GEODE-4285: Get a distributed lock if we can't find a PDX type

If we are unable to find a PDX type during a get, will we get a
distributed lock and try again. This prevents races where the type may
be in the middle of distribution.

Adding a dunit test for the race condition that requires this fix.

> Temporary failure with "Unable to determine PDXType" using WAN
> --------------------------------------------------------------
>                 Key: GEODE-4285
>                 URL:
>             Project: Geode
>          Issue Type: Bug
>          Components: serialization
>            Reporter: Dan Smith
>            Priority: Major
>              Labels: pull-request-available
> We tracked down a race condition in distributing PDX types to the remote side of a WAN
> When using a parallel sender, all primaries on the sending side are dispatching the same
PDX type in parallel.
> On the receiving side, the first gateway batch will get a distributed lock in PeerTypeRegistration.addRemoteType
> {code}
> if (!r.containsKey(typeId)) {
>         // This type could actually be for this distributed system,
>         // so we need to make sure the type is published while holding
>         // the distributed lock.
>         lock();
>         try {
>           r.putIfAbsent(typeId, type);
>         } finally {
>           unlock();
>         }
>       }
> {code}
> However, the second gateway batch that is received will continue on without getting the
distributed lock because r.containsKey() will return true.
> The second batch could have values that require this type. But without getting the lock,
those fails will get to members that need the type potentially before the first batch is finished
distributing the type.

This message was sent by Atlassian JIRA

View raw message