curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Blum (JIRA)" <>
Subject [jira] [Commented] (CURATOR-294) Optimize TreeCache, fix possible concurrency issue
Date Thu, 28 Jan 2016 16:42:39 GMT


Scott Blum commented on CURATOR-294:


Looking at this closer, I see your point!  There is a potential race between an update thread
updating stat+data, and a reader thread accessing it, and it looks like we could make some
changes to the implementation to simplify and fix this without affecting the public API.

Here's a tentative plan:

1) Change the existing ChildData to make a hard data ref instead of an embedded atomic ref.
2) Create a package-protected subclass (ChildDataWithClear?) that leaves 'data' null and has
a dataRef atomic ref, move clearData() there.
3) PathChildrenCache would use the subclass.
4) Refactor TreeCache to combine the stat and data fields into a single atomic ref on ChildData
as you suggested.

And as you point out, if we cache ChildData objects in the TreeNodes, then we can avoid a
lot of new object construction on queries.

Sound good?

> Optimize TreeCache, fix possible concurrency issue
> --------------------------------------------------
>                 Key: CURATOR-294
>                 URL:
>             Project: Apache Curator
>          Issue Type: Improvement
>          Components: Recipes
>            Reporter: Nick Hill
>            Assignee: Scott Blum
> Hi, I have been looking at the TreeCache impl and have some questions.
> It doesn't look right to me that there's separate atomic refs for a node's data and stat.
It seems the stat in a ChildData object obtained from getCurrentData() might not correspond
to the data that it's with. This could be problematic when doing conditional state changes
given assumptions about that data.
> An obvious and simple solution to this would be to have a single AtomicReference<ChildData>
field instead, which would have the additional significant benefit of eliminating ChildData
obj creation on every cache access. PathChildrenCache works this way, but my understanding
was that TreeCache is intended to be a (more flexible) replacement.
> Furthermore I'd propose that the data field of ChildData be just a final byte[] instead
of an AtomicReference. This would avoid needing two volatile reads to get to the data, and
mean that "sharing" these (per above) is a bit safer. The ChildData byte[] AtomicReference
is only used by PathChildrenCache.clearDataBytes() (not currently used by TreeCache at all),
and that capability could be easily maintained by having PathChildrenCache use it's own simple
subclass of ChildData containing the atomic reference.
> If similar capability were to be added to TreeCache, I'd suggest it would be better to
just replace the node's ChildData object with a copy that has the byte[] field nulled out
(but same stat ref).
> I'm fairly new to the code so apologies if there's something I've missed/misunderstood!
But if there is agreement, I'd also be happy to prepare a PR.
> Regards,
> Nick

This message was sent by Atlassian JIRA

View raw message