curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Blum <dragonsi...@gmail.com>
Subject Re: TreeCache question
Date Thu, 28 Jan 2016 16:38:04 GMT
Nick,

Looking at this closer, I see your point!  There is a potential race
between an update thread updating stat+data, and a reader thread accessing
it, and it looks like we could make some changes to the implementation to
simplify and fix this without affecting the public API.

Here's a tentative plan:

1) Change the existing ChildData to make a hard data ref instead of an
embedded atomic ref.
2) Create a package-protected subclass (ChildDataWithClear?) that leaves
'data' null and has a dataRef atomic ref, move clearData() there.
3) PathChildrenCache would use the subclass.
4) Refactor TreeCache to combine the stat and data fields into a single
atomic ref on ChildData as you suggested.

And as you point out, if we cache ChildData objects in the TreeNodes, then
we can avoid a lot of new object construction on queries.

Sound good?
Scott


On Thu, Jan 28, 2016 at 10:21 AM, Nick Hill <apache@nickhill.org> wrote:

> Thanks Cameron, Scott,
>
> I'm not aware of any
>> concurrency bugs in TreeCache right now, I don't think I relied on the
>> atomic refs for safety.
>>
>
> I guess I wasn't aware in general that atomic refs had a purpose other
> than this.
>
> It just seems that not guaranteeing consistency of the stat and data in
> the returned ChildData objects restricts the usefulness of the cache, since
> you would typically have to get the data/stat again directly before doing
> any conditional update to a znode.
>
> Ignoring the less-important suggestion for simplifying ChildData itself,
> wouldn't there still be significant value in a simple change to move the
> TreeNode data and stat atomic refs into a single ChildData atomic ref?
>
> Cheers,
> Nick
>
>
> Quoting Scott Blum <dragonsinth@gmail.com>:
>
> HI Nick,
>>
>> TreeCache came later, and literally the only reason for reusing ChildData
>> was to not create additional API surface area, and make for an easier
>> transition / drop-in replacement for NodeCache & PathChildrenCache.
>> Ordinarily, I'm a big fan of immutable objects.  I'm not aware of any
>> concurrency bugs in TreeCache right now, I don't think I relied on the
>> atomic refs for safety.
>>
>> The main issue with changing it would just be additional surface area or
>> breaking old client code.
>>
>> HTH!
>> Scott
>>
>>
>> On Wed, Jan 27, 2016 at 10:16 PM, Cameron McKenzie <
>> cammckenzie@apache.org>
>> wrote:
>>
>> hey Nick,
>>> Sounds like a reasonable suggestion to me. I'll let Scott chime in though
>>> as he wrote that code originally and may have had some reason for
>>> structuring it as he has.
>>> cheers
>>>
>>> On Thu, Jan 28, 2016 at 1:42 PM, Nick Hill <apache@nickhill.org> wrote:
>>>
>>> > Hi, I have been looking at the TreeCache impl and have some questions.
>>> >
>>> > It doesn't look right to me that there's separate atomic refs for a
>>> node's
>>> > data and stat. It seems the stat in a ChildData object obtained from
>>> > getCurrentData() might not correspond to the data that it's with. This
>>> > could be problematic when doing conditional state changes given
>>> assumptions
>>> > about that data.
>>> >
>>> > An obvious and simple solution to this would be to have a single
>>> > AtomicReference<ChildData> field instead, which would have the
>>> additional
>>> > significant benefit of eliminating ChildData obj creation on every
>>> cache
>>> > access. PathChildrenCache works this way, but my understanding was that
>>> > TreeCache is intended to be a (more flexible) replacement.
>>> >
>>> > Furthermore I'd propose that the data field of ChildData be just a
>>> final
>>> > byte[] instead of an AtomicReference. This would avoid needing two
>>> volatile
>>> > reads to get to the data, and mean that "sharing" these (per above) is
>>> a
>>> > bit safer. The ChildData byte[] AtomicReference is only used by
>>> > PathChildrenCache.clearDataBytes() (not currently used by TreeCache at
>>> > all), and that capability could be easily maintained by having
>>> > PathChildrenCache use it's own simple subclass of ChildData containing
>>> the
>>> > atomic reference.
>>> > If similar capability were to be added to TreeCache, I'd suggest it
>>> would
>>> > be better to just replace the node's ChildData object with a copy that
>>> has
>>> > the byte[] field nulled out (but same stat ref).
>>> >
>>> > I'm fairly new to the code so apologies if there's something I've
>>> > missed/misunderstood! But if there is agreement, I'd also be happy to
>>> > prepare a PR.
>>> >
>>> > Regards,
>>> > Nick
>>> >
>>>
>>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message