curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Hill <apa...@nickhill.org>
Subject Re: TreeCache question
Date Thu, 28 Jan 2016 19:18:34 GMT
Sounds great to me!

Thanks!!

Quoting Scott Blum <dragonsinth@gmail.com>:

> https://issues.apache.org/jira/browse/CURATOR-294
>
> On Thu, Jan 28, 2016 at 11:38 AM, Scott Blum <dragonsinth@gmail.com> wrote:
>
>> Nick,
>>
>> Looking at this closer, I see your point!  There is a potential race
>> between an update thread updating stat+data, and a reader thread accessing
>> it, and it looks like we could make some changes to the implementation to
>> simplify and fix this without affecting the public API.
>>
>> Here's a tentative plan:
>>
>> 1) Change the existing ChildData to make a hard data ref instead of an
>> embedded atomic ref.
>> 2) Create a package-protected subclass (ChildDataWithClear?) that leaves
>> 'data' null and has a dataRef atomic ref, move clearData() there.
>> 3) PathChildrenCache would use the subclass.
>> 4) Refactor TreeCache to combine the stat and data fields into a single
>> atomic ref on ChildData as you suggested.
>>
>> And as you point out, if we cache ChildData objects in the TreeNodes, then
>> we can avoid a lot of new object construction on queries.
>>
>> Sound good?
>> Scott
>>
>>
>> On Thu, Jan 28, 2016 at 10:21 AM, Nick Hill <apache@nickhill.org> wrote:
>>
>>> Thanks Cameron, Scott,
>>>
>>> I'm not aware of any
>>>> concurrency bugs in TreeCache right now, I don't think I relied on the
>>>> atomic refs for safety.
>>>>
>>>
>>> I guess I wasn't aware in general that atomic refs had a purpose other
>>> than this.
>>>
>>> It just seems that not guaranteeing consistency of the stat and data in
>>> the returned ChildData objects restricts the usefulness of the cache, since
>>> you would typically have to get the data/stat again directly before doing
>>> any conditional update to a znode.
>>>
>>> Ignoring the less-important suggestion for simplifying ChildData itself,
>>> wouldn't there still be significant value in a simple change to move the
>>> TreeNode data and stat atomic refs into a single ChildData atomic ref?
>>>
>>> Cheers,
>>> Nick
>>>
>>>
>>> Quoting Scott Blum <dragonsinth@gmail.com>:
>>>
>>> HI Nick,
>>>>
>>>> TreeCache came later, and literally the only reason for reusing ChildData
>>>> was to not create additional API surface area, and make for an easier
>>>> transition / drop-in replacement for NodeCache & PathChildrenCache.
>>>> Ordinarily, I'm a big fan of immutable objects.  I'm not aware of any
>>>> concurrency bugs in TreeCache right now, I don't think I relied on the
>>>> atomic refs for safety.
>>>>
>>>> The main issue with changing it would just be additional surface area or
>>>> breaking old client code.
>>>>
>>>> HTH!
>>>> Scott
>>>>
>>>>
>>>> On Wed, Jan 27, 2016 at 10:16 PM, Cameron McKenzie <
>>>> cammckenzie@apache.org>
>>>> wrote:
>>>>
>>>> hey Nick,
>>>>> Sounds like a reasonable suggestion to me. I'll let Scott chime in
>>>>> though
>>>>> as he wrote that code originally and may have had some reason for
>>>>> structuring it as he has.
>>>>> cheers
>>>>>
>>>>> On Thu, Jan 28, 2016 at 1:42 PM, Nick Hill <apache@nickhill.org>
wrote:
>>>>>
>>>>> > Hi, I have been looking at the TreeCache impl and have some questions.
>>>>> >
>>>>> > It doesn't look right to me that there's separate atomic refs for
a
>>>>> node's
>>>>> > data and stat. It seems the stat in a ChildData object obtained
from
>>>>> > getCurrentData() might not correspond to the data that it's with.
This
>>>>> > could be problematic when doing conditional state changes given
>>>>> assumptions
>>>>> > about that data.
>>>>> >
>>>>> > An obvious and simple solution to this would be to have a single
>>>>> > AtomicReference<ChildData> field instead, which would have
the
>>>>> additional
>>>>> > significant benefit of eliminating ChildData obj creation on every
>>>>> cache
>>>>> > access. PathChildrenCache works this way, but my understanding was
>>>>> that
>>>>> > TreeCache is intended to be a (more flexible) replacement.
>>>>> >
>>>>> > Furthermore I'd propose that the data field of ChildData be just
a
>>>>> final
>>>>> > byte[] instead of an AtomicReference. This would avoid needing two
>>>>> volatile
>>>>> > reads to get to the data, and mean that "sharing" these (per above)
>>>>> is a
>>>>> > bit safer. The ChildData byte[] AtomicReference is only used by
>>>>> > PathChildrenCache.clearDataBytes() (not currently used by TreeCache
at
>>>>> > all), and that capability could be easily maintained by having
>>>>> > PathChildrenCache use it's own simple subclass of ChildData containing
>>>>> the
>>>>> > atomic reference.
>>>>> > If similar capability were to be added to TreeCache, I'd suggest
it
>>>>> would
>>>>> > be better to just replace the node's ChildData object with a copy
that
>>>>> has
>>>>> > the byte[] field nulled out (but same stat ref).
>>>>> >
>>>>> > I'm fairly new to the code so apologies if there's something I've
>>>>> > missed/misunderstood! But if there is agreement, I'd also be happy
to
>>>>> > prepare a PR.
>>>>> >
>>>>> > Regards,
>>>>> > Nick
>>>>> >
>>>>>
>>>>>
>>>
>>>
>>



Mime
View raw message