asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianfeng Jia <jianfeng....@gmail.com>
Subject Re: Possible Race condition in the new UTF8String implementation
Date Wed, 11 Nov 2015 20:21:59 GMT
I agree on avoiding using instance, that’s why I removed the UTF8StringSerialDerializer.INSTANCE
object, and alway create a new object whenever it was asked. 
The problem is that it was asked and created in the Factory object, and shared in the Thread
object. 

Using IPrinterFactory should be a good solution. 


> On Nov 11, 2015, at 12:03 PM, Yingyi Bu <buyingyi@gmail.com> wrote:
> 
> Jianfeng,
> 
>    I think we'd better not use any ISerializerDeserializer instance
> for complex
> objects like Records and Ordered/UnorderedLists in the runtime code ---
> they have race conditions.  Instead, we should always use IPrinterFactory
> instances to print things at runtime.
> 
> Best,
> Yingyi
> 
> On Wed, Nov 11, 2015 at 11:54 AM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
> 
>> Here is my finding and thoughts.
>> I think I’ve checked all the direct use case of UTF8SerDer. However, I
>> missed some indirect static/shared use case of UTF8SerDer.
>> 
>> One big suspect is the RecordDescriptor which has the
>> ISerializerDeserializers inside and is always passed into the Factory
>> method and shared by the ThreadMethod (usually NodePushable).
>> E.g., in the ResultWriterOperatorDescriptor, the outRecordDesc is passed
>> to the createPushRuntime() factory method to create the “resultSerializer”,
>> and it is shared by the thread object
>> AbstractUnaryInputSinkOperatorNodePushable. This pushable object will
>> directly get the deserializer from the shared
>> recordDescpitor.getFields()[i]. It explains the issue-1164.
>> 
>> I guess in your case there must be some deserializers given by shared
>> RecordDescriptor. Then it will get into the racing condition if there are
>> some UTF8StringSerDer involved.
>> 
>> Given that the SerDers are stored in the shared RecordDescriptor, I think
>> the very initial design was to make the all the SerDers thread-safe. And it
>> maybe some other data structures stores the SerDers and are passed/used in
>> a same way. Then I’d have to propose to roll back the UTF8SerDer into the
>> state-less version (at the expense of creating intermediate buffer array
>> per record).
>> 
>> Any opinions?
>> 
>> 
>>> On Nov 11, 2015, at 10:54 AM, abdullah alamoudi <bamousaa@gmail.com>
>> wrote:
>>> 
>>> That was my first thought and so I changed it. The issue is still there.
>>> I am also using the UTF8StringSerializerDeserializer to deserialize the
>>> strings and they always serialize it correctly.
>>> 
>>> I am thinking maybe it is related to the UTF8StringPointable but I am not
>>> sure how that could be.
>>> I am looking at this as well,
>>> Abdullah.
>>> 
>>> Amoudi, Abdullah.
>>> 
>>> On Wed, Nov 11, 2015 at 8:05 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
>>> wrote:
>>> 
>>>> The possible racing condition could be that the
>>>> UTF8StringSerializerDeserializer now is not a singleton method any
>> more. It
>>>> was implemented to reuse the byte[] that serialize/deserialize the
>> string
>>>> object. Let me look into this issue.
>>>> 
>>>>> On Nov 11, 2015, at 8:37 AM, abdullah alamoudi <bamousaa@gmail.com>
>>>> wrote:
>>>>> 
>>>>> Highly probable.
>>>>> Please, let's fix this soon.
>>>>> 
>>>>> Amoudi, Abdullah.
>>>>> 
>>>>> On Wed, Nov 11, 2015 at 7:32 PM, Till Westmann <tillw@apache.org>
>> wrote:
>>>>> 
>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1164
>>>>>> might be related.
>>>>>> 
>>>>>> Cheers,
>>>>>> Till
>>>>>> 
>>>>>> On 11 Nov 2015, at 8:25, abdullah alamoudi wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> I am having a hard time figuring this out. Here are the symptoms
I am
>>>>>>> seeing in case one has an idea what this could be.
>>>>>>> 
>>>>>>> I have a feed running ingesting data into a dataset. sporadically,
I
>>>> get
>>>>>>> duplicate key exception errors (The key is of a string type)
and I am
>>>>>> 100%
>>>>>>> sure that I don't have duplicate records.
>>>>>>> 
>>>>>>> Moreover, I am printing the content of the frames about to be
>> inserted
>>>>>> into
>>>>>>> the primary index and there are no duplicate records.
>>>>>>> 
>>>>>>> There are three reasons why I am suspecting the String
>> implementation:
>>>>>>> 1. It is fairly recent change.
>>>>>>> 2. When I run on a single node, or run one thread at a time,
I never
>>>> get
>>>>>>> this exception.
>>>>>>> 3. the key is a String.
>>>>>>> 
>>>>>>> I have looked at the change trying to figure out where a race
>> condition
>>>>>>> might take place but it is well hidden (if it is true at all.).
>>>>>>> 
>>>>>>> Let me know if you have seen something similar.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Abdullah.
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> Best,
>>>> 
>>>> Jianfeng Jia
>>>> PhD Candidate of Computer Science
>>>> University of California, Irvine
>>>> 
>>>> 
>> 
>> 
>> 
>> Best,
>> 
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>> 
>> 



Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message