asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianfeng Jia <jianfeng....@gmail.com>
Subject Re: Possible Race condition in the new UTF8String implementation
Date Thu, 12 Nov 2015 00:24:26 GMT
You are correct. I was looking at one implementation of ResultWriterOperator which was only
used in hyracks test. The normal ResultWriter is using the IPrinterFactory as what Yingyi
said.

There must be a different reason for this racing condition. Let me check 1164 again. 

> On Nov 11, 2015, at 12:34 PM, abdullah alamoudi <bamousaa@gmail.com> wrote:
> 
> I definitely re-built both and I am 100% sure that the
> AsterixIndexInsertDeleteNodePushable has the bug. where? not sure but most
> likely hidden somewhere in the storage layer.
> 
> Tomorrow, I am going to check each of the components in that operator 1 by
> 1 until I can isolate the source of the bug.
> 
> Cheers,
> Abdullah.
> 
> Amoudi, Abdullah.
> 
> On Wed, Nov 11, 2015 at 11:27 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
> 
>> Then that will be two different issues.
>> Just want to make sure that you’ve rebuilt the hyracks (not only
>> asterixdb) before test your code, cause those changes are in hyracks.
>> And could you send the logic plan and the hyrack job so that we can lock
>> which hyracks operators that get involved?
>> 
>>> On Nov 11, 2015, at 12:10 PM, abdullah alamoudi <bamousaa@gmail.com>
>> wrote:
>>> 
>>> That was my first thought as I said but I am 100% sure the issue is not
>> in
>>> the SerDe. To confirm this, I removed the reader and writer from the
>> serde
>>> and created a new instance of reader/writer in every call to serialize or
>>> deserialize just to determine if the problem is gone.
>>> 
>>> The problem didn't go away and I still had the same issue. That is why I
>>> know for sure it is not the SerDe.
>>> 
>>> Don't waste any more time in that direction.
>>> ~Abdullah.
>>> 
>>> Amoudi, Abdullah.
>>> 
>>> On Wed, Nov 11, 2015 at 10:54 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
>>> wrote:
>>> 
>>>> Here is my finding and thoughts.
>>>> I think I’ve checked all the direct use case of UTF8SerDer. However, I
>>>> missed some indirect static/shared use case of UTF8SerDer.
>>>> 
>>>> One big suspect is the RecordDescriptor which has the
>>>> ISerializerDeserializers inside and is always passed into the Factory
>>>> method and shared by the ThreadMethod (usually NodePushable).
>>>> E.g., in the ResultWriterOperatorDescriptor, the outRecordDesc is passed
>>>> to the createPushRuntime() factory method to create the
>> “resultSerializer”,
>>>> and it is shared by the thread object
>>>> AbstractUnaryInputSinkOperatorNodePushable. This pushable object will
>>>> directly get the deserializer from the shared
>>>> recordDescpitor.getFields()[i]. It explains the issue-1164.
>>>> 
>>>> I guess in your case there must be some deserializers given by shared
>>>> RecordDescriptor. Then it will get into the racing condition if there
>> are
>>>> some UTF8StringSerDer involved.
>>>> 
>>>> Given that the SerDers are stored in the shared RecordDescriptor, I
>> think
>>>> the very initial design was to make the all the SerDers thread-safe.
>> And it
>>>> maybe some other data structures stores the SerDers and are passed/used
>> in
>>>> a same way. Then I’d have to propose to roll back the UTF8SerDer into
>> the
>>>> state-less version (at the expense of creating intermediate buffer array
>>>> per record).
>>>> 
>>>> Any opinions?
>>>> 
>>>> 
>>>>> On Nov 11, 2015, at 10:54 AM, abdullah alamoudi <bamousaa@gmail.com>
>>>> wrote:
>>>>> 
>>>>> That was my first thought and so I changed it. The issue is still
>> there.
>>>>> I am also using the UTF8StringSerializerDeserializer to deserialize the
>>>>> strings and they always serialize it correctly.
>>>>> 
>>>>> I am thinking maybe it is related to the UTF8StringPointable but I am
>> not
>>>>> sure how that could be.
>>>>> I am looking at this as well,
>>>>> Abdullah.
>>>>> 
>>>>> Amoudi, Abdullah.
>>>>> 
>>>>> On Wed, Nov 11, 2015 at 8:05 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> The possible racing condition could be that the
>>>>>> UTF8StringSerializerDeserializer now is not a singleton method any
>>>> more. It
>>>>>> was implemented to reuse the byte[] that serialize/deserialize the
>>>> string
>>>>>> object. Let me look into this issue.
>>>>>> 
>>>>>>> On Nov 11, 2015, at 8:37 AM, abdullah alamoudi <bamousaa@gmail.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Highly probable.
>>>>>>> Please, let's fix this soon.
>>>>>>> 
>>>>>>> Amoudi, Abdullah.
>>>>>>> 
>>>>>>> On Wed, Nov 11, 2015 at 7:32 PM, Till Westmann <tillw@apache.org>
>>>> wrote:
>>>>>>> 
>>>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1164
>>>>>>>> might be related.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Till
>>>>>>>> 
>>>>>>>> On 11 Nov 2015, at 8:25, abdullah alamoudi wrote:
>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> I am having a hard time figuring this out. Here are the
symptoms I
>> am
>>>>>>>>> seeing in case one has an idea what this could be.
>>>>>>>>> 
>>>>>>>>> I have a feed running ingesting data into a dataset.
sporadically,
>> I
>>>>>> get
>>>>>>>>> duplicate key exception errors (The key is of a string
type) and I
>> am
>>>>>>>> 100%
>>>>>>>>> sure that I don't have duplicate records.
>>>>>>>>> 
>>>>>>>>> Moreover, I am printing the content of the frames about
to be
>>>> inserted
>>>>>>>> into
>>>>>>>>> the primary index and there are no duplicate records.
>>>>>>>>> 
>>>>>>>>> There are three reasons why I am suspecting the String
>>>> implementation:
>>>>>>>>> 1. It is fairly recent change.
>>>>>>>>> 2. When I run on a single node, or run one thread at
a time, I
>> never
>>>>>> get
>>>>>>>>> this exception.
>>>>>>>>> 3. the key is a String.
>>>>>>>>> 
>>>>>>>>> I have looked at the change trying to figure out where
a race
>>>> condition
>>>>>>>>> might take place but it is well hidden (if it is true
at all.).
>>>>>>>>> 
>>>>>>>>> Let me know if you have seen something similar.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Abdullah.
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Best,
>>>>>> 
>>>>>> Jianfeng Jia
>>>>>> PhD Candidate of Computer Science
>>>>>> University of California, Irvine
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> Best,
>>>> 
>>>> Jianfeng Jia
>>>> PhD Candidate of Computer Science
>>>> University of California, Irvine
>>>> 
>>>> 
>> 
>> 
>> 
>> Best,
>> 
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>> 
>> 



Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message