asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann" <ti...@apache.org>
Subject Re: Possible Race condition in the new UTF8String implementation
Date Thu, 12 Nov 2015 16:43:39 GMT
Is the order in which the records are inserted in the test deterministic 
or does it change?

Thanks,
Till


On 12 Nov 2015, at 4:40, abdullah alamoudi wrote:

> Another update. I found a way to trigger the bug deterministically. It 
> has
> nothing to do with objects being shared between multiple threads.
> the bug shows up when records are inserted into the index in a 
> specific
> order and doesn't show up otherwise.
>
> Will be back with more updates.
>
> Amoudi, Abdullah.
>
> On Thu, Nov 12, 2015 at 10:52 AM, abdullah alamoudi 
> <bamousaa@gmail.com>
> wrote:
>
>> So here is some updates about this:
>>
>> I have added some debug statement so that whenever the value of a
>> UTF8StringPointable is reset, it prints the following:
>> 1. Its own object Id
>> 2. whether the value changed from the previous value
>> 3. if the value was changed, it prints the change
>>
>> Here is some output when doing inserts
>>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 WARNIN: UTF8StringPointble: prev =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332> NOT equal 
>> current
>> = submit-5.chtc.wisc.edu#3946939.19#1438705343
>> <http://submit-5.chtc.wisc.edu#3946939.19%231438705343>
>> My Id = 967207753 WARNIN: UTF8StringPointble: prev =
>> submit-5.chtc.wisc.edu#3946939.19#1438705343
>> <http://submit-5.chtc.wisc.edu#3946939.19%231438705343> NOT equal 
>> current
>> = submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946933.27#1438705332
>> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>>
>> As can be seen, for some reason, the value of this pointable changed
>> briefly and immediately after that, I got the duplicate key 
>> exception. The
>> same thing can be seen in some other places too:
>>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 WARNIN: UTF8StringPointble: prev =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345> NOT equal 
>> current
>> = submit-5.chtc.wisc.edu#3946946.79#1438705351
>> <http://submit-5.chtc.wisc.edu#3946946.79%231438705351>
>> My Id = 1829631708 WARNIN: UTF8StringPointble: prev =
>> submit-5.chtc.wisc.edu#3946946.79#1438705351
>> <http://submit-5.chtc.wisc.edu#3946946.79%231438705351> NOT equal 
>> current
>> = submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
>> submit-5.chtc.wisc.edu#3946943.44#1438705345
>> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>>
>> Not sure if I am on the right track but this is more promising than 
>> the
>> SerDe. will be back with more updates soon.
>> ~Abdullah.
>>
>> Amoudi, Abdullah.
>>
>> On Thu, Nov 12, 2015 at 3:24 AM, Jianfeng Jia 
>> <jianfeng.jia@gmail.com>
>> wrote:
>>
>>> You are correct. I was looking at one implementation of
>>> ResultWriterOperator which was only used in hyracks test. The normal
>>> ResultWriter is using the IPrinterFactory as what Yingyi said.
>>>
>>> There must be a different reason for this racing condition. Let me 
>>> check
>>> 1164 again.
>>>
>>>> On Nov 11, 2015, at 12:34 PM, abdullah alamoudi 
>>>> <bamousaa@gmail.com>
>>> wrote:
>>>>
>>>> I definitely re-built both and I am 100% sure that the
>>>> AsterixIndexInsertDeleteNodePushable has the bug. where? not sure 
>>>> but
>>> most
>>>> likely hidden somewhere in the storage layer.
>>>>
>>>> Tomorrow, I am going to check each of the components in that 
>>>> operator 1
>>> by
>>>> 1 until I can isolate the source of the bug.
>>>>
>>>> Cheers,
>>>> Abdullah.
>>>>
>>>> Amoudi, Abdullah.
>>>>
>>>> On Wed, Nov 11, 2015 at 11:27 PM, Jianfeng Jia 
>>>> <jianfeng.jia@gmail.com>
>>>> wrote:
>>>>
>>>>> Then that will be two different issues.
>>>>> Just want to make sure that you’ve rebuilt the hyracks (not only
>>>>> asterixdb) before test your code, cause those changes are in 
>>>>> hyracks.
>>>>> And could you send the logic plan and the hyrack job so that we 
>>>>> can
>>> lock
>>>>> which hyracks operators that get involved?
>>>>>
>>>>>> On Nov 11, 2015, at 12:10 PM, abdullah alamoudi 
>>>>>> <bamousaa@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> That was my first thought as I said but I am 100% sure the issue

>>>>>> is
>>> not
>>>>> in
>>>>>> the SerDe. To confirm this, I removed the reader and writer from

>>>>>> the
>>>>> serde
>>>>>> and created a new instance of reader/writer in every call to
>>> serialize or
>>>>>> deserialize just to determine if the problem is gone.
>>>>>>
>>>>>> The problem didn't go away and I still had the same issue. That 
>>>>>> is
>>> why I
>>>>>> know for sure it is not the SerDe.
>>>>>>
>>>>>> Don't waste any more time in that direction.
>>>>>> ~Abdullah.
>>>>>>
>>>>>> Amoudi, Abdullah.
>>>>>>
>>>>>> On Wed, Nov 11, 2015 at 10:54 PM, Jianfeng Jia <
>>> jianfeng.jia@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Here is my finding and thoughts.
>>>>>>> I think I’ve checked all the direct use case of UTF8SerDer.

>>>>>>> However,
>>> I
>>>>>>> missed some indirect static/shared use case of UTF8SerDer.
>>>>>>>
>>>>>>> One big suspect is the RecordDescriptor which has the
>>>>>>> ISerializerDeserializers inside and is always passed into the

>>>>>>> Factory
>>>>>>> method and shared by the ThreadMethod (usually NodePushable).
>>>>>>> E.g., in the ResultWriterOperatorDescriptor, the outRecordDesc

>>>>>>> is
>>> passed
>>>>>>> to the createPushRuntime() factory method to create the
>>>>> “resultSerializer”,
>>>>>>> and it is shared by the thread object
>>>>>>> AbstractUnaryInputSinkOperatorNodePushable. This pushable object

>>>>>>> will
>>>>>>> directly get the deserializer from the shared
>>>>>>> recordDescpitor.getFields()[i]. It explains the issue-1164.
>>>>>>>
>>>>>>> I guess in your case there must be some deserializers given by

>>>>>>> shared
>>>>>>> RecordDescriptor. Then it will get into the racing condition
if 
>>>>>>> there
>>>>> are
>>>>>>> some UTF8StringSerDer involved.
>>>>>>>
>>>>>>> Given that the SerDers are stored in the shared 
>>>>>>> RecordDescriptor, I
>>>>> think
>>>>>>> the very initial design was to make the all the SerDers 
>>>>>>> thread-safe.
>>>>> And it
>>>>>>> maybe some other data structures stores the SerDers and are
>>> passed/used
>>>>> in
>>>>>>> a same way. Then I’d have to propose to roll back the 
>>>>>>> UTF8SerDer into
>>>>> the
>>>>>>> state-less version (at the expense of creating intermediate 
>>>>>>> buffer
>>> array
>>>>>>> per record).
>>>>>>>
>>>>>>> Any opinions?
>>>>>>>
>>>>>>>
>>>>>>>> On Nov 11, 2015, at 10:54 AM, abdullah alamoudi 
>>>>>>>> <bamousaa@gmail.com
>>>>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> That was my first thought and so I changed it. The issue
is 
>>>>>>>> still
>>>>> there.
>>>>>>>> I am also using the UTF8StringSerializerDeserializer to 
>>>>>>>> deserialize
>>> the
>>>>>>>> strings and they always serialize it correctly.
>>>>>>>>
>>>>>>>> I am thinking maybe it is related to the UTF8StringPointable

>>>>>>>> but I
>>> am
>>>>> not
>>>>>>>> sure how that could be.
>>>>>>>> I am looking at this as well,
>>>>>>>> Abdullah.
>>>>>>>>
>>>>>>>> Amoudi, Abdullah.
>>>>>>>>
>>>>>>>> On Wed, Nov 11, 2015 at 8:05 PM, Jianfeng Jia <
>>> jianfeng.jia@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> The possible racing condition could be that the
>>>>>>>>> UTF8StringSerializerDeserializer now is not a singleton
method 
>>>>>>>>> any
>>>>>>> more. It
>>>>>>>>> was implemented to reuse the byte[] that serialize/deserialize

>>>>>>>>> the
>>>>>>> string
>>>>>>>>> object. Let me look into this issue.
>>>>>>>>>
>>>>>>>>>> On Nov 11, 2015, at 8:37 AM, abdullah alamoudi <
>>> bamousaa@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Highly probable.
>>>>>>>>>> Please, let's fix this soon.
>>>>>>>>>>
>>>>>>>>>> Amoudi, Abdullah.
>>>>>>>>>>
>>>>>>>>>> On Wed, Nov 11, 2015 at 7:32 PM, Till Westmann 
>>>>>>>>>> <tillw@apache.org>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1164
>>>>>>>>>>> might be related.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Till
>>>>>>>>>>>
>>>>>>>>>>> On 11 Nov 2015, at 8:25, abdullah alamoudi wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> I am having a hard time figuring this out.
Here are the
>>> symptoms I
>>>>> am
>>>>>>>>>>>> seeing in case one has an idea what this
could be.
>>>>>>>>>>>>
>>>>>>>>>>>> I have a feed running ingesting data into
a dataset.
>>> sporadically,
>>>>> I
>>>>>>>>> get
>>>>>>>>>>>> duplicate key exception errors (The key is
of a string 
>>>>>>>>>>>> type)
>>> and I
>>>>> am
>>>>>>>>>>> 100%
>>>>>>>>>>>> sure that I don't have duplicate records.
>>>>>>>>>>>>
>>>>>>>>>>>> Moreover, I am printing the content of the
frames about to 
>>>>>>>>>>>> be
>>>>>>> inserted
>>>>>>>>>>> into
>>>>>>>>>>>> the primary index and there are no duplicate
records.
>>>>>>>>>>>>
>>>>>>>>>>>> There are three reasons why I am suspecting
the String
>>>>>>> implementation:
>>>>>>>>>>>> 1. It is fairly recent change.
>>>>>>>>>>>> 2. When I run on a single node, or run one
thread at a 
>>>>>>>>>>>> time, I
>>>>> never
>>>>>>>>> get
>>>>>>>>>>>> this exception.
>>>>>>>>>>>> 3. the key is a String.
>>>>>>>>>>>>
>>>>>>>>>>>> I have looked at the change trying to figure
out where a 
>>>>>>>>>>>> race
>>>>>>> condition
>>>>>>>>>>>> might take place but it is well hidden (if
it is true at 
>>>>>>>>>>>> all.).
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know if you have seen something similar.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Abdullah.
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Jianfeng Jia
>>>>>>>>> PhD Candidate of Computer Science
>>>>>>>>> University of California, Irvine
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Jianfeng Jia
>>>>>>> PhD Candidate of Computer Science
>>>>>>> University of California, Irvine
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Jianfeng Jia
>>>>> PhD Candidate of Computer Science
>>>>> University of California, Irvine
>>>>>
>>>>>
>>>
>>>
>>>
>>> Best,
>>>
>>> Jianfeng Jia
>>> PhD Candidate of Computer Science
>>> University of California, Irvine
>>>
>>>
>>

Mime
View raw message