asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From abdullah alamoudi <bamou...@gmail.com>
Subject Re: Possible Race condition in the new UTF8String implementation
Date Thu, 12 Nov 2015 12:40:08 GMT
Another update. I found a way to trigger the bug deterministically. It has
nothing to do with objects being shared between multiple threads.
the bug shows up when records are inserted into the index in a specific
order and doesn't show up otherwise.

Will be back with more updates.

Amoudi, Abdullah.

On Thu, Nov 12, 2015 at 10:52 AM, abdullah alamoudi <bamousaa@gmail.com>
wrote:

> So here is some updates about this:
>
> I have added some debug statement so that whenever the value of a
> UTF8StringPointable is reset, it prints the following:
> 1. Its own object Id
> 2. whether the value changed from the previous value
> 3. if the value was changed, it prints the change
>
> Here is some output when doing inserts
>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 WARNIN: UTF8StringPointble: prev =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332> NOT equal current
> = submit-5.chtc.wisc.edu#3946939.19#1438705343
> <http://submit-5.chtc.wisc.edu#3946939.19%231438705343>
> My Id = 967207753 WARNIN: UTF8StringPointble: prev =
> submit-5.chtc.wisc.edu#3946939.19#1438705343
> <http://submit-5.chtc.wisc.edu#3946939.19%231438705343> NOT equal current
> = submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
> My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946933.27#1438705332
> <http://submit-5.chtc.wisc.edu#3946933.27%231438705332>
>
> As can be seen, for some reason, the value of this pointable changed
> briefly and immediately after that, I got the duplicate key exception. The
> same thing can be seen in some other places too:
>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 WARNIN: UTF8StringPointble: prev =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345> NOT equal current
> = submit-5.chtc.wisc.edu#3946946.79#1438705351
> <http://submit-5.chtc.wisc.edu#3946946.79%231438705351>
> My Id = 1829631708 WARNIN: UTF8StringPointble: prev =
> submit-5.chtc.wisc.edu#3946946.79#1438705351
> <http://submit-5.chtc.wisc.edu#3946946.79%231438705351> NOT equal current
> = submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
> My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
> submit-5.chtc.wisc.edu#3946943.44#1438705345
> <http://submit-5.chtc.wisc.edu#3946943.44%231438705345>
>
> Not sure if I am on the right track but this is more promising than the
> SerDe. will be back with more updates soon.
> ~Abdullah.
>
> Amoudi, Abdullah.
>
> On Thu, Nov 12, 2015 at 3:24 AM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
>
>> You are correct. I was looking at one implementation of
>> ResultWriterOperator which was only used in hyracks test. The normal
>> ResultWriter is using the IPrinterFactory as what Yingyi said.
>>
>> There must be a different reason for this racing condition. Let me check
>> 1164 again.
>>
>> > On Nov 11, 2015, at 12:34 PM, abdullah alamoudi <bamousaa@gmail.com>
>> wrote:
>> >
>> > I definitely re-built both and I am 100% sure that the
>> > AsterixIndexInsertDeleteNodePushable has the bug. where? not sure but
>> most
>> > likely hidden somewhere in the storage layer.
>> >
>> > Tomorrow, I am going to check each of the components in that operator 1
>> by
>> > 1 until I can isolate the source of the bug.
>> >
>> > Cheers,
>> > Abdullah.
>> >
>> > Amoudi, Abdullah.
>> >
>> > On Wed, Nov 11, 2015 at 11:27 PM, Jianfeng Jia <jianfeng.jia@gmail.com>
>> > wrote:
>> >
>> >> Then that will be two different issues.
>> >> Just want to make sure that you’ve rebuilt the hyracks (not only
>> >> asterixdb) before test your code, cause those changes are in hyracks.
>> >> And could you send the logic plan and the hyrack job so that we can
>> lock
>> >> which hyracks operators that get involved?
>> >>
>> >>> On Nov 11, 2015, at 12:10 PM, abdullah alamoudi <bamousaa@gmail.com>
>> >> wrote:
>> >>>
>> >>> That was my first thought as I said but I am 100% sure the issue is
>> not
>> >> in
>> >>> the SerDe. To confirm this, I removed the reader and writer from the
>> >> serde
>> >>> and created a new instance of reader/writer in every call to
>> serialize or
>> >>> deserialize just to determine if the problem is gone.
>> >>>
>> >>> The problem didn't go away and I still had the same issue. That is
>> why I
>> >>> know for sure it is not the SerDe.
>> >>>
>> >>> Don't waste any more time in that direction.
>> >>> ~Abdullah.
>> >>>
>> >>> Amoudi, Abdullah.
>> >>>
>> >>> On Wed, Nov 11, 2015 at 10:54 PM, Jianfeng Jia <
>> jianfeng.jia@gmail.com>
>> >>> wrote:
>> >>>
>> >>>> Here is my finding and thoughts.
>> >>>> I think I’ve checked all the direct use case of UTF8SerDer. However,
>> I
>> >>>> missed some indirect static/shared use case of UTF8SerDer.
>> >>>>
>> >>>> One big suspect is the RecordDescriptor which has the
>> >>>> ISerializerDeserializers inside and is always passed into the Factory
>> >>>> method and shared by the ThreadMethod (usually NodePushable).
>> >>>> E.g., in the ResultWriterOperatorDescriptor, the outRecordDesc is
>> passed
>> >>>> to the createPushRuntime() factory method to create the
>> >> “resultSerializer”,
>> >>>> and it is shared by the thread object
>> >>>> AbstractUnaryInputSinkOperatorNodePushable. This pushable object
will
>> >>>> directly get the deserializer from the shared
>> >>>> recordDescpitor.getFields()[i]. It explains the issue-1164.
>> >>>>
>> >>>> I guess in your case there must be some deserializers given by shared
>> >>>> RecordDescriptor. Then it will get into the racing condition if
there
>> >> are
>> >>>> some UTF8StringSerDer involved.
>> >>>>
>> >>>> Given that the SerDers are stored in the shared RecordDescriptor,
I
>> >> think
>> >>>> the very initial design was to make the all the SerDers thread-safe.
>> >> And it
>> >>>> maybe some other data structures stores the SerDers and are
>> passed/used
>> >> in
>> >>>> a same way. Then I’d have to propose to roll back the UTF8SerDer
into
>> >> the
>> >>>> state-less version (at the expense of creating intermediate buffer
>> array
>> >>>> per record).
>> >>>>
>> >>>> Any opinions?
>> >>>>
>> >>>>
>> >>>>> On Nov 11, 2015, at 10:54 AM, abdullah alamoudi <bamousaa@gmail.com
>> >
>> >>>> wrote:
>> >>>>>
>> >>>>> That was my first thought and so I changed it. The issue is
still
>> >> there.
>> >>>>> I am also using the UTF8StringSerializerDeserializer to deserialize
>> the
>> >>>>> strings and they always serialize it correctly.
>> >>>>>
>> >>>>> I am thinking maybe it is related to the UTF8StringPointable
but I
>> am
>> >> not
>> >>>>> sure how that could be.
>> >>>>> I am looking at this as well,
>> >>>>> Abdullah.
>> >>>>>
>> >>>>> Amoudi, Abdullah.
>> >>>>>
>> >>>>> On Wed, Nov 11, 2015 at 8:05 PM, Jianfeng Jia <
>> jianfeng.jia@gmail.com>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> The possible racing condition could be that the
>> >>>>>> UTF8StringSerializerDeserializer now is not a singleton
method any
>> >>>> more. It
>> >>>>>> was implemented to reuse the byte[] that serialize/deserialize
the
>> >>>> string
>> >>>>>> object. Let me look into this issue.
>> >>>>>>
>> >>>>>>> On Nov 11, 2015, at 8:37 AM, abdullah alamoudi <
>> bamousaa@gmail.com>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>> Highly probable.
>> >>>>>>> Please, let's fix this soon.
>> >>>>>>>
>> >>>>>>> Amoudi, Abdullah.
>> >>>>>>>
>> >>>>>>> On Wed, Nov 11, 2015 at 7:32 PM, Till Westmann <tillw@apache.org>
>> >>>> wrote:
>> >>>>>>>
>> >>>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1164
>> >>>>>>>> might be related.
>> >>>>>>>>
>> >>>>>>>> Cheers,
>> >>>>>>>> Till
>> >>>>>>>>
>> >>>>>>>> On 11 Nov 2015, at 8:25, abdullah alamoudi wrote:
>> >>>>>>>>
>> >>>>>>>>> Hi all,
>> >>>>>>>>> I am having a hard time figuring this out. Here
are the
>> symptoms I
>> >> am
>> >>>>>>>>> seeing in case one has an idea what this could
be.
>> >>>>>>>>>
>> >>>>>>>>> I have a feed running ingesting data into a
dataset.
>> sporadically,
>> >> I
>> >>>>>> get
>> >>>>>>>>> duplicate key exception errors (The key is of
a string type)
>> and I
>> >> am
>> >>>>>>>> 100%
>> >>>>>>>>> sure that I don't have duplicate records.
>> >>>>>>>>>
>> >>>>>>>>> Moreover, I am printing the content of the frames
about to be
>> >>>> inserted
>> >>>>>>>> into
>> >>>>>>>>> the primary index and there are no duplicate
records.
>> >>>>>>>>>
>> >>>>>>>>> There are three reasons why I am suspecting
the String
>> >>>> implementation:
>> >>>>>>>>> 1. It is fairly recent change.
>> >>>>>>>>> 2. When I run on a single node, or run one thread
at a time, I
>> >> never
>> >>>>>> get
>> >>>>>>>>> this exception.
>> >>>>>>>>> 3. the key is a String.
>> >>>>>>>>>
>> >>>>>>>>> I have looked at the change trying to figure
out where a race
>> >>>> condition
>> >>>>>>>>> might take place but it is well hidden (if it
is true at all.).
>> >>>>>>>>>
>> >>>>>>>>> Let me know if you have seen something similar.
>> >>>>>>>>>
>> >>>>>>>>> Cheers,
>> >>>>>>>>> Abdullah.
>> >>>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> Best,
>> >>>>>>
>> >>>>>> Jianfeng Jia
>> >>>>>> PhD Candidate of Computer Science
>> >>>>>> University of California, Irvine
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> Best,
>> >>>>
>> >>>> Jianfeng Jia
>> >>>> PhD Candidate of Computer Science
>> >>>> University of California, Irvine
>> >>>>
>> >>>>
>> >>
>> >>
>> >>
>> >> Best,
>> >>
>> >> Jianfeng Jia
>> >> PhD Candidate of Computer Science
>> >> University of California, Irvine
>> >>
>> >>
>>
>>
>>
>> Best,
>>
>> Jianfeng Jia
>> PhD Candidate of Computer Science
>> University of California, Irvine
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message