asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ildar Absalyamov <ildar.absalya...@gmail.com>
Subject Re: Severe issue relating to indexing on nullable fields within a closed type?
Date Thu, 11 Jun 2015 00:39:29 GMT
Just a bit of background, the bug is not specific to RTree, but will be triggered on any index,
created on a nullable field.
In order to persist datatype with a nullable field a special internal type with name ’Type_xxx_UnionType_yyy’
must be created on MD node side.
However the stale version of datatype (where the name of the nullable field is not yet computed)
is returned on the client, where it got cached.

Clearly, relying on the fact that the argument supposedly passed by reference will get updated
is wrong, when things are sent via RMI (and Ian’s patch resolves that).
However I think a better way would be to invalidate locally cached stale copy, so when the
actual index is created it will query the MD and get the right Datatype.
That is in line with the question Niki (our visiting student from Germany) asked me today,
for which I did not had a clear answer: how do we manage the local MD cache and invalidate
it?

> On Jun 10, 2015, at 02:43, Ian Maxon <imaxon@uci.edu> wrote:
> 
> Indeed, it is weird practice at best. The real index is an RTree index on
> FacebookMessages.sender-location, which is nullable, and the code path to
> create that index just happens to trigger this.
> 
> On Tue, Jun 9, 2015 at 11:35 PM, Mike Carey <dtabass@gmail.com> wrote:
> 
>> This sounds like a bad practice (doing any mods on the MD node - we should
>> view the call as passing in immutable data).
>> But I'm not sure which index is being referred to here - what index on
>> what dataset is the "actual index" below?
>> 
>> 
>> 
>> On 6/9/15 5:11 PM, Ian Maxon wrote:
>> 
>>> So I think I might have a fix for this:
>>> https://asterix-gerrit.ics.uci.edu/#/c/283/1
>>> 
>>> This bug is really insidious. What I think is happening is that in
>>> AqlTranslator, the types are computed and given to the MetadataManager as
>>> a
>>> new Datatype, to insert into the metadata node via RMI on MetadataNode.
>>> However MetadataNode modifies the object it is given before it inserts it
>>> into the actual index. That is where the names of the nullable types get
>>> generated, and so the MetadataManager doesn't get to see those changes
>>> unless MetadataManager and MetadataNode are actually sharing the same
>>> object. So if it is stale, that object gets inserted into the cache, and
>>> when the index is created, it sees the stale type information rather than
>>> the true value that's stored in the index.
>>> 
>>> My fix is just to read it again before inserting into the cache. I'm not
>>> really sure thats the right thing to do though. And if there are any other
>>> instances of this type (caller expecting changes on remote side to
>>> reflect), the same sort of behavior can happen.
>>> 
>>> - Ian
>>> 
>>> On Tue, Jun 9, 2015 at 10:12 AM, Ian Maxon <imaxon@uci.edu> wrote:
>>> 
>>> My quack theory is that it is due to the RMI somehow, or that the RMI
>>>> reveals it. It isn't specific to managix. It happens as long as the CC
>>>> and
>>>> NC reside in different JVMs.
>>>> 
>>>> - Ian
>>>> 
>>>> On Tue, Jun 9, 2015 at 10:03 AM, Ildar Absalyamov <
>>>> ildar.absalyamov@gmail.com> wrote:
>>>> 
>>>> As Ian mentioned test case already exists (which I thought was good
>>>>> enough to prevent regressions like this one), but for some reason it’s
>>>>> not
>>>>> triggering the error, whereas if the same ddl\query is executed on a
>>>>> local
>>>>> cluster, which was prepared by managix, the error will trigger.
>>>>> 
>>>>> On Jun 9, 2015, at 08:07, Mike Carey <dtabass@gmail.com> wrote:
>>>>>> 
>>>>>> Ildar, can you try to nail this one down and also add a test case
if
>>>>>> you
>>>>>> can create one once this is making more sense...?
>>>>>> On Jun 8, 2015 11:58 PM, "Ildar Absalyamov" <
>>>>>> ildar.absalyamov@gmail.com
>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> Interesting, I guess step 1 here would be to investigate what’s
>>>>>>> 
>>>>>> different
>>>>> 
>>>>>> in test framework\AsterixHyracksIntegrationUntil setup and regular
>>>>>>> 
>>>>>> cluster
>>>>> 
>>>>>> configuration, created by managix.
>>>>>>> 
>>>>>>> On Jun 8, 2015, at 23:37, Ian Maxon <imaxon@uci.edu> wrote:
>>>>>>>> 
>>>>>>>> I was playing around with the latest snapshot version, trying
to
>>>>>>>> 
>>>>>>> create a
>>>>> 
>>>>>> fresher version of my demo docker container, and so I ran through
the
>>>>>>>> ADM/AQL 101. However I couldn't get it to work, which was
really
>>>>>>>> 
>>>>>>> surprising
>>>>>>> 
>>>>>>>> (and distressing). The error appears after trying to create
an RTree
>>>>>>>> 
>>>>>>> index
>>>>>>> 
>>>>>>>> on
>>>>>>>> sender-location:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> SEVERE: java.lang.NullPointerException
>>>>>>>> 
>>>>>>>>> edu.uci.ics.asterix.metadata.MetadataException:
>>>>>>>>> java.lang.NullPointerException
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.metadata.MetadataNode.addIndex(MetadataNode.java:229)
>>>>> 
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> 
>>>>>> at java.lang.reflect.Method.invoke(Method.java:497)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
>>>>> 
>>>>>> at sun.rmi.transport.Transport$1.run(Transport.java:200)
>>>>>>>>> at sun.rmi.transport.Transport$1.run(Transport.java:197)
>>>>>>>>> at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>>> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$97(TCPTransport.java:683)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$2/1502895914.run(Unknown
>>>>> 
>>>>>> Source)
>>>>>>>>> at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> 
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:276)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:253)
>>>>> 
>>>>>> at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:162)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:194)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:148)
>>>>> 
>>>>>> at com.sun.proxy.$Proxy11.addIndex(Unknown Source)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.metadata.MetadataManager.addIndex(MetadataManager.java:417)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.aql.translator.AqlTranslator.handleCreateIndexStatement(AqlTranslator.java:920)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.aql.translator.AqlTranslator.compileAndExecute(AqlTranslator.java:261)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.api.http.servlet.APIServlet.doPost(APIServlet.java:97)
>>>>> 
>>>>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:754)
>>>>>>>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>> 
>>>>>>> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:546)
>>>>>>> 
>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:483)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:970)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:411)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:904)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
>>>>> 
>>>>>> at org.eclipse.jetty.server.Server.handle(Server.java:347)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:439)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.java:924)
>>>>> 
>>>>>> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:781)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:220)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:43)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:545)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:43)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
>>>>> 
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>>> Caused by: java.lang.NullPointerException
>>>>>>>>> at java.io.DataOutputStream.writeUTF(DataOutputStream.java:347)
>>>>>>>>> at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.hyracks.dataflow.common.data.marshalling.UTF8StringSerializerDeserializer.serialize(UTF8StringSerializerDeserializer.java:44)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.dataflow.data.nontagged.serde.AStringSerializerDeserializer.serialize(AStringSerializerDeserializer.java:47)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.dataflow.data.nontagged.serde.AStringSerializerDeserializer.serialize(AStringSerializerDeserializer.java:26)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.formats.nontagged.AqlSerializerDeserializerProvider$1.serialize(AqlSerializerDeserializerProvider.java:208)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.formats.nontagged.AqlSerializerDeserializerProvider$1.serialize(AqlSerializerDeserializerProvider.java:189)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.metadata.entitytupletranslators.IndexTupleTranslator.getTupleFromMetadataEntity(IndexTupleTranslator.java:260)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> edu.uci.ics.asterix.metadata.MetadataNode.addIndex(MetadataNode.java:224)
>>>>> 
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> 
>>>>>> at java.lang.reflect.Method.invoke(Method.java:497)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
>>>>> 
>>>>>> at sun.rmi.transport.Transport$1.run(Transport.java:200)
>>>>>>>>> at sun.rmi.transport.Transport$1.run(Transport.java:197)
>>>>>>>>> at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>>> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$97(TCPTransport.java:683)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$2/1502895914.run(Unknown
>>>>> 
>>>>>> Source)
>>>>>>>>> at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> 
>>>>>> at
>>>>>>>>> 
>>>>>>>>> 
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> 
>>>>>> ... 1 more
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> I started going back, trying this on older and older revisions
until
>>>>>>>> 
>>>>>>> it
>>>>> 
>>>>>> stopped happening. It looks like this somehow has been lurking in
the
>>>>>>>> Open/nested indexing code for a long time. The error does
not
>>>>>>>> 
>>>>>>> reproduce
>>>>> 
>>>>>> in
>>>>>>> 
>>>>>>>> the testing framework, for some reason I don't understand.
>>>>>>>> 
>>>>>>>> The source of the bug seems to be that the RTree index is
on a type
>>>>>>>> of
>>>>>>>> UNION(sender-location, null). The name of the type is not
set like it
>>>>>>>> should be, it is null. Then in IndexTupleTranslator, the
9th field of
>>>>>>>> 
>>>>>>> the
>>>>> 
>>>>>> new metadata record is written with the name of the type- which is
>>>>>>>> 
>>>>>>> null,
>>>>> 
>>>>>> so
>>>>>>> 
>>>>>>>> everything explodes.
>>>>>>>> 
>>>>>>>> What is especially puzzling- is that this doesn't reproduce
in
>>>>>>>> AsterixHyracksIntegrationUtil. There, the name of the type
is
>>>>>>>> "Field_sender-location_in_FacebookMessageType". This type
is created
>>>>>>>> 
>>>>>>> too
>>>>> 
>>>>>> in
>>>>>>> 
>>>>>>>> the deployed version, however it doesn't seem to be picked
up.
>>>>>>>> 
>>>>>>>> I will keep digging on this, but I imagine someone more familiar
with
>>>>>>>> 
>>>>>>> the
>>>>> 
>>>>>> Open/Nested indexing patch might have an idea about why this is
>>>>>>>> 
>>>>>>> happening.
>>>>>>> 
>>>>>>>> To reproduce just try deploying the snapshot version from
the website
>>>>>>>> normally on your machine, and run through ADM/AQL 101.
>>>>>>>> 
>>>>>>>> - Ian
>>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Ildar
>>>>>>> 
>>>>>>> 
>>>>>>> Best regards,
>>>>> Ildar
>>>>> 
>>>>> 
>>>>> 
>> 

Best regards,
Ildar


Mime
View raw message