arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Peterson <donttrytocontact...@gmail.com>
Subject Re: [Java] Flight fails to handle list vector
Date Sat, 20 Feb 2021 03:54:30 GMT
Thanks for the explanation Jacques. You're correct, the issue must be in
the appender. If I take that out, and read the list data from Flight like
so, the data is in the VectorSchemaRoot.

Is this something a bug should be opened for? Or is it possible I'm
invoking the appender incorrectly? Based on your explanation, I'm wondering
if I need to somehow pre-allocate the target VectorSchemaRoot in such a
manner so that it has enough space to hold the data of any
VectorSchemaRoot's that I want to join. That seems true in the unit test
for the individual vector appender test -
https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/util/TestVectorAppender.java#L149

On Tue, Feb 16, 2021 at 12:31 PM Jacques Nadeau <jacques@apache.org> wrote:

> Hey John, a brief review of your code makes me wonder if the problem may
> be associated with VectorSchemaRootAppender. Can you try your test without
> that. Basically, once you get a batch of data back, inspect it to see that
> you have your values. VectorSchemaRootAppender is new code that I haven't
> reviewed and I'm wondering if it isn't handling reference counting
> correctly.
>
> The exception you're seeing is most frequently associated with what could
> be thought of as a NPE for the memory backing a vector. When a vectors are
> like a container. The design was built so a vector has batches stream
> through it. When no buffer is available, rather than setting the buffer to
> null, we set it to the empty buffer (which is of zero length). If you try
> to do something with the vector when it is empty. In this case, my guess is
> you are trying to read the start offset for the first item in a list e.g.
> the first four bytes [0..4) of the vector but the vector is only 0 bytes in
> length (thus the exception).
>
> On Mon, Feb 15, 2021 at 7:21 PM John Peterson <
> donttrytocontactme2@gmail.com> wrote:
>
>> Appreciate the help Jacques. Unfortunately calling setPosition(0) on the
>> writer for the list did not solve it.
>>
>> I put the entirety of the code up on pastebin so it should be an easy
>> copy/paste if anybody else wants to try to reproduce it. I suppose it could
>> also be a bug in VectorAppender, but again I'm not sure if the error is in
>> my code or in Arrow.
>>
>> https://pastebin.com/vwvnYY40
>>
>> Thanks in advance.
>>
>>
>> On Mon, Feb 15, 2021 at 1:33 PM Jacques Nadeau <jacques@apache.org>
>> wrote:
>>
>>> I think you need to call setPosition(0) before you start writing the
>>> list. (This is from memory when I wrote the code 6-7 years ago so I may be
>>> off.)
>>>
>>> On Sun, Feb 14, 2021 at 6:20 PM John Peterson <
>>> donttrytocontactme2@gmail.com> wrote:
>>>
>>>> Hi Bryan,
>>>>
>>>> This is the stacktrace I get:
>>>>
>>>> java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected:
>>>> range(0, 0))
>>>> at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
>>>> at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
>>>> at org.apache.arrow.memory.ArrowBuf.getInt(ArrowBuf.java:424)
>>>> at
>>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:97)
>>>> at
>>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:45)
>>>> at
>>>> org.apache.arrow.vector.BaseVariableWidthVector.accept(BaseVariableWidthVector.java:1402)
>>>> at
>>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:233)
>>>> at
>>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:45)
>>>> at
>>>> org.apache.arrow.vector.complex.ListVector.accept(ListVector.java:449)
>>>> at
>>>> org.apache.arrow.vector.util.VectorSchemaRootAppender.append(VectorSchemaRootAppender.java:67)
>>>> at
>>>> org.apache.arrow.vector.util.VectorSchemaRootAppender.append(VectorSchemaRootAppender.java:81)
>>>>
>>>> Thanks for your help.
>>>>
>>>> On Thu, Jan 14, 2021 at 2:23 PM Bryan Cutler <cutlerb@gmail.com> wrote:
>>>>
>>>>> Hi John, could you include the error with stacktrace?
>>>>>
>>>>> On Sat, Jan 9, 2021 at 9:34 PM John Peterson <
>>>>> donttrytocontactme2@gmail.com> wrote:
>>>>>
>>>>>> I believe I'm running into a bug with Flight but I'd like to confirm
>>>>>> and get some advice on a potential fix. I'm not sure where to look
or what
>>>>>> could be causing it.
>>>>>>
>>>>>> The code in question simply uploads a one-element List<Integer>
to
>>>>>> the example server, fetches it from the server, and attempts to append
the
>>>>>> data from the server to a new VectorSchemaRoot. It fails in the same
way
>>>>>> regardless of whether or not I construct a VectorSchemaRoot instance.
>>>>>>
>>>>>> Likewise, the data from the server can't be written out with the
JSON
>>>>>> writer, it'll fail in the same way. However, changing the data from
a
>>>>>> ListVector to an IntVector causes it to succeed.
>>>>>>
>>>>>> Any help would be appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>> John
>>>>>>
>>>>>> Code in question:
>>>>>> // Set up the server and client
>>>>>> BufferAllocator allocator = new RootAllocator(Long.MAX_VALUE);
>>>>>> Location l = Location.forGrpcInsecure(FlightTestUtil.LOCALHOST,
>>>>>> 12233);
>>>>>> ExampleFlightServer server = new ExampleFlightServer(allocator, l);
>>>>>> server.start();
>>>>>> FlightClient client = FlightClient.builder(allocator, l).build();
>>>>>>
>>>>>> // Write a one-element List<Integer>
>>>>>> ListVector listVector = ListVector.empty("list", allocator);
>>>>>> UnionListWriter writer = listVector.getWriter();
>>>>>> writer.startList();
>>>>>> writer.integer().writeInt(1);
>>>>>> writer.endList();
>>>>>> writer.setValueCount(1);
>>>>>>
>>>>>> // Send that data to the server
>>>>>> VectorSchemaRoot root = VectorSchemaRoot.of(listVector);
>>>>>> ClientStreamListener listener =
>>>>>> client.startPut(FlightDescriptor.path("test"), root, new
>>>>>> AsyncPutListener());
>>>>>> root.setRowCount(1);
>>>>>> listener.putNext();
>>>>>> root.clear();
>>>>>> listener.completed();
>>>>>>
>>>>>> // wait for ack to avoid memory leaks.
>>>>>> listener.getResult();
>>>>>>
>>>>>> // Attempt to read it back
>>>>>> FlightInfo info = client.getInfo(FlightDescriptor.path("test"));
>>>>>> try (final FlightStream stream =
>>>>>> client.getStream(info.getEndpoints().get(0).getTicket())) {
>>>>>>   VectorSchemaRoot newRoot = stream.getRoot();
>>>>>>   while (stream.next()) {
>>>>>>     // Copying into an entirely new VectorSchemaRoot fails
>>>>>>     try {
>>>>>>       ListVector newList = ListVector.empty("list", allocator);
>>>>>>
>>>>>> newList.addOrGetVector(FieldType.nullable(Types.MinorType.INT.getType()));
>>>>>>       VectorSchemaRoot copyRoot = VectorSchemaRoot.of(newList);
>>>>>>       VectorSchemaRootAppender.append(copyRoot, newRoot);
>>>>>>     } catch (IndexOutOfBoundsException e) {
>>>>>>       System.err.println("Expected IOOBE caught");
>>>>>>     }
>>>>>>
>>>>>>     // The same is true if we try to copy the data from the server
to
>>>>>> our VectorSchemaRoot
>>>>>>     try {
>>>>>>       VectorSchemaRootAppender.append(root, newRoot);
>>>>>>     } catch (IndexOutOfBoundsException e) {
>>>>>>       System.err.println("Expected IOOBE caught again");
>>>>>>       throw e;
>>>>>>     }
>>>>>>
>>>>>>     root.clear();
>>>>>>     newRoot.clear();
>>>>>>   }
>>>>>> }
>>>>>>
>>>>>

Mime
View raw message