Mailing-List: contact dev-help@arrow.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@arrow.apache.org
MIME-Version: 1.0
In-Reply-To: <CAJPUwMCMf7+COJz-qivgNEeTXWTsAu_jvhtXYx+1uFes31cK-Q@mail.gmail.com>
References: <ca583b35-f376-fc9a-54e0-16ae6e949ffe@ccri.com>
 <e9c906ba-9165-52eb-22cb-3f4eaf4be062@ccri.com> <CAJPUwMAvph4yg+VZAHSH8F7P9DAP0=qv97F416Vm=1EW-FOAvg@mail.gmail.com>
 <6c8bd004-da62-ef41-5060-7e7606da5c11@ccri.com> <cae75e11-ea34-fd34-fa4c-62cd8d9a7d72@ccri.com>
 <CAGY9duXpYOdkLdhLuYyT71VkbsSZuW3OOU1bZ4zfZgO4HN94cQ@mail.gmail.com>
 <6fd670c9-4648-beae-a4cc-97459a09d3d5@ccri.com> <CAJPUwMCMf7+COJz-qivgNEeTXWTsAu_jvhtXYx+1uFes31cK-Q@mail.gmail.com>
From: Wes McKinney <wesmckinn@gmail.com>
Date: Tue, 8 Aug 2017 14:06:05 -0400
Message-ID: <CAJPUwMBY7hvag=4+V0y9Tm_jwVGhBPSeqAC5eNaLOF5FQ6DRkQ@mail.gmail.com>
Subject: Re: buffer alignment (format/java/js)
To: dev@arrow.apache.org
Content-Type: text/plain; charset="UTF-8"
archived-at: Tue, 08 Aug 2017 18:06:58 -0000

I opened https://issues.apache.org/jira/browse/ARROW-1343. Let's try
to resolve this today so it can make the 0.6.0 release?

On Tue, Aug 8, 2017 at 2:03 PM, Wes McKinney <wesmckinn@gmail.com> wrote:
> I'm happy to have a look at the branch / integration tests if you
> could put up a PR
>
> When you say "a single serialized record batch" you mean an
> encapsulated message, right (including length prefix and metadata)?
> Using the terminology from http://arrow.apache.org/docs/ipc.html. I
> guess the problem is that the total size of the Schema message at the
> start of the stream may not be a multiple of 8. We should totally fix
> this; I don't think it even constitutes a breakage of the format -- I
> am fairly sure with extra padding bytes between the schema and the
> first record batch (or first dictionary) that the stream will be
> backwards compatible.
>
> However, we should document in
> https://github.com/apache/arrow/blob/master/format/IPC.md that message
> sizes are expected to be a multiple of 8. We should also take a look
> at the File format implementation to ensure that padding is inserted
> after the magic number at the start of the file
>
> - Wes
>
> On Tue, Aug 8, 2017 at 1:32 PM, Emilio Lahr-Vivaz <elahrvivaz@ccri.com> wrote:
>> Sure, the workflow is a little complicated, but we have the following code
>> running in distributed databases (as accumulo iterators and hbase
>> coprocessors). They process data rows and transform them into arrow records,
>> then periodically write out record batches:
>>
>> https://github.com/locationtech/geomesa/blob/master/geomesa-index-api/src/main/scala/org/locationtech/geomesa/index/iterators/ArrowBatchScan.scala#L79-L105
>>
>> https://github.com/locationtech/geomesa/blob/master/geomesa-arrow/geomesa-arrow-gt/src/main/scala/org/locationtech/geomesa/arrow/io/records/RecordBatchUnloader.scala#L20
>>
>> The record batches come back to a single client and are concatenated with a
>> file header and footer (then wrapped in SimpleFeature objects, as we
>> implement a geotools data store):
>>
>> https://github.com/locationtech/geomesa/blob/master/geomesa-index-api/src/main/scala/org/locationtech/geomesa/index/iterators/ArrowBatchScan.scala#L265-L268
>>
>> The resulting bytes are written out as an arrow streaming file that we parse
>> with the arrow-js libraries in the browser.
>>
>> Thanks,
>>
>> Emilio
>>
>>
>> On 08/08/2017 01:24 PM, Li Jin wrote:
>>>
>>> Hi Emilio,
>>>
>>>> So I think the issue is that we are serializing record batches in a
>>>
>>> distributed fashion, and then > concatenating them in the streaming
>>> format.
>>>
>>> Can you show the code for this?
>>>
>>> On Tue, Aug 8, 2017 at 12:35 PM, Emilio Lahr-Vivaz <elahrvivaz@ccri.com>
>>> wrote:
>>>
>>>> So I think the issue is that we are serializing record batches in a
>>>> distributed fashion, and then concatenating them in the streaming format.
>>>> However, the message serialization only aligns the start of the buffers,
>>>> which requires it to know the current absolute offset of the output
>>>> stream.
>>>> Would there be any problem with padding the end of the message, so any
>>>> single serialized record batch would always be a multiple of 8 bytes?
>>>>
>>>> I've put together a branch that does this, and the existing java tests
>>>> all
>>>> pass. I'm having some trouble running the integration tests though.
>>>>
>>>> Thanks,
>>>>
>>>> Emilio
>>>>
>>>>
>>>> On 08/08/2017 09:18 AM, Emilio Lahr-Vivaz wrote:
>>>>
>>>>> Hi Wes,
>>>>>
>>>>> You're right, I just realized that. I think the alignment issue might be
>>>>> in some unrelated code, actually. From what I can tell the the arrow
>>>>> writers are aligning buffers correctly; if not I'll open a bug.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Emilio
>>>>>
>>>>> On 08/08/2017 09:15 AM, Wes McKinney wrote:
>>>>>
>>>>>> hi Emilio,
>>>>>>
>>>>>>   From your description, it isn't clear why 8-byte alignment is causing
>>>>>> a problem (as compare with 64-byte alignment). My understanding is
>>>>>> that JavaScript's TypedArray classes range in size from 1 to 8 bytes.
>>>>>>
>>>>>> The starting offset for all buffers should be 8-byte aligned, if not
>>>>>> that is a bug. Could you clarify?
>>>>>>
>>>>>> - Wes
>>>>>>
>>>>>> On Tue, Aug 8, 2017 at 8:52 AM, Emilio Lahr-Vivaz <elahrvivaz@ccri.com>
>>>>>> wrote:
>>>>>>
>>>>>>> After looking at it further, I think only the buffers themselves need
>>>>>>> to be
>>>>>>> aligned, not the metadata and/or schema. Would there be any problem
>>>>>>> with
>>>>>>> changing the alignment to 64 bytes then?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Emilio
>>>>>>>
>>>>>>>
>>>>>>> On 08/08/2017 08:08 AM, Emilio Lahr-Vivaz wrote:
>>>>>>>
>>>>>>>> I'm looking into buffer alignment in the java writer classes.
>>>>>>>> Currently
>>>>>>>> some files written with the java streaming writer can't be read due
>>>>>>>> to
>>>>>>>> the
>>>>>>>> javascript TypedArray's restriction that the start offset of the
>>>>>>>> array
>>>>>>>> must
>>>>>>>> be a multiple of the data size of the array type (i.e. Int32Vectors
>>>>>>>> must
>>>>>>>> start on a multiple of 4, Float64Vectors must start on a multiple of
>>>>>>>> 8,
>>>>>>>> etc). From a cursory look at the java writer, I believe that the
>>>>>>>> schema that
>>>>>>>> is written first is not aligned at all, and then each record batch
>>>>>>>> pads out
>>>>>>>> its size to a multiple of 8. So:
>>>>>>>>
>>>>>>>> 1. should the schema block pad itself so that the first record batch
>>>>>>>> is
>>>>>>>> aligned, and is there any problem with doing so?
>>>>>>>> 2. is there any problem with changing the alignment to 64 bytes, as
>>>>>>>> recommended (but not required) by the spec?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Emilio
>>>>>>>>
>>>>>>>
>>