arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: Is there plan to support BigEndian Systems like SUN SPARC Hardware ?
Date Sun, 07 Aug 2016 00:52:26 GMT
I suspect that relaxing the constraint to native endianness (and
including this in any IPC/RPC metadata (per ARROW-245) will not cause
too many problems. One of the challenges for us will be testing and
continuous integration -- what are the options for running the test
suite on a regular basis on big endian platforms? I know that in
pandas we occasionally ran into esoteric test failures for the PPC /
big-endian Debian package builds but for the most part there haven't
been any problems.

- Wes

On Fri, Aug 5, 2016 at 4:39 AM, Sanjay Rao <getsanjayrao@live.com> wrote:
> Some places where explicit check for Little Endian is there-
> ./memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java:    if (!NATIVE_ORDER
|| buf.order() != ByteOrder.BIG_ENDIAN) {./memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java:
     throw new IllegalStateException("Arrow only runs on LittleEndian systems.");
> Sanjay
>> From: pchandra@maprtech.com
>> Date: Thu, 4 Aug 2016 17:04:34 -0700
>> Subject: Re: Is there plan to support BigEndian Systems like SUN SPARC Hardware ?
>> To: dev@arrow.apache.org; emkornfield@gmail.com
>> CC: julien@dremio.com
>>
>> Drill's assumption of little endian is in the ValueVector code, and Arrow
>> has inherited the same assertion. (
>> https://github.com/apache/arrow/blob/master/java/memory/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java#L58
>> )
>>
>> In the Java implementation, the underlying Netty implementation handles the
>> conversion between endianness fairly well, so potentially this assert can
>> be removed from here and Drill can move this higher up in the Drill code.
>>
>>
>> Parth
>>
>> On Thu, Aug 4, 2016 at 1:14 PM, Micah Kornfield <emkornfield@gmail.com>
>> wrote:
>>
>> > Hi Julien,
>> > Thats the theory.  I don't think that there is anything in the C++ code
>> > base that should break but we don't have access to hardware to verify that.
>> >
>> > The java Arrow code currently asserts that it is running on a little endian
>> > machine.  I did a very quick scan of the Java code and didn't see anything
>> > there would break on a big-endian system, but according to at least one
>> > person who is working on Drill, it seems that Drill assumes little
>> > endianness (I don't know if this is in Arrow/ValueVector code or it is
>> > higher up the stack in the Drill code).
>> >
>> > Thanks,
>> > Micah
>> >
>> >
>> > On Thu, Aug 4, 2016 at 11:36 AM, Julien Le Dem <julien@dremio.com> wrote:
>> >
>> > > So it sounds like right now it just works as long as there are no
>> > > inter-system communication (with different endianness) because both java
>> > > and c++ code just use the underlying endianness.
>> > > Is that correct?
>> > >
>> > >
>> > > On Thu, Aug 4, 2016 at 11:17 AM, Micah Kornfield <emkornfield@gmail.com>
>> > > wrote:
>> > >
>> > >> Hi Sanjay,
>> > >> I think we are trying to work that out now.  As you've seen with some
of
>> > >> you initial investigation we have no coverage for big-endian machines
>> > yet.
>> > >> But in the long run, we should be able to make it work (it seems like
>> > >> there
>> > >> might be some difference of opinion on how to make it work).
>> > >>
>> > >> Thanks,
>> > >> Micah
>> > >>
>> > >> On Mon, Aug 1, 2016 at 11:16 AM, Sanjay Rao <getsanjayrao@live.com>
>> > >> wrote:
>> > >>
>> > >> > Hi Wes, Hi Micah,
>> > >> > I understood what you meant, so point 2. Arrow working with Big
Endian
>> > >> > machine to Big Endian shouldn't be an issue right ?
>> > >> > Please confirm.
>> > >> > Thanks,Sanjay
>> > >> > > From: wesmckinn@gmail.com
>> > >> > > Date: Mon, 1 Aug 2016 11:07:07 -0700
>> > >> > > Subject: Re: Is there plan to support BigEndian Systems like
SUN
>> > SPARC
>> > >> > Hardware ?
>> > >> > > To: dev@arrow.apache.org; emkornfield@gmail.com
>> > >> > >
>> > >> > > hey Micah,
>> > >> > >
>> > >> > > On Mon, Aug 1, 2016 at 11:02 AM, Micah Kornfield <
>> > >> emkornfield@gmail.com>
>> > >> > wrote:
>> > >> > > > Hi Wes,
>> > >> > > > The point I was trying to argue from an earlier thread
is that the
>> > >> most
>> > >> > > > common cases for relocation are:
>> > >> > > > 1.  Little endian machine to little endian machine (most
likely
>> > same
>> > >> > > > machine)
>> > >> > > > 2.  big endian machine to big endian machine (most likely
same
>> > >> machine)
>> > >> > > > 3.  big endian machine to little endian machine or vice
versa
>> > >> > > >
>> > >> > > > The purpose of the metadata would be to make use-cases
1 and 2
>> > >> possible
>> > >> > > > without byte-swapping.  Use case 3 would obviously require
byte
>> > >> > swapping
>> > >> > > > but for an initial implementation the code could simply
indicate
>> > >> that
>> > >> > it is
>> > >> > > > not supported.
>> > >> > > >
>> > >> > > > This seems less complex to me than actually implementing
any sort
>> > of
>> > >> > > > byte-swapping logic while still supporting the widest
variety of
>> > >> > hardware
>> > >> > > > with the same code for the most common use-cases.
>> > >> > >
>> > >> > > This makes sense. My comments were for the situation that
a big
>> > endian
>> > >> > > system would be exposing memory to an unknown consumer --
for
>> > example,
>> > >> > > if we implemented an RPC wire format for Arrow memory, then
in
>> > general
>> > >> > > a big endian system would need to send little-endian integers
to an
>> > >> > > arbitrary receiver. I'm not sure the best way to provide
for easy
>> > >> > > native-endianness support for cases 1/2, but trying to fully
solve
>> > >> > > this problem now seems premature until we've established
some of
>> > these
>> > >> > > tools (so long as we haven't painted ourselves into a corner).
>> > >> > >
>> > >> > > - Wes
>> > >> > >
>> > >> > > >
>> > >> > > > Thanks,
>> > >> > > > Micah
>> > >> > > >
>> > >> > > > P.S. If anybody can provide pointers I'd be interested
to
>> > understand
>> > >> > which
>> > >> > > > pieces of the java code make assumptions about little-endianness.
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Julien
>> > >
>> >
>

Mime
View raw message