arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: char types in c and ipc in java
Date Fri, 10 Jun 2016 21:42:51 GMT
More design work is needed on the metadata specification -- I would
not use either codebase as a reference for doing a different
implementation until the specification / format documents are complete
enough to enable a "clean room" implementation.

I will leave Steven or Jacques from the Drill side to comment on ways
to jump into the Java code.

- Wes

On Fri, Jun 10, 2016 at 1:27 PM, Kiril Menshikov <kmenshikov@gmail.com> wrote:
> Hi Wes,
>
> What is the most complete arrow version at the moment? I can see C++ and Python are most
active and Java was coped from Drill. So does this mean that we can use the C++ version as
a reference?
>
> I also want to help you. I can do metadata read/write, if nobody doing it.
>
> Thanks,
> -Kiril
>
>> On Jun 9, 2016, at 21:24, Wes McKinney <wesmckinn@gmail.com> wrote:
>>
>> Since we are at the "chicken" stage of the chicken-and-egg problem I
>> don't have straightforward guidance about how to proceed, other than
>> to dig in to either the current Java and / or C++ codebases and
>> helping sort out what needs to be done. It may be beneficial on the
>> mailing list to discuss the incremental steps required to reach
>> working integration tests (I suspect there will be many JIRAs /
>> patches required to get there) -- defining these tasks (perhaps in a
>> shared Google document) and creating the associated JIRAs is a
>> valuable and necessary exercise.
>>
>> (Personally I've been most interested in "up stack" integration with
>> other projects like Apache Parquet as it relates to native code
>> consumers (e.g. Python libraries).)
>>
>> On the IPC side, you can look at the internal IPC round trip C++ tests:
>>
>> https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-adapter-test.cc
>>
>> On the Java side, an initial task would be to create a testing setup
>> that generates sample data and either sends and receives it through a
>> socket or memory map. There are other questions to analyze:
>>
>> - Schema negotiation
>> - Metadata read / write (see
>> https://github.com/apache/arrow/blob/master/format/Message.fbs)
>>
>> As discussed there are some inconsistencies between the reference
>> implementations that we will need to resolve before this work can
>> proceed to completion. The metadata (schemas and logical types, e.g.
>> what is in Message.fbs) are also in flux and will require a round of
>> iteration.
>>
>> Thanks,
>> Wes
>>
>> On Thu, Jun 9, 2016 at 8:32 AM, Nicole Nemer <Nicole.Nemer@rms.com> wrote:
>>> Hi Wes.
>>> Would love to help.  Just point me the tests that need to be
>>> expanded/written and I will work on that today/tomorrow.
>>> Thanks!
>>> nn
>>> —
>>> Nicole Nemer, PhD
>>> Technical Architect/Dev Manager
>>>
>>> 303-641-3340
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 6/8/16, 5:47 PM, "Wes McKinney" <wesmckinn@gmail.com> wrote:
>>>
>>>> hi Nicki
>>>>
>>>> Micah's patch for #1 is in progress here
>>>> https://github.com/apache/arrow/pull/85
>>>>
>>>> I believe Steven Phillips is working on a patch toward reconciling the
>>>> Java implementation with the current working version of the spec. We
>>>> need to be able to verify that memory can be passed between Java and
>>>> C++ with full fidelity (using files / memory maps as the exchange
>>>> medium to start); these integration tests will help enable other Arrow
>>>> implementations validate their compatibility as well. It would be
>>>> great to have some additional help here
>>>>
>>>> cheers
>>>> Wes
>>>>
>>>> On Thu, Jun 2, 2016 at 7:10 AM, Nicole Nemer <Nicole.Nemer@rms.com>
wrote:
>>>>> Good Morning Micah,
>>>>> How is 1 please?  anything that I can do to help?
>>>>>
>>>>> Anyone with more insight on 2 please?
>>>>>
>>>>> Thanks,
>>>>> nicki
>>>>> ‹
>>>>> Nicole Nemer, PhD
>>>>> Technical Architect/Dev Manager
>>>>>
>>>>> 303-641-3340
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 5/27/16, 9:51 AM, "Micah Kornfield" <emkornfield@gmail.com>
wrote:
>>>>>
>>>>>> Hi Nicki,
>>>>>> 1.  I'm currently working on the char/string support for C++.  I've
>>>>>> been a little bit backlogged on it.  If I don't make substantial
>>>>>> progress this weekend, I'm happy to relinquish the task.
>>>>>>
>>>>>> 2.  I'll let someone more knowledgable about the java implementation
>>>>>> chime in, but I think the answer is a qualified yes.  We were just
>>>>>> talking about trying to make the first integration test that proves
>>>>>> C++/Java compatibility [1]
>>>>>>
>>>>>> 3.  Yes it is easy to become a contributor.  The general workflow
is
>>>>>> to chime in on jira item [2] (or someone on the PMC? can make you
a
>>>>>> contributor so you can assign a ticket to yourself), and submit a
pull
>>>>>> request via github with "ARROW-<JIRA-NUMBER>:" as the start
of the
>>>>>> pull request title.  In addition to the items mentioned below there
is
>>>>>> a pretty substantial backlog of items to work on if you are interested
>>>>>> in contributing generally.
>>>>>>
>>>>>> Thanks,
>>>>>> Micah
>>>>>>
>>>>>> [1]
>>>>>> http://mail-archives.apache.org/mod_mbox/arrow-dev/201605.mbox/%3CCAK7Z5
>>>>>> T8
>>>>>> X2OiWfSoQ0S-3vu0D4zgkuAO-SD_Q%3DF2Pu%3D4GhaTFbQ%40mail.gmail.com%3E
>>>>>> [2]
>>>>>> https://issues.apache.org/jira/browse/ARROW/?selectedTab=com.atlassian.j
>>>>>> ir
>>>>>> a.jira-projects-plugin:issues-panel
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, May 27, 2016 at 7:35 AM, Nicole Nemer <Nicole.Nemer@rms.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>  1.  does cpp/ipc support char/string types?  If not - when please?
>>>>>>>  2.  Is there a java implementation of the ipc feature please?
 If
>>>>>>> not
>>>>>>> - when please?
>>>>>>>  3.  Is it easy to join and help as a contributor?  I would love
to
>>>>>>> help with these 2 items if they are planned for the near future.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nicki
>>>>>>> -
>>>>>>> Nicole Nemer, PhD
>>>>>>> Technical Architect/Dev Manager
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>

Mime
View raw message