arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Le Dem <>
Subject Re: Arrow sync in 30min
Date Thu, 01 Sep 2016 17:07:13 GMT
 Notes from the sync:
next meeting same day/time in 2 weeks

*Attendees and their interests:*
 - looking for example of IPC from Java to C++
 - C++ to C++
 - Java to Java

Jacques: Dremio

 - file format
 - plan for integration testing. (has time in the next 2 to 4 weeks)

Tsuyoshi: Newbie.
 - try to contribute.
 - how to contribute? What’s going on in the project

  - trying to build on linux default python. packaging.

  - arrow file
  - make java code match the spec
  - release

*Agenda: *
 - java to C++ IPC
 - file format
 - integration testing
 - How to contribute. state of project.
 - python packaging for linux pip
 - release 0.1

*Topics discussed:*
Java to C++ IPC: ARROW-263
 - communication between 2 processes using shared memory.
 - some info:
 - Erol:
   - would rather memory maps rather than files.
   - had trouble with the type size not always being the same size in c++
and java
   - goal: need to more large amount across languages (python, matlab,
.net) without going to files. shared memory map would make it easier.
- current thinking: use RPC for communicating memory location.
- Actions:
  - Julien: create a JIRA for prototype.
  - Erol: share prototype of IPC.

 - for now looking at GRPC. Possibly use HTTP directly.
 - need a sidecar for RPC.
 - Kudu did their own (krpc)

File format:
 - Julien created a 1st version of the file format. Java impl
 - Wes to do a C++ implementation of the file format.
 - create integrations tests based on that (possibly use jni for java to
drive the c++ lib and check for compatibility)

How to contribute:
 - Pull requests/jira/dev list + sync to discuss.

Tsuyoshi: interested in limitation of java byte array. use arrow for
backend of spark byte array. make spark more scalable.

 - Arrow to spark data frames.
 - PySpark integration.
 - Action: Wes open JIRA.

Python packaging:
 - Uwe has python packages to generate parquet files through pandas arrow.
    - use pandas in python to generate arrow. Than arrow to parquet files
and back.
conda-forge. Build portable binaries not tied to a specific linux

 - create blocker jiras to release:
   - Action (Uwe). lira to make parquet-cpp optional.
   - create a release of parquet cpp
 - 0.1 is not considered stable yet.

On Thu, Sep 1, 2016 at 9:00 AM, Julien Le Dem <> wrote:

> starting now
> On Thu, Sep 1, 2016 at 8:32 AM, Julien Le Dem <> wrote:
>> Julien
> --
> Julien


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message