arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sutou Kouhei <...@clear-code.com>
Subject Re: PyArrow building from source issue
Date Sat, 02 May 2020 00:10:03 GMT
Hi,

It'll work.
Note that the LD_LIBRARY_PATH is the environment variable at
run time. You need to specify the correct ARROW_HOME or
-DCMAKE_MODULE_PATH at build time too.


Thanks,
--
kou

In <CABC-QT=U_+AE8LaMmHwfhJP+4wvVKJ0Y-z7styK0TxtrJyZdNQ@mail.gmail.com>
  "Re: PyArrow building from source issue" on Fri, 1 May 2020 18:13:54 -0400,
  Vibhatha Abeykoon <vibhatha@gmail.com> wrote:

> I will elaborate a couple of reasons,
> 
> When there are a couple of versions of Arrow, used for different projects
> depending on various development choices, it is convenient for me to keep
> them pointed towards a folder of my choice.
> Then refer to it and continue the work. Correct me if I am wrong, what if I
> point to this folder of my choice and add it to the LD_LIBRARY_PATH.
> Will this cause issues?
> 
> With Regards,
> Vibhatha Abeykoon
> 
> 
> On Fri, May 1, 2020 at 6:09 PM Vibhatha Abeykoon <vibhatha@gmail.com> wrote:
> 
>> Okay, thank you for your response.
>>
>> With Regards,
>> Vibhatha Abeykoon
>>
>>
>> On Fri, May 1, 2020 at 5:17 PM Sutou Kouhei <kou@clear-code.com> wrote:
>>
>>> Hi,
>>>
>>> > I used these settings, I still want the libs to be in a custom
>>> directory,
>>> > not in lib/
>>>
>>> Why? We don't recommend it if you don't a CMake expert.
>>>
>>> You can't do this if you use ARROW_HOME environment
>>> variable.
>>>
>>> You may able to do this by removing ARROW_HOME environment
>>> variable and adding
>>>
>>> PYARROW_CMAKE_OPTIONS="-DCMAKE_MODULE_PATH=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist/lib/cmake/arrow"
>>> environment variable or something.
>>>
>>>
>>> Thanks,
>>> --
>>> kou
>>>
>>> In <CABC-QTmsekZwjWG4X=aWrfQZgYi+qLdyEc_kvbN2njgtwJLofw@mail.gmail.com>
>>>   "Re: PyArrow building from source issue" on Fri, 1 May 2020 15:05:51
>>> -0400,
>>>   Vibhatha Abeykoon <vibhatha@gmail.com> wrote:
>>>
>>> > export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>>> > export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>>> >
>>> > export PYARROW_WITH_PARQUET=1
>>> > export PYARROW_WITH_PYTHON=1
>>> > export PYARROW_WITH_BUILD_TESTS=1
>>> >
>>> > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>> >
>>> >
>>> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>>> > \
>>> >       -DARROW_WITH_BZ2=OFF \
>>> >       -DARROW_WITH_ZLIB=OFF \
>>> >       -DARROW_WITH_ZSTD=OFF \
>>> >       -DARROW_WITH_LZ4=OFF \
>>> >       -DARROW_WITH_SNAPPY=OFF \
>>> >       -DARROW_WITH_BROTLI=OFF \
>>> >       -DARROW_PARQUET=ON \
>>> >       -DARROW_PYTHON=ON \
>>> >       -DARROW_BUILD_TESTS=ON \
>>> >
>>> >
>>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>>> > \
>>> >       ..
>>> >
>>> >
>>> > I used these settings, I still want the libs to be in a custom
>>> directory,
>>> > not in lib/
>>> >
>>> > Does it make things not work?
>>> >
>>> > Now I get the following error,
>>> >
>>> > python setup.py build_ext --inplace
>>> > WARNING: The wheel package is not available.
>>> > running build_ext
>>> > -- Running cmake for pyarrow
>>> > cmake
>>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/ENVARROW/bin/python
>>> >  -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
>>> > -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
>>> > -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
>>> > -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off
>>> -DPYARROW_BUILD_HDFS=off
>>> > -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
>>> > -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
>>> > -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
>>> > -DCMAKE_BUILD_TYPE=release
>>> /home/vibhatha/sandbox/arrow/repos/arrow/python
>>> > -- System processor: x86_64
>>> > -- Arrow build warning level: PRODUCTION
>>> > Using ld linker
>>> > Configured for RELEASE build (set with cmake
>>> > -DCMAKE_BUILD_TYPE={release,debug,...})
>>> > -- Build Type: RELEASE
>>> > -- Build output directory:
>>> >
>>> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
>>> > -- Arrow version: 0.18.0 (HOME:
>>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist)
>>> > -- Arrow SO and ABI version: 18
>>> > -- Arrow full SO version: 18.0.0
>>> > -- Found the Arrow core shared library:
>>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>>> > -- Found the Arrow core import library:
>>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>>> > -- Found the Arrow core static library:
>>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
>>> > -- Found the Arrow Python by HOME:
>>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>>> > -- Found the Arrow Python shared library:
>>> >
>>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
>>> > -- Found the Arrow Python import library:
>>> >
>>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
>>> > -- Found the Arrow Python static library:
>>> >
>>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.a
>>> > CMake Error at
>>> >
>>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
>>> > (message):
>>> >   Could NOT find Parquet (missing: PARQUET_LIB_DIR) (found version
>>> "1.5.1")
>>> > Call Stack (most recent call first):
>>> >
>>> >
>>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
>>> > (_FPHSA_FAILURE_MESSAGE)
>>> >   cmake_modules/FindParquet.cmake:116
>>> (find_package_handle_standard_args)
>>> >   CMakeLists.txt:426 (find_package)
>>> >
>>> > Does Parquet need to be installed separately?
>>> >
>>> > With Regards,
>>> > Vibhatha Abeykoon,
>>> > Research Assistant,
>>> > Intelligent Systems Engineering,
>>> > Indiana University Bloomington,
>>> > Cell : +1-812-955-1394
>>> > Web: https://www.vibhatha.org
>>> > <https://www.linkedin.com/in/vibhathaabeykoon/>
>>> >
>>> >
>>> > On Fri, May 1, 2020 at 1:55 PM Wes McKinney <wesmckinn@gmail.com>
>>> wrote:
>>> >
>>> >> Try this instead
>>> >>
>>> >> export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>>> >> export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>>> >>
>>> >> Make sure to use
>>> >>
>>> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME -DCMAKE_INSTALL_LIBDIR=lib
>>> >>
>>> >> I renamed from "arrowmylibs" to "dist" so that it's less confusing
>>> >> what that directory represents (it doesn't contains the libs
>>> >> directory, but rather the directories include/, lib/, etc.)
>>> >>
>>> >>
>>> >> On Fri, May 1, 2020 at 12:51 PM Vibhatha Abeykoon <vibhatha@gmail.com>
>>> >> wrote:
>>> >> >
>>> >> > I didn't clearly mention the config I used,
>>> >> >
>>> >> > export
>>> >> ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>>> >> > export LD_LIBRARY_PATH=$ARROW_HOME:$LD_LIBRARY_PATH
>>> >> >
>>> >> > Before starting I added these ENV vars.
>>> >> >
>>> >> > My objective is to get all the libs inside my build setting. I
use
>>> arrow
>>> >> as one of the core libraries in my project.
>>> >> > I have to build it in the following way,
>>> >> >
>>> >> > 1. Arrow clone from git
>>> >> > 2. Arrow CPP built
>>> >> > 3. Using arrow CPP I develop custom functions upon loaded data
>>> >> > 4. Integrate Cython APIs for these custom functions
>>> >> > 5. Use Pyarrow Cython to provide more functionality from Python
end
>>> to
>>> >> Cython to my CPP lib which uses Arrow.
>>> >> >
>>> >> > This is the build that I am trying to formulate. So I have to keep
>>> those
>>> >> libs there, and the idea is to build python from
>>> >> > the same cloned source. The challenge was to keep all the shared
libs
>>> >> from arrow, my-cython libs to point in the right
>>> >> > direction. Is this a clear description?
>>> >> >
>>> >> > With Regards,
>>> >> > Vibhatha Abeykoon,
>>> >> > Research Assistant,
>>> >> > Intelligent Systems Engineering,
>>> >> > Indiana University Bloomington,
>>> >> > Cell : +1-812-955-1394
>>> >> > Web: https://www.vibhatha.org
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Fri, May 1, 2020 at 1:28 PM Wes McKinney <wesmckinn@gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> This part doesn't look correct
>>> >> >>
>>> >> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>> >> >>
>>> >>
>>> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>>> >> >>
>>> >> >> The usual incantation is
>>> >> >>
>>> >> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>> >> >>       -DCMAKE_INSTALL_LIBDIR=lib
>>> >> >>
>>> >> >> The reason we pass -DCMAKE_INSTALL_LIBDIR=lib is that some
systems
>>> >> >> will install libraries in lib32 or lib64 instead of just lib
>>> >> >>
>>> >> >>
>>> >> >> On Fri, May 1, 2020 at 10:38 AM Vibhatha Abeykoon <
>>> vibhatha@gmail.com>
>>> >> wrote:
>>> >> >> >
>>> >> >> > Hi Neal,
>>> >> >> >
>>> >> >> > Yes, I added the flag. But I installed my libs not to
/usr/lib,
>>> but
>>> >> to a different folder.
>>> >> >> >
>>> >> >> > This is the way I built arrow C++,
>>> >> >> >
>>> >> >> > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>> >> >> >
>>> >>
>>> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>>> >> \
>>> >> >> >       -DARROW_WITH_BZ2=OFF \
>>> >> >> >       -DARROW_WITH_ZLIB=OFF \
>>> >> >> >       -DARROW_WITH_ZSTD=OFF \
>>> >> >> >       -DARROW_WITH_LZ4=OFF \
>>> >> >> >       -DARROW_WITH_SNAPPY=OFF \
>>> >> >> >       -DARROW_WITH_BROTLI=OFF \
>>> >> >> >       -DARROW_PARQUET=ON \
>>> >> >> >       -DARROW_PYTHON=ON \
>>> >> >> >       -DARROW_BUILD_TESTS=ON \
>>> >> >> >
>>> >>
>>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>>> >> \
>>> >> >> >       ..
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > With Regards,
>>> >> >> > Vibhatha Abeykoon,
>>> >> >> > Research Assistant,
>>> >> >> > Intelligent Systems Engineering,
>>> >> >> > Indiana University Bloomington,
>>> >> >> > Cell : +1-812-955-1394
>>> >> >> > Web: https://www.vibhatha.org
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > On Fri, May 1, 2020 at 11:29 AM Neal Richardson <
>>> >> neal.p.richardson@gmail.com> wrote:
>>> >> >> >>
>>> >> >> >> Hi Vibhatha,
>>> >> >> >> Did you build Arrow C++ with -DARROW_PYTHON=ON?
>>> >> >> >>
>>> >> >> >> Neal
>>> >> >> >>
>>> >> >> >> On Fri, May 1, 2020 at 8:24 AM Vibhatha Abeykoon <
>>> vibhatha@gmail.com>
>>> >> wrote:
>>> >> >> >>>
>>> >> >> >>> Hi,
>>> >> >> >>>
>>> >> >> >>> I am trying to integrate Arrow with an application
that I am
>>> >> developing. Here I build Arrow from the source (CPP) and use the API
to
>>> >> develop some custom functions to do a scientific calculation after data
>>> >> loaded with Arrow table API. On top of this, I develop a Cython API
to
>>> >> design a python API.
>>> >> >> >>>
>>> >> >> >>> In the current stage, I have a new necessity where
I need to
>>> >> consume Arrow Cython API for my code.
>>> >> >> >>>
>>> >> >> >>> Here It was hard to link the build libarrow.so.16
with the
>>> >> libarrow_python.so.16 from the installed pyarrow (separately from pip).
>>> >> What I realised was everything has to be built from the same source,
so
>>> >> that I can install pyarrow from the source in my virtual environment.
>>> >> >> >>>
>>> >> >> >>> Before going through deeper things, I started
by just building
>>> from
>>> >> source (CPP) and then moving towards installing pyarrow from the
>>> source.
>>> >> >> >>>
>>> >> >> >>> I tried to follow the guideline form here,
>>> >> >> >>>
>>> >> >> >>> https://arrow.apache.org/docs/developers/python.html,
>>> >> >> >>>
>>> >> >> >>> But when I found issues in the python build, I
followed this
>>> source,
>>> >> >> >>> (but still, I used the clone from the master,
not a released
>>> >> version)
>>> >> >> >>>
>>> >> >> >>>
>>> https://gist.github.com/heavyinfo/04e1326bb9bed9cecb19c2d603c8d521
>>> >> >> >>>
>>> >> >> >>> My environmental variables are as follows,
>>> >> >> >>>
>>> >> >> >>> python3 setup.py build_ext --inplace
>>> >> >> >>> running build_ext
>>> >> >> >>> -- Running cmake for pyarrow
>>> >> >> >>> cmake
>>> >>
>>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>>> >> -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
>>> >> -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
>>> >> -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
>>> >> -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off
>>> -DPYARROW_BUILD_HDFS=off
>>> >> -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
>>> >> -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
>>> >> -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
>>> >> -DCMAKE_BUILD_TYPE=release
>>> /home/vibhatha/sandbox/arrow/repos/arrow/python
>>> >> >> >>> -- System processor: x86_64
>>> >> >> >>> -- Arrow build warning level: PRODUCTION
>>> >> >> >>> Using ld linker
>>> >> >> >>> Configured for RELEASE build (set with cmake
>>> >> -DCMAKE_BUILD_TYPE={release,debug,...})
>>> >> >> >>> -- Build Type: RELEASE
>>> >> >> >>> -- Build output directory:
>>> >>
>>> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
>>> >> >> >>> -- Arrow version: 0.18.0 (HOME:
>>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs)
>>> >> >> >>> -- Arrow SO and ABI version: 18
>>> >> >> >>> -- Arrow full SO version: 18.0.0
>>> >> >> >>> -- Found the Arrow core shared library:
>>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>>> >> >> >>> -- Found the Arrow core import library:
>>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>>> >> >> >>> -- Found the Arrow core static library:
>>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
>>> >> >> >>> CMake Error at
>>> >>
>>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
>>> >> (message):
>>> >> >> >>>   Could NOT find ArrowPython (missing: ARROW_PYTHON_INCLUDE_DIR)
>>> >> (found
>>> >> >> >>>   version "0.18.0")
>>> >> >> >>> Call Stack (most recent call first):
>>> >> >> >>>
>>> >>
>>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
>>> >> (_FPHSA_FAILURE_MESSAGE)
>>> >> >> >>>   cmake_modules/FindArrowPython.cmake:76
>>> >> (find_package_handle_standard_args)
>>> >> >> >>>   CMakeLists.txt:210 (find_package)
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>> -- Configuring incomplete, errors occurred!
>>> >> >> >>> See also
>>> >>
>>> "/home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/CMakeFiles/CMakeOutput.log".
>>> >> >> >>> error: command 'cmake' failed with exit status
1
>>> >> >> >>>
>>> >> >> >>> Maybe I am missing some step and I am not quite
sure what is the
>>> >> issue.
>>> >> >> >>>
>>> >> >> >>> Any pointers to solve this issue?
>>> >> >> >>>
>>> >> >> >>> With Regards,
>>> >> >> >>> Vibhatha
>>> >> >> >>>
>>> >> >> >>>
>>> >>
>>>
>>

Mime
View raw message