arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vibhatha Abeykoon <vibha...@gmail.com>
Subject Re: PyArrow building from source issue
Date Sat, 02 May 2020 00:23:08 GMT
Thank you for the clarification.

On Fri, May 1, 2020 at 8:10 PM Sutou Kouhei <kou@clear-code.com> wrote:

> Hi,
>
> It'll work.
> Note that the LD_LIBRARY_PATH is the environment variable at
> run time. You need to specify the correct ARROW_HOME or
> -DCMAKE_MODULE_PATH at build time too.
>
>
> Thanks,
> --
> kou
>
> In <CABC-QT=U_+AE8LaMmHwfhJP+4wvVKJ0Y-z7styK0TxtrJyZdNQ@mail.gmail.com>
>   "Re: PyArrow building from source issue" on Fri, 1 May 2020 18:13:54
> -0400,
>   Vibhatha Abeykoon <vibhatha@gmail.com> wrote:
>
> > I will elaborate a couple of reasons,
> >
> > When there are a couple of versions of Arrow, used for different projects
> > depending on various development choices, it is convenient for me to keep
> > them pointed towards a folder of my choice.
> > Then refer to it and continue the work. Correct me if I am wrong, what
> if I
> > point to this folder of my choice and add it to the LD_LIBRARY_PATH.
> > Will this cause issues?
> >
> > With Regards,
> > Vibhatha Abeykoon
> >
> >
> > On Fri, May 1, 2020 at 6:09 PM Vibhatha Abeykoon <vibhatha@gmail.com>
> wrote:
> >
> >> Okay, thank you for your response.
> >>
> >> With Regards,
> >> Vibhatha Abeykoon
> >>
> >>
> >> On Fri, May 1, 2020 at 5:17 PM Sutou Kouhei <kou@clear-code.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> > I used these settings, I still want the libs to be in a custom
> >>> directory,
> >>> > not in lib/
> >>>
> >>> Why? We don't recommend it if you don't a CMake expert.
> >>>
> >>> You can't do this if you use ARROW_HOME environment
> >>> variable.
> >>>
> >>> You may able to do this by removing ARROW_HOME environment
> >>> variable and adding
> >>>
> >>>
> PYARROW_CMAKE_OPTIONS="-DCMAKE_MODULE_PATH=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist/lib/cmake/arrow"
> >>> environment variable or something.
> >>>
> >>>
> >>> Thanks,
> >>> --
> >>> kou
> >>>
> >>> In <CABC-QTmsekZwjWG4X=aWrfQZgYi+qLdyEc_kvbN2njgtwJLofw@mail.gmail.com
> >
> >>>   "Re: PyArrow building from source issue" on Fri, 1 May 2020 15:05:51
> >>> -0400,
> >>>   Vibhatha Abeykoon <vibhatha@gmail.com> wrote:
> >>>
> >>> > export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
> >>> > export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
> >>> >
> >>> > export PYARROW_WITH_PARQUET=1
> >>> > export PYARROW_WITH_PYTHON=1
> >>> > export PYARROW_WITH_BUILD_TESTS=1
> >>> >
> >>> > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
> >>> >
> >>> >
> >>>
> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
> >>> > \
> >>> >       -DARROW_WITH_BZ2=OFF \
> >>> >       -DARROW_WITH_ZLIB=OFF \
> >>> >       -DARROW_WITH_ZSTD=OFF \
> >>> >       -DARROW_WITH_LZ4=OFF \
> >>> >       -DARROW_WITH_SNAPPY=OFF \
> >>> >       -DARROW_WITH_BROTLI=OFF \
> >>> >       -DARROW_PARQUET=ON \
> >>> >       -DARROW_PYTHON=ON \
> >>> >       -DARROW_BUILD_TESTS=ON \
> >>> >
> >>> >
> >>>
> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
> >>> > \
> >>> >       ..
> >>> >
> >>> >
> >>> > I used these settings, I still want the libs to be in a custom
> >>> directory,
> >>> > not in lib/
> >>> >
> >>> > Does it make things not work?
> >>> >
> >>> > Now I get the following error,
> >>> >
> >>> > python setup.py build_ext --inplace
> >>> > WARNING: The wheel package is not available.
> >>> > running build_ext
> >>> > -- Running cmake for pyarrow
> >>> > cmake
> >>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/ENVARROW/bin/python
> >>> >  -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
> >>> > -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
> >>> > -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
> >>> > -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off
> >>> -DPYARROW_BUILD_HDFS=off
> >>> > -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
> >>> > -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
> >>> > -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
> >>> > -DCMAKE_BUILD_TYPE=release
> >>> /home/vibhatha/sandbox/arrow/repos/arrow/python
> >>> > -- System processor: x86_64
> >>> > -- Arrow build warning level: PRODUCTION
> >>> > Using ld linker
> >>> > Configured for RELEASE build (set with cmake
> >>> > -DCMAKE_BUILD_TYPE={release,debug,...})
> >>> > -- Build Type: RELEASE
> >>> > -- Build output directory:
> >>> >
> >>>
> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
> >>> > -- Arrow version: 0.18.0 (HOME:
> >>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist)
> >>> > -- Arrow SO and ABI version: 18
> >>> > -- Arrow full SO version: 18.0.0
> >>> > -- Found the Arrow core shared library:
> >>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
> >>> > -- Found the Arrow core import library:
> >>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
> >>> > -- Found the Arrow core static library:
> >>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
> >>> > -- Found the Arrow Python by HOME:
> >>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
> >>> > -- Found the Arrow Python shared library:
> >>> >
> >>>
> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
> >>> > -- Found the Arrow Python import library:
> >>> >
> >>>
> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
> >>> > -- Found the Arrow Python static library:
> >>> >
> >>>
> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.a
> >>> > CMake Error at
> >>> >
> >>>
> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
> >>> > (message):
> >>> >   Could NOT find Parquet (missing: PARQUET_LIB_DIR) (found version
> >>> "1.5.1")
> >>> > Call Stack (most recent call first):
> >>> >
> >>> >
> >>>
> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
> >>> > (_FPHSA_FAILURE_MESSAGE)
> >>> >   cmake_modules/FindParquet.cmake:116
> >>> (find_package_handle_standard_args)
> >>> >   CMakeLists.txt:426 (find_package)
> >>> >
> >>> > Does Parquet need to be installed separately?
> >>> >
> >>> > With Regards,
> >>> > Vibhatha Abeykoon,
> >>> > Research Assistant,
> >>> > Intelligent Systems Engineering,
> >>> > Indiana University Bloomington,
> >>> > Cell : +1-812-955-1394
> >>> > Web: https://www.vibhatha.org
> >>> > <https://www.linkedin.com/in/vibhathaabeykoon/>
> >>> >
> >>> >
> >>> > On Fri, May 1, 2020 at 1:55 PM Wes McKinney <wesmckinn@gmail.com>
> >>> wrote:
> >>> >
> >>> >> Try this instead
> >>> >>
> >>> >> export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
> >>> >> export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
> >>> >>
> >>> >> Make sure to use
> >>> >>
> >>> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME -DCMAKE_INSTALL_LIBDIR=lib
> >>> >>
> >>> >> I renamed from "arrowmylibs" to "dist" so that it's less confusing
> >>> >> what that directory represents (it doesn't contains the libs
> >>> >> directory, but rather the directories include/, lib/, etc.)
> >>> >>
> >>> >>
> >>> >> On Fri, May 1, 2020 at 12:51 PM Vibhatha Abeykoon <
> vibhatha@gmail.com>
> >>> >> wrote:
> >>> >> >
> >>> >> > I didn't clearly mention the config I used,
> >>> >> >
> >>> >> > export
> >>> >> ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
> >>> >> > export LD_LIBRARY_PATH=$ARROW_HOME:$LD_LIBRARY_PATH
> >>> >> >
> >>> >> > Before starting I added these ENV vars.
> >>> >> >
> >>> >> > My objective is to get all the libs inside my build setting.
I use
> >>> arrow
> >>> >> as one of the core libraries in my project.
> >>> >> > I have to build it in the following way,
> >>> >> >
> >>> >> > 1. Arrow clone from git
> >>> >> > 2. Arrow CPP built
> >>> >> > 3. Using arrow CPP I develop custom functions upon loaded
data
> >>> >> > 4. Integrate Cython APIs for these custom functions
> >>> >> > 5. Use Pyarrow Cython to provide more functionality from Python
> end
> >>> to
> >>> >> Cython to my CPP lib which uses Arrow.
> >>> >> >
> >>> >> > This is the build that I am trying to formulate. So I have
to keep
> >>> those
> >>> >> libs there, and the idea is to build python from
> >>> >> > the same cloned source. The challenge was to keep all the
shared
> libs
> >>> >> from arrow, my-cython libs to point in the right
> >>> >> > direction. Is this a clear description?
> >>> >> >
> >>> >> > With Regards,
> >>> >> > Vibhatha Abeykoon,
> >>> >> > Research Assistant,
> >>> >> > Intelligent Systems Engineering,
> >>> >> > Indiana University Bloomington,
> >>> >> > Cell : +1-812-955-1394
> >>> >> > Web: https://www.vibhatha.org
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > On Fri, May 1, 2020 at 1:28 PM Wes McKinney <wesmckinn@gmail.com>
> >>> wrote:
> >>> >> >>
> >>> >> >> This part doesn't look correct
> >>> >> >>
> >>> >> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
> >>> >> >>
> >>> >>
> >>>
> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
> >>> >> >>
> >>> >> >> The usual incantation is
> >>> >> >>
> >>> >> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
> >>> >> >>       -DCMAKE_INSTALL_LIBDIR=lib
> >>> >> >>
> >>> >> >> The reason we pass -DCMAKE_INSTALL_LIBDIR=lib is that
some
> systems
> >>> >> >> will install libraries in lib32 or lib64 instead of just
lib
> >>> >> >>
> >>> >> >>
> >>> >> >> On Fri, May 1, 2020 at 10:38 AM Vibhatha Abeykoon <
> >>> vibhatha@gmail.com>
> >>> >> wrote:
> >>> >> >> >
> >>> >> >> > Hi Neal,
> >>> >> >> >
> >>> >> >> > Yes, I added the flag. But I installed my libs not
to /usr/lib,
> >>> but
> >>> >> to a different folder.
> >>> >> >> >
> >>> >> >> > This is the way I built arrow C++,
> >>> >> >> >
> >>> >> >> > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
> >>> >> >> >
> >>> >>
> >>>
> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
> >>> >> \
> >>> >> >> >       -DARROW_WITH_BZ2=OFF \
> >>> >> >> >       -DARROW_WITH_ZLIB=OFF \
> >>> >> >> >       -DARROW_WITH_ZSTD=OFF \
> >>> >> >> >       -DARROW_WITH_LZ4=OFF \
> >>> >> >> >       -DARROW_WITH_SNAPPY=OFF \
> >>> >> >> >       -DARROW_WITH_BROTLI=OFF \
> >>> >> >> >       -DARROW_PARQUET=ON \
> >>> >> >> >       -DARROW_PYTHON=ON \
> >>> >> >> >       -DARROW_BUILD_TESTS=ON \
> >>> >> >> >
> >>> >>
> >>>
> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
> >>> >> \
> >>> >> >> >       ..
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > With Regards,
> >>> >> >> > Vibhatha Abeykoon,
> >>> >> >> > Research Assistant,
> >>> >> >> > Intelligent Systems Engineering,
> >>> >> >> > Indiana University Bloomington,
> >>> >> >> > Cell : +1-812-955-1394
> >>> >> >> > Web: https://www.vibhatha.org
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > On Fri, May 1, 2020 at 11:29 AM Neal Richardson <
> >>> >> neal.p.richardson@gmail.com> wrote:
> >>> >> >> >>
> >>> >> >> >> Hi Vibhatha,
> >>> >> >> >> Did you build Arrow C++ with -DARROW_PYTHON=ON?
> >>> >> >> >>
> >>> >> >> >> Neal
> >>> >> >> >>
> >>> >> >> >> On Fri, May 1, 2020 at 8:24 AM Vibhatha Abeykoon
<
> >>> vibhatha@gmail.com>
> >>> >> wrote:
> >>> >> >> >>>
> >>> >> >> >>> Hi,
> >>> >> >> >>>
> >>> >> >> >>> I am trying to integrate Arrow with an application
that I am
> >>> >> developing. Here I build Arrow from the source (CPP) and use the
> API to
> >>> >> develop some custom functions to do a scientific calculation after
> data
> >>> >> loaded with Arrow table API. On top of this, I develop a Cython
API
> to
> >>> >> design a python API.
> >>> >> >> >>>
> >>> >> >> >>> In the current stage, I have a new necessity
where I need to
> >>> >> consume Arrow Cython API for my code.
> >>> >> >> >>>
> >>> >> >> >>> Here It was hard to link the build libarrow.so.16
with the
> >>> >> libarrow_python.so.16 from the installed pyarrow (separately from
> pip).
> >>> >> What I realised was everything has to be built from the same
> source, so
> >>> >> that I can install pyarrow from the source in my virtual
> environment.
> >>> >> >> >>>
> >>> >> >> >>> Before going through deeper things, I started
by just
> building
> >>> from
> >>> >> source (CPP) and then moving towards installing pyarrow from the
> >>> source.
> >>> >> >> >>>
> >>> >> >> >>> I tried to follow the guideline form here,
> >>> >> >> >>>
> >>> >> >> >>> https://arrow.apache.org/docs/developers/python.html,
> >>> >> >> >>>
> >>> >> >> >>> But when I found issues in the python build,
I followed this
> >>> source,
> >>> >> >> >>> (but still, I used the clone from the master,
not a released
> >>> >> version)
> >>> >> >> >>>
> >>> >> >> >>>
> >>> https://gist.github.com/heavyinfo/04e1326bb9bed9cecb19c2d603c8d521
> >>> >> >> >>>
> >>> >> >> >>> My environmental variables are as follows,
> >>> >> >> >>>
> >>> >> >> >>> python3 setup.py build_ext --inplace
> >>> >> >> >>> running build_ext
> >>> >> >> >>> -- Running cmake for pyarrow
> >>> >> >> >>> cmake
> >>> >>
> >>>
> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
> >>> >> -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
> >>> >> -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
> >>> >> -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
> >>> >> -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off
> >>> -DPYARROW_BUILD_HDFS=off
> >>> >> -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
> >>> >> -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
> >>> >> -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
> >>> >> -DCMAKE_BUILD_TYPE=release
> >>> /home/vibhatha/sandbox/arrow/repos/arrow/python
> >>> >> >> >>> -- System processor: x86_64
> >>> >> >> >>> -- Arrow build warning level: PRODUCTION
> >>> >> >> >>> Using ld linker
> >>> >> >> >>> Configured for RELEASE build (set with cmake
> >>> >> -DCMAKE_BUILD_TYPE={release,debug,...})
> >>> >> >> >>> -- Build Type: RELEASE
> >>> >> >> >>> -- Build output directory:
> >>> >>
> >>>
> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
> >>> >> >> >>> -- Arrow version: 0.18.0 (HOME:
> >>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs)
> >>> >> >> >>> -- Arrow SO and ABI version: 18
> >>> >> >> >>> -- Arrow full SO version: 18.0.0
> >>> >> >> >>> -- Found the Arrow core shared library:
> >>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
> >>> >> >> >>> -- Found the Arrow core import library:
> >>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
> >>> >> >> >>> -- Found the Arrow core static library:
> >>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
> >>> >> >> >>> CMake Error at
> >>> >>
> >>>
> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
> >>> >> (message):
> >>> >> >> >>>   Could NOT find ArrowPython (missing:
> ARROW_PYTHON_INCLUDE_DIR)
> >>> >> (found
> >>> >> >> >>>   version "0.18.0")
> >>> >> >> >>> Call Stack (most recent call first):
> >>> >> >> >>>
> >>> >>
> >>>
> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
> >>> >> (_FPHSA_FAILURE_MESSAGE)
> >>> >> >> >>>   cmake_modules/FindArrowPython.cmake:76
> >>> >> (find_package_handle_standard_args)
> >>> >> >> >>>   CMakeLists.txt:210 (find_package)
> >>> >> >> >>>
> >>> >> >> >>>
> >>> >> >> >>> -- Configuring incomplete, errors occurred!
> >>> >> >> >>> See also
> >>> >>
> >>>
> "/home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/CMakeFiles/CMakeOutput.log".
> >>> >> >> >>> error: command 'cmake' failed with exit status
1
> >>> >> >> >>>
> >>> >> >> >>> Maybe I am missing some step and I am not
quite sure what is
> the
> >>> >> issue.
> >>> >> >> >>>
> >>> >> >> >>> Any pointers to solve this issue?
> >>> >> >> >>>
> >>> >> >> >>> With Regards,
> >>> >> >> >>> Vibhatha
> >>> >> >> >>>
> >>> >> >> >>>
> >>> >>
> >>>
> >>
>
-- 
Vibhatha Abeykoon

Mime
View raw message