arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vibhatha Abeykoon <vibha...@gmail.com>
Subject Re: PyArrow building from source issue
Date Fri, 01 May 2020 22:13:54 GMT
I will elaborate a couple of reasons,

When there are a couple of versions of Arrow, used for different projects
depending on various development choices, it is convenient for me to keep
them pointed towards a folder of my choice.
Then refer to it and continue the work. Correct me if I am wrong, what if I
point to this folder of my choice and add it to the LD_LIBRARY_PATH.
Will this cause issues?

With Regards,
Vibhatha Abeykoon


On Fri, May 1, 2020 at 6:09 PM Vibhatha Abeykoon <vibhatha@gmail.com> wrote:

> Okay, thank you for your response.
>
> With Regards,
> Vibhatha Abeykoon
>
>
> On Fri, May 1, 2020 at 5:17 PM Sutou Kouhei <kou@clear-code.com> wrote:
>
>> Hi,
>>
>> > I used these settings, I still want the libs to be in a custom
>> directory,
>> > not in lib/
>>
>> Why? We don't recommend it if you don't a CMake expert.
>>
>> You can't do this if you use ARROW_HOME environment
>> variable.
>>
>> You may able to do this by removing ARROW_HOME environment
>> variable and adding
>>
>> PYARROW_CMAKE_OPTIONS="-DCMAKE_MODULE_PATH=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist/lib/cmake/arrow"
>> environment variable or something.
>>
>>
>> Thanks,
>> --
>> kou
>>
>> In <CABC-QTmsekZwjWG4X=aWrfQZgYi+qLdyEc_kvbN2njgtwJLofw@mail.gmail.com>
>>   "Re: PyArrow building from source issue" on Fri, 1 May 2020 15:05:51
>> -0400,
>>   Vibhatha Abeykoon <vibhatha@gmail.com> wrote:
>>
>> > export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>> > export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>> >
>> > export PYARROW_WITH_PARQUET=1
>> > export PYARROW_WITH_PYTHON=1
>> > export PYARROW_WITH_BUILD_TESTS=1
>> >
>> > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>> >
>> >
>> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>> > \
>> >       -DARROW_WITH_BZ2=OFF \
>> >       -DARROW_WITH_ZLIB=OFF \
>> >       -DARROW_WITH_ZSTD=OFF \
>> >       -DARROW_WITH_LZ4=OFF \
>> >       -DARROW_WITH_SNAPPY=OFF \
>> >       -DARROW_WITH_BROTLI=OFF \
>> >       -DARROW_PARQUET=ON \
>> >       -DARROW_PYTHON=ON \
>> >       -DARROW_BUILD_TESTS=ON \
>> >
>> >
>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>> > \
>> >       ..
>> >
>> >
>> > I used these settings, I still want the libs to be in a custom
>> directory,
>> > not in lib/
>> >
>> > Does it make things not work?
>> >
>> > Now I get the following error,
>> >
>> > python setup.py build_ext --inplace
>> > WARNING: The wheel package is not available.
>> > running build_ext
>> > -- Running cmake for pyarrow
>> > cmake
>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/ENVARROW/bin/python
>> >  -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
>> > -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
>> > -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
>> > -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off
>> -DPYARROW_BUILD_HDFS=off
>> > -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
>> > -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
>> > -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
>> > -DCMAKE_BUILD_TYPE=release
>> /home/vibhatha/sandbox/arrow/repos/arrow/python
>> > -- System processor: x86_64
>> > -- Arrow build warning level: PRODUCTION
>> > Using ld linker
>> > Configured for RELEASE build (set with cmake
>> > -DCMAKE_BUILD_TYPE={release,debug,...})
>> > -- Build Type: RELEASE
>> > -- Build output directory:
>> >
>> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
>> > -- Arrow version: 0.18.0 (HOME:
>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist)
>> > -- Arrow SO and ABI version: 18
>> > -- Arrow full SO version: 18.0.0
>> > -- Found the Arrow core shared library:
>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>> > -- Found the Arrow core import library:
>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>> > -- Found the Arrow core static library:
>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
>> > -- Found the Arrow Python by HOME:
>> > /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>> > -- Found the Arrow Python shared library:
>> >
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
>> > -- Found the Arrow Python import library:
>> >
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
>> > -- Found the Arrow Python static library:
>> >
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.a
>> > CMake Error at
>> >
>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
>> > (message):
>> >   Could NOT find Parquet (missing: PARQUET_LIB_DIR) (found version
>> "1.5.1")
>> > Call Stack (most recent call first):
>> >
>> >
>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
>> > (_FPHSA_FAILURE_MESSAGE)
>> >   cmake_modules/FindParquet.cmake:116
>> (find_package_handle_standard_args)
>> >   CMakeLists.txt:426 (find_package)
>> >
>> > Does Parquet need to be installed separately?
>> >
>> > With Regards,
>> > Vibhatha Abeykoon,
>> > Research Assistant,
>> > Intelligent Systems Engineering,
>> > Indiana University Bloomington,
>> > Cell : +1-812-955-1394
>> > Web: https://www.vibhatha.org
>> > <https://www.linkedin.com/in/vibhathaabeykoon/>
>> >
>> >
>> > On Fri, May 1, 2020 at 1:55 PM Wes McKinney <wesmckinn@gmail.com>
>> wrote:
>> >
>> >> Try this instead
>> >>
>> >> export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>> >> export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>> >>
>> >> Make sure to use
>> >>
>> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME -DCMAKE_INSTALL_LIBDIR=lib
>> >>
>> >> I renamed from "arrowmylibs" to "dist" so that it's less confusing
>> >> what that directory represents (it doesn't contains the libs
>> >> directory, but rather the directories include/, lib/, etc.)
>> >>
>> >>
>> >> On Fri, May 1, 2020 at 12:51 PM Vibhatha Abeykoon <vibhatha@gmail.com>
>> >> wrote:
>> >> >
>> >> > I didn't clearly mention the config I used,
>> >> >
>> >> > export
>> >> ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>> >> > export LD_LIBRARY_PATH=$ARROW_HOME:$LD_LIBRARY_PATH
>> >> >
>> >> > Before starting I added these ENV vars.
>> >> >
>> >> > My objective is to get all the libs inside my build setting. I use
>> arrow
>> >> as one of the core libraries in my project.
>> >> > I have to build it in the following way,
>> >> >
>> >> > 1. Arrow clone from git
>> >> > 2. Arrow CPP built
>> >> > 3. Using arrow CPP I develop custom functions upon loaded data
>> >> > 4. Integrate Cython APIs for these custom functions
>> >> > 5. Use Pyarrow Cython to provide more functionality from Python end
>> to
>> >> Cython to my CPP lib which uses Arrow.
>> >> >
>> >> > This is the build that I am trying to formulate. So I have to keep
>> those
>> >> libs there, and the idea is to build python from
>> >> > the same cloned source. The challenge was to keep all the shared libs
>> >> from arrow, my-cython libs to point in the right
>> >> > direction. Is this a clear description?
>> >> >
>> >> > With Regards,
>> >> > Vibhatha Abeykoon,
>> >> > Research Assistant,
>> >> > Intelligent Systems Engineering,
>> >> > Indiana University Bloomington,
>> >> > Cell : +1-812-955-1394
>> >> > Web: https://www.vibhatha.org
>> >> >
>> >> >
>> >> >
>> >> > On Fri, May 1, 2020 at 1:28 PM Wes McKinney <wesmckinn@gmail.com>
>> wrote:
>> >> >>
>> >> >> This part doesn't look correct
>> >> >>
>> >> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>> >> >>
>> >>
>> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>> >> >>
>> >> >> The usual incantation is
>> >> >>
>> >> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>> >> >>       -DCMAKE_INSTALL_LIBDIR=lib
>> >> >>
>> >> >> The reason we pass -DCMAKE_INSTALL_LIBDIR=lib is that some systems
>> >> >> will install libraries in lib32 or lib64 instead of just lib
>> >> >>
>> >> >>
>> >> >> On Fri, May 1, 2020 at 10:38 AM Vibhatha Abeykoon <
>> vibhatha@gmail.com>
>> >> wrote:
>> >> >> >
>> >> >> > Hi Neal,
>> >> >> >
>> >> >> > Yes, I added the flag. But I installed my libs not to /usr/lib,
>> but
>> >> to a different folder.
>> >> >> >
>> >> >> > This is the way I built arrow C++,
>> >> >> >
>> >> >> > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>> >> >> >
>> >>
>> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>> >> \
>> >> >> >       -DARROW_WITH_BZ2=OFF \
>> >> >> >       -DARROW_WITH_ZLIB=OFF \
>> >> >> >       -DARROW_WITH_ZSTD=OFF \
>> >> >> >       -DARROW_WITH_LZ4=OFF \
>> >> >> >       -DARROW_WITH_SNAPPY=OFF \
>> >> >> >       -DARROW_WITH_BROTLI=OFF \
>> >> >> >       -DARROW_PARQUET=ON \
>> >> >> >       -DARROW_PYTHON=ON \
>> >> >> >       -DARROW_BUILD_TESTS=ON \
>> >> >> >
>> >>
>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>> >> \
>> >> >> >       ..
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > With Regards,
>> >> >> > Vibhatha Abeykoon,
>> >> >> > Research Assistant,
>> >> >> > Intelligent Systems Engineering,
>> >> >> > Indiana University Bloomington,
>> >> >> > Cell : +1-812-955-1394
>> >> >> > Web: https://www.vibhatha.org
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Fri, May 1, 2020 at 11:29 AM Neal Richardson <
>> >> neal.p.richardson@gmail.com> wrote:
>> >> >> >>
>> >> >> >> Hi Vibhatha,
>> >> >> >> Did you build Arrow C++ with -DARROW_PYTHON=ON?
>> >> >> >>
>> >> >> >> Neal
>> >> >> >>
>> >> >> >> On Fri, May 1, 2020 at 8:24 AM Vibhatha Abeykoon <
>> vibhatha@gmail.com>
>> >> wrote:
>> >> >> >>>
>> >> >> >>> Hi,
>> >> >> >>>
>> >> >> >>> I am trying to integrate Arrow with an application
that I am
>> >> developing. Here I build Arrow from the source (CPP) and use the API to
>> >> develop some custom functions to do a scientific calculation after data
>> >> loaded with Arrow table API. On top of this, I develop a Cython API to
>> >> design a python API.
>> >> >> >>>
>> >> >> >>> In the current stage, I have a new necessity where
I need to
>> >> consume Arrow Cython API for my code.
>> >> >> >>>
>> >> >> >>> Here It was hard to link the build libarrow.so.16
with the
>> >> libarrow_python.so.16 from the installed pyarrow (separately from pip).
>> >> What I realised was everything has to be built from the same source, so
>> >> that I can install pyarrow from the source in my virtual environment.
>> >> >> >>>
>> >> >> >>> Before going through deeper things, I started by just
building
>> from
>> >> source (CPP) and then moving towards installing pyarrow from the
>> source.
>> >> >> >>>
>> >> >> >>> I tried to follow the guideline form here,
>> >> >> >>>
>> >> >> >>> https://arrow.apache.org/docs/developers/python.html,
>> >> >> >>>
>> >> >> >>> But when I found issues in the python build, I followed
this
>> source,
>> >> >> >>> (but still, I used the clone from the master, not
a released
>> >> version)
>> >> >> >>>
>> >> >> >>>
>> https://gist.github.com/heavyinfo/04e1326bb9bed9cecb19c2d603c8d521
>> >> >> >>>
>> >> >> >>> My environmental variables are as follows,
>> >> >> >>>
>> >> >> >>> python3 setup.py build_ext --inplace
>> >> >> >>> running build_ext
>> >> >> >>> -- Running cmake for pyarrow
>> >> >> >>> cmake
>> >>
>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>> >> -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
>> >> -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
>> >> -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
>> >> -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off
>> -DPYARROW_BUILD_HDFS=off
>> >> -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
>> >> -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
>> >> -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
>> >> -DCMAKE_BUILD_TYPE=release
>> /home/vibhatha/sandbox/arrow/repos/arrow/python
>> >> >> >>> -- System processor: x86_64
>> >> >> >>> -- Arrow build warning level: PRODUCTION
>> >> >> >>> Using ld linker
>> >> >> >>> Configured for RELEASE build (set with cmake
>> >> -DCMAKE_BUILD_TYPE={release,debug,...})
>> >> >> >>> -- Build Type: RELEASE
>> >> >> >>> -- Build output directory:
>> >>
>> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
>> >> >> >>> -- Arrow version: 0.18.0 (HOME:
>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs)
>> >> >> >>> -- Arrow SO and ABI version: 18
>> >> >> >>> -- Arrow full SO version: 18.0.0
>> >> >> >>> -- Found the Arrow core shared library:
>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>> >> >> >>> -- Found the Arrow core import library:
>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>> >> >> >>> -- Found the Arrow core static library:
>> >> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
>> >> >> >>> CMake Error at
>> >>
>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
>> >> (message):
>> >> >> >>>   Could NOT find ArrowPython (missing: ARROW_PYTHON_INCLUDE_DIR)
>> >> (found
>> >> >> >>>   version "0.18.0")
>> >> >> >>> Call Stack (most recent call first):
>> >> >> >>>
>> >>
>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
>> >> (_FPHSA_FAILURE_MESSAGE)
>> >> >> >>>   cmake_modules/FindArrowPython.cmake:76
>> >> (find_package_handle_standard_args)
>> >> >> >>>   CMakeLists.txt:210 (find_package)
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> -- Configuring incomplete, errors occurred!
>> >> >> >>> See also
>> >>
>> "/home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/CMakeFiles/CMakeOutput.log".
>> >> >> >>> error: command 'cmake' failed with exit status 1
>> >> >> >>>
>> >> >> >>> Maybe I am missing some step and I am not quite sure
what is the
>> >> issue.
>> >> >> >>>
>> >> >> >>> Any pointers to solve this issue?
>> >> >> >>>
>> >> >> >>> With Regards,
>> >> >> >>> Vibhatha
>> >> >> >>>
>> >> >> >>>
>> >>
>>
>

Mime
View raw message