arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Burke Kaltenberger <bu...@firsttalentsearch.com>
Subject Re: PyArrow building from source issue
Date Fri, 01 May 2020 20:05:54 GMT
Hi, please take me off the mailing list. Thank you

On Fri, May 1, 2020 at 12:46 PM Vibhatha Abeykoon <vibhatha@gmail.com>
wrote:

> For the moment, I removed parquet from the build to test the idea.
>
> export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
> export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>
> export PYARROW_WITH_PYTHON=1
> export PYARROW_WITH_BUILD_TESTS=1
>
> cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>
> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
> \
>       -DARROW_WITH_BZ2=OFF \
>       -DARROW_WITH_ZLIB=OFF \
>       -DARROW_WITH_ZSTD=OFF \
>       -DARROW_WITH_LZ4=OFF \
>       -DARROW_WITH_SNAPPY=OFF \
>       -DARROW_WITH_BROTLI=OFF \
>       -DARROW_PARQUET=OFF \
>       -DARROW_PYTHON=ON \
>       -DARROW_BUILD_TESTS=ON \
>
> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
> \
>       ..
>
> Then,
>
> python3 setup.py build_ext
>
> As I installed the libs not to /lib, but to a different directory, I added
> that path to LD_LIBRARY_PATH,
> (/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs)
>
> It cannot locate arrow libs, this is the error I got,
>
> running build_ext
> -- Running cmake for pyarrow
> cmake
> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/ENVARROW/bin/python3
>  -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
> -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
> -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=off
> -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off -DPYARROW_BUILD_HDFS=off
> -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
> -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
> -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
> -DCMAKE_BUILD_TYPE=release /home/vibhatha/sandbox/arrow/repos/arrow/python
> -- System processor: x86_64
> -- Arrow build warning level: PRODUCTION
> Using ld linker
> Configured for RELEASE build (set with cmake
> -DCMAKE_BUILD_TYPE={release,debug,...})
> -- Build Type: RELEASE
> -- Build output directory:
> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
> CMake Error at cmake_modules/FindArrow.cmake:359 (file):
>   file failed to open for reading (No such file or directory):
>
>     /home/vibhatha/sandbox/arrow/repos/dist/include/arrow/util/config.h
> Call Stack (most recent call first):
>   cmake_modules/FindArrowPython.cmake:46 (find_package)
>   CMakeLists.txt:210 (find_package)
>
>
> CMake Error at
> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
> (message):
>   Could NOT find Arrow (missing: ARROW_FULL_SO_VERSION ARROW_SO_VERSION)
>   (found version "0.0.0")
> Call Stack (most recent call first):
>
> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
> (_FPHSA_FAILURE_MESSAGE)
>   cmake_modules/FindArrow.cmake:412 (find_package_handle_standard_args)
>   cmake_modules/FindArrowPython.cmake:46 (find_package)
>   CMakeLists.txt:210 (find_package)
>
>
>
> With Regards,
> Vibhatha Abeykoon,
> Research Assistant,
> Intelligent Systems Engineering,
> Indiana University Bloomington,
> Cell : +1-812-955-1394
> Web: https://www.vibhatha.org
> <https://www.linkedin.com/in/vibhathaabeykoon/>
>
>
> On Fri, May 1, 2020 at 3:05 PM Vibhatha Abeykoon <vibhatha@gmail.com>
> wrote:
>
>>
>> export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>> export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>>
>> export PYARROW_WITH_PARQUET=1
>> export PYARROW_WITH_PYTHON=1
>> export PYARROW_WITH_BUILD_TESTS=1
>>
>> cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>
>> -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>> \
>>       -DARROW_WITH_BZ2=OFF \
>>       -DARROW_WITH_ZLIB=OFF \
>>       -DARROW_WITH_ZSTD=OFF \
>>       -DARROW_WITH_LZ4=OFF \
>>       -DARROW_WITH_SNAPPY=OFF \
>>       -DARROW_WITH_BROTLI=OFF \
>>       -DARROW_PARQUET=ON \
>>       -DARROW_PYTHON=ON \
>>       -DARROW_BUILD_TESTS=ON \
>>
>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>> \
>>       ..
>>
>>
>> I used these settings, I still want the libs to be in a custom directory,
>> not in lib/
>>
>> Does it make things not work?
>>
>> Now I get the following error,
>>
>> python setup.py build_ext --inplace
>> WARNING: The wheel package is not available.
>> running build_ext
>> -- Running cmake for pyarrow
>> cmake
>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/ENVARROW/bin/python
>>  -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
>> -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
>> -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
>> -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off -DPYARROW_BUILD_HDFS=off
>> -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
>> -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
>> -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
>> -DCMAKE_BUILD_TYPE=release /home/vibhatha/sandbox/arrow/repos/arrow/python
>> -- System processor: x86_64
>> -- Arrow build warning level: PRODUCTION
>> Using ld linker
>> Configured for RELEASE build (set with cmake
>> -DCMAKE_BUILD_TYPE={release,debug,...})
>> -- Build Type: RELEASE
>> -- Build output directory:
>> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
>> -- Arrow version: 0.18.0 (HOME:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist)
>> -- Arrow SO and ABI version: 18
>> -- Arrow full SO version: 18.0.0
>> -- Found the Arrow core shared library:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>> -- Found the Arrow core import library:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>> -- Found the Arrow core static library:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
>> -- Found the Arrow Python by HOME:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>> -- Found the Arrow Python shared library:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
>> -- Found the Arrow Python import library:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.so
>> -- Found the Arrow Python static library:
>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow_python.a
>> CMake Error at
>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
>> (message):
>>   Could NOT find Parquet (missing: PARQUET_LIB_DIR) (found version
>> "1.5.1")
>> Call Stack (most recent call first):
>>
>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
>> (_FPHSA_FAILURE_MESSAGE)
>>   cmake_modules/FindParquet.cmake:116 (find_package_handle_standard_args)
>>   CMakeLists.txt:426 (find_package)
>>
>> Does Parquet need to be installed separately?
>>
>> With Regards,
>> Vibhatha Abeykoon,
>> Research Assistant,
>> Intelligent Systems Engineering,
>> Indiana University Bloomington,
>> Cell : +1-812-955-1394
>> Web: https://www.vibhatha.org
>> <https://www.linkedin.com/in/vibhathaabeykoon/>
>>
>>
>> On Fri, May 1, 2020 at 1:55 PM Wes McKinney <wesmckinn@gmail.com> wrote:
>>
>>> Try this instead
>>>
>>> export ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/dist
>>> export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>>>
>>> Make sure to use
>>>
>>> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME -DCMAKE_INSTALL_LIBDIR=lib
>>>
>>> I renamed from "arrowmylibs" to "dist" so that it's less confusing
>>> what that directory represents (it doesn't contains the libs
>>> directory, but rather the directories include/, lib/, etc.)
>>>
>>>
>>> On Fri, May 1, 2020 at 12:51 PM Vibhatha Abeykoon <vibhatha@gmail.com>
>>> wrote:
>>> >
>>> > I didn't clearly mention the config I used,
>>> >
>>> > export
>>> ARROW_HOME=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>>> > export LD_LIBRARY_PATH=$ARROW_HOME:$LD_LIBRARY_PATH
>>> >
>>> > Before starting I added these ENV vars.
>>> >
>>> > My objective is to get all the libs inside my build setting. I use
>>> arrow as one of the core libraries in my project.
>>> > I have to build it in the following way,
>>> >
>>> > 1. Arrow clone from git
>>> > 2. Arrow CPP built
>>> > 3. Using arrow CPP I develop custom functions upon loaded data
>>> > 4. Integrate Cython APIs for these custom functions
>>> > 5. Use Pyarrow Cython to provide more functionality from Python end to
>>> Cython to my CPP lib which uses Arrow.
>>> >
>>> > This is the build that I am trying to formulate. So I have to keep
>>> those libs there, and the idea is to build python from
>>> > the same cloned source. The challenge was to keep all the shared libs
>>> from arrow, my-cython libs to point in the right
>>> > direction. Is this a clear description?
>>> >
>>> > With Regards,
>>> > Vibhatha Abeykoon,
>>> > Research Assistant,
>>> > Intelligent Systems Engineering,
>>> > Indiana University Bloomington,
>>> > Cell : +1-812-955-1394
>>> > Web: https://www.vibhatha.org
>>> >
>>> >
>>> >
>>> > On Fri, May 1, 2020 at 1:28 PM Wes McKinney <wesmckinn@gmail.com>
>>> wrote:
>>> >>
>>> >> This part doesn't look correct
>>> >>
>>> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>> >>
>>>  -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>>> >>
>>> >> The usual incantation is
>>> >>
>>> >> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>> >>       -DCMAKE_INSTALL_LIBDIR=lib
>>> >>
>>> >> The reason we pass -DCMAKE_INSTALL_LIBDIR=lib is that some systems
>>> >> will install libraries in lib32 or lib64 instead of just lib
>>> >>
>>> >>
>>> >> On Fri, May 1, 2020 at 10:38 AM Vibhatha Abeykoon <vibhatha@gmail.com>
>>> wrote:
>>> >> >
>>> >> > Hi Neal,
>>> >> >
>>> >> > Yes, I added the flag. But I installed my libs not to /usr/lib,
but
>>> to a different folder.
>>> >> >
>>> >> > This is the way I built arrow C++,
>>> >> >
>>> >> > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>>> >> >
>>>  -DCMAKE_INSTALL_LIBDIR=/home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs
>>> \
>>> >> >       -DARROW_WITH_BZ2=OFF \
>>> >> >       -DARROW_WITH_ZLIB=OFF \
>>> >> >       -DARROW_WITH_ZSTD=OFF \
>>> >> >       -DARROW_WITH_LZ4=OFF \
>>> >> >       -DARROW_WITH_SNAPPY=OFF \
>>> >> >       -DARROW_WITH_BROTLI=OFF \
>>> >> >       -DARROW_PARQUET=ON \
>>> >> >       -DARROW_PYTHON=ON \
>>> >> >       -DARROW_BUILD_TESTS=ON \
>>> >> >
>>>  -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>>> \
>>> >> >       ..
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > With Regards,
>>> >> > Vibhatha Abeykoon,
>>> >> > Research Assistant,
>>> >> > Intelligent Systems Engineering,
>>> >> > Indiana University Bloomington,
>>> >> > Cell : +1-812-955-1394
>>> >> > Web: https://www.vibhatha.org
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Fri, May 1, 2020 at 11:29 AM Neal Richardson <
>>> neal.p.richardson@gmail.com> wrote:
>>> >> >>
>>> >> >> Hi Vibhatha,
>>> >> >> Did you build Arrow C++ with -DARROW_PYTHON=ON?
>>> >> >>
>>> >> >> Neal
>>> >> >>
>>> >> >> On Fri, May 1, 2020 at 8:24 AM Vibhatha Abeykoon <
>>> vibhatha@gmail.com> wrote:
>>> >> >>>
>>> >> >>> Hi,
>>> >> >>>
>>> >> >>> I am trying to integrate Arrow with an application that
I am
>>> developing. Here I build Arrow from the source (CPP) and use the API to
>>> develop some custom functions to do a scientific calculation after data
>>> loaded with Arrow table API. On top of this, I develop a Cython API to
>>> design a python API.
>>> >> >>>
>>> >> >>> In the current stage, I have a new necessity where I need
to
>>> consume Arrow Cython API for my code.
>>> >> >>>
>>> >> >>> Here It was hard to link the build libarrow.so.16 with
the
>>> libarrow_python.so.16 from the installed pyarrow (separately from pip).
>>> What I realised was everything has to be built from the same source, so
>>> that I can install pyarrow from the source in my virtual environment.
>>> >> >>>
>>> >> >>> Before going through deeper things, I started by just building
>>> from source (CPP) and then moving towards installing pyarrow from the
>>> source.
>>> >> >>>
>>> >> >>> I tried to follow the guideline form here,
>>> >> >>>
>>> >> >>> https://arrow.apache.org/docs/developers/python.html,
>>> >> >>>
>>> >> >>> But when I found issues in the python build, I followed
this
>>> source,
>>> >> >>> (but still, I used the clone from the master, not a released
>>> version)
>>> >> >>>
>>> >> >>>
>>> https://gist.github.com/heavyinfo/04e1326bb9bed9cecb19c2d603c8d521
>>> >> >>>
>>> >> >>> My environmental variables are as follows,
>>> >> >>>
>>> >> >>> python3 setup.py build_ext --inplace
>>> >> >>> running build_ext
>>> >> >>> -- Running cmake for pyarrow
>>> >> >>> cmake
>>> -DPYTHON_EXECUTABLE=/home/vibhatha/sandbox/arrow/repos/arrow/ENVARROW/bin/python3
>>> -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off
>>> -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off
>>> -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=on
>>> -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off -DPYARROW_BUILD_HDFS=off
>>> -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off
>>> -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off
>>> -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on
>>> -DCMAKE_BUILD_TYPE=release /home/vibhatha/sandbox/arrow/repos/arrow/python
>>> >> >>> -- System processor: x86_64
>>> >> >>> -- Arrow build warning level: PRODUCTION
>>> >> >>> Using ld linker
>>> >> >>> Configured for RELEASE build (set with cmake
>>> -DCMAKE_BUILD_TYPE={release,debug,...})
>>> >> >>> -- Build Type: RELEASE
>>> >> >>> -- Build output directory:
>>> /home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/release
>>> >> >>> -- Arrow version: 0.18.0 (HOME:
>>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs)
>>> >> >>> -- Arrow SO and ABI version: 18
>>> >> >>> -- Arrow full SO version: 18.0.0
>>> >> >>> -- Found the Arrow core shared library:
>>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>>> >> >>> -- Found the Arrow core import library:
>>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.so
>>> >> >>> -- Found the Arrow core static library:
>>> /home/vibhatha/sandbox/arrow/repos/arrow/cpp/arrowmylibs/libarrow.a
>>> >> >>> CMake Error at
>>> /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146
>>> (message):
>>> >> >>>   Could NOT find ArrowPython (missing: ARROW_PYTHON_INCLUDE_DIR)
>>> (found
>>> >> >>>   version "0.18.0")
>>> >> >>> Call Stack (most recent call first):
>>> >> >>>
>>>  /usr/local/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393
>>> (_FPHSA_FAILURE_MESSAGE)
>>> >> >>>   cmake_modules/FindArrowPython.cmake:76
>>> (find_package_handle_standard_args)
>>> >> >>>   CMakeLists.txt:210 (find_package)
>>> >> >>>
>>> >> >>>
>>> >> >>> -- Configuring incomplete, errors occurred!
>>> >> >>> See also
>>> "/home/vibhatha/sandbox/arrow/repos/arrow/python/build/temp.linux-x86_64-3.8/CMakeFiles/CMakeOutput.log".
>>> >> >>> error: command 'cmake' failed with exit status 1
>>> >> >>>
>>> >> >>> Maybe I am missing some step and I am not quite sure what
is the
>>> issue.
>>> >> >>>
>>> >> >>> Any pointers to solve this issue?
>>> >> >>>
>>> >> >>> With Regards,
>>> >> >>> Vibhatha
>>> >> >>>
>>> >> >>>
>>>
>>

-- 
*First Talent Search & Placement*
*Burke Kaltenberger
<https://www.linkedin.com/in/burke-kaltenberger-3a41731/> | Founder*
*408.458.0071*

Mime
View raw message