One thing I might try is to attempt to use the javabridge pip module. It has a basic search mechanism for finding libjvm.so and perhaps you could replicate this or specify the location of libjvm.so in the options to the hadoop system.


On Mon, Mar 22, 2021 at 6:35 AM Wes McKinney <wesmckinn@gmail.com> wrote:
I have never developed or tested the HDFS integration on Windows (we don't test it in our CI either), so we would need to see if there is someone reading who has used it successfully to try to help, or a developer who wants to dig in and try to get it to work themselves (fixing anything that pops up along the way).

On Mon, Mar 22, 2021 at 7:10 AM 황세규 <gladiator67@naver.com> wrote:

Hello dear. 

My name is Joseph Hwang. I am a developer in South Korea. 

I try to develop hadoop file system client application with pyarrow 3 on windows 10. First, my development environment are like below,

 

OS : Windows 10

Language : Anaconda 2020.11

IDE : eclipse

 

And my environment variables are

 

JAVA_HOME : C:\Program Files\Java\jdk-11.0.10

HADOOP_HOME : C:\hadoop-3.3.0

ARROW_LIBHDFS_DIR : C:\hadoop-3.3.0\lib\native 

CLASSPATH = 'hdfs classpath --glob'

 

These are my short python codes with pyarrow

 

from pyarrow import fs

hdfs = fs.HadoopFileSystem('localhost', port=9000

 

But I can not connect to my hadoop file system. The brought error is

 

hdfs = fs.HadoopFileSystem('localhost', port=9000)

  File "pyarrow\_hdfs.pyx", line 83, in pyarrow._hdfs.HadoopFileSystem.__init__

  File "pyarrow\error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status

  File "pyarrow\error.pxi", line 99, in pyarrow.lib.check_status

OSError: Unable to load libjvm:

 

I think my codes have some problems with java configuration but I have no idea how to correct this error.

Kindly inform me of your advise to correct this error. Thank you for reading my e-mail.

Best regards.