I have never developed or tested the HDFS integration on Windows (we don't test it in our CI either), so we would need to see if there is someone reading who has used it successfully to try to help, or a developer who wants to dig in and try to get it to work themselves (fixing anything that pops up along the way).

On Mon, Mar 22, 2021 at 7:10 AM 황세규 <gladiator67@naver.com> wrote:

Hello dear. 

My name is Joseph Hwang. I am a developer in South Korea. 

I try to develop hadoop file system client application with pyarrow 3 on windows 10. First, my development environment are like below,


OS : Windows 10

Language : Anaconda 2020.11

IDE : eclipse


And my environment variables are


JAVA_HOME : C:\Program Files\Java\jdk-11.0.10

HADOOP_HOME : C:\hadoop-3.3.0

ARROW_LIBHDFS_DIR : C:\hadoop-3.3.0\lib\native 

CLASSPATH = 'hdfs classpath --glob'


These are my short python codes with pyarrow


from pyarrow import fs

hdfs = fs.HadoopFileSystem('localhost', port=9000


But I can not connect to my hadoop file system. The brought error is


hdfs = fs.HadoopFileSystem('localhost', port=9000)

  File "pyarrow\_hdfs.pyx", line 83, in pyarrow._hdfs.HadoopFileSystem.__init__

  File "pyarrow\error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status

  File "pyarrow\error.pxi", line 99, in pyarrow.lib.check_status

OSError: Unable to load libjvm:


I think my codes have some problems with java configuration but I have no idea how to correct this error.

Kindly inform me of your advise to correct this error. Thank you for reading my e-mail.

Best regards.