hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William Forson (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1210) Documentation regarding usage of libhdfs3 in concurrent environment
Date Fri, 09 Dec 2016 22:12:58 GMT
William Forson created HAWQ-1210:

             Summary: Documentation regarding usage of libhdfs3 in concurrent environment
                 Key: HAWQ-1210
                 URL: https://issues.apache.org/jira/browse/HAWQ-1210
             Project: Apache HAWQ
          Issue Type: Bug
          Components: libhdfs
            Reporter: William Forson
            Assignee: Lei Chang


I've been using libhdfs3 in a single-threaded environment for several months now, without
any problems. However, as soon as I tried using the library concurrently from multiple threads:
hello, segfaults.

Although the source of these segfaults is annoyingly subtle, I've managed to isolate it to
a relatively small block of my code that does nothing interesting aside from using libhdfs3
to download a single hdfs file.

To be clear: I assume that the mistake here is mine -- that is, that I am using your library
incorrectly. However, I have been unable to find any documentation as to how the libhdfs3
API _should_ be used in a multi-threaded environment. I initially interpreted this to mean,
"go to town, it's all more or less threadsafe", but I am now questioning that interpretation.

So, I have a question, a request.

Question: Are there any known, non-obvious concurrency gotchas regarding the usage of libhdfs3
(or whatever it's currently called)?

Request: Could you please add some documentation, to the README and/or hdfs.h, regarding usage
in a concurrent environment? (ideally, such notes would annotate individual components of
the API in hdfs.h, but if the answer to my question above is, "No", then this could perhaps
be a single sentence in the README which affirmatively states that the library is generally
safe for concurrent usage without additional/explicit synchronization -- anything would be
better than nothing :))

This message was sent by Atlassian JIRA

View raw message