hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhanwei.Wang (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2656) Implement a pure c client based on webhdfs
Date Sun, 01 Apr 2012 05:54:57 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243664#comment-13243664
] 

Zhanwei.Wang commented on HDFS-2656:
------------------------------------


Hi donal, 
Good question, performance is an important issue and the lib needs to be designed and implemented
carefully.

>From lib side, I use libcurl to deal with http protocol and a buffer in the lib to optimize
the performance. The same design was also used in our another project and the performance
of libcurl is ok.

For the transmission, http use tcp connection. To read data from server, only the raw data
is transfered. To write to server, I use "chunked" transfer encoding, and the overhead is
just a small head per chunk.

For the server side, the performance is depending on the jetty server. In the previous prototype,
jetty server or webhdfs had performance problem when I use HTTP1.1 protocol to read data from
server, but this problem cannot reproduce when I switch to HTTP1.0 protocol. 

I did simple performance test on the previous prototype, and more performance test work is
on the plan.

Currently, to write to hdfs may still fail under the heavy workload, I am not sure it is a
bug of my code or the hdfs, I am working on it (seems not my bug -_-). The doc is under writing,
function test is finished. As soon as I get the permit to open source and finish the doc,
you can test yourself. I think it will not take too long.

                
> Implement a pure c client based on webhdfs
> ------------------------------------------
>
>                 Key: HDFS-2656
>                 URL: https://issues.apache.org/jira/browse/HDFS-2656
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Zhanwei.Wang
>
> Currently, the implementation of libhdfs is based on JNI. The overhead of JVM seems a
little big, and libhdfs can also not be used in the environment without hdfs.
> It seems a good idea to implement a pure c client by wrapping webhdfs. It also can be
used to access different version of hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message