hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bob Hansen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client
Date Wed, 25 Mar 2015 13:23:56 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379854#comment-14379854

Bob Hansen commented on HDFS-6994:

We are evaluating libhdfs3 for use in our product, and building from an asynchronous system
from the core, with synchronous shims at the top, is a very solid approach that has had very
good performance and scalability properties in the past.  

For our use case, we're especially interested in the ability to issue thousands of read requests
to hundreds of machinues simultaneously without dedicating thousands of threads.  On machines
with lots of disks and NCQ, we've seen this reduce our small-read latency by an order of magnitude.
 The async pattern of issuing all the requests asynchronously at the start of the process
and consume the results as they arrive is very powerful in increasing throughput.  We will
frequently have 3000-10000 outstanding reads across a cluster (frequently satisfied from the
disk cache on the various HDFS nodes), and need a client that can keep up with the load. 

We've spoken with Haohui on his approach in libhdfspp, and we think it's going to be a solid
basis for a c++ client going forward.

> libhdfs3 - A native C/C++ HDFS client
> -------------------------------------
>                 Key: HDFS-6994
>                 URL: https://issues.apache.org/jira/browse/HDFS-6994
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs-client
>            Reporter: Zhanwei Wang
>            Assignee: Zhanwei Wang
>         Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch
> Hi All
> I just got the permission to open source libhdfs3, which is a native C/C++ HDFS client
based on Hadoop RPC protocol and HDFS Data Transfer Protocol.
> libhdfs3 provide the libhdfs style C interface and a C++ interface. Support both HADOOP
RPC version 8 and 9. Support Namenode HA and Kerberos authentication.
> libhdfs3 is currently used by HAWQ of Pivotal
> I'd like to integrate libhdfs3 into HDFS source code to benefit others.
> You can find libhdfs3 code from github
> https://github.com/PivotalRD/libhdfs3
> http://pivotalrd.github.io/libhdfs3/

This message was sent by Atlassian JIRA

View raw message