hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Binglin Chang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10388) Pure native hadoop client
Date Tue, 01 Apr 2014 06:28:19 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956157#comment-13956157
] 

Binglin Chang commented on HADOOP-10388:
----------------------------------------

bq. We can even make the XML-reading code optional if you want.
Sure, if for compatibility I guess add xml support if fine. By keeping strict compatibility
we may need to support all javax xml / hadoop config features, I'm afraid libexpact/libxml2
support all of those, a lot effort may be spent on this, it is better to make it optional
and do it later I think.

bq. Thread pools and async I/O, I'm afraid, are something we can't live without.
I am also prefer to use async I/O and thread for performance reasons, the code I published
on github already have a working HDFS client with read/write, and HDFSOuputstream uses an
aditional thread. 
What I was saying is use of more threads should be limited, in java client, to simply read/write
a HDFS file, too much threads are used(rpc socket read/write, data transfer socket read/write,
other misc executors, lease renewer etc.) Since we use async i/o, thread number should be
rapidly reduced


> Pure native hadoop client
> -------------------------
>
>                 Key: HADOOP-10388
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10388
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: HADOOP-10388
>            Reporter: Binglin Chang
>            Assignee: Colin Patrick McCabe
>
> A pure native hadoop client has following use case/advantages:
> 1.  writing Yarn applications using c++
> 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs interface.
> 3.  wrap native library to support more languages, e.g. python
> 4.  lightweight, small footprint compare to several hundred MB of JDK and hadoop library
with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message