hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-8275) Create a JNI interface to interact with Windows
Date Mon, 14 May 2018 06:02:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16473789#comment-16473789
] 

Allen Wittenauer edited comment on YARN-8275 at 5/14/18 6:01 AM:
-----------------------------------------------------------------

bq.  I am planning to code everything in Commons to be used from YARN and HDFS.

The umbrella JIRA should really start out in HADOOP so that people aren't taken by surprise.
 I suspect any YARN and HDFS specific code to be relatively tiny since winutils is used all
over the place, including in the client code.  

That fact probably makes ...

bq. a long running native process communicating with YARN over pipe

almost certainly a non-starter, never mind the security concerns, with greatly increasing
the complexity for likely very little gain.

The other thing to keep in mind is that winutils pre-dates Java 7.  Things like symlinks can
now be done with Java APIs.  No C required.  I'd highly recommend starting with replacing
the winutils calls with Java API calls first and then digging into something more complex
later.  [The Unix versions of those same calls will likely get a speed bump too.]

---

Before I forget, from a "what gets run on the maven command line", there is very little difference
between libhadoop (JNI) and winutils.  Windows *always* requires (and thus triggers) -Pnative.
 

I suspect the direction was set because winutils was added when libhadoop was still being
built by autoconf.  But now that cmake is there and works properly on Windows (at least in
3.x), it'd be nice to place the core of winutils into libhadoop and just keep winutils as
a wrapper to use for debugging.  This might also move us away from using MSBuild, which would
greatly simplify the build process.


was (Author: aw):
bq.  I am planning to code everything in Commons to be used from YARN and HDFS.

The umbrella JIRA should really start out in HADOOP so that people aren't taken by surprise.
 I suspect any YARN and HDFS specific code to be relatively tiny since winutils is used all
over the place, including in the client code.  

That fact probably makes ...

bq. a long running native process communicating with YARN over pipe

almost certainly a non-starter, never mind the security concerns, with greatly increasing
the complexity for likely very little gain.

The other thing to keep in mind is that winutils pre-dates Java 7.  Things like symlinks can
now be done with Java APIs.  No C required.  I'd highly recommend starting with replacing
the winutils calls with Java API calls first and then digging into something more complex
later.  [The Unix versions of those same calls will likely get a speed bump too.]

> Create a JNI interface to interact with Windows
> -----------------------------------------------
>
>                 Key: YARN-8275
>                 URL: https://issues.apache.org/jira/browse/YARN-8275
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Giovanni Matteo Fumarola
>            Assignee: Giovanni Matteo Fumarola
>            Priority: Major
>         Attachments: WinUtils-Functions.pdf, WinUtils.CSV
>
>
> I did a quick investigation of the performance of WinUtils in YARN. In average NM calls
4.76 times per second and 65.51 per container.
>  
> | |Requests|Requests/sec|Requests/min|Requests/container|
> |*Sum [WinUtils]*|*135354*|*4.761*|*286.160*|*65.51*|
> |[WinUtils] Execute -help|4148|0.145|8.769|2.007|
> |[WinUtils] Execute -ls|2842|0.0999|6.008|1.37|
> |[WinUtils] Execute -systeminfo|9153|0.321|19.35|4.43|
> |[WinUtils] Execute -symlink|115096|4.048|243.33|57.37|
> |[WinUtils] Execute -task isAlive|4115|0.144|8.699|2.05|
>  Interval: 7 hours, 53 minutes and 48 seconds
> Each execution of WinUtils does around *140 IO ops*, of which 130 are DDL ops.
> This means *666.58* IO ops/second due to WinUtils.
> We should start considering to remove WinUtils from Hadoop and creating a JNI interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message