hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7191) Improve yarn-service documentation
Date Wed, 13 Sep 2017 20:24:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165232#comment-16165232

Jian He commented on YARN-7191:

Below comments are from [~aw], thank you for giving suggestions. I'll address them in this
Somewhat. Greatly improved, but there’s still way too much “we’re working on this”
and “here’s a link to a JIRA” and just general brokenness going on.

	Here’s some examples from concepts.  Concepts!  The document I’d expect to give me very
basic “when we talk about X, we mean Y” definitions:

"A host of scheduling features are being developed to support long running services.”

	Yeah, ok?  How is this a concept?


	"[YARN-3998](https://issues.apache.org/jira/browse/YARN-3998) implements a retry-policy to
let NM re-launch a service container when it fails.”

	The patch itself went through nine revisions and a long discussion. Would an end user care
about the details in that JIRA?  

	If the answer to the last question is YES, then the documentation has failed.  The whole
point of documentation is so they don’t have to go digging into the details of the implementation,
the decision process that got us there, etc.  If they care enough about the details, they’ll
run through the changelog and click on the JIRA link there.  If the summary line of the changelog
isn’t obvious, well… then we need better summaries.

	etc, etc.


	The sleep example is nice.  Now, let’s see a non-toy example:  multiple instances of Apache
httpd or MariaDB or something real and not from the Hadoop echo chamber (e.g., non-JVM-based).
 If this is for “native” services, this shouldn’t be a problem, right?  Give a real
example and users will buy what you’re selling.  I also think writing the docs and providing
an example of doing something big and outside the team’s comfort zone will clarify where
end users are going to need more help than what’s being provided.  Getting a MariaDB instance
or three up will help tremendously here.

	Which reminds me: something the documentation doesn’t cover is storage. What happens to
it, where does it come from, etc, etc.  That’s an important detail that I didn’t see covered.
 (I may have missed it.)  


	Why are there directions to enable other, partially unrelated services in here?  Shouldn’t
there be pointers to their specific documentation?  Is the expectation that if the requirements
for those other services change that contributors will need to update multiple documents?

"Start the DNS server”

	Just… yikes.

		a) yarn classname … This is not how we do user-facing things. The fact it’s not really
possible for a *daemon* to be put in the YarnCommands.md doc should be a giant red flag that
something isn’t going correctly here.
		b) no jsvc support for something that it’s strongly hinted at wanting to run privileged
= an instant -1 for failing basic security practices.  There’s zero reason for it to be
running continually as root.
		c) If this would have been hooked into the shell scripts appropriately, logs, user switching,
etc would have been had for free.
		d) Where’s stop?  Right. Since it’s outside the scripts, there is no pid support so
one has to do all of that manually….


	 "3. Supports reverse lookups (name based on IP). Note, this works only for Docker containers.”


	"It should not be used as a fully-functional corporate DNS.”

Scratch corporate.  It’s not a fully functional DNS server if it can’t do reverse lookups.
 (Which, ironically, means it’s not suitable for use with Apache Hadoop, given it requires
both fwd and rev DNS ...)

> Improve yarn-service documentation
> ----------------------------------
>                 Key: YARN-7191
>                 URL: https://issues.apache.org/jira/browse/YARN-7191
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jian He
>            Assignee: Jian He

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message