hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery
Date Wed, 25 Feb 2015 16:27:06 GMT

    [ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336707#comment-14336707

Junping Du commented on YARN-3039:

Thanks [~Naganarasimha] and [~rkanter] for review and comments!

bq. I feel AM should be informed of AggregatorAddr as early as register itself than currently
being done in ApplicationMasterService.allocate().
That's a good point. Another idea (from Vinod in offline discussion) is to add a blocking
call in AMRMClient to get aggregator address directly from RM. AMRMClient can be wrapped into
TimelineClient so no aggregator address or aggregator failure can be handled transparently.

bq. For NM's too, would it be better to update during registering itself (may be recovered
during recovery, not sure though) thoughts?
I think NM case is slightly different here: NM need this knowledge whenever the first container
of this app get allocated/launched, so get things updated in heartbeat sounds good enough.
Isn't it? In addition, if adding a new API in AMRMClient can be accepted, NM will use TimelineClient
too so can handle service discovery automatically.

bq. Was not clear about source of RMAppEventType.AGGREGATOR_UPDATE. Based on YARN-3030 (Aggregators
collection through NM's Aux service), PerNodeAggregatorServer(Aux service) launches AppLevelAggregatorService,
so will AppLevelAggregatorService inform RM about the aggregator for the application? and
then RM will inform NM about the appAggregatorAddr as part of heart beat response ? if this
is the flow will there be chances of race condition where in before NM gets appAggregatorAddr
from RM, NM might require to post some AM container Entities/events?
I think we can discuss this flow in two scenarios, the first time launch of app aggregator
and app aggregator failed over on another NM:
For the first time launch of app aggregator, NM aux service will bind the app aggregator to
perNodeAggregator when AM container get allocated (per YARN-3030). NM will notify RM that
this new appAggregator is ready for use in next heartbeat to RM (missing in this patch). After
received this messsage from NM, RM with update its aggregator list and send RMAppEventType.AGGREGATOR_UPDATE
to trigger persistent of aggregator list updating in RMStateStore (for RM failed over).
For app aggregator get failed over, AM or NMs (who called putEntities with timelineClient)
will notify RM on this failure, RM verify the out of service for this app aggregator first
and kick off rebind appAggregator to another NM's perNodeAggregatorService in next heartbeat
comes. When hear back from this new NM, RM did the same thing as the 1st case.
One gap here today is we launched appAggregatorService (by NM's auxiliary service) whenever
AM container get launched, no matter first time launch or rescheduled as failed before. As
my early comments above - AM container failed over with rescheduled to other NM may not have
to cause rebind of aggregator service just like out of service for app's aggregator may not
cause AM container get killed. So I think appAggregatorService should get launched by NM automatic
only in first attemp and taken care by RM in next attempts. 
About rack condition between NM heartbeat with posting entities, I don't think posting entities
should block any major logic especially NM heartbeat. In addition, if we make TimelineClient
can handle service discovery automatically, this will never happen. What do you think?

bq. Sorry for not commenting earlier. Thanks for taking this up Junping Du.
No worry. Thanks!

bq. Not using YARN-913 is fine if it's not going to make sense. I haven't looked too closely
at it either; it just sounded like it might be helpful here.
Agree. My feeling now is service discovery get couple tightly with service lifecycle management.
Given our app aggregator service - not inside of a dedicated container, but have many options,
and its consumer include YARN components but not only AM. So I think YARN-913 may not be the
best fit at this moment.
 [~stevel@apache.org] is the main author of YARN-913. Steve, do you have any comments here?

bq. Given that a particular NM is only interested in the Applications that are running on
it, is there some way to have it only receive the aggregator info for those apps? This would
decrease the amount of "throw away" data that gets sent.
In current patch, RM only send NM the aggregator lists for active Apps on this container.
Please check the code in ResourceTrackerService:  
+    ConcurrentMap<ApplicationId, String> liveAppAggregatorsMap = new 
+        ConcurrentHashMap<ApplicationId, String>();
+    List<ApplicationId> keepAliveApps = remoteNodeStatus.getKeepAliveApplications();
+    if (keepAliveApps != null) {
+      ConcurrentMap<ApplicationId, RMApp> rmApps = rmContext.getRMApps();
+      for (ApplicationId appId : keepAliveApps) {
+        String appAggregatorAddr = rmApps.get(appId).getAggregatorAddr();
+        if (appAggregatorAddr != null) {
+          liveAppAggregatorsMap.put(appId, appAggregatorAddr);
+        } else {
+          // Log a debug info if aggregator address is not found.
+          if (LOG.isDebugEnabled()) {
+            LOG.debug("Aggregator for applicaton: " + appId + " hasn't registered yet!");
+          }
+        }
+      }
+      nodeHeartBeatResponse.setAppAggregatorsMap(liveAppAggregatorsMap);
+    }
In addition, as reply to [~zjshen]'s comments above, we can even improve to include interested
appAggregators only sent by NM - assume NM can detect the failure of aggregator which may
not be true if we wrap everything in TimelineClient (with new API in AMRMClient to retrive
aggregator address)

bq. Also, can you update the design doc? Looking at the patch, it seems like some things have
changed. (e.g. it's using protobufs instead of REST; which I think makes more sense here anyway).
I will. Many things need to get updated, and many details are getting more clear.

> [Aggregator wireup] Implement ATS writer service discovery
> ----------------------------------------------------------
>                 Key: YARN-3039
>                 URL: https://issues.apache.org/jira/browse/YARN-3039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Junping Du
>         Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, YARN-3039-no-test.patch
> Per design in YARN-2928, implement ATS writer service discovery. This is essential for
off-node clients to send writes to the right ATS writer. This should also handle the case
of AM failures.

This message was sent by Atlassian JIRA

View raw message