hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Lu <...@hortonworks.com>
Subject Re: [Timeline V2 branch] Latest timeline v2 and SMP problem
Date Tue, 15 Dec 2015 19:00:33 GMT
Thanks Varun and Naga! I verified locally that the V2 publisher introduced in YARN-4129 caused
this problem. I’ll open a JIRA and post a quick fix right away. Thanks for the information!

Li Lu

On Dec 14, 2015, at 21:38, Naganarasimha G R (Naga) <garlanaganarasimha@huawei.com<mailto:garlanaganarasimha@huawei.com>>
wrote:

Hi Varun & Li,

Yes Varun most possible reason would be what you mentioned and it has to be done in serviceInit
which is taken care in V1 Publisher but missed in V2 Publisher.
Entire logic present in serviceStart of V2Publisher should be moved to serviceInit.

But was wondering for which event/entity ? was it in RM Recover mode ?

Regards,
+ Naga

________________________________
From: Varun Saxena [vsaxena.varun@gmail.com<mailto:vsaxena.varun@gmail.com>]
Sent: Tuesday, December 15, 2015 10:48
To: Li Lu
Cc: yarn-dev@hadoop.apache.org<mailto:yarn-dev@hadoop.apache.org>; Sangjin Lee; Junping
Du; Vrushali Channapattan; Joep Rottinghuis; Naganarasimha G R (Naga)
Subject: Re: [Timeline V2 branch] Latest timeline v2 and SMP problem

Hi Li,

This is because we are registering the event in serviceStart() instead of serviceInit().
As SMP is the last service in the list, its started right in the end i.e. even after all the
RPCs', UI related stuff.

This can cause an app flow to start before the SMP/V2Publisher service has even started. This
is what causes the issue.

You want to raise JIRA for this issue or should I ? I can handle it.

Regards,
Varun Saxena.

On Tue, Dec 15, 2015 at 8:35 AM, Li Lu <llu@hortonworks.com<mailto:llu@hortonworks.com>>
wrote:
Thanks Sangjin. I’ll keep tracing this. Meanwhile, if anybody has reproduced the problem,
please feel free to let me know. Thanks!

Li Lu

On Dec 14, 2015, at 18:16, Sangjin Lee <sjlee0@gmail.com<mailto:sjlee0@gmail.com>>
wrote:

Can you bisect the commits to see if you can isolate which commit
introduced the issue?

On Mon, Dec 14, 2015 at 5:39 PM, Li Lu <llu@hortonworks.com<mailto:llu@hortonworks.com>>
wrote:

Hi YARN developers working on Timeline v2 (YARN-2928) branch,

I just realized I’ve accidentally turned off SMP for my local Timeline v2
build. After I turned yarn.system-metrics-publisher.enabled back on, the RM
fails to start with the following FATAL message:

2015-12-14 17:27:54,125 INFO  ipc.Server (Server.java:run(797)) - IPC
Server listener on 8033: starting
2015-12-14 17:27:54,127 FATAL event.AsyncDispatcher
(AsyncDispatcher.java:dispatch(189)) - Error in dispatcher thread true
java.lang.Exception: No handler for registered for class
org.apache.hadoop.yarn.server.resourcemanager.metrics.AbstractSystemMetricsPubli
sher$SystemMetricsEventType
       at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:185)
       at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
       at java.lang.Thread.run(Thread.java:745)
2015-12-14 17:27:54,127 INFO  event.AsyncDispatcher
(AsyncDispatcher.java:register(208)) - Registering class
org.apache.hadoop.yarn.serve
r.resourcemanager.metrics.AbstractSystemMetricsPublisher$SystemMetricsEventType
for class org.apache.hadoop.yarn.server.resourcemanager.m
etrics.TimelineServiceV2Publisher$TimelineV2EventHandler

Interestingly, we’re registering this class to timeline v2 handler in the
next line of log. I’m wondering if this is caused by some of my missing
configs, or a newly introduced issue? Has anybody on feature-YARN-2928
branch noticed this issue? Thanks!

Li Lu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message