hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4265) Provide new timeline plugin storage to support fine-grained entity caching
Date Mon, 04 Jan 2016 17:54:40 GMT

    [ https://issues.apache.org/jira/browse/YARN-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081472#comment-15081472

Junping Du commented on YARN-4265:

Thanks [~gtCarrera9] for clarification. Assume Jason is fine with going on with this patch,
I quickly go through the v3 patch. 
with following comments so far (I haven't finished my review yet as patch is pretty big):

In YarnConfiguration.java,
I noticed that we are setting 1 minutes as default scan interval but original patch in HDFS-3942
is 5 minutes. Why shall we do any update here? The same question on "app-cache-size", the
default value in HDFS-3942 is 5 but here is 10. Any reason to update the value?

In yarn-default.xml,
+    <description>DFS path to store active application’s timeline data</description>
+    <description>DFS path to store done application’s timeline data</description>
DFS is very old name, use HDFS instead to be more clear.

Why we don't have any default value specified in property of "yarn.timeline-service.entity-group-fs-store.group-id-plugin-classes"?

In hadoop-yarn-server-timeline-pluginstorage/pom.xml,

For EmptyTimelineEntityGroupPlugin.java, why we need this class? I didn't see any help provided
even in tests. We should remove it if useless.

In EntityCacheItem.java,
We should have a description for this class in Javadoc.

Can we optimize the synchronization logic here? Like in synchronized method refreshCache,
we are intialize/start/stop TimelineDataManager (and MemoryTimelineStore) which is quite expensive
and unnecessary to block other synchronized operations. Shall we move these operations out
of synchronized block?

+      LOG.warn("Error closing datamanager", e);
I think we are closing store here instead of datamanager. Isn't it?

+  public boolean needRefresh() {
+    //TODO: make a config for cache freshness
+    return (Time.monotonicNow() - lastRefresh > 10000);
+  }
Does refresh interval here need to do any coordination with scan interval specificed in "yarn.timeline-service.entity-group-fs-store.scan-interval-seconds"?

In EntityGroupFSTimelineStore.java,

+      if (appState != AppState.UNKNOWN) {
+        appLogs = new AppLogs(applicationId, appDirPath, appState);
+        LOG.debug("Create and try to add new appLogs to appIdLogMap for {}",
+            applicationId);
+        AppLogs oldAppLogs = appIdLogMap.putIfAbsent(applicationId, appLogs);
+        if (oldAppLogs != null) {
+          appLogs = oldAppLogs;
+        }
+      }
This logic is very similiar with method of getAndSetActiveLog(). Can we consolidate them together?

If parseSummaryLogs() is synchronized, it seems getSummaryLogs() should be synchronized too
or the getter will get stale(half-done) result.

Still checking if other multi-threads issues. More comments will come soon.

> Provide new timeline plugin storage to support fine-grained entity caching
> --------------------------------------------------------------------------
>                 Key: YARN-4265
>                 URL: https://issues.apache.org/jira/browse/YARN-4265
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-4265-trunk.001.patch, YARN-4265.YARN-4234.001.patch, YARN-4265.YARN-4234.002.patch
> To support the newly proposed APIs in YARN-4234, we need to create a new plugin timeline
store. The store may have similar behavior as the EntityFileTimelineStore proposed in YARN-3942,
but cache date in cache id granularity, instead of application id granularity. Let's have
this storage as a standalone one, instead of updating EntityFileTimelineStore, to keep the
existing store (EntityFileTimelineStore) stable. 

This message was sent by Atlassian JIRA

View raw message