Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Mon, 6 Jul 2015 22:31:06 +0000 (UTC)
From: "Varun Saxena (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12766998.1421110823000.111571.1436221866219@Atlassian.JIRA>
In-Reply-To: <JIRA.12766998.1421110823000@Atlassian.JIRA>
References: <JIRA.12766998.1421110823000@Atlassian.JIRA>
 <JIRA.12766998.1421110823412@arcas>
Subject: [jira] [Commented] (YARN-3051) [Storage abstraction] Create backing
 storage read interface for ATS readers
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615776#comment-14615776 ] 

Varun Saxena commented on YARN-3051:
------------------------------------

[~zjshen],
bq. we have chosen clusterId + appId to globally find a unique flow run. I think here we should do it similar by adding clusterId
The current FS implementation had cluster as part of the path. So there will a app_flow_mapping.csv for each cluster. So in a way it is part of the primary key even though its not there in app_flow_mapping.csv
I hope that is what your concern was.

bq. 1. Maybe we want to cache the mapping instead of reading it from the file for every query.
Yes, we should be doing so. Plan to do these optimizations in later JIRA. Also some optimizations are required as in we are using set instead of map for storing metrics and events. So I have to iterate over all of them. Any issue in turning them into map ?

bq. 2. limit should be push down into the for loop. It's unnecessary that if we want to just retrieve.
The issue here is that we want to have limit on entities but these should be latest entities(sorted descendingly by created time). Having created time in entity file name will help towards not reading all the files.

bq.3. We'd better avoid hard code "/" as the path separator, and we should use FileSystem interface to operate the files, such that the impl can also work with HDFS.
Ok.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---------------------------------------------------------------------------
>
>                 Key: YARN-3051
>                 URL: https://issues.apache.org/jira/browse/YARN-3051
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>         Attachments: YARN-3051-YARN-2928.003.patch, YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be implemented by multiple backing storage implementations.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)