Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 76A5E18EBA for ; Mon, 6 Jul 2015 22:31:06 +0000 (UTC) Received: (qmail 71909 invoked by uid 500); 6 Jul 2015 22:31:06 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 71869 invoked by uid 500); 6 Jul 2015 22:31:06 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 71858 invoked by uid 99); 6 Jul 2015 22:31:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jul 2015 22:31:06 +0000 Date: Mon, 6 Jul 2015 22:31:06 +0000 (UTC) From: "Varun Saxena (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615776#comment-14615776 ] Varun Saxena commented on YARN-3051: ------------------------------------ [~zjshen], bq. we have chosen clusterId + appId to globally find a unique flow run. I think here we should do it similar by adding clusterId The current FS implementation had cluster as part of the path. So there will a app_flow_mapping.csv for each cluster. So in a way it is part of the primary key even though its not there in app_flow_mapping.csv I hope that is what your concern was. bq. 1. Maybe we want to cache the mapping instead of reading it from the file for every query. Yes, we should be doing so. Plan to do these optimizations in later JIRA. Also some optimizations are required as in we are using set instead of map for storing metrics and events. So I have to iterate over all of them. Any issue in turning them into map ? bq. 2. limit should be push down into the for loop. It's unnecessary that if we want to just retrieve. The issue here is that we want to have limit on entities but these should be latest entities(sorted descendingly by created time). Having created time in entity file name will help towards not reading all the files. bq.3. We'd better avoid hard code "/" as the path separator, and we should use FileSystem interface to operate the files, such that the impl can also work with HDFS. Ok. > [Storage abstraction] Create backing storage read interface for ATS readers > --------------------------------------------------------------------------- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-2928 > Reporter: Sangjin Lee > Assignee: Varun Saxena > Attachments: YARN-3051-YARN-2928.003.patch, YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, YARN-3051-YARN-2928.05.patch, YARN-3051-YARN-2928.06.patch, YARN-3051-YARN-2928.07.patch, YARN-3051-YARN-2928.08.patch, YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch > > > Per design in YARN-2928, create backing storage read interface that can be implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)