hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
Date Mon, 01 Jun 2015 22:18:19 GMT

    [ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568128#comment-14568128

Li Lu commented on YARN-3051:

Hi [~varun_saxena], thanks for the work! Not sure if you've already made progress since the
latest patch, but I'm posting some of my comments and questions w.r.t the reader API design
in the 003 patch. I may have more comments in the near future, but I won't mind to see a new
patch before posting them. 

# I noticed there is a _readerLimit_ for read operations, which works for ATS v1. I'm wondering
if it's fine to use -1 to indicate there's no such limit? Not sure if this feature is already
# The {{fromId}} parameter, we may need to be careful on the concept of "id". In timeline
v2 we need context information to identify each entity, such as cluster, user, flow, run.
When querying with {{fromId}}, what kind of assumptions should we make on the "id" here? Are
we assuming all entities are of the same cluster, user, and/or flow, or the "id" is a concatenation
of all information, or it's something else? 
# For all filters related parameters, I'm not sure if the current object model and storage
implementation support a trivial solution. I'd certainly welcome any comments/suggestions
on this problem. 
# Based on the previous two issues, a more general question is, shall we focus on a evolution
of the v1 API here, or we start a v2 reader API design from the scratch, and then try to make
them compatible to the v1 APIs? The current patch looks to be pursuing the evolution approach.

# In some APIs, we're requiring clusterID and appID, but not having flow/run information.
In the current writer implementations, this indicates a full table scan. Maybe we can have
flow and run information as optional parameters so that we can avoid full table scans when
the caller does have flow and run information?
# The current APIs require a pretty long list of parameters. For most of the use cases, I
think we can abstract something much simpler. Do we plan to add those "simple APIs" in a higher
layer? I think having a lot of nulls when calling reader API looks suboptimal, but with only
these few APIs we may need to do this frequently?  

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---------------------------------------------------------------------------
>                 Key: YARN-3051
>                 URL: https://issues.apache.org/jira/browse/YARN-3051
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
>            Assignee: Varun Saxena
>         Attachments: YARN-3051-YARN-2928.003.patch, YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch,
YARN-3051.wip.patch, YARN-3051_temp.patch
> Per design in YARN-2928, create backing storage read interface that can be implemented
by multiple backing storage implementations.

This message was sent by Atlassian JIRA

View raw message