Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 114DB18286 for ; Wed, 29 Jul 2015 23:46:05 +0000 (UTC) Received: (qmail 65369 invoked by uid 500); 29 Jul 2015 23:46:04 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 65310 invoked by uid 500); 29 Jul 2015 23:46:04 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 65299 invoked by uid 99); 29 Jul 2015 23:46:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jul 2015 23:46:04 +0000 Date: Wed, 29 Jul 2015 23:46:04 +0000 (UTC) From: "Zhijie Shen (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-3984) Rethink event column key issue MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646924#comment-14646924 ] Zhijie Shen commented on YARN-3984: ----------------------------------- [~vrushalic], thanks for picking it up. The aforementioned cases are definitely good to support, while the current query we want to support now (in YARN-3051 and YARN-3049) is to retrieve all events belonging to an entity (e.g. application, attempt, container and etc.). With this basic query, we can easily distill the details that happen to the entity, such as the diagnostic msg of the kill event. In this case, the most efficient way is to put timestamp even before the event ID, so that we don't need to order the events in memory. In addition to the key composition, I find another significant problem with the event store schema. If the event doesn't contain any info, it will be ignored then. And we cannot always guarantee user will put something into info. For example, user may define a KILL event without any diagnostic msg. > Rethink event column key issue > ------------------------------ > > Key: YARN-3984 > URL: https://issues.apache.org/jira/browse/YARN-3984 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Zhijie Shen > Assignee: Vrushali C > Fix For: YARN-2928 > > > Currently, the event column key is event_id?info_key?timestamp, which is not so friendly to fetching all the events of an entity and sorting them in a chronologic order. IMHO, timestamp?event_id?info_key may be a better key schema. I open this jira to continue the discussion about it which was commented on YARN-3908. -- This message was sent by Atlassian JIRA (v6.3.4#6332)