Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DB18F18EB7 for ; Tue, 22 Dec 2015 07:07:47 +0000 (UTC) Received: (qmail 77621 invoked by uid 500); 22 Dec 2015 07:07:45 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 77523 invoked by uid 500); 22 Dec 2015 07:07:45 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 77501 invoked by uid 99); 22 Dec 2015 07:07:45 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2015 07:07:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B5539C093C; Tue, 22 Dec 2015 07:07:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.899 X-Spam-Level: ** X-Spam-Status: No, score=2.899 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id RjBPh6lDn9Nj; Tue, 22 Dec 2015 07:07:40 +0000 (UTC) Received: from mail-vk0-f43.google.com (mail-vk0-f43.google.com [209.85.213.43]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id D0C3D20270; Tue, 22 Dec 2015 07:07:39 +0000 (UTC) Received: by mail-vk0-f43.google.com with SMTP id j66so113228858vkg.1; Mon, 21 Dec 2015 23:07:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=i7odP841QhGgC61zYctdhJlU+s53aUWGPULo9NEKWNc=; b=YvSiIMD0MX49PWW7lvN6v5pdGwcaSJcnCwaio6J4UyS+Y12D1gqXGwN0us+xbSB7hO EekwrDlFgRrLsfG/pNYUwkcZOtMEtGTTi17rF1H80emIJS4XHh5lovYGP20Hf1j+88aA MSwEYkTbFi2yqivYwC2cL29ZMYprUXMKuWCzX+itlBmZm1cBBuqPT4WRIVZQA3kqoqD3 gOb394e4IT3otr9k48zysghGjjRa9ji5oHRar33tvD1eKF2FotQcTv0JV0fHqVyuL2yv 3Luba8itR+Hc7uUYQff9VsjCluoL1/zomXK7CfNZbCfH65ifrb5iTjFyunki7TSqi0jP MjSA== MIME-Version: 1.0 X-Received: by 10.31.52.73 with SMTP id b70mr3599867vka.16.1450768053281; Mon, 21 Dec 2015 23:07:33 -0800 (PST) Received: by 10.103.40.197 with HTTP; Mon, 21 Dec 2015 23:07:33 -0800 (PST) Date: Mon, 21 Dec 2015 23:07:33 -0800 Message-ID: Subject: hbase (coprocessors & cell tags) used in hadoop-yarn From: Vrushali Channapattan To: dev@hbase.apache.org, user@hbase.apache.org, Sangjin Lee , jrottinghuis@gmail.com Content-Type: multipart/alternative; boundary=001a1143fa4c77ab810527774203 --001a1143fa4c77ab810527774203 Content-Type: text/plain; charset=UTF-8 A group of us in the hadoop community are working on Yarn's next gen timeline service component https://issues.apache.org/jira/browse/YARN-2928 that will be storing for application that runs on a hadoop cluster all of the application stats, workflow metadata and container metrics information in hbase tables (some plain hbase tables and some phoenix based ones). We have been thinking about validating some of the implementation approaches we are taking with HBase. It would be great to get some feedback on the code and design from the HBase dev perspective. Among other things, we are making use of cell tags in coprocessors for summation, min and max operations on different versions of cells in a given column during read as well flush and compaction operations. Some relevant subjiras that deal with hbase coprocessors https://issues.apache.org/jira/browse/YARN-4062 https://issues.apache.org/jira/browse/YARN-3901 We have the schema documented with example records in the code as well as in pdf on the jira. https://github.com/apache/hadoop/blob/feature-YARN-2928/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunTable.java#L34 https://github.com/apache/hadoop/blob/feature-YARN-2928/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityTable.java#L40 Schema jira (pdf attachment that describes the schema) https://issues.apache.org/jira/browse/YARN-3411 Would appreciate any feedback/comments that you have and be glad to answer any questions to clarify in depth further. thanks Vrushali --001a1143fa4c77ab810527774203--