Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C22E1174E2 for ; Thu, 9 Oct 2014 22:55:34 +0000 (UTC) Received: (qmail 49822 invoked by uid 500); 9 Oct 2014 22:55:34 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 49745 invoked by uid 500); 9 Oct 2014 22:55:34 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 49732 invoked by uid 500); 9 Oct 2014 22:55:34 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 49729 invoked by uid 99); 9 Oct 2014 22:55:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Oct 2014 22:55:34 +0000 Date: Thu, 9 Oct 2014 22:55:34 +0000 (UTC) From: "Szehon Ho (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-6500) Stats collection via filesystem MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165928#comment-14165928 ] Szehon Ho commented on HIVE-6500: --------------------------------- [~leftylev] This looks good to me, although my knowledge is limited about stats. Only comment is there seems to be an un-needed dot on the configuration wiki page: jdbc(:.) I think it makes sense to fix that as you suggested in HIVE-6586. Thanks! PS Yes , I think so. On [https://cwiki.apache.org/confluence/display/Hive/Home|https://cwiki.apache.org/confluence/display/Hive/Home], it's not listed in the 'children' list on the left. I misunderstood to think all supported pages are listed there.. > Stats collection via filesystem > ------------------------------- > > Key: HIVE-6500 > URL: https://issues.apache.org/jira/browse/HIVE-6500 > Project: Hive > Issue Type: New Feature > Components: Statistics > Reporter: Ashutosh Chauhan > Assignee: Ashutosh Chauhan > Labels: TODOC13, TODOC14 > Fix For: 0.13.0 > > Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch > > > Recently, support for stats gathering via counter was [added | https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has following issues: > * [Length of counter group name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] > * [Length of counter name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] > * [Number of distinct counter groups are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] > * [Number of distinct counters are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] > Although, these limits are configurable, but setting them to higher value implies increased memory load on AM and job history server. > Now, whether these limits makes sense or not is [debatable | https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that Hive doesn't make use of counters features of framework so that it we can evolve this feature without relying on support from framework. Filesystem based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)