hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <>
Subject [jira] [Commented] (HIVE-6500) Stats collection via filesystem
Date Sun, 23 Mar 2014 09:39:42 GMT


Lefty Leverenz commented on HIVE-6500:

I updated the wiki for *hive.stats.dbclass* -- please review:

Hive 0.13 and later:  The storage that stores temporary Hive statistics. In FS based statistics
collection, each task writes statistics it has collected in a file on the filesystem, which
will be aggregated after the job has finished. Supported values are fs (filesystem), jdbc(:.*),
hbase, counter and custom (HIVE-6500).

* [Configuration Properties:  hive.stats.dbclass |]

> Stats collection via filesystem
> -------------------------------
>                 Key: HIVE-6500
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Statistics
>            Reporter: Ashutosh Chauhan
>            Assignee: Ashutosh Chauhan
>             Fix For: 0.13.0
>         Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch
> Recently, support for stats gathering via counter was [added |]
Although, its useful it has following issues:
> * [Length of counter group name is limited |]
> * [Length of counter name is limited |]
> * [Number of distinct counter groups are limited |]
> * [Number of distinct counters are limited |]
> Although, these limits are configurable, but setting them to higher value implies increased
memory load on AM and job history server.
> Now, whether these limits makes sense or not is [debatable |]
it is desirable that Hive doesn't make use of counters features of framework so that it we
can evolve this feature without relying on support from framework. Filesystem based counter
collection is a step in that direction.

This message was sent by Atlassian JIRA

View raw message