hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
Date Thu, 16 Oct 2014 02:28:33 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173247#comment-14173247
] 

Rui Li commented on HIVE-8456:
------------------------------

I'm not familiar how the counter/accumulator works. Just a few high level questions:

1. Shall we think of better names for the new classes? Because the naming (e.g. SparkCounterGroup
and SparkCounters) seems a little bit confusing to me.

2. Have we defined all the counters in {{SparkCounters.initializeSparkCounters}}? For example,
it seems {{Operator.HIVECOUNTERFATAL}} isn't added there.

3. The Counter enum in operators doesn't seem to be used as "Counter" in hive. Rather, it's
just kept in {{statsMap : HashMap<Enum<?>, LongWritable>}}. Maybe we shouldn't
add them as SparkCounter? If we do want to wrap them as SparkCounter, there're other operators
to handle other than MapOperator, e.g. FilterOperator and  JoinOperator also have such an
enum.

4. Maybe we should always use {{HiveConf.ConfVars.HIVECOUNTERGROUP}} as the group name, rather
than the enum class name ({{key.getDeclaringClass().getName()}})?

> Support Hive Counter to collect spark job metric[Spark Branch]
> --------------------------------------------------------------
>
>                 Key: HIVE-8456
>                 URL: https://issues.apache.org/jira/browse/HIVE-8456
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>              Labels: Spark-M3
>         Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch
>
>
> Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES
and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info.
 Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable
Hive Counter based on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message