drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1992) Add more stats for HashJoinBatch and HashAggBatch
Date Tue, 13 Jan 2015 01:47:34 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274578#comment-14274578
] 

Venki Korukanti commented on DRILL-1992:
----------------------------------------

Sample queries used to analyze the stats

{code:sql}
SELECT
  CASE 
    WHEN metric['metricId'] = 0 THEN 'NUM_BUCKETS'
    WHEN metric['metricId'] = 1 THEN 'NUM_ENTRIES'
    WHEN metric['metricId'] = 2 THEN 'NUM_RESIZES'
    WHEN metric['metricId'] = 4 THEN 'HT_MEMORY'
    WHEN metric['metricId'] = 5 THEN 'NUM_BHOLDERS'
    WHEN metric['metricId'] = 6 THEN 'HJH_MEMORY'
  END,
  sum(metric['longValue']) AggMetricValue FROM 
  (SELECT minorFragId, opProfile['operatorType'] opType, flatten(opProfile['metric']) as metric
FROM 
     (SELECT  minorFragProfile['minorFragmentId'] as minorFragId,
         flatten(minorFragProfile['operatorProfile']) opProfile
      FROM 
         (SELECT flatten(majorFragment['minorFragmentProfile']) as minorFragProfile
           FROM
              (SELECT flatten(fragmentProfile) as majorFragment from dfs.`/tmp/a.json`)
           -- WHERE majorFragment['majorFragmentId'] = 1 -- if we are interested in op in
a particular major fragment
         )
     )
  )
WHERE
      (metric['metricId'] IN  (0, 1, 2, 4, 5, 6)) AND opType = 4 -- Change to 3 for HashAgg
stats
GROUP BY
     metric['metricId']
ORDER BY
      metric['metricId'];
{code}

{code}
+------------+----------------+
|   EXPR$0   | AggMetricValue |
+------------+----------------+
| NUM_BUCKETS | 6291456        |
| NUM_ENTRIES | 4000000        |
| NUM_RESIZES | 24             |
| HT_MEMORY  | 81395712       |
| NUM_BHOLDERS | 66             |
| HJH_MEMORY | 33301504       |
+------------+----------------+
{code}



> Add more stats for HashJoinBatch and HashAggBatch
> -------------------------------------------------
>
>                 Key: DRILL-1992
>                 URL: https://issues.apache.org/jira/browse/DRILL-1992
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Operators
>    Affects Versions: 0.8.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>             Fix For: 0.8.0
>
>
> Adding more stats to analyze the memory usage of HashJoinBatch and HashAggBatch.
> HashJoinBatch
>   + HASHTABLE_MEMORY_ALLOCATION
>   + HASHTABLE_NUM_BATCHHOLDERS
>   + HASHJOINHELPER_MEMORY
> HashAgg
>   + HASHTABLE_MEMORY_ALLOCATION
>   + HASHTABLE_NUM_BATCHHOLDERS
>   + HASHAGG_MEMORY
>   + HASHAGG_NUM_BATCHHOLDERS
> Cleanup:
>   + Prefix "HASHTABLE_" to existing HashTable metrics such as NUM_BUCKETS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message