hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyunzhang (JIRA)" <>
Subject [jira] [Commented] (HIVE-14285) Explain outputs: map-entry ordering of non-primitive objects.
Date Sun, 11 Feb 2018 05:21:02 GMT


liyunzhang commented on HIVE-14285:

[~kgyrtkirk]: I found some problem with {{ExplainTask#getBasictypeKeyedMap}}. It converts
original input(type Map) to a treeMap which will order by the key. Like an example, the original
"Map 6" -> 
"Reducer 7" -> 
"Reducer 9" -> 
"Map 10" -> 
"Reducer 11" -> 
After this function, it will be 
"Map 10" -> 
"Map 6" -> 
"Reducer 11" -> 
"Reducer 7" -> 
"Reducer 9" -> 

{{Map 10}} is in front of {{Map6}} because "Map 10" is small than "Map 6".  But actually {{Map6}}
 is executed first then {{Map10}}. It maybe confused in explain. Here I want to ask whether
we don't care the order in explain is different from execution. If we don't care, it is ok
now. If the order of explain must be same with execution, we can refactor {{ExplainTask#getBasictypeKeyedMap}}

>  Explain outputs: map-entry ordering of non-primitive objects. 
> ---------------------------------------------------------------
>                 Key: HIVE-14285
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Minor
>             Fix For: 2.3.0
>         Attachments: HIVE-14285.1.patch
> In HIVE-12244 I've left behind some ugly backward compatible getters with {{@Explain}}
decorations to keep the qtests from breaking.
> There were heavy explain plan changes when I used {{Path}} objects as keys in {{@Explain}}
marked methods.
> I've looked into the causes of this:
>  * there is a {{TreeSet}} in there to keep all the keys in order.
>  * but: {{org.apache.hadoop.fs.Path}} uses a different sort order (inherited from {{}}
) sorts the paths using priorities:[schema,schemeSpecificPart,host,path,query,fragment]
>   considering that the output is an explain result(possibly read by humans): i don't
think this sophisticated sort order can be useful.
> {{ExplainTask#outputMap}} always calls toString() on the keys before using
the most painless solution would be to change all the keys inside the treeset to simple strings
(in case it's not a primitive already); this would restore the original behaviour for me.

This message was sent by Atlassian JIRA

View raw message