[ https://issues.apache.org/jira/browse/TEZ-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179174#comment-14179174
]
Siddharth Seth commented on TEZ-1656:
-------------------------------------
Unstable primarily because the interaction with TEZ-1397 isn't defined. Also, it can be a
little difficult to understand - it will not be invoked twice within the same app (in most
cases). When invoked across apps - even if they're processing the same data, the splits may
not be consistent till TEZ-1397 goes in - since the ordering of the hosts could be different.
I'd guess a single config would control both TEZ-1396 and TEZ-1397. Once both are in, I think
we should move it to not being unstable.
Some notes on expected behaviour will be useful, irrespective of the annotation.
> Grouping of splits should maintain the original ordering of splits within a group
> ---------------------------------------------------------------------------------
>
> Key: TEZ-1656
> URL: https://issues.apache.org/jira/browse/TEZ-1656
> Project: Apache Tez
> Issue Type: Task
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-1656.1.patch, TEZ-1656.2.patch, TEZ-1656.3.patch, TEZ-1656.4.patch
>
>
> Sometimes the original splits may have an ordering (eg. splits from a sorted file). Maintaining
the ordering of splits inside a group maintains the sort order.
> The node level grouping maintains ordering. When collecting leftover groups for rack
level grouping, the ordering is lost in current code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|