tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEZ-1656) Grouping of splits should maintain the original ordering of splits within a group
Date Tue, 21 Oct 2014 21:45:34 GMT

    [ https://issues.apache.org/jira/browse/TEZ-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179174#comment-14179174
] 

Siddharth Seth commented on TEZ-1656:
-------------------------------------

Unstable primarily because the interaction with TEZ-1397 isn't defined. Also, it can be a
little difficult to understand - it will not be invoked twice within the same app (in most
cases). When invoked across apps - even if they're processing the same data, the splits may
not be consistent till TEZ-1397 goes in - since the ordering of the hosts could be different.
I'd guess a single config would control both TEZ-1396 and TEZ-1397. Once both are in, I think
we should move it to not being unstable.
Some notes on expected behaviour will be useful, irrespective of the annotation.

> Grouping of splits should maintain the original ordering of splits within a group
> ---------------------------------------------------------------------------------
>
>                 Key: TEZ-1656
>                 URL: https://issues.apache.org/jira/browse/TEZ-1656
>             Project: Apache Tez
>          Issue Type: Task
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: TEZ-1656.1.patch, TEZ-1656.2.patch, TEZ-1656.3.patch, TEZ-1656.4.patch
>
>
> Sometimes the original splits may have an ordering (eg. splits from a sorted file). Maintaining
the ordering of splits inside a group maintains the sort order.
> The node level grouping maintains ordering. When collecting leftover groups for rack
level grouping, the ordering is lost in current code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message