tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEZ-1656) Grouping of splits should maintain the original ordering of splits within a group
Date Tue, 21 Oct 2014 21:45:34 GMT

    [ https://issues.apache.org/jira/browse/TEZ-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179174#comment-14179174

Siddharth Seth commented on TEZ-1656:

Unstable primarily because the interaction with TEZ-1397 isn't defined. Also, it can be a
little difficult to understand - it will not be invoked twice within the same app (in most
cases). When invoked across apps - even if they're processing the same data, the splits may
not be consistent till TEZ-1397 goes in - since the ordering of the hosts could be different.
I'd guess a single config would control both TEZ-1396 and TEZ-1397. Once both are in, I think
we should move it to not being unstable.
Some notes on expected behaviour will be useful, irrespective of the annotation.

> Grouping of splits should maintain the original ordering of splits within a group
> ---------------------------------------------------------------------------------
>                 Key: TEZ-1656
>                 URL: https://issues.apache.org/jira/browse/TEZ-1656
>             Project: Apache Tez
>          Issue Type: Task
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: TEZ-1656.1.patch, TEZ-1656.2.patch, TEZ-1656.3.patch, TEZ-1656.4.patch
> Sometimes the original splits may have an ordering (eg. splits from a sorted file). Maintaining
the ordering of splits inside a group maintains the sort order.
> The node level grouping maintains ordering. When collecting leftover groups for rack
level grouping, the ordering is lost in current code.

This message was sent by Atlassian JIRA

View raw message