tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Syed Shameerur Rahman (Jira)" <j...@apache.org>
Subject [jira] [Commented] (TEZ-2103) Implement a Partial completion VertexManagerPlugin
Date Wed, 20 Jan 2021 07:45:00 GMT

    [ https://issues.apache.org/jira/browse/TEZ-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268416#comment-17268416
] 

Syed Shameerur Rahman commented on TEZ-2103:
--------------------------------------------

[~rajesh.balamohan] Any thoughts on the approach mentioned in https://issues.apache.org/jira/browse/TEZ-2103?focusedCommentId=17110149&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17110149
?

> Implement a Partial completion VertexManagerPlugin
> --------------------------------------------------
>
>                 Key: TEZ-2103
>                 URL: https://issues.apache.org/jira/browse/TEZ-2103
>             Project: Apache Tez
>          Issue Type: New Feature
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Syed Shameerur Rahman
>            Priority: Major
>              Labels: gsoc, gsoc2015, hadoop, java, tez
>         Attachments: TEZ-2103.01.patch, TEZ-2103.02.patch, TEZ-2103.03.patch, TEZ-2103.WIP.patch
>
>
> Currently, there is no sibling communication between tasks - this implies that a task
can be completed by the first vertex in a wave of tasks, but the entire wave of tasks has
to complete before success can be reported.
> This occurs in limit + filter query patterns common between the data access engines.
> {code}
> select * from data where x > 1 limit 10;
> {code}
> will run through a full-table scan worth of tasks to generate 10 rows per task, to aggregate
it to produce the final 10 row result.
> The VertexManager receives counters/events early enough to short-circuit the rest of
the vertex tasks, to prevent the remainder of tasks from getting scheduled when the limit
condition has been satisfied by an initial sub-set of the tasks.
> This is a specialization of the VertexManagerPlugin for this common case scheduling pattern.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message