tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (TEZ-1547) Make use of state change notifier in VertexManagerPlugins
Date Fri, 31 Oct 2014 01:47:33 GMT

    [ https://issues.apache.org/jira/browse/TEZ-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191181#comment-14191181
] 

Bikas Saha edited comment on TEZ-1547 at 10/31/14 1:47 AM:
-----------------------------------------------------------

Fixed the race condition. Added test cases for that.

bq. vertexReconfigurationPlanned - In the JavaDoc, mentioning that scheduleTasks / addInputInitilizerEvents
does not qualify as re-configuration 
Done.

bq. public void doneReconfiguringVertex() - The same could be achieved by assuming reconfiguration
Its actually already done that way to make the common case easier. An incompletely configured
vertex will implicitly send the notification when its completely configured. Calling these
API's is optional for the common case. The new API's are needed only when a fully configured
vertex is re-configured. E.g. Shuffle vertex parallelism is set and auto-parallel can choose
to not change the parallelism and setParallelism() will not be invoked. However, doneReconfiguringVertex()
must be invoked to let the vertex know that there is not going to be any further reconfiguring
and it sends the notification at that time. This is also going to enable us to quickly add
multiple invocations of setParallelism() without encountering the current issues because the
doneReconfiguringVertex() can be used to notify the vertex that reconfiguration is finally
complete.

bq. public void vertexManagerDone(); - Do we need to add this right now.
This has been added for multiple reasons, not just for event unregistering. Recovery needs
to know when a vertex manager is done doing all things like sending events etc. However, given
that without the locking fix this is not going to work. And given that its an optional API
and things work without it, I am going to back it out for this patch. For the same reason
will not add an upcall to unregister in this patch. Its easy for users to ignore events if
needed.

bq. onVertexStateUpdated should throw Exception
Done. Those patches came in while this was in progress. So now its my problem :)

bq. COMPLETELY_CONFIGURED - This is only sent out after startVertex has been invoked in VertexImpl
Done this way because there are many path out of INITING. But all of those immediately go
trigger startVertex(). So it was easier to put it there without any drawbacks. This would
result in this and RUNNING coming one after another except for cases like auto-reduce/multiple
changes in parallelism etc.

bq. ImmediateStartVertexManager. SourceInfo no longer required
Removed. I think it was that way because only in those cases the ImmediateStartManager is
effective. If there is an SG edge then the shuffle vertex manager gets used. Doing it this
way removes need for special casing in both vertex managers and also re-enabled the min/max=0
case for perf in the shuffle vertex manager.

bq. VertexImpl "if (fromVertexManager && canInitVertex()) {" - this check should ideally
be af
Done. This is my last comment where I mentioned it needs to go inside the try block.

bq. Typo: defune
Will do.

bq. MRInputSplitDistributor and the correct number of tasks up front
>From what I see only MRInputAMSplitGenerator results in setParallelism to be called. In
which case numTasks will be -1. The distributor does not result in setParallelism to be called.
If needed, we could change the RootInputVertexManager to use the new APIs and make this check
pass. But dont think thats needed now.

bq.  s/unregisterForVertexStatusUpdates/unregisterForVertexStateUpdates
Will do

bq. Does everything on the VertexManager need to be synchronized ?
IMO its fine to go with heavy handed synchronization for now since this is not a high throughput
API and it allows VMs to safely call methods. We should have done this much earlier.

bq.  on InputReadyVertexManager. That is intentional ? Nothing required,
Right. Nothing required. This is the fully conservative scheduler.

Will post a revised patch soon.



was (Author: bikassaha):
Fixed the race condition. Added test cases for that.

bq. vertexReconfigurationPlanned - In the JavaDoc, mentioning that scheduleTasks / addInputInitilizerEvents
does not qualify as re-configuration 
Done.

bq. public void doneReconfiguringVertex() - The same could be achieved by assuming reconfiguration
Its actually already done that way to make the common case easier. An incompletely configured
vertex will implicitly send the notification when its completely configured. Calling these
API's is optional for the common case. The new API's are needed only when a fully configured
vertex is re-configured. E.g. Shuffle vertex parallelism is set and auto-parallel can choose
to not change the parallelism and setParallelism() will not be invoked. However, doneReconfiguringVertex()
must be invoked to let the vertex know that there is not going to be any further reconfiguring
and it sends the notification at that time. This is also going to enable us to quickly add
multiple invocations of setParallelism() without encountering the current issues because the
doneReconfiguringVertex() can be used to notify the vertex that reconfiguration is finally
complete.

bq. public void vertexManagerDone(); - Do we need to add this right now.
This has been added for multiple reasons, not just for event unregistering. Recovery needs
to know when a vertex manager is done doing all things like sending events etc. However, given
that without the locking fix this is not going to work. And given that its an optional API
and things work without it, I am going to back it out for this patch.

bq. onVertexStateUpdated should throw Exception
Done. Those patches came in while this was in progress. So now its my problem :)

bq. COMPLETELY_CONFIGURED - This is only sent out after startVertex has been invoked in VertexImpl
Done this way because there are many path out of INITING. But all of those immediately go
trigger startVertex(). So it was easier to put it there without any drawbacks. This would
result in this and RUNNING coming one after another except for cases like auto-reduce/multiple
changes in parallelism etc.

bq. ImmediateStartVertexManager. SourceInfo no longer required
Removed. I think it was that way because only in those cases the ImmediateStartManager is
effective. If there is an SG edge then the shuffle vertex manager gets used. Doing it this
way removes need for special casing in both vertex managers and also re-enabled the min/max=0
case for perf in the shuffle vertex manager.

bq. VertexImpl "if (fromVertexManager && canInitVertex()) {" - this check should ideally
be af
Done. This is my last comment where I mentioned it needs to go inside the try block.

bq. Typo: defune
Will do.

bq. MRInputSplitDistributor and the correct number of tasks up front
>From what I see only MRInputAMSplitGenerator results in setParallelism to be called. In
which case numTasks will be -1. The distributor does not result in setParallelism to be called.
If needed, we could change the RootInputVertexManager to use the new APIs and make this check
pass. But dont think thats needed now.

bq.  s/unregisterForVertexStatusUpdates/unregisterForVertexStateUpdates
Will do

bq. Does everything on the VertexManager need to be synchronized ?
IMO its fine to go with heavy handed synchronization for now since this is not a high throughput
API and it allows VMs to safely call methods. We should have done this much earlier.

bq.  on InputReadyVertexManager. That is intentional ? Nothing required,
Right. Nothing required. This is the fully conservative scheduler.

Will post a revised patch soon.


> Make use of state change notifier in VertexManagerPlugins
> ---------------------------------------------------------
>
>                 Key: TEZ-1547
>                 URL: https://issues.apache.org/jira/browse/TEZ-1547
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Bikas Saha
>         Attachments: TEZ-1547.1.patch, TEZ-1547.3.patch, TEZ-1547.4.patch, TEZ-1547.5.patch,
TEZ-1547.6.patch, TEZ-1547.7.patch
>
>
> Instead of the various APIs like onVertexStarted, simple notifications could be sent.
> Some existing APIs could end up being deprecated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message