falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pallavi Rao" <pallavi....@inmobi.com>
Subject Re: Review Request 35724: Base framework of the native scheduler
Date Tue, 13 Oct 2015 11:54:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35724/#review101609
-----------------------------------------------------------



scheduler/pom.xml (line 92)
<https://reviews.apache.org/r/35724/#comment159041>

    Moved version to parent pom. Updated to latest version.



scheduler/pom.xml (line 103)
<https://reviews.apache.org/r/35724/#comment159043>

    Need it at 1.9.5 because PowerMock uses that and I intend to use PowerMock.



scheduler/pom.xml (line 109)
<https://reviews.apache.org/r/35724/#comment159042>

    Moved version to parent pom. Updated to latest release.



scheduler/src/main/java/org/apache/falcon/exception/DAGEngineException.java (line 23)
<https://reviews.apache.org/r/35724/#comment159046>

    Oops.. Fixed.



scheduler/src/main/java/org/apache/falcon/execution/EntityExecutor.java (line 33)
<https://reviews.apache.org/r/35724/#comment159064>

    Has a different package name, so NotificationHandler should be ok.



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 38)
<https://reviews.apache.org/r/35724/#comment159066>

    External ID is the ID used to identify the Job submitted to the DAG Engine, as returned
by the DAG Engine. For example, for Oozie this would be the workflow Id. Added some doc.
    
    Regarding tracking URL, right now, the URL can be easily constructed given a DAGEngine
and externalId, so don't intend to store it. But, I'm open to adding it, if required in the
future.



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 39)
<https://reviews.apache.org/r/35724/#comment159067>

    Renamed to instanceTime. 
    
    You are right. It is not mandatory. But, in case of a-periodic, it will still indicate
at what time the instance was created. Added another constructor without instanceTime.



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 104)
<https://reviews.apache.org/r/35724/#comment159069>

    My current thought is to create a new instance per run, in which the instance Id will
be a combination of entity name, cluster, instanceTime and runId. Will implement as part of
FALCON-1512.



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 145)
<https://reviews.apache.org/r/35724/#comment159068>

    Well. Individual instances may need to be suspended too by a user and not the the entity
itself. Hence, this method.
    
    But, yes, like we have EntityExecutor and Entity bean. I do want to separate out InstanceExecutor
and Instance bean. Will take that up as a separate JIRA and the instance bean will also have
to cater to state store.



scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java (line 46)
<https://reviews.apache.org/r/35724/#comment159071>

    Yes. Changed.



scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java (line 50)
<https://reviews.apache.org/r/35724/#comment159072>

    The private constructor is invoked during class loading itself as it is assigned to a
private static variable. So, there can only be single instance.



scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java (line 53)
<https://reviews.apache.org/r/35724/#comment159076>

    Changed to return Entity and not EntityState



scheduler/src/main/java/org/apache/falcon/execution/NotificationHandler.java (line 26)
<https://reviews.apache.org/r/35724/#comment159063>

    Since the package name is different, it should be ok.



scheduler/src/main/java/org/apache/falcon/execution/NotificationHandler.java (line 33)
<https://reviews.apache.org/r/35724/#comment159077>

    Makes sense. Changed to onEvent.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 60)
<https://reviews.apache.org/r/35724/#comment159078>

    The reason I didn't make this a plain bean was to support instance level operations such
as suspend, kill, resume. I can split ExecutionInstance as InstanceExecutor and the bean to
make things simpler.
    
    Makes sense to use DI for DAGEngine and ExecutionService.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 108)
<https://reviews.apache.org/r/35724/#comment159397>

    Since inputs are feeds, In line 100, I check if input is null. If no input, the execution
does not  reach here.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 109)
<https://reviews.apache.org/r/35724/#comment159398>

    Oops. Fixed it.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 120)
<https://reviews.apache.org/r/35724/#comment159399>

    This and next few lines are only illustrative, to say, we'll have to register to DataAvailabilityService,
as the TODO above indicates. I'll remove these to avoid confusion.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 241)
<https://reviews.apache.org/r/35724/#comment159400>

    Corrected it.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 273)
<https://reviews.apache.org/r/35724/#comment159401>

    1. Awaited predicates is not cleaned up on purpose. destroy() is called during abnormal
termination such as suspend or killed. In which case, we want to keep the awaited predicates
around for resume/rerun. Are you are suggesting that we re-evaluate the predicates on resumption
or re-run?
    
    2. Not sure what you mean here.. Will discuss offline.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutor.java (line 141)
<https://reviews.apache.org/r/35724/#comment159246>

    Could be. EntityExecutor has this method too. When we introduce the FeedExecutor will
refactor to move the common code into EntityExecutor.



scheduler/src/main/java/org/apache/falcon/notification/service/FalconNotificationService.java
(line 36)
<https://reviews.apache.org/r/35724/#comment159242>

    Changed.



scheduler/src/main/java/org/apache/falcon/notification/service/FalconNotificationService.java
(line 43)
<https://reviews.apache.org/r/35724/#comment159244>

    The request object parameters in addition to handler and callbackID that the notification
service uses. For example, the dataLocation, type, etc. in case of data availability. However,
handler and callbackID are sufficient to unregister. 
    
    I could use request. But, that would mean keeping the bulkier request object around although
it won't be used.



scheduler/src/main/java/org/apache/falcon/notification/service/FalconNotificationService.java
(line 48)
<https://reviews.apache.org/r/35724/#comment159241>

    Not really. Now, it returns a request builder.



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
32)
<https://reviews.apache.org/r/35724/#comment159237>

    Oops.. understand. Renamed this this NotificationServicesRegistry.



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
39)
<https://reviews.apache.org/r/35724/#comment159238>

    Done.



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
40)
<https://reviews.apache.org/r/35724/#comment159239>

    Done.



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
63)
<https://reviews.apache.org/r/35724/#comment159240>

    Yes. In fact, it is. This registry acts more like a router to the notification services.



scheduler/src/main/java/org/apache/falcon/notification/service/event/DataEvent.java (line
33)
<https://reviews.apache.org/r/35724/#comment159236>

    Have knocked this off. This isn't used. Latest patch for DataNotification is in FALCON-1230



scheduler/src/main/java/org/apache/falcon/notification/service/event/DataEvent.java (line
58)
<https://reviews.apache.org/r/35724/#comment159235>

    Have knocked this off. This isn't used. Latest patch for DataNotification is in FALCON-1230



scheduler/src/main/java/org/apache/falcon/notification/service/event/JobCompletedEvent.java
(line 28)
<https://reviews.apache.org/r/35724/#comment159080>

    Done.



scheduler/src/main/java/org/apache/falcon/notification/service/event/JobCompletedEvent.java
(line 37)
<https://reviews.apache.org/r/35724/#comment159213>

    Agreed. Fixed it.



scheduler/src/main/java/org/apache/falcon/notification/service/event/JobScheduledEvent.java
(line 30)
<https://reviews.apache.org/r/35724/#comment159214>

    Done.



scheduler/src/main/java/org/apache/falcon/notification/service/impl/JobCompletionService.java
(line 72)
<https://reviews.apache.org/r/35724/#comment159217>

    The intention is as follows : When a listener registers and says "let me know when this
Job completes". We check if the job has already completed, if so, we generate an event immediately.
If the job is not complete, we expect that this class will get notified when the job completes
as this class is a listener to WorkflowJobEndNotificationService.
    
    JobEndNotificationService has been made reliable as part of FALCON-1231.
    
    Added this clarification as comment.



scheduler/src/main/java/org/apache/falcon/notification/service/impl/JobCompletionService.java
(line 124)
<https://reviews.apache.org/r/35724/#comment159218>

    Sure. I have moved out the common code to an onEnd method.
    However, the onSuccess and onFailure methods need to be there as the class implements
WorkflowExecutionListener.



scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
(line 99)
<https://reviews.apache.org/r/35724/#comment159219>

    My bad. Fixed that.



scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
(line 220)
<https://reviews.apache.org/r/35724/#comment159220>

    When the cache exceeds its size, it is possible that the instance (which is running) can
be evicted from cache. That is why, to handle such cases, the onRemoval method persists it
in the state upon eviction.
    
    The cache is backed by state (loading cache), so, when we receive a completion event for
a particular instance, it will be loaded back into cache.



scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
(line 288)
<https://reviews.apache.org/r/35724/#comment159221>

    Makes sense. Done



scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
(line 335)
<https://reviews.apache.org/r/35724/#comment159222>

    Makes sense. Moved.



scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
(line 375)
<https://reviews.apache.org/r/35724/#comment159223>

    Same instance will not have a second run because the State has changed. A new instance
can be scheduled (based on parallelism)



scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
(line 430)
<https://reviews.apache.org/r/35724/#comment159224>

    From docs:    
         *  @return a negative integer, zero, or a positive integer as the first argument
is less than, equal to, or greater than the second.
         int compare(T o1, T o2);
         
    So, I guess it is ok?



scheduler/src/main/java/org/apache/falcon/notification/service/request/DataNotificationRequest.java
(line 30)
<https://reviews.apache.org/r/35724/#comment159225>

    Done



scheduler/src/main/java/org/apache/falcon/notification/service/request/DataNotificationRequest.java
(line 37)
<https://reviews.apache.org/r/35724/#comment159226>

    Put them in there, to just illustrate my thought process. Will leave it for now. These
have already been refactored in FALCON-1230, anyway.



scheduler/src/main/java/org/apache/falcon/notification/service/request/DataNotificationRequest.java
(line 43)
<https://reviews.apache.org/r/35724/#comment159227>

    Yep. As before, just to illustrate. Has been refactored in FALCON-1230.



scheduler/src/main/java/org/apache/falcon/notification/service/request/NotificationRequest.java
(line 38)
<https://reviews.apache.org/r/35724/#comment159228>

    Shouldn't be abstract class, knocked it off.



scheduler/src/main/java/org/apache/falcon/notification/service/request/TimeNotificationRequest.java
(line 31)
<https://reviews.apache.org/r/35724/#comment159229>

    I agree with the partial construction and that all mandatory parameters should be part
of constructor. Introduced builders to be consistent.



scheduler/src/main/java/org/apache/falcon/state/EntityState.java (line 71)
<https://reviews.apache.org/r/35724/#comment159232>

    Thanks. Rectified.



scheduler/src/main/java/org/apache/falcon/state/ID.java (line 33)
<https://reviews.apache.org/r/35724/#comment159538>

    Agree that it is bit overloaded. I tried separating this out into EntityID and InstanceID
(as subclasses of ID) and the code got a lot messier, with instanceof checks all over the
place as the notification services accept ID (and that I don't want to change).
    
    So, prefer to keep it this way for now.



scheduler/src/main/java/org/apache/falcon/state/ID.java (line 189)
<https://reviews.apache.org/r/35724/#comment159234>

    You are right. Fixed it.



scheduler/src/main/java/org/apache/falcon/state/store/AbstractStateStore.java (line 80)
<https://reviews.apache.org/r/35724/#comment159079>

    Yes. Otherwise there is no singleton guarantee. Corrected it.


- Pallavi Rao


On Oct. 13, 2015, 11:53 a.m., Pallavi Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35724/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2015, 11:53 a.m.)
> 
> 
> Review request for Falcon and Srikanth Sundarrajan.
> 
> 
> Bugs: FALCON-1213
>     https://issues.apache.org/jira/browse/FALCON-1213
> 
> 
> Repository: falcon-git
> 
> 
> Description
> -------
> 
> The patch has the basic framework for the scheduler. Each of the individual service needs
to be implemented completely and will be done as separate JIRAs. The intention of the patch
is ensure the base framework satisfies all use cases and get any early feedback in terms of
course correction.
> 
> 
> Diffs
> -----
> 
>   common/src/main/java/org/apache/falcon/entity/EntityUtil.java 3ab9339 
>   pom.xml 54e6cd1 
>   scheduler/pom.xml PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/exception/DAGEngineException.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/exception/InvalidStateTransitionException.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/exception/ServiceException.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/exception/StateStoreException.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/EntityExecutor.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/NotificationHandler.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutor.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/SchedulerUtil.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/notification/service/FalconNotificationService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/NotificationServicesRegistry.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/DataEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/Event.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/notification/service/event/JobCompletedEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/JobScheduledEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/TimeElapsedEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/AlarmService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/DataAvailabilityService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/JobCompletionService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/AlarmRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/DataNotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/JobCompletionNotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/JobScheduleNotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/NotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/predicate/Predicate.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/EntityState.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/EntityStateChangeHandler.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/ID.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/InstanceState.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/InstanceStateChangeHandler.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/StateMachine.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/StateService.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/store/AbstractStateStore.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/InMemoryStateStore.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/StateStore.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/workflow/engine/DAGEngine.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/workflow/engine/DAGEngineFactory.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/workflow/engine/FalconWorkflowEngine.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/workflow/engine/OozieDAGEngine.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/execution/FalconExecutionServiceTest.java
PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/execution/MockDAGEngine.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/execution/SchedulerUtilTest.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/notification/service/AlarmServiceTest.java
PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/notification/service/SchedulerServiceTest.java
PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/predicate/PredicateTest.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/state/EntityStateServiceTest.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/state/InstanceStateServiceTest.java PRE-CREATION

>   scheduler/src/test/resources/config/cluster/cluster-0.1.xml PRE-CREATION 
>   scheduler/src/test/resources/config/feed/feed-0.1.xml PRE-CREATION 
>   scheduler/src/test/resources/config/process/process-0.1.xml PRE-CREATION 
>   webapp/pom.xml e63aa44 
> 
> Diff: https://reviews.apache.org/r/35724/diff/
> 
> 
> Testing
> -------
> 
> New UTs have been added.
> 
> 
> Thanks,
> 
> Pallavi Rao
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message