falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth Sundarrajan" <srik...@hotmail.com>
Subject Re: Review Request 35724: Base framework of the native scheduler
Date Mon, 14 Sep 2015 18:43:25 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35724/#review98194
-----------------------------------------------------------



common/src/main/java/org/apache/falcon/entity/EntityUtil.java (line 260)
<https://reviews.apache.org/r/35724/#comment154483>

    Are we sure we want to calculate months this way ? There are other places in falcon this
is done, but the usage mayn't have correctness implications, but being available in EntityUtil,
might result in inadvertent-incorrect usage.



common/src/main/java/org/apache/falcon/entity/EntityUtil.java (line 634)
<https://reviews.apache.org/r/35724/#comment154487>

    Is the staging dir owned by the entity submitting user ?



common/src/main/java/org/apache/falcon/entity/EntityUtil.java (line 641)
<https://reviews.apache.org/r/35724/#comment154485>

    Second condition may suffice



common/src/main/java/org/apache/falcon/entity/EntityUtil.java (line 647)
<https://reviews.apache.org/r/35724/#comment154493>

    Technically timestamp after md5 is a numeric value and is appended as is without any prefix
padding, so it might not sort it correctly, but may not cause issues for a long time to come.
Might be more readable as well to clearly compare on the FileStatus::getModificationTime()
or comparing on the timestamp after numeric::parse



scheduler/pom.xml (line 57)
<https://reviews.apache.org/r/35724/#comment155492>

    Why would we have oozie-adaptor dependency in Scheduler ?



scheduler/pom.xml (line 92)
<https://reviews.apache.org/r/35724/#comment155493>

    Version should be setup in dependencyManagement in parent pom as a best practice



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 38)
<https://reviews.apache.org/r/35724/#comment155503>

    What is external ID ? JavaDoc would be useful. Do we also intend to keep a tracking url.



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 39)
<https://reviews.apache.org/r/35724/#comment155506>

    nominalTime is a Oozie hangover. Can we use instanceTime instead ?
    
    Also is instanceTime/nominalTime mandatory for all scheduling types ? A Periodic ?



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 104)
<https://reviews.apache.org/r/35724/#comment155508>

    How do we handle re-runs in this world ? Will ExecutionInstance be different for different
runs ?



scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java (line 145)
<https://reviews.apache.org/r/35724/#comment155510>

    EntityExecutor has these behaviors, why are they again in the Instance ? If Instance is
modelled as a bean, then these behaviors can be avoided here.



scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java (line 46)
<https://reviews.apache.org/r/35724/#comment155511>

    Should this be held in a ConcurrentMap<> instead of Map ?



scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java (line 53)
<https://reviews.apache.org/r/35724/#comment155512>

    Why is StateStore::getEntities returning an iterable of EntityState ?



scheduler/src/main/java/org/apache/falcon/execution/NotificationHandler.java (line 33)
<https://reviews.apache.org/r/35724/#comment155528>

    onEvent() or onNotification() might be more apt. Currently NotificationHandler seems to
notify.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 60)
<https://reviews.apache.org/r/35724/#comment155531>

    Can we use DI here for the DAGEngine & ExecutionService ?
    
    Also if we make ExecutionInstance a bean, these might be unnessacary and will be limited
to EntityExecutor



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 109)
<https://reviews.apache.org/r/35724/#comment155539>

    Cluster specific locations has to be considered



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 120)
<https://reviews.apache.org/r/35724/#comment155550>

    DI ?



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 121)
<https://reviews.apache.org/r/35724/#comment155557>

    Can't quite get where the feed EL's being evaluated and paths being resolved.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 124)
<https://reviews.apache.org/r/35724/#comment155553>

    Shouldn't be polling frequency be lot smaller than the feed frequency ?



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 126)
<https://reviews.apache.org/r/35724/#comment155549>

    Not sure why data availability notification is being registered against ServicesRegistry.



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 127)
<https://reviews.apache.org/r/35724/#comment155556>

    Verbose logging



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 241)
<https://reviews.apache.org/r/35724/#comment155562>

    Comment seem to imply that process is scheduled in Oozie



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java (line 273)
<https://reviews.apache.org/r/35724/#comment155564>

    This has many issues:
    
    1. AwaitedPredicate is not cleaned up. This can lead to inconsistencies. 
    2. Suspend/Resume path also seems a bit suspect. Particularly concerned about scenarios
arising out of process having the same feed instances multiple times as part of different
dependency



scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutor.java (line 141)
<https://reviews.apache.org/r/35724/#comment155569>

    Would this not be common for both feed & process executors ?



scheduler/src/main/java/org/apache/falcon/notification/service/FalconNotificationService.java
(line 36)
<https://reviews.apache.org/r/35724/#comment155576>

    forNotification may be redundant since this is the NotificationService



scheduler/src/main/java/org/apache/falcon/notification/service/FalconNotificationService.java
(line 43)
<https://reviews.apache.org/r/35724/#comment155577>

    Why is unregister() is on a different entity other than NotificationRequest



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
32)
<https://reviews.apache.org/r/35724/#comment155572>

    Since there is already a service registry (Services), this is confusing. Can we rename
this to something more appropriate. Was mistaking this for the common Services all this while.



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
39)
<https://reviews.apache.org/r/35724/#comment155573>

    AlarmService ?



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
40)
<https://reviews.apache.org/r/35724/#comment155574>

    DataAvailabilityService ?



scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java (line
63)
<https://reviews.apache.org/r/35724/#comment155575>

    Can the NotificationService be registered with the general Services framework and accessed
through Services::get()



scheduler/src/main/java/org/apache/falcon/notification/service/event/DataEvent.java (line
58)
<https://reviews.apache.org/r/35724/#comment155578>

    Can this be documented. Not clear what is this for



scheduler/src/main/java/org/apache/falcon/notification/service/impl/JobCompletionService.java
(line 72)
<https://reviews.apache.org/r/35724/#comment155583>

    If the job is not complete, is the notification request ignored ? Can't see any polling
to check for job completion. Is this dependent on job end notifications ? Are we expecting
that to be resilient to failures ?



scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
(line 220)
<https://reviews.apache.org/r/35724/#comment155585>

    Is there a possiblity of removing running instances when the cache runs out of space.
Am unclear on what the behavior would be


- Srikanth Sundarrajan


On July 28, 2015, 11:07 a.m., Pallavi Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35724/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 11:07 a.m.)
> 
> 
> Review request for Falcon and Srikanth Sundarrajan.
> 
> 
> Bugs: FALCON-1213
>     https://issues.apache.org/jira/browse/FALCON-1213
> 
> 
> Repository: falcon-git
> 
> 
> Description
> -------
> 
> The patch has the basic framework for the scheduler. Each of the individual service needs
to be implemented completely and will be done as separate JIRAs. The intention of the patch
is ensure the base framework satisfies all use cases and get any early feedback in terms of
course correction.
> 
> 
> Diffs
> -----
> 
>   common/pom.xml 36de1f5 
>   common/src/main/java/org/apache/falcon/entity/EntityUtil.java b86d9d7 
>   common/src/main/java/org/apache/falcon/workflow/WorkflowJobEndNotificationService.java
c4f8843 
>   common/src/test/java/org/apache/falcon/entity/EntityUtilTest.java cfdc84d 
>   pom.xml 31997e8 
>   scheduler/pom.xml PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/exception/DAGEngineException.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/exception/InvalidStateTransitionException.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/exception/ServiceException.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/exception/StateStoreException.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/EntityExecutor.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/ExecutionInstance.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/FalconExecutionService.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/NotificationHandler.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutionInstance.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/execution/ProcessExecutor.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/notification/service/FalconNotificationService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/ServicesRegistry.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/DataEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/Event.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/notification/service/event/JobCompletedEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/JobScheduledEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/event/TimeElapsedEvent.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/DataNotificationService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/JobCompletionService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/SchedulerService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/impl/TimeNotificationService.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/DataNotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/JobCompletionNotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/JobScheduleNotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/NotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/notification/service/request/TimeNotificationRequest.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/predicate/Predicate.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/EntityState.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/EntityStateChangeHandler.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/ID.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/InstanceState.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/InstanceStateChangeHandler.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/StateMachine.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/StateService.java PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/state/store/AbstractStateStore.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/InMemoryStateStore.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/state/store/StateStore.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/workflow/engine/DAGEngine.java PRE-CREATION

>   scheduler/src/main/java/org/apache/falcon/workflow/engine/FalconWorkflowEngine.java
PRE-CREATION 
>   scheduler/src/main/java/org/apache/falcon/workflow/engine/OozieDAGEngine.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/execution/FalconExecutionServiceTest.java
PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/notification/service/SchedulerServiceTest.java
PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/notification/service/TimeNotificationServiceTest.java
PRE-CREATION 
>   scheduler/src/test/java/org/apache/falcon/predicate/PredicateTest.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/state/EntityStateServiceTest.java PRE-CREATION

>   scheduler/src/test/java/org/apache/falcon/state/InstanceStateServiceTest.java PRE-CREATION

>   scheduler/src/test/resources/config/cluster/cluster-0.1.xml PRE-CREATION 
>   scheduler/src/test/resources/config/feed/feed-0.1.xml PRE-CREATION 
>   scheduler/src/test/resources/config/process/process-0.1.xml PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35724/diff/
> 
> 
> Testing
> -------
> 
> New UTs have been added.
> 
> 
> Thanks,
> 
> Pallavi Rao
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message