falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "pavan kumar kolamuri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-1728) Process entity definition allows multiple clusters when it has output Feed defined.
Date Thu, 07 Jan 2016 05:10:39 GMT

    [ https://issues.apache.org/jira/browse/FALCON-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086815#comment-15086815
] 

pavan kumar kolamuri commented on FALCON-1728:
----------------------------------------------

Just adding more points to what [~ajayyadava] said 
{noformat}
<cluster name="cluster1" type="source" partition="${cluster.colo}">
      <validity start="2013-02-01T00:00Z" end="2099-01-01T00:00Z"/>
      <retention limit="days(7)" action="delete"/>
</cluster>
{noformat}

In the feed definition we can specify partition, in the target cluster replication will happen
based on partitions so data override won't happen

{noformat}
ex:  <feedpath>/{partition1}/cluster1-data
     <feedpath>/{partition2}/cluster2-data
{noformat}


> Process entity definition allows multiple clusters when it has output Feed defined. 
> ------------------------------------------------------------------------------------
>
>                 Key: FALCON-1728
>                 URL: https://issues.apache.org/jira/browse/FALCON-1728
>             Project: Falcon
>          Issue Type: Bug
>          Components: process
>    Affects Versions: 0.9
>            Reporter: Balu Vellanki
>            Assignee: Balu Vellanki
>            Priority: Critical
>
> Process XSD allows user to specify multiple clusters per process entity. I am guessing
this would allow a user to run duplicate instance of the process on multiple clusters at the
same time (I do not really see a need for this). When the process has an output feed defined,
you can have duplicate process instances writing to same feed instance, causing data corruption/failures.
The solution is to 
> 1. Do not allow multiple clusters per process. Let the user define a duplicate process
if user wants to run duplicate instances.  
> OR
> 2. Allow multiple clusters, but only when there is no output feed defined.
> [~sriksun] please let me know if there is any other reason for allowing multiple clusters
in a process. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message