gearpump-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manu Zhang (JIRA)" <>
Subject [jira] [Commented] (GEARPUMP-68) If-statement support in DAG
Date Thu, 21 Apr 2016 08:02:25 GMT


Manu Zhang commented on GEARPUMP-68:

The comments history is too long to put here. Please refer to the original issue link for
more context.

> If-statement support in DAG
> ---------------------------
>                 Key: GEARPUMP-68
>                 URL:
>             Project: Apache Gearpump
>          Issue Type: New Feature
>            Reporter: Manu Zhang
> imported from [] on behalf of [~whjiang]
> h1. Goal
> Currently, in Gearpump, publisher will publish each message to all the subscriptions.
However, there are cases that need to selectively publish to certain subscription. E.g. in
fraud detection use case, a threshold will be checked to determine which route to go (a good
user, a bad user or a suspicious user?). Basically, this routing was represented as an IF-statement.
> {code}
> if (is_from_good_user(message)) 
>     no more check needed
> else if(is_from_bad_user(message))
>     alert and no more check needed
> else
>     perform additional check to decide
> {code}
> To support such routing, we need selectively route at processor level. (#1343 is on task
level instead of processor level.)
> h1. Solution
> h2. solution 1
> Solution 1 is a workaround solution. No change need from Gearpump core part. Each If-then-else
statement was represented as 2 processors
> {code}
> upstream ~> conditionTrueFilter ~> thenClause   #filter out the false condition
messages in conditionTrueFilter
> upstream ~> conditionFalseFilter ~> elseClause  #filter out the true condition
messages in conditionFalseFilter
> {code}
> The main advantage of this solution is no need to change any code at Gearpump core side.
> The main disadvantages are:
>     It is hard to maintain. E.g. for dynamic DAG, if such day in future we need to change
the condition, we need to carefully change both nodes. Otherwise, they will be inconsistent.
>     Bad performance. if the two condition filters are not at the same JVM as upstream.
It means we will see significant network transport which is unnecessary.
>     Hard to understand. User need to learn this BKM to write if-statement.
>     Bad to express on UI. From DAG structure, it is impossible to know which one is the
then clause and which one is the else clause. So, user is unable to have insight of the goodness
of the condition check. E.g. does the condition check succeed in most cases?
> h2. Solution 2
> Solution 2 is to add built-in support for if-statement. Basically, the design is:
> # Allow a processor to have more than one output channels. Each channel has a name. Each
channel has a default output channel named "out". Processor can add alias names.
> # Each channel can have multiple subscribers. Thus, for dynamic DAG, user can dynamically
add/remove subscriptions for certain channel.
> So, it is quite easy to implement if-statment support using this solution:
>     A special IFProcessor is created. It has two output channels then and else. UI can
show the channel name on edge. Inside IFProcessor, user can write code output("then", msg)
to output to then channel.
>     DSL can be expressed like this
> {code}
>   upstream.if(condition, thenClause, elseClause)
> {code}
> or
> {code}
>    val ifStmt = upstream.if(condition)
>    ifStmt.then(thenClause)
>    ifStmt.else(elseClause)
> {code}
>     Low level API can express as
> {code}
>   A#"then" ~ partioner ~> B
> {code}

This message was sent by Atlassian JIRA

View raw message