eagle-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Garrett Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (EAGLE-794) Enable publish bolt parallelism
Date Mon, 21 Nov 2016 09:57:58 GMT

    [ https://issues.apache.org/jira/browse/EAGLE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683060#comment-15683060
] 

Garrett Li commented on EAGLE-794:
----------------------------------

We cannot simply use AlertConstants.FIELD_0 field grouping to partition tuples, it is because
we may have policy1 which has publisher1 and publisher2. If we send 2 events with default
group by fields which may cause duplicated tuples.

To avoid this kinds of the duplicated tuples, we need to use the same strategy as router bolt
to do partition stream.

> Enable publish bolt parallelism
> -------------------------------
>
>                 Key: EAGLE-794
>                 URL: https://issues.apache.org/jira/browse/EAGLE-794
>             Project: Eagle
>          Issue Type: Improvement
>    Affects Versions: v0.5.0
>            Reporter: Garrett Li
>            Assignee: Garrett Li
>             Fix For: v0.5.0
>
>
> Currently the publish is using shuffle grouping, we cannot enable parallelism for publish
since we may have local cache which is unavailable across the storm cluster. 
> We are going to use the same strategy as router bolts to alert bolts, which is using
field grouping (define empty list of field) and dispatch tuple according to group by fields
hashing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message