Return-Path: X-Original-To: apmail-streams-dev-archive@minotaur.apache.org Delivered-To: apmail-streams-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7615710B09 for ; Thu, 6 Jun 2013 12:09:23 +0000 (UTC) Received: (qmail 15850 invoked by uid 500); 6 Jun 2013 12:09:23 -0000 Delivered-To: apmail-streams-dev-archive@streams.apache.org Received: (qmail 15816 invoked by uid 500); 6 Jun 2013 12:09:21 -0000 Mailing-List: contact dev-help@streams.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@streams.incubator.apache.org Delivered-To: mailing list dev@streams.incubator.apache.org Received: (qmail 15808 invoked by uid 99); 6 Jun 2013 12:09:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Jun 2013 12:09:20 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lavender.beth@gmail.com designates 209.85.217.179 as permitted sender) Received: from [209.85.217.179] (HELO mail-lb0-f179.google.com) (209.85.217.179) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Jun 2013 12:09:15 +0000 Received: by mail-lb0-f179.google.com with SMTP id w20so2932224lbh.24 for ; Thu, 06 Jun 2013 05:08:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=3pD8UkaYtlgsm4uEdiMZvJhc++oS783G0eMQ/1D6WOo=; b=pjL12RF2GK4Rd6AWTyOdAYcRtuxuodLd85WLF8gkNXW5vLFFrVO8ptSVw3G6wXLEaV 2MFIKa6wZJhxJumdIKFaDUEWjoiEo8HV/aLtRz2minEBWGRDaZrv5saj5Mwnkqc7gKst 6uJTLlV1tocu8QvEW26VXpo8Vef3QwXQyIKyDA2svpRo1vDAPj6Zca6qrttHP45G0nEh fTPcG6RQOtpkuwl6WGji5FpCHjPpGnIvCdjN8RaI2mzFGNmvzFoja9K7JnU9yedz3M7j TL2SlrvSZptmoZkcDj1B7oM8r6DIacH32wVpGmMXsRbfj4gynhhfTXJMWo8pzVXrwNIP UtTA== MIME-Version: 1.0 X-Received: by 10.112.54.132 with SMTP id j4mr1544583lbp.36.1370520533549; Thu, 06 Jun 2013 05:08:53 -0700 (PDT) Received: by 10.152.127.38 with HTTP; Thu, 6 Jun 2013 05:08:53 -0700 (PDT) In-Reply-To: References: <002C46140B2DCC4D8323F87F498E7A0426EF1B70@IMCMBX04.MITRE.ORG> Date: Thu, 6 Jun 2013 08:08:53 -0400 Message-ID: Subject: Re: Question about processing architecture From: Beth Lavender To: dev@streams.incubator.apache.org Content-Type: multipart/alternative; boundary=001a11c3a9c88f461a04de7b2d45 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3a9c88f461a04de7b2d45 Content-Type: text/plain; charset=ISO-8859-1 On Tue, Jun 4, 2013 at 8:32 AM, Matt Franklin wrote: > On Mon, Jun 3, 2013 at 10:52 PM, Jason Letourneau > wrote: > > > The current vision is that filters will be implemented agnostic to the > > overall processing architecture - there may be subscribers using lucene > dsl > > as part of the initial streams implemetation- but the interface won't > > dictate how a subscriber filters it's activities - > > > Maybe I am misunderstanding your statement. In my mind, we really need > inbound and outbound data pipelines. I don't think a simple outbound > filter can solve this easily. I don't see how the system can do > de-duplication, supersession, aggregation, etc during the outbound phase. > We will need to do a lot of processing before we hit an intermediate > persistence layer that can then be used by subscriber filters and query > endpoints. > > The pipeline components themselves should be pluggable and we just need a > series of workflow events that they can hook and do work against the > incoming data. > > Am I off base? > The rollup [1] reference is useful. There is an implied set use cases given the context for "views that support roll-up". Do we have a set of use cases documented (or that can be referenced) that would help drive where in the architecture the plug ins are needed? > > > > I don't know that we've > > figured out whether the subscriber delegate tells the aggregate its > filter > > via an interface or whether aggregator tells each subscriber about every > > activity and the subscriber filters - either way - you can implement > > filters however you want - provides they adhere to the common filter > > interface (which to my recollection is very simplistic) > > > > On Monday, June 3, 2013, Lavender, Beth A wrote: > > > > > Many of our current systems that will feed the integrated activity > stream > > > are noisy. For example, if I update a page 4 times in 5 minutes it > > > generates an activity for each one. I want to be able to set rule for > > > discard the last n activities if they have the same actor, verb, and > > object > > > in x time frame. > > > > > > This assumes a sub processor that detects the pattern and takes an > action > > > described in a rule. Where do rules and sub processors fit in this > > > architecture? Is anyone doing this in their existing systems? > > > > > > > > > --001a11c3a9c88f461a04de7b2d45--