Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B53A911096 for ; Mon, 18 Aug 2014 20:35:30 +0000 (UTC) Received: (qmail 24284 invoked by uid 500); 18 Aug 2014 20:35:30 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 24237 invoked by uid 500); 18 Aug 2014 20:35:30 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 24227 invoked by uid 99); 18 Aug 2014 20:35:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Aug 2014 20:35:30 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of konstt2000@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-wg0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Aug 2014 20:35:24 +0000 Received: by mail-wg0-f44.google.com with SMTP id m15so5422022wgh.3 for ; Mon, 18 Aug 2014 13:35:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=th4dOqPh2ermvJBs7Xy1TWE4dOMWQqNovj2L09McCRE=; b=QgyhkE32I/O3cWD05jQ2wFLTPTvCcABu1J6KDv9t/+UfjqCBLa94Ta70oTVKtbhwbZ URZItUtgdxmKOVqT6BjMPCvn6H484yyv01OWJL30f09OH7VlKIegsVcYSJi8DB4DcwYA KYenOh3n7488H6IQXz0+WqtfiLKsuQ4qwowaKaYsawOnGFSUSzSHj2mE1BTlIZYcAUfz 1iVe6aal/bFpl9pDzBy0KR6ECDkCKGjGe8RrojvQfz8QF59MMSl8RW+6bIG8EaYk/Vjp wLNVe7mlP+Lm9trh7GgbIeOwxZbRoOXYZrOktAyVGT1pSedmaw09TiBok4XcEjokWkAW xQKg== MIME-Version: 1.0 X-Received: by 10.180.188.35 with SMTP id fx3mr1462073wic.82.1408394103759; Mon, 18 Aug 2014 13:35:03 -0700 (PDT) Received: by 10.217.61.201 with HTTP; Mon, 18 Aug 2014 13:35:03 -0700 (PDT) In-Reply-To: References: Date: Mon, 18 Aug 2014 22:35:03 +0200 Message-ID: Subject: Re: Flow in Flume, could it make better? From: Guillermo Ortiz To: user@flume.apache.org Content-Type: multipart/alternative; boundary=001a11c261d04229330500ed4e99 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c261d04229330500ed4e99 Content-Type: text/plain; charset=UTF-8 On my test, everything is in the same VM. Later, I'll have another flow which is just spooling or tailing a file and send through Avro to another Source on my system. Do I really need to do that replicating step? I think that I have too many channel and that means too resources and too configuration. 2014-08-18 19:51 GMT+02:00 terrey shih : > Hi, > > Your 2 sources (spooling) and source Avro (from sink 2) are in two > different JVMs/machines ? > > thx > > > On Mon, Aug 18, 2014 at 9:53 AM, Guillermo Ortiz > wrote: > >> Hi, >> >> I have build a flow with Flume and I don't know if it's the way to do it, >> or there is something better. I am spooling a directory and need those data >> in three different paths in HDFS with different formats, so I have created >> two interceptors. >> >> Source(Spooling) + Replication + Interceptor1 --> to C1 and C2 >> C1 -> Sink1 to HDFS Path1 (It's like a historic) >> C2 --> Sink2 to Avro --> Source Avro + Multiplexing + Interceptor2 --> C3 >> and C4 >> C3 --> Sink3 to HDFS Path2 >> C4 --> Sink4 to HDFS Path3 >> >> Interceptor1 doesn't make too much with the data, it's just to save as >> they are, it's like to store an history of the original data. >> >> Interceptor2 configure an selector and a header. It processes the data >> and configure the selector to redirect to Sink3 or Sink4. But this >> interceptor change the original data. >> >> I tried to do all the process without replicating data, but I could not. >> Now, it seems like too many steps just because I want to store the original >> data in HDFS like a historic. >> > > --001a11c261d04229330500ed4e99 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On my test, everything is in the same VM. Later, I'll = have another flow which is just spooling or tailing a file and send through= Avro to another Source on my system.

Do I really need t= o do that replicating step? I think that I have too many channel and that m= eans too resources and too configuration.=C2=A0
--001a11c261d04229330500ed4e99--