Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 39B31186DB for ; Mon, 21 Sep 2015 15:19:07 +0000 (UTC) Received: (qmail 16921 invoked by uid 500); 21 Sep 2015 15:19:06 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 16874 invoked by uid 500); 21 Sep 2015 15:19:06 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 16864 invoked by uid 99); 21 Sep 2015 15:19:06 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Sep 2015 15:19:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2301C1A20D9 for ; Mon, 21 Sep 2015 15:19:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.399 X-Spam-Level: *** X-Spam-Status: No, score=3.399 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, KAM_EU=0.5, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id LNNNQlkfbx2T for ; Mon, 21 Sep 2015 15:19:01 +0000 (UTC) Received: from mail-lb0-f171.google.com (mail-lb0-f171.google.com [209.85.217.171]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id AFCF242B7B for ; Mon, 21 Sep 2015 15:19:00 +0000 (UTC) Received: by lbbvu2 with SMTP id vu2so52864390lbb.0 for ; Mon, 21 Sep 2015 08:18:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-type; bh=0HWo5U+B/z1vF58gGsDd1FVB6o7NaRJnC+AKq6EWASU=; b=spTvZ+tTRBjOFXAt/Ogj1LOoRh7BBZMxSGsdxAlIeERzA2EdjHPk0fsIptask8vWB5 DghDC44Pzu5/WmQRf+0LgUC/0iSGiMP8AWO73vI3+NckbQP4dCAXFvO1m/gtMYcKHA3c joyV+H+If1fmx1DoN97yfYwxVvdkJS/iComncGvezmD80chZJrtR9HIMZofqW86s81Zd MRuXkOvxI3bfUeNIgq16Ye+4Y1e4j5nwMTTquGappeWeKzqvOxDWbm3RhXvGx7k6OO/S tWmr+Ij+0Ae2FS6ObflcQxOfVmSoxtxfnCJ4V5m7Sy4rGOxda+5tNRd+GHc+MyOAIm2Y 1ygg== X-Received: by 10.152.237.1 with SMTP id uy1mr2282899lac.33.1442848739553; Mon, 21 Sep 2015 08:18:59 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: IT CTO Date: Mon, 21 Sep 2015 15:18:50 +0000 Message-ID: Subject: Re: Interceptor vrs Serialeztion To: user@flume.apache.org Content-Type: multipart/alternative; boundary=001a1134079895f74f0520436639 --001a1134079895f74f0520436639 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks, I will try it. =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7=B3, 21 = =D7=91=D7=A1=D7=A4=D7=98=D7=B3 2015, 18:15 =D7=9E=D7=90=D7=AA Ahmed Vila : > I think that interceptor is a way to go and you can use regex_extractor > interceptor instead of building your own - it simplifies deployment. > https://flume.apache.org/FlumeUserGuide.html#regex-extractor-interceptor > > Further, you can use RegexExtractorInterceptorMillisSerializer for this > interceptor's type in order to do datetime parsing and do the formatting = on > fly that will be suitable for your hdfs folder (f.e. extract year-month). > https://flume.apache.org/FlumeUserGuide.html#example-2 > > Later on, you can use header replacement string in the hdfs path to injec= t > a header defined with serializer's name ("timestamp" in the example above= ). > Hdfs Sink supports timestamp header out of the box and applies escape > sequences against it by default, unless hdfs.useLocalTimeStamp =3D true. > So, a1.sinks.k1.hdfs.path =3D /flume/events/%y-%m-%d/%H%M/%S will have > replacement values from the timestamp header produced by regex extractor. > > > > > On Mon, Sep 21, 2015 at 5:03 PM, IT CTO wrote: > >> I want to read files and write them to hdfs but I want to write the to a >> date partitioned folder based on a date value IN THE ROW. >> Should I write a custom interceptor or custom serializer? >> Eran >> -- >> Eran | "You don't need eyes to see, you need vision" (Faithless) >> > > > > -- > > Best regards, > Ahmed Vila | Senior software developer > DevLogic | Sarajevo | Bosnia and Herzegovina > > Office : +387 33 942 123 > Mobile: +387 62 139 348 > > Website: www.devlogic.eu > E-mail : avila@devlogic.eu > --------------------------------------------------------------------- > This e-mail and any attachment is for authorised use by the intended > recipient(s) only. This email contains confidential information. It shoul= d > not be copied, disclosed to, retained or used by, any party other than th= e > intended recipient. Any unauthorised distribution, dissemination or copyi= ng > of this E-mail or its attachments, and/or any use of any information > contained in them, is strictly prohibited and may be illegal. If you are > not an intended recipient then please promptly delete this e-mail and any > attachment and all copies and inform the sender directly via email. Any > emails that you send to us may be monitored by systems or persons other > than the named communicant for the purposes of ascertaining whether the > communication complies with the law and company policies. > > --------------------------------------------------------------------- > This e-mail and any attachment is for authorised use by the intended > recipient(s) only. This email contains confidential information. It shoul= d > not be copied, disclosed to, retained or used by, any party other than th= e > intended recipient. Any unauthorised distribution, dissemination or copyi= ng > of this E-mail or its attachments, and/or any use of any information > contained in them, is strictly prohibited and may be illegal. If you are > not an intended recipient then please promptly delete this e-mail and any > attachment and all copies and inform the sender directly via email. Any > emails that you send to us may be monitored by systems or persons other > than the named communicant for the purposes of ascertaining whether the > communication complies with the law and company policies. --=20 Eran | "You don't need eyes to see, you need vision" (Faithless) --001a1134079895f74f0520436639 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Thanks, I will try it.


=D7=91=D7=AA=D7=90=D7=A8=D7= =99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7=B3, 21 =D7=91=D7=A1=D7=A4=D7=98=D7= =B3 2015, 18:15=C2=A0=D7=9E=D7=90=D7=AA Ahmed Vila <avila@devlogic.eu>:
I think that interceptor is a way to go and you c= an use regex_extractor interceptor instead of building your own - it simpli= fies deployment.

Further, y= ou can use RegexExtractorInterceptorMillisSerializer for this interceptor&#= 39;s type in order to do datetime parsing and do the formatting on fly that= will be suitable for your hdfs folder (f.e. extract year-month).
Later on, you can use header replacement string in the hd= fs path to inject a header defined with serializer's name ("timest= amp" in the example above).
Hdfs Sink supports timestamp hea= der out of the box and applies escape sequences against it by default, unle= ss hdfs.useLocalTimeStamp =3D true.
So, a1.sinks.k1.hdfs.path =3D = /flume/events/%y-%m-%d/%H%M/%S =C2=A0will have replacement values from the = timestamp header produced by regex extractor.


=


On Mon, Sep 21, 2015 at 5:03 PM,= IT CTO <goi.cto@gmail.com> wrote:

I want to read files and write them to hdfs but I= want to write the to a date partitioned folder based on a date value IN TH= E ROW.
Should I write a custom interceptor or custom serializer?
Eran

--
Eran | "You don't n= eed eyes to see, you need vision" (Faithless)



--
<= div dir=3D"ltr">

Best regards,

Ahmed Vila | Senior software developer
DevLogic | Sarajevo | Bosnia and Herzegovina

<= div>Office :=C2=A0+387 33 942 123=C2=A0
Mobile:=C2=A0+387 62 139 = 348

Website:=C2=A0www.devlogic.eu=C2=A0
E-mail =C2=A0 :=C2= =A0avila@devlogic.eu=
----------------------------------------= -----------------------------
This e-mail and any attachment is f= or authorised use by the intended recipient(s) only. This email contains co= nfidential information. It should not be copied, disclosed to, retained or = used by, any party other than the intended recipient. Any unauthorised dist= ribution, dissemination or copying of this E-mail or its attachments, and/o= r any use of any information contained in them, is strictly prohibited and = may be illegal. If you are not an intended recipient then please promptly d= elete this e-mail and any attachment and all copies and inform the sender d= irectly via email. Any emails that you send to us may be monitored by syste= ms or persons other than the named communicant for the purposes of ascertai= ning whether the communication complies with the law and company policies.<= br>

----------------------------------------------------------= -----------
This e-mail and any attachment is for authorised use by the = intended recipient(s) only. This email contains confidential information. I= t should not be copied, disclosed to, retained or used by, any party other = than the intended recipient. Any unauthorised distribution, dissemination o= r copying of this E-mail or its attachments, and/or any use of any informat= ion contained in them, is strictly prohibited and may be illegal. If you ar= e not an intended recipient then please promptly delete this e-mail and any= attachment and all copies and inform the sender directly via email. Any em= ails that you send to us may be monitored by systems or persons other than = the named communicant for the purposes of ascertaining whether the communic= ation complies with the law and company policies.
=
--
Eran | "You don't n= eed eyes to see, you need vision" (Faithless)
--001a1134079895f74f0520436639--