Return-Path: X-Original-To: apmail-falcon-dev-archive@minotaur.apache.org Delivered-To: apmail-falcon-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE50818AD7 for ; Thu, 29 Oct 2015 04:40:06 +0000 (UTC) Received: (qmail 36487 invoked by uid 500); 29 Oct 2015 04:40:06 -0000 Delivered-To: apmail-falcon-dev-archive@falcon.apache.org Received: (qmail 36444 invoked by uid 500); 29 Oct 2015 04:40:06 -0000 Mailing-List: contact dev-help@falcon.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@falcon.apache.org Delivered-To: mailing list dev@falcon.apache.org Received: (qmail 36432 invoked by uid 99); 29 Oct 2015 04:40:06 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2015 04:40:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 9DDBB1A0849 for ; Thu, 29 Oct 2015 04:40:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 5.179 X-Spam-Level: ***** X-Spam-Status: No, score=5.179 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.008, HTML_MESSAGE=3, KAM_LAZY_DOMAIN_SECURITY=1, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id eNY8UwltsB33 for ; Thu, 29 Oct 2015 04:40:04 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with SMTP id D89B920FE7 for ; Thu, 29 Oct 2015 04:40:02 +0000 (UTC) Received: (qmail 36427 invoked by uid 99); 29 Oct 2015 04:40:02 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2015 04:40:02 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 608C02A7B9C; Thu, 29 Oct 2015 04:40:01 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============0129360150278237072==" MIME-Version: 1.0 Subject: Re: Review Request 39711: Lifecycle does not allow feed with frequency greater than days(1) From: "Ajay Yadava" To: "Sowmya Ramesh" , "Falcon" , "Ajay Yadava" Date: Thu, 29 Oct 2015 04:40:01 -0000 Message-ID: <20151029044001.22461.97246@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Ajay Yadava" X-ReviewGroup: Falcon X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/39711/ X-Sender: "Ajay Yadava" References: <20151028234622.22462.77364@reviews.apache.org> In-Reply-To: <20151028234622.22462.77364@reviews.apache.org> Reply-To: "Ajay Yadava" X-ReviewRequest-Repository: falcon-git --===============0129360150278237072== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Oct. 28, 2015, 11:46 p.m., Sowmya Ramesh wrote: > > common/src/main/java/org/apache/falcon/entity/FeedHelper.java, line 813 > > > > > > Sorry, for multiple comments. I didn't review Lifecycle feature so I didn't have the complete picture. > > > > Frequency in the retention stage is not mandatory and if teh frequency is not set by user then > > 1> If feed frequency < 6 hrs its set to 6 hrs > > 2> If its > 6 hrs its set to feed frequency > > > > Shouldn't it fallbaack to current behavior for retenting the data? < 6hrs set to 6 hrs and > 6hrs set to 1 day? > > > > This is required for 2 reasons > > 1> Current understanding of users is that if feed frequency > 6 hrs , retention job will run every day. We shouldn't deviate from this. > > > > 2> I also spoke with Venkatesh about why was it set to 1 day. He mentioned in case retention fails and reruns fail too we don't want to keep the data till it runs next time if feed frequency is used. This can cause SEC retention vioalation and also cause memory issues if feed frequency is say one year. If job runs every day it catches up for the scenario mentioned above. > > > > Any specific reason to change the old behavior? Sowmya and I had an offline discussion to address this. Updating the gist here. We try to fall back to old behaviour as much as possible but it fails the extra validations in lifecycle retention. The current behaviour is to retain old behaviour as much as possible within new constraints (specifically retention shouldn't be more frequent than data availability). Keeping retention frequency as a fallback to retries is not the best thing to do in such scenarios. If it fails all retries there is no guarantee that it will succeed next time as well. It means system is not able to recover on it's own and needs manual intervention. Best way to deal with such scenarios is to have appropriate monitoring and alerting (e.g. they can now have email alerts on failure of retention workflow). The said kind of set up also fails for a majority of frequencies e.g. minutely, hourly, daily (all apart from roll ups like monthly) will not ensure the above guarantee from the reasoning mentioned. So the guarantee is already broken, if it was ever the intent. Also, the above behaviour is a wastage of resources 99% of the times to solve for that rare 1% case. Coordinators will run and they will have nothing to do. - Ajay ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/39711/#review104372 ----------------------------------------------------------- On Oct. 28, 2015, 6:04 p.m., Ajay Yadava wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/39711/ > ----------------------------------------------------------- > > (Updated Oct. 28, 2015, 6:04 p.m.) > > > Review request for Falcon. > > > Bugs: FALCON-1560 > https://issues.apache.org/jira/browse/FALCON-1560 > > > Repository: falcon-git > > > Description > ------- > > Lifecycle does not allow feed with frequency greater than days(1) > > > Diffs > ----- > > common/src/main/java/org/apache/falcon/entity/FeedHelper.java 5c252a8 > common/src/test/java/org/apache/falcon/entity/FeedHelperTest.java 4020d36 > common/src/test/java/org/apache/falcon/entity/parser/FeedEntityParserTest.java 905be68 > > Diff: https://reviews.apache.org/r/39711/diff/ > > > Testing > ------- > > Added unit test for the scenarios. > > > Thanks, > > Ajay Yadava > > --===============0129360150278237072==--