Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 908471817A for ; Wed, 9 Dec 2015 07:45:33 +0000 (UTC) Received: (qmail 20588 invoked by uid 500); 9 Dec 2015 07:45:33 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 20532 invoked by uid 500); 9 Dec 2015 07:45:33 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 20518 invoked by uid 99); 9 Dec 2015 07:45:33 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Dec 2015 07:45:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 91833180985 for ; Wed, 9 Dec 2015 07:45:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.879 X-Spam-Level: *** X-Spam-Status: No, score=3.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id nNHQNDffzX-K for ; Wed, 9 Dec 2015 07:45:31 +0000 (UTC) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 3105742AA6 for ; Wed, 9 Dec 2015 07:45:31 +0000 (UTC) Received: by obc18 with SMTP id 18so28944050obc.2 for ; Tue, 08 Dec 2015 23:45:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-type; bh=JdroLnp4gHkky5GPCITj6xie+qZRaDrA4DvDjuIGTuk=; b=bdyti2DHDXh+iQAyVFE0h94GszGFSDjOngaUt9gWSmCQGbgBjPCEp37UbFnjUmbLtH Xy8gWNwO93pU0MC3mlLd/9/FJIyh4nepCA7IBp7cXJldkU6lOxIhdLBOyYsr7gF/f6jt 4KD/BHkFA67SC0SLoomC4kze55lOjwstHRn0BOXzH1tFJhqhv5XS/uT3BivTh55BDLod jMk4rpvo84hii1f0cucIwpAr43uFmS6g4OmO+IxY6P6FFZpQyJ5Aac76pGwaTsmu9Acv PyBwOYgK1F1xBQrQwCmd44IETojic24gx6XuZuTig/U7Cq3LJnMDC6AQuAHbdf5NtXjt xJrA== X-Received: by 10.182.116.200 with SMTP id jy8mr3130051obb.35.1449647124028; Tue, 08 Dec 2015 23:45:24 -0800 (PST) Received: from mikejcarey.local (ip72-219-184-46.oc.oc.cox.net. [72.219.184.46]) by smtp.googlemail.com with ESMTPSA id x71sm3090306oie.18.2015.12.08.23.45.23 for (version=TLSv1/SSLv3 cipher=OTHER); Tue, 08 Dec 2015 23:45:23 -0800 (PST) Subject: Re: Feeds UDF To: dev@asterixdb.incubator.apache.org References: <8CEA091E-1E8A-46AE-98F7-9EBE5FF9AEE0@gmail.com> From: Mike Carey Message-ID: <5667DC12.3020800@gmail.com> Date: Tue, 8 Dec 2015 23:45:22 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------080405010800010105080706" --------------080405010800010105080706 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit I'm confused: Why is "self-join" (or any join) an issue? I think the alledged case (:-)) against self-join is equivalent to a case against ever doing any queries against any data set under any circumstances where data is being inserted.... I don't think we want to restrict the system to only querying read-only datasets... A feed lets you run a query against the system based on the contents of a current incoming record R. Unless I am missing something (which is not unlikely because it's been a long day and I just got home from traveling :-)), this is equivalent to: let $r = ... (picture a constant constructor that yields the same content as R) ... return feed_processor ($r) Right? I.e., the new record R is not yet in the dataset - so - what's the issue? What's special about this? Cheers, Mike PS - Again, apologies if a long day has led to extra cluelessness on my part... On 12/8/15 9:52 PM, abdullah alamoudi wrote: > I think that we probably should restrict feed applied functions somehow > (needs further thoughts and discussions) and I know for sure that we don't. > As for the case you present, I would imagine that it could be allowed > theoretically but I think everyone sees why it should be disallowed. > > One thing to keep in mind is that we introduce a materialize if the dataset > was part of an insert pipeline. Now think about how this would work with a > continuous feed. One choice would be that the feed will materialize all > records to be inserted and once the feed stops, it would start inserting > them but I still think we should not allow it. > > My 2c, > Any opposing argument? > > > Amoudi, Abdullah. > > On Tue, Dec 8, 2015 at 6:28 PM, Ildar Absalyamov > wrote: >> Hi All, >> >> As a part of feed ingestion we do allow preprocessing incoming data with >> AQL UDFs. >> I was wondering if we somehow restrict the kind of UDFs that could be >> used? Do we allow joins in these UDFs? Especially joins with the same >> dataset, which is used for intake. Ex: >> >> create type TweetType as open { >> id: string, >> username : string, >> location : string, >> text : string, >> timestamp : string >> } >> create dataset Tweets(TweetType) >> primary key id; >> create function feed_processor($x) { >> for $y in dataset Tweets >> // self-join with Tweets dataset on some predicate($x, $y) >> return $y >> } >> create feed TweetFeed >> apply function feed_processor; >> >> The query above fails in runtime, but I was wondering if that >> theoretically could work at all. >> >> Best regards, >> Ildar >> >> --------------080405010800010105080706--