Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C666D18120 for ; Wed, 9 Dec 2015 07:36:43 +0000 (UTC) Received: (qmail 3354 invoked by uid 500); 9 Dec 2015 07:36:34 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 3301 invoked by uid 500); 9 Dec 2015 07:36:34 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 3289 invoked by uid 99); 9 Dec 2015 07:36:34 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Dec 2015 07:36:34 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 949FAC6F22 for ; Wed, 9 Dec 2015 07:36:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.898 X-Spam-Level: ** X-Spam-Status: No, score=2.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 8NMFYnOYX4wD for ; Wed, 9 Dec 2015 07:36:32 +0000 (UTC) Received: from mail-oi0-f44.google.com (mail-oi0-f44.google.com [209.85.218.44]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id E463620CC6 for ; Wed, 9 Dec 2015 07:36:31 +0000 (UTC) Received: by oiww189 with SMTP id w189so22499707oiw.3 for ; Tue, 08 Dec 2015 23:36:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-type; bh=xvYw4B8UIOPo79jK8hO4ZE78ReIYUCndy6RPsqG4g8k=; b=IE2o+Zp5n6en8VaZvk/NLQURQ2FV/Chdcen8MhvgoOv4wf6oUyiOAN7naNn+7UO+As aDdD9+DMipFjf5wjoQX0dLHZiXJjQbC6nHf84DNnBtLqVMPU1zEWPTvA8h9wu54AX3h7 CA2FC2EQFj4dfaXiPTEs14TGcx5L5jEGi3ajUgbmUU7qLIIX/X2OKJTTd7tWIgtwyuPw yGGxqALAQq57shgpf+PpYjk+mEyw+SNh5fStRusZzPozw6wh1a6rZXVIT6MPsfDwZa2x Vr7CY7CTeg4/HIZJIcFXOhN82eLV8CtUyYKmEhZyixWknxJPEgyKuS0s6JJ1aJzbvdTL hA3Q== X-Received: by 10.202.88.130 with SMTP id m124mr3022329oib.36.1449646590826; Tue, 08 Dec 2015 23:36:30 -0800 (PST) Received: from mikejcarey.local (ip72-219-184-46.oc.oc.cox.net. [72.219.184.46]) by smtp.googlemail.com with ESMTPSA id gi5sm3111676obb.6.2015.12.08.23.36.29 for (version=TLSv1/SSLv3 cipher=OTHER); Tue, 08 Dec 2015 23:36:30 -0800 (PST) Subject: Re: Feeds UDF To: dev@asterixdb.incubator.apache.org References: <8CEA091E-1E8A-46AE-98F7-9EBE5FF9AEE0@gmail.com> From: Mike Carey Message-ID: <5667D9FD.4040202@gmail.com> Date: Tue, 8 Dec 2015 23:36:29 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <8CEA091E-1E8A-46AE-98F7-9EBE5FF9AEE0@gmail.com> Content-Type: multipart/alternative; boundary="------------030507060203000001020800" --------------030507060203000001020800 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit I have the impression that AQL UDFs are poorly tested right now, so this failure may be unrelated to the sameness of the dataset. We don't have any restrictions currently, and I'm not sure we need to - joins are an important use case, actually. (Not self joins specifically, but joins for sure.) Cheers, Mike On 12/8/15 6:28 PM, Ildar Absalyamov wrote: > Hi All, > > As a part of feed ingestion we do allow preprocessing incoming data with AQL UDFs. > I was wondering if we somehow restrict the kind of UDFs that could be used? Do we allow joins in these UDFs? Especially joins with the same dataset, which is used for intake. Ex: > > create type TweetType as open { > id: string, > username : string, > location : string, > text : string, > timestamp : string > } > create dataset Tweets(TweetType) > primary key id; > create function feed_processor($x) { > for $y in dataset Tweets > // self-join with Tweets dataset on some predicate($x, $y) > return $y > } > create feed TweetFeed > apply function feed_processor; > > The query above fails in runtime, but I was wondering if that theoretically could work at all. > > Best regards, > Ildar > --------------030507060203000001020800--