Return-Path: X-Original-To: apmail-asterixdb-users-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F3CFD18076 for ; Wed, 9 Dec 2015 07:22:32 +0000 (UTC) Received: (qmail 59273 invoked by uid 500); 9 Dec 2015 07:22:32 -0000 Delivered-To: apmail-asterixdb-users-archive@asterixdb.apache.org Received: (qmail 59244 invoked by uid 500); 9 Dec 2015 07:22:32 -0000 Mailing-List: contact users-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@asterixdb.incubator.apache.org Delivered-To: mailing list users@asterixdb.incubator.apache.org Received: (qmail 59218 invoked by uid 99); 9 Dec 2015 07:22:32 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Dec 2015 07:22:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2EC3C1A246F; Wed, 9 Dec 2015 07:22:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 6xYNP2H14yPj; Wed, 9 Dec 2015 07:22:22 +0000 (UTC) Received: from mail-pf0-f171.google.com (mail-pf0-f171.google.com [209.85.192.171]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id C7C5E21270; Wed, 9 Dec 2015 07:22:21 +0000 (UTC) Received: by pfbg73 with SMTP id g73so26138807pfb.1; Tue, 08 Dec 2015 23:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:content-type:mime-version:subject:from:in-reply-to:date:cc :message-id:references:to; bh=yJUobd1eZFDFqHbs4qGyFzUPKkTJLe5eNwXZX+GErKM=; b=BfMrqOfWX7urLrvpkuRhyT+JUSQI6OhlB9m/EZH6ijz4dBK+clZycsYyoymdAm3kbQ Y2zzv19Hzc8N8qJOn0K6vgfbFAn8tY6TxinfFwAqz/k8+DmhjY6dNXaWaVnC4KPDQ2UH nVTlCKmjJh7Ft82cwNOG/Chp+Imw7yHPrimV2/n4r0Sd5KjwakdDvR3kjDe0tDyhd54f 8UtZ4CBqYuSCxEQge0L6UUtuYzmYZ6wbSX6zcpI0XRBd6bnBrKB4TfU/k60l7NFcQmav 1E7kqskCrT2N5sT45I7QClr9ADHf193SR+JOzKTDg1oclG7oEwJ00zKVZ2GwlFSeqf0B jbYg== X-Received: by 10.98.17.131 with SMTP id 3mr10739108pfr.57.1449645440253; Tue, 08 Dec 2015 23:17:20 -0800 (PST) Received: from [192.168.1.149] (66-215-226-0.dhcp.rvsd.ca.charter.com. [66.215.226.0]) by smtp.gmail.com with ESMTPSA id 72sm9076419pfi.33.2015.12.08.23.17.19 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 08 Dec 2015 23:17:19 -0800 (PST) Sender: Ildar Absalyamov Content-Type: multipart/alternative; boundary="Apple-Mail=_AEF891DD-A069-4673-B01D-2989ACCA3355" Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\)) Subject: Re: Aggregate function on collection of ordered list From: Ildar Absalyamov In-Reply-To: Date: Tue, 8 Dec 2015 23:17:17 -0800 Cc: users@asterixdb.incubator.apache.org Message-Id: <17F867CF-C495-4F93-8959-CF2CBE0ABD2B@gmail.com> References: <5665A5B2.3010800@ics.uci.edu> <92985D31-248B-4104-8A3B-C10901DF320B@gmail.com> To: dev@asterixdb.incubator.apache.org X-Mailer: Apple Mail (2.3096.5) --Apple-Mail=_AEF891DD-A069-4673-B01D-2989ACCA3355 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 I believe we need to do a major refactoring of all user-facing = functions. Created a root issue for that = https://issues.apache.org/jira/browse/ASTERIXDB-1219 = > On Dec 8, 2015, at 22:11, Wail Alkowaileet wrote: >=20 > Me again ... > One way to workaround Namrata's problem is to enforce the type either = by > specifying the schema or at runtime: > let $l :=3D [[1.2, 2.3, 3.4],[6,3,7,2]] > for $x in $l // for each list in the outer list > let $k :=3D (for $y in $x > return abs($y) > ) > return sql-avg($k) >=20 > This will work only if your list doesn't contain negative numbers. I = think > we need to unify the behavior in all functions on how to deal with = type ANY. >=20 >=20 >=20 > On Tue, Dec 8, 2015 at 11:34 AM, Wail Alkowaileet > wrote: >=20 >> That's one thing I observed in the built-in functions. Some work = perfectly >> fine with the open type and some are not. >> As for instance, if I want to do string-length on a string that's not >> declared in my schema. I have to trick the compiler as such >> string-length(string-concat(["",$mystring]) to infer the type of = $mystring >> as UNION(NULL, STRING) instead of ANY to satisfies the check = conditions. >>=20 >> I really don't know what would be the best solution. However, I think = it >> would be better for open type queries to fail at runtime instead of = compile >> time. But ... from a user experience point-of-view, runtime fail can = be >> problematic in a situation where I can apply the function to the = first n-1 >> of the records and fails at the last record. >>=20 >> On Tue, Dec 8, 2015 at 1:04 AM, Ildar Absalyamov < >> ildar.absalyamov@gmail.com> wrote: >>=20 >>> That=E2=80=99s true, the trick will work only for homogeneous lists. >>>=20 >>> On Dec 7, 2015, at 13:00, Ian Maxon wrote: >>>=20 >>> We still can't declare a list of mixed type though, I don't think. I >>> was trying that earlier and ran into some cryptic errors about Java >>> typecasting. Hopefully that isn't necessary though as the NetCDF (or >>> the json representation thereof) isn't dynamically structured (e.g. >>> open types aren't necessary)? >>>=20 >>> On Mon, Dec 7, 2015 at 12:48 PM, Ildar Absalyamov >>> wrote: >>>=20 >>> Namrata, >>>=20 >>> I assume the aforementioned query with record defined in let clause = was >>> only the example. >>> That query indeed has a bug, but is happen only because the type of = the >>> list is not statically enforced. >>>=20 >>> Do you load your data into dataset? I so what is the type of that = dataset? >>> If you enforce the type of your nested ordered lists upon data = ingestion >>> you can calculate the average: >>>=20 >>> drop dataverse test if exists >>> create dataverse test >>> use dataverse test >>>=20 >>> create type testType as { >>> id: int32, >>> list: [[double]] >>> } >>>=20 >>> create dataset testDS(testType) primary key id; >>> insert into dataset testDS({"id": 1, "list": [[1.2, 2.3, >>> 3.4],[6,3,7,2]]}); >>>=20 >>> for $x in dataset testDS >>> for $y in $x.list >>> return {"avg": avg($y)} >>>=20 >>> On Dec 7, 2015, at 09:57, Malarout, Namrata (398M-Affiliate) < >>> Namrata.Malarout@jpl.nasa.gov> wrote: >>>=20 >>> Hi, >>>=20 >>> Wail, thanks for looking into it and explaining the use of for. I = will be >>> following the issue. However, working with my sample data may be a = little >>> more tricky. I have a couple hundred of records which contain such = nested >>> ordered lists. I would like to perform an aggregation over all the = values >>> across all the records. Any suggestions on how to do it? >>>=20 >>> Mike, thanks for understanding :) Appreciate all the help. >>> -Namrata >>>=20 >>> From: Michael Carey >> >> >>> Reply-To: "users@asterixdb.incubator.apache.org < >>> mailto:users@asterixdb.incubator.apache.org >>> >" < >>> users@asterixdb.incubator.apache.org < >>> mailto:users@asterixdb.incubator.apache.org >>> >> >>> Date: Monday, December 7, 2015 at 7:28 AM >>> To: "users@asterixdb.incubator.apache.org < >>> mailto:users@asterixdb.incubator.apache.org >>> >" < >>> users@asterixdb.incubator.apache.org < >>> mailto:users@asterixdb.incubator.apache.org >>> >>, " >>> dev@asterixdb.incubator.apache.org < >>> mailto:dev@asterixdb.incubator.apache.org >>> >" < >>> dev@asterixdb.incubator.apache.org< >>> mailto:dev@asterixdb.incubator.apache.org >>> >> >>> Subject: Re: Aggregate function on collection of ordered list >>>=20 >>> + Looping in the dev list to try and get fast attention to the fix, = if >>> it's easy! >>> (I know that Namarata's under time pressure in a NASA bakeoff = exercise. >>> :-)) >>>=20 >>> On 12/7/15 4:59 AM, Wail Alkowaileet wrote: >>>=20 >>> It's an easy fix... >>> Thanks for reporting that. >>>=20 >>> I reported it in = https://issues.apache.org/jira/browse/ASTERIXDB-1216 < >>> https://issues.apache.org/jira/browse/ASTERIXDB-1216> >>>=20 >>> On Mon, Dec 7, 2015 at 3:33 PM, Wail Alkowaileet >> mailto:wael.y.k@gmail.com >> wrote: >>> Hi Namrata, >>>=20 >>> The best way to think of for in lists is to think it works as = foreach in >>> java. >>> So .. >>> in your first query, it should be like: >>>=20 >>> let $l :=3D [[1.2, 2.3, 3.4],[6,3,7,2]] >>> for $x in $l // for each list in the outer list >>> return {"avg=E2=80=9D: avg($y)} >>>=20 >>> However, I tried it and it seems that there is a bug for applying >>> aggregation on nested open field. >>>=20 >>> I'll look into it to see if it's an easy fix >>>=20 >>>=20 >>>=20 >>> On Mon, Dec 7, 2015 at 2:52 PM, Malarout, Namrata (398M-Affiliate) < >>> Namrata.Malarout@jpl.nasa.gov>> >> wrote: >>> Hi, >>>=20 >>> I am trying to perform avg, sum, min and max functions on a = collection of >>> ordered lists. An example is: >>> let $l :=3D [[1.2, 2.3, 3.4],[6,3,7,2]] >>> return {"avg=E2=80=9D: avg($l)} >>>=20 >>> I have tried both avg and sql-avg. But I get the following error: >>> Cannot compute AVG for values of type ORDEREDLIST >>> [NotImplementedException]. >>>=20 >>> I=E2=80=99ve attached the sample data that I=E2=80=99m working with = (sample.adm). My AQL >>> query to find the average of analysis_error looks like: >>>=20 >>> use dataverse Test; >>> for $f in dataset sample >>> where not(is-null($f.analysis_error)) >>> return avg($f.analysis_error); >>>=20 >>> The error seen is as follows: >>> Type of argument in function-call: asterix:avg, Args:[function-call: >>> asterix:field-access-by-name, Args:[%0->$$0, AString: = {analysis_error}]] >>> should be a collection type instead of ANY [AlgebricksException] >>>=20 >>> I would like to know what is the correct syntax to find the average. >>> Appreciate the help. >>> Thanks, >>> Namrata >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>> -- >>>=20 >>> Regards, >>> Wail Alkowaileet >>>=20 >>>=20 >>>=20 >>> -- >>>=20 >>> Regards, >>> Wail Alkowaileet >>>=20 >>>=20 >>>=20 >>> Best regards, >>> Ildar >>>=20 >>>=20 >>> Best regards, >>> Ildar >>>=20 >>>=20 >>=20 >>=20 >> -- >>=20 >> *Regards,* >> Wail Alkowaileet >>=20 >=20 >=20 >=20 > --=20 >=20 > *Regards,* > Wail Alkowaileet Best regards, Ildar --Apple-Mail=_AEF891DD-A069-4673-B01D-2989ACCA3355 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
I believe we need to do a major refactoring = of all user-facing functions.
<= br class=3D"">
On= Dec 8, 2015, at 22:11, Wail Alkowaileet <wael.y.k@gmail.com> = wrote:

Me again ...
One way to workaround Namrata's = problem is to enforce the type either by
specifying the = schema or at runtime:
let $l :=3D [[1.2, 2.3, = 3.4],[6,3,7,2]]
for $x in $l // for each list in the outer = list
let $k :=3D (for $y in $x
return = abs($y)
)
return sql-avg($k)
This will work only if your list doesn't contain negative = numbers. I think
we need to unify the behavior in all = functions on how to deal with type ANY.



On Tue, Dec 8, 2015 at 11:34 AM, Wail = Alkowaileet <wael.y.k@gmail.com>
wrote:

That's = one thing I observed in the built-in functions. Some work perfectly
fine with the open type and some are not.
As = for instance, if I want to do string-length on a string that's not
declared in my schema. I have to trick the compiler as = such
string-length(string-concat(["",$mystring]) to infer = the type of $mystring
as UNION(NULL, STRING) instead of = ANY to satisfies the check conditions.

I = really don't know what would be the best solution. However, I think = it
would be better for open type queries to fail at = runtime instead of compile
time. But ... from a user = experience point-of-view, runtime fail can be
problematic = in a situation where I can apply the function to the first n-1
of the records and fails at the last record.

On Tue, Dec 8, 2015 at 1:04 AM, Ildar Absalyamov <
ildar.absalyamov@gmail.com> wrote:

That=E2=80=99s true, the = trick will work only for homogeneous lists.

On Dec 7, 2015, at 13:00, Ian Maxon <imaxon@uci.edu> = wrote:

We still can't declare a list of = mixed type though, I don't think. I
was trying that = earlier and ran into some cryptic errors about Java
typecasting. Hopefully that isn't necessary though as the = NetCDF (or
the json representation thereof) isn't = dynamically structured (e.g.
open types aren't = necessary)?

On Mon, Dec 7, 2015 at 12:48 = PM, Ildar Absalyamov
<ildar.absalyamov@gmail.com> wrote:

Namrata,

I assume the = aforementioned query with record defined in let clause was
only the example.
That query indeed has a bug, = but is happen only because the type of the
list is not = statically enforced.

Do you load your data = into dataset? I so what is the type of that dataset?
If = you enforce the type of your nested ordered lists upon data ingestion
you can calculate the average:

drop dataverse test if exists
create dataverse = test
use dataverse test

create = type testType as {
id: int32,
list: = [[double]]
}

create dataset = testDS(testType) primary key id;
insert into dataset = testDS({"id": 1, "list": [[1.2, 2.3,
3.4],[6,3,7,2]]});

for $x in dataset  testDS
for= $y in $x.list
return {"avg": avg($y)}

On Dec 7, 2015, at 09:57, Malarout, Namrata (398M-Affiliate) = <
Namrata.Malarout@jpl.nasa.gov> wrote:

Hi,

Wail, thanks for looking = into it and explaining the use of for. I will be
following = the issue. However, working with my sample data  may be a little
more tricky. I have a couple hundred of records which contain = such nested
ordered lists. I would like to perform an = aggregation over all the values
across all the records. = Any suggestions on how to do it?

Mike, = thanks for understanding :) Appreciate all the help.
-Namrata

From: Michael Carey = <mjcarey@ics.uci.edu <mailto:mjcarey@ics.uci.edu
<mjcarey@ics.uci.edu>>>
Reply-To: = "users@asterixdb.incubator.apache.org <
mailto:users@asterixdb.incubator.apache.org
<users@asterixdb.incubator.apache.org>>" <
users@asterixdb.incubator.apache.org <
mailto:users@asterixdb.incubator.apache.org
<users@asterixdb.incubator.apache.org>>>
Date: Monday, December 7, 2015 at 7:28 AM
To: = "users@asterixdb.incubator.apache.org <
mailto:users@asterixdb.incubator.apache.org
<users@asterixdb.incubator.apache.org>>" <
users@asterixdb.incubator.apache.org <
mailto:users@asterixdb.incubator.apache.org
<users@asterixdb.incubator.apache.org>>>, "
dev@asterixdb.incubator.apache.org <
mailto:dev@asterixdb.incubator.apache.org
<dev@asterixdb.incubator.apache.org>>" <
dev@asterixdb.incubator.apache.org<
mailto:dev@asterixdb.incubator.apache.org
<dev@asterixdb.incubator.apache.org>>>
Subject: Re: Aggregate function on collection of ordered = list

+ Looping in the dev list to try and = get fast attention to the fix, if
it's easy!
(I know that Namarata's under time pressure in a NASA bakeoff = exercise.
:-))

On 12/7/15 = 4:59 AM, Wail Alkowaileet wrote:

It's an = easy fix...
Thanks for reporting that.

I reported it in = https://issues.apache.org/jira/browse/ASTERIXDB-1216 <
https://issues.apache.org/jira/browse/ASTERIXDB-1216>

On Mon, Dec 7, 2015 at 3:33 PM, Wail = Alkowaileet <wael.y.k@gmail.com <
mailto:wael.y.k@gmail.com <wael.y.k@gmail.com>>> = wrote:
Hi Namrata,

The best = way to think of for in lists is to think it works as foreach in
java.
So ..
in your first query, = it should be like:

let $l :=3D [[1.2, 2.3, = 3.4],[6,3,7,2]]
for $x in $l // for each list in the outer = list
return {"avg=E2=80=9D: avg($y)}

However, I tried it and it seems that there is a bug for = applying
aggregation on nested open field.
I'll look into it to see if it's an easy fix


On Mon, Dec 7, 2015 at 2:52 = PM, Malarout, Namrata (398M-Affiliate) <
Namrata.Malarout@jpl.nasa.gov<mailto:Namrata.Malarout@jpl.na= sa.gov
<Namrata.Malarout@jpl.nasa.gov>>> = wrote:
Hi,

I am trying to = perform avg, sum, min and max functions on a collection of
ordered lists. An example is:
let $l :=3D = [[1.2, 2.3, 3.4],[6,3,7,2]]
return {"avg=E2=80=9D: = avg($l)}

I have tried both avg and sql-avg. = But I get the following error:
Cannot compute AVG for = values of type ORDEREDLIST
[NotImplementedException].

I=E2=80=99ve attached the sample data that = I=E2=80=99m working with (sample.adm). My AQL
query to = find the average of analysis_error looks like:

use dataverse Test;
for $f in dataset sample
where not(is-null($f.analysis_error))
return = avg($f.analysis_error);

The error seen is = as follows:
Type of argument in function-call: = asterix:avg, Args:[function-call:
asterix:field-access-by-name, Args:[%0->$$0, AString: = {analysis_error}]]
should be a collection type instead of = ANY [AlgebricksException]

I would like to = know what is the correct syntax to find the average.
Appreciate the help.
Thanks,
Namrata





--

Regards,
Wail Alkowaileet



--

Regards,
Wail Alkowaileet



Best regards,
Ildar


Best = regards,
Ildar




--
*Regards,*
Wail Alkowaileet




-- =

*Regards,*
Wail = Alkowaileet

Best = regards,
Ildar

= --Apple-Mail=_AEF891DD-A069-4673-B01D-2989ACCA3355--