asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Malarout, Namrata (398M-Affiliate)" <Namrata.Malar...@jpl.nasa.gov>
Subject Re: Aggregate function on collection of ordered list
Date Wed, 09 Dec 2015 18:31:09 GMT
Hi Wail,
Thanks for suggesting this. Unfortunately, for some of the variables we do have negative values.
- Namrata

From: Wail Alkowaileet <wael.y.k@gmail.com<mailto:wael.y.k@gmail.com>>
Reply-To: "users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>"
<users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>>
Date: Tuesday, December 8, 2015 at 10:11 PM
To: "users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>"
<users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>>
Cc: "dev@asterixdb.incubator.apache.org<mailto:dev@asterixdb.incubator.apache.org>"
<dev@asterixdb.incubator.apache.org<mailto:dev@asterixdb.incubator.apache.org>>
Subject: Re: Aggregate function on collection of ordered list

Me again ...
One way to workaround Namrata's problem is to enforce the type either by specifying the schema
or at runtime:
let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
for $x in $l // for each list in the outer list
let $k := (for $y in $x
return abs($y)
)
return sql-avg($k)

This will work only if your list doesn't contain negative numbers. I think we need to unify
the behavior in all functions on how to deal with type ANY.



On Tue, Dec 8, 2015 at 11:34 AM, Wail Alkowaileet <wael.y.k@gmail.com<mailto:wael.y.k@gmail.com>>
wrote:
That's one thing I observed in the built-in functions. Some work perfectly fine with the open
type and some are not.
As for instance, if I want to do string-length on a string that's not declared in my schema.
I have to trick the compiler as such string-length(string-concat(["",$mystring]) to infer
the type of $mystring as UNION(NULL, STRING) instead of ANY to satisfies the check conditions.

I really don't know what would be the best solution. However, I think it would be better for
open type queries to fail at runtime instead of compile time. But ... from a user experience
point-of-view, runtime fail can be problematic in a situation where I can apply the function
to the first n-1 of the records and fails at the last record.

On Tue, Dec 8, 2015 at 1:04 AM, Ildar Absalyamov <ildar.absalyamov@gmail.com<mailto:ildar.absalyamov@gmail.com>>
wrote:
That’s true, the trick will work only for homogeneous lists.

On Dec 7, 2015, at 13:00, Ian Maxon <imaxon@uci.edu<mailto:imaxon@uci.edu>> wrote:

We still can't declare a list of mixed type though, I don't think. I
was trying that earlier and ran into some cryptic errors about Java
typecasting. Hopefully that isn't necessary though as the NetCDF (or
the json representation thereof) isn't dynamically structured (e.g.
open types aren't necessary)?

On Mon, Dec 7, 2015 at 12:48 PM, Ildar Absalyamov
<ildar.absalyamov@gmail.com<mailto:ildar.absalyamov@gmail.com>> wrote:
Namrata,

I assume the aforementioned query with record defined in let clause was only the example.
That query indeed has a bug, but is happen only because the type of the list is not statically
enforced.

Do you load your data into dataset? I so what is the type of that dataset?
If you enforce the type of your nested ordered lists upon data ingestion you can calculate
the average:

drop dataverse test if exists
create dataverse test
use dataverse test

create type testType as {
id: int32,
list: [[double]]
}

create dataset testDS(testType) primary key id;
insert into dataset testDS({"id": 1, "list": [[1.2, 2.3, 3.4],[6,3,7,2]]});

for $x in dataset  testDS
for $y in $x.list
return {"avg": avg($y)}

On Dec 7, 2015, at 09:57, Malarout, Namrata (398M-Affiliate) <Namrata.Malarout@jpl.nasa.gov<mailto:Namrata.Malarout@jpl.nasa.gov>>
wrote:

Hi,

Wail, thanks for looking into it and explaining the use of for. I will be following the issue.
However, working with my sample data  may be a little more tricky. I have a couple hundred
of records which contain such nested ordered lists. I would like to perform an aggregation
over all the values across all the records. Any suggestions on how to do it?

Mike, thanks for understanding :) Appreciate all the help.
-Namrata

From: Michael Carey <mjcarey@ics.uci.edu<mailto:mjcarey@ics.uci.edu> <mailto:mjcarey@ics.uci.edu>>
Reply-To: "users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>
<mailto:users@asterixdb.incubator.apache.org>" <users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>
<mailto:users@asterixdb.incubator.apache.org>>
Date: Monday, December 7, 2015 at 7:28 AM
To: "users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>
<mailto:users@asterixdb.incubator.apache.org>" <users@asterixdb.incubator.apache.org<mailto:users@asterixdb.incubator.apache.org>
<mailto:users@asterixdb.incubator.apache.org>>, "dev@asterixdb.incubator.apache.org<mailto:dev@asterixdb.incubator.apache.org>
<mailto:dev@asterixdb.incubator.apache.org>" <dev@asterixdb.incubator.apache.org<mailto:dev@asterixdb.incubator.apache.org><mailto:dev@asterixdb.incubator.apache.org>>
Subject: Re: Aggregate function on collection of ordered list

+ Looping in the dev list to try and get fast attention to the fix, if it's easy!
(I know that Namarata's under time pressure in a NASA bakeoff exercise. :-))

On 12/7/15 4:59 AM, Wail Alkowaileet wrote:
It's an easy fix...
Thanks for reporting that.

I reported it in https://issues.apache.org/jira/browse/ASTERIXDB-1216 <https://issues.apache.org/jira/browse/ASTERIXDB-1216>

On Mon, Dec 7, 2015 at 3:33 PM, Wail Alkowaileet <wael.y.k@gmail.com<mailto:wael.y.k@gmail.com>
<mailto:wael.y.k@gmail.com>> wrote:
Hi Namrata,

The best way to think of for in lists is to think it works as foreach in java.
So ..
in your first query, it should be like:

let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
for $x in $l // for each list in the outer list
return {"avg”: avg($y)}

However, I tried it and it seems that there is a bug for applying aggregation on nested open
field.

I'll look into it to see if it's an easy fix



On Mon, Dec 7, 2015 at 2:52 PM, Malarout, Namrata (398M-Affiliate) <Namrata.Malarout@jpl.nasa.gov<mailto:Namrata.Malarout@jpl.nasa.gov><mailto:Namrata.Malarout@jpl.nasa.gov>>
wrote:
Hi,

I am trying to perform avg, sum, min and max functions on a collection of ordered lists. An
example is:
let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
return {"avg”: avg($l)}

I have tried both avg and sql-avg. But I get the following error:
Cannot compute AVG for values of type ORDEREDLIST [NotImplementedException].

I’ve attached the sample data that I’m working with (sample.adm). My AQL query to find
the average of analysis_error looks like:

use dataverse Test;
for $f in dataset sample
where not(is-null($f.analysis_error))
return avg($f.analysis_error);

The error seen is as follows:
Type of argument in function-call: asterix:avg, Args:[function-call: asterix:field-access-by-name,
Args:[%0->$$0, AString: {analysis_error}]] should be a collection type instead of ANY [AlgebricksException]

I would like to know what is the correct syntax to find the average. Appreciate the help.
Thanks,
Namrata





--

Regards,
Wail Alkowaileet



--

Regards,
Wail Alkowaileet


Best regards,
Ildar

Best regards,
Ildar




--

Regards,
Wail Alkowaileet



--

Regards,
Wail Alkowaileet

Mime
View raw message