asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wail Alkowaileet <wael....@gmail.com>
Subject Re: Aggregate function on collection of ordered list
Date Wed, 09 Dec 2015 06:11:37 GMT
Me again ...
One way to workaround Namrata's problem is to enforce the type either by
specifying the schema or at runtime:
let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
for $x in $l // for each list in the outer list
let $k := (for $y in $x
return abs($y)
)
return sql-avg($k)

This will work only if your list doesn't contain negative numbers. I think
we need to unify the behavior in all functions on how to deal with type ANY.



On Tue, Dec 8, 2015 at 11:34 AM, Wail Alkowaileet <wael.y.k@gmail.com>
wrote:

> That's one thing I observed in the built-in functions. Some work perfectly
> fine with the open type and some are not.
> As for instance, if I want to do string-length on a string that's not
> declared in my schema. I have to trick the compiler as such
> string-length(string-concat(["",$mystring]) to infer the type of $mystring
> as UNION(NULL, STRING) instead of ANY to satisfies the check conditions.
>
> I really don't know what would be the best solution. However, I think it
> would be better for open type queries to fail at runtime instead of compile
> time. But ... from a user experience point-of-view, runtime fail can be
> problematic in a situation where I can apply the function to the first n-1
> of the records and fails at the last record.
>
> On Tue, Dec 8, 2015 at 1:04 AM, Ildar Absalyamov <
> ildar.absalyamov@gmail.com> wrote:
>
>> That’s true, the trick will work only for homogeneous lists.
>>
>> On Dec 7, 2015, at 13:00, Ian Maxon <imaxon@uci.edu> wrote:
>>
>> We still can't declare a list of mixed type though, I don't think. I
>> was trying that earlier and ran into some cryptic errors about Java
>> typecasting. Hopefully that isn't necessary though as the NetCDF (or
>> the json representation thereof) isn't dynamically structured (e.g.
>> open types aren't necessary)?
>>
>> On Mon, Dec 7, 2015 at 12:48 PM, Ildar Absalyamov
>> <ildar.absalyamov@gmail.com> wrote:
>>
>> Namrata,
>>
>> I assume the aforementioned query with record defined in let clause was
>> only the example.
>> That query indeed has a bug, but is happen only because the type of the
>> list is not statically enforced.
>>
>> Do you load your data into dataset? I so what is the type of that dataset?
>> If you enforce the type of your nested ordered lists upon data ingestion
>> you can calculate the average:
>>
>> drop dataverse test if exists
>> create dataverse test
>> use dataverse test
>>
>> create type testType as {
>> id: int32,
>> list: [[double]]
>> }
>>
>> create dataset testDS(testType) primary key id;
>> insert into dataset testDS({"id": 1, "list": [[1.2, 2.3,
>> 3.4],[6,3,7,2]]});
>>
>> for $x in dataset  testDS
>> for $y in $x.list
>> return {"avg": avg($y)}
>>
>> On Dec 7, 2015, at 09:57, Malarout, Namrata (398M-Affiliate) <
>> Namrata.Malarout@jpl.nasa.gov> wrote:
>>
>> Hi,
>>
>> Wail, thanks for looking into it and explaining the use of for. I will be
>> following the issue. However, working with my sample data  may be a little
>> more tricky. I have a couple hundred of records which contain such nested
>> ordered lists. I would like to perform an aggregation over all the values
>> across all the records. Any suggestions on how to do it?
>>
>> Mike, thanks for understanding :) Appreciate all the help.
>> -Namrata
>>
>> From: Michael Carey <mjcarey@ics.uci.edu <mailto:mjcarey@ics.uci.edu
>> <mjcarey@ics.uci.edu>>>
>> Reply-To: "users@asterixdb.incubator.apache.org <
>> mailto:users@asterixdb.incubator.apache.org
>> <users@asterixdb.incubator.apache.org>>" <
>> users@asterixdb.incubator.apache.org <
>> mailto:users@asterixdb.incubator.apache.org
>> <users@asterixdb.incubator.apache.org>>>
>> Date: Monday, December 7, 2015 at 7:28 AM
>> To: "users@asterixdb.incubator.apache.org <
>> mailto:users@asterixdb.incubator.apache.org
>> <users@asterixdb.incubator.apache.org>>" <
>> users@asterixdb.incubator.apache.org <
>> mailto:users@asterixdb.incubator.apache.org
>> <users@asterixdb.incubator.apache.org>>>, "
>> dev@asterixdb.incubator.apache.org <
>> mailto:dev@asterixdb.incubator.apache.org
>> <dev@asterixdb.incubator.apache.org>>" <
>> dev@asterixdb.incubator.apache.org<
>> mailto:dev@asterixdb.incubator.apache.org
>> <dev@asterixdb.incubator.apache.org>>>
>> Subject: Re: Aggregate function on collection of ordered list
>>
>> + Looping in the dev list to try and get fast attention to the fix, if
>> it's easy!
>> (I know that Namarata's under time pressure in a NASA bakeoff exercise.
>> :-))
>>
>> On 12/7/15 4:59 AM, Wail Alkowaileet wrote:
>>
>> It's an easy fix...
>> Thanks for reporting that.
>>
>> I reported it in https://issues.apache.org/jira/browse/ASTERIXDB-1216 <
>> https://issues.apache.org/jira/browse/ASTERIXDB-1216>
>>
>> On Mon, Dec 7, 2015 at 3:33 PM, Wail Alkowaileet <wael.y.k@gmail.com <
>> mailto:wael.y.k@gmail.com <wael.y.k@gmail.com>>> wrote:
>> Hi Namrata,
>>
>> The best way to think of for in lists is to think it works as foreach in
>> java.
>> So ..
>> in your first query, it should be like:
>>
>> let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
>> for $x in $l // for each list in the outer list
>> return {"avg”: avg($y)}
>>
>> However, I tried it and it seems that there is a bug for applying
>> aggregation on nested open field.
>>
>> I'll look into it to see if it's an easy fix
>>
>>
>>
>> On Mon, Dec 7, 2015 at 2:52 PM, Malarout, Namrata (398M-Affiliate) <
>> Namrata.Malarout@jpl.nasa.gov<mailto:Namrata.Malarout@jpl.nasa.gov
>> <Namrata.Malarout@jpl.nasa.gov>>> wrote:
>> Hi,
>>
>> I am trying to perform avg, sum, min and max functions on a collection of
>> ordered lists. An example is:
>> let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
>> return {"avg”: avg($l)}
>>
>> I have tried both avg and sql-avg. But I get the following error:
>> Cannot compute AVG for values of type ORDEREDLIST
>> [NotImplementedException].
>>
>> I’ve attached the sample data that I’m working with (sample.adm). My AQL
>> query to find the average of analysis_error looks like:
>>
>> use dataverse Test;
>> for $f in dataset sample
>> where not(is-null($f.analysis_error))
>> return avg($f.analysis_error);
>>
>> The error seen is as follows:
>> Type of argument in function-call: asterix:avg, Args:[function-call:
>> asterix:field-access-by-name, Args:[%0->$$0, AString: {analysis_error}]]
>> should be a collection type instead of ANY [AlgebricksException]
>>
>> I would like to know what is the correct syntax to find the average.
>> Appreciate the help.
>> Thanks,
>> Namrata
>>
>>
>>
>>
>>
>> --
>>
>> Regards,
>> Wail Alkowaileet
>>
>>
>>
>> --
>>
>> Regards,
>> Wail Alkowaileet
>>
>>
>>
>> Best regards,
>> Ildar
>>
>>
>> Best regards,
>> Ildar
>>
>>
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>



-- 

*Regards,*
Wail Alkowaileet

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message