asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Carey <mjca...@ics.uci.edu>
Subject Re: Aggregate function on collection of ordered list
Date Wed, 09 Dec 2015 07:32:40 GMT
+1

On 12/8/15 11:17 PM, Ildar Absalyamov wrote:
> I believe we need to do a major refactoring of all user-facing functions.
> Created a root issue for that 
> https://issues.apache.org/jira/browse/ASTERIXDB-1219
>
>> On Dec 8, 2015, at 22:11, Wail Alkowaileet <wael.y.k@gmail.com 
>> <mailto:wael.y.k@gmail.com>> wrote:
>>
>> Me again ...
>> One way to workaround Namrata's problem is to enforce the type either by
>> specifying the schema or at runtime:
>> let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
>> for $x in $l // for each list in the outer list
>> let $k := (for $y in $x
>> return abs($y)
>> )
>> return sql-avg($k)
>>
>> This will work only if your list doesn't contain negative numbers. I 
>> think
>> we need to unify the behavior in all functions on how to deal with 
>> type ANY.
>>
>>
>>
>> On Tue, Dec 8, 2015 at 11:34 AM, Wail Alkowaileet <wael.y.k@gmail.com 
>> <mailto:wael.y.k@gmail.com>>
>> wrote:
>>
>>> That's one thing I observed in the built-in functions. Some work 
>>> perfectly
>>> fine with the open type and some are not.
>>> As for instance, if I want to do string-length on a string that's not
>>> declared in my schema. I have to trick the compiler as such
>>> string-length(string-concat(["",$mystring]) to infer the type of 
>>> $mystring
>>> as UNION(NULL, STRING) instead of ANY to satisfies the check conditions.
>>>
>>> I really don't know what would be the best solution. However, I think it
>>> would be better for open type queries to fail at runtime instead of 
>>> compile
>>> time. But ... from a user experience point-of-view, runtime fail can be
>>> problematic in a situation where I can apply the function to the 
>>> first n-1
>>> of the records and fails at the last record.
>>>
>>> On Tue, Dec 8, 2015 at 1:04 AM, Ildar Absalyamov <
>>> ildar.absalyamov@gmail.com <mailto:ildar.absalyamov@gmail.com>> wrote:
>>>
>>>> That’s true, the trick will work only for homogeneous lists.
>>>>
>>>> On Dec 7, 2015, at 13:00, Ian Maxon <imaxon@uci.edu 
>>>> <mailto:imaxon@uci.edu>> wrote:
>>>>
>>>> We still can't declare a list of mixed type though, I don't think. I
>>>> was trying that earlier and ran into some cryptic errors about Java
>>>> typecasting. Hopefully that isn't necessary though as the NetCDF (or
>>>> the json representation thereof) isn't dynamically structured (e.g.
>>>> open types aren't necessary)?
>>>>
>>>> On Mon, Dec 7, 2015 at 12:48 PM, Ildar Absalyamov
>>>> <ildar.absalyamov@gmail.com <mailto:ildar.absalyamov@gmail.com>>
wrote:
>>>>
>>>> Namrata,
>>>>
>>>> I assume the aforementioned query with record defined in let clause was
>>>> only the example.
>>>> That query indeed has a bug, but is happen only because the type of the
>>>> list is not statically enforced.
>>>>
>>>> Do you load your data into dataset? I so what is the type of that 
>>>> dataset?
>>>> If you enforce the type of your nested ordered lists upon data 
>>>> ingestion
>>>> you can calculate the average:
>>>>
>>>> drop dataverse test if exists
>>>> create dataverse test
>>>> use dataverse test
>>>>
>>>> create type testType as {
>>>> id: int32,
>>>> list: [[double]]
>>>> }
>>>>
>>>> create dataset testDS(testType) primary key id;
>>>> insert into dataset testDS({"id": 1, "list": [[1.2, 2.3,
>>>> 3.4],[6,3,7,2]]});
>>>>
>>>> for $x in dataset  testDS
>>>> for $y in $x.list
>>>> return {"avg": avg($y)}
>>>>
>>>> On Dec 7, 2015, at 09:57, Malarout, Namrata (398M-Affiliate) <
>>>> Namrata.Malarout@jpl.nasa.gov 
>>>> <mailto:Namrata.Malarout@jpl.nasa.gov>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Wail, thanks for looking into it and explaining the use of for. I 
>>>> will be
>>>> following the issue. However, working with my sample data  may be a 
>>>> little
>>>> more tricky. I have a couple hundred of records which contain such 
>>>> nested
>>>> ordered lists. I would like to perform an aggregation over all the 
>>>> values
>>>> across all the records. Any suggestions on how to do it?
>>>>
>>>> Mike, thanks for understanding :) Appreciate all the help.
>>>> -Namrata
>>>>
>>>> From: Michael Carey <mjcarey@ics.uci.edu 
>>>> <mailto:mjcarey@ics.uci.edu> <mailto:mjcarey@ics.uci.edu
>>>> <mjcarey@ics.uci.edu <mailto:mjcarey@ics.uci.edu>>>>
>>>> Reply-To: "users@asterixdb.incubator.apache.org 
>>>> <mailto:users@asterixdb.incubator.apache.org> <
>>>> mailto:users@asterixdb.incubator.apache.org
>>>> <users@asterixdb.incubator.apache.org>>" <
>>>> users@asterixdb.incubator.apache.org <
>>>> mailto:users@asterixdb.incubator.apache.org
>>>> <users@asterixdb.incubator.apache.org>>>
>>>> Date: Monday, December 7, 2015 at 7:28 AM
>>>> To: "users@asterixdb.incubator.apache.org <
>>>> mailto:users@asterixdb.incubator.apache.org
>>>> <users@asterixdb.incubator.apache.org>>" <
>>>> users@asterixdb.incubator.apache.org <
>>>> mailto:users@asterixdb.incubator.apache.org
>>>> <users@asterixdb.incubator.apache.org>>>, "
>>>> dev@asterixdb.incubator.apache.org <
>>>> mailto:dev@asterixdb.incubator.apache.org
>>>> <dev@asterixdb.incubator.apache.org>>" <
>>>> dev@asterixdb.incubator.apache.org<
>>>> mailto:dev@asterixdb.incubator.apache.org
>>>> <dev@asterixdb.incubator.apache.org>>>
>>>> Subject: Re: Aggregate function on collection of ordered list
>>>>
>>>> + Looping in the dev list to try and get fast attention to the fix, if
>>>> it's easy!
>>>> (I know that Namarata's under time pressure in a NASA bakeoff exercise.
>>>> :-))
>>>>
>>>> On 12/7/15 4:59 AM, Wail Alkowaileet wrote:
>>>>
>>>> It's an easy fix...
>>>> Thanks for reporting that.
>>>>
>>>> I reported it in https://issues.apache.org/jira/browse/ASTERIXDB-1216 <
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1216>
>>>>
>>>> On Mon, Dec 7, 2015 at 3:33 PM, Wail Alkowaileet <wael.y.k@gmail.com <
>>>> mailto:wael.y.k@gmail.com <wael.y.k@gmail.com>>> wrote:
>>>> Hi Namrata,
>>>>
>>>> The best way to think of for in lists is to think it works as 
>>>> foreach in
>>>> java.
>>>> So ..
>>>> in your first query, it should be like:
>>>>
>>>> let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
>>>> for $x in $l // for each list in the outer list
>>>> return {"avg”: avg($y)}
>>>>
>>>> However, I tried it and it seems that there is a bug for applying
>>>> aggregation on nested open field.
>>>>
>>>> I'll look into it to see if it's an easy fix
>>>>
>>>>
>>>>
>>>> On Mon, Dec 7, 2015 at 2:52 PM, Malarout, Namrata (398M-Affiliate) <
>>>> Namrata.Malarout@jpl.nasa.gov<mailto:Namrata.Malarout@jpl.nasa.gov
>>>> <Namrata.Malarout@jpl.nasa.gov>>> wrote:
>>>> Hi,
>>>>
>>>> I am trying to perform avg, sum, min and max functions on a 
>>>> collection of
>>>> ordered lists. An example is:
>>>> let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
>>>> return {"avg”: avg($l)}
>>>>
>>>> I have tried both avg and sql-avg. But I get the following error:
>>>> Cannot compute AVG for values of type ORDEREDLIST
>>>> [NotImplementedException].
>>>>
>>>> I’ve attached the sample data that I’m working with (sample.adm). 
>>>> My AQL
>>>> query to find the average of analysis_error looks like:
>>>>
>>>> use dataverse Test;
>>>> for $f in dataset sample
>>>> where not(is-null($f.analysis_error))
>>>> return avg($f.analysis_error);
>>>>
>>>> The error seen is as follows:
>>>> Type of argument in function-call: asterix:avg, Args:[function-call:
>>>> asterix:field-access-by-name, Args:[%0->$$0, AString: 
>>>> {analysis_error}]]
>>>> should be a collection type instead of ANY [AlgebricksException]
>>>>
>>>> I would like to know what is the correct syntax to find the average.
>>>> Appreciate the help.
>>>> Thanks,
>>>> Namrata
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Regards,
>>>> Wail Alkowaileet
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Regards,
>>>> Wail Alkowaileet
>>>>
>>>>
>>>>
>>>> Best regards,
>>>> Ildar
>>>>
>>>>
>>>> Best regards,
>>>> Ildar
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> *Regards,*
>>> Wail Alkowaileet
>>>
>>
>>
>>
>> -- 
>>
>> *Regards,*
>> Wail Alkowaileet
>
> Best regards,
> Ildar
>


Mime
View raw message