asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Aggregate function on collection of ordered list
Date Wed, 09 Dec 2015 20:58:17 GMT
Q:  Isn't the actual data stored and well-typed, though?  (So the types 
are known?)

On 12/9/15 10:31 AM, Malarout, Namrata (398M-Affiliate) wrote:
> Hi Wail,
> Thanks for suggesting this. Unfortunately, for some of the variables 
> we do have negative values.
> - Namrata
>
> From: Wail Alkowaileet <wael.y.k@gmail.com <mailto:wael.y.k@gmail.com>>
> Reply-To: "users@asterixdb.incubator.apache.org 
> <mailto:users@asterixdb.incubator.apache.org>" 
> <users@asterixdb.incubator.apache.org 
> <mailto:users@asterixdb.incubator.apache.org>>
> Date: Tuesday, December 8, 2015 at 10:11 PM
> To: "users@asterixdb.incubator.apache.org 
> <mailto:users@asterixdb.incubator.apache.org>" 
> <users@asterixdb.incubator.apache.org 
> <mailto:users@asterixdb.incubator.apache.org>>
> Cc: "dev@asterixdb.incubator.apache.org 
> <mailto:dev@asterixdb.incubator.apache.org>" 
> <dev@asterixdb.incubator.apache.org 
> <mailto:dev@asterixdb.incubator.apache.org>>
> Subject: Re: Aggregate function on collection of ordered list
>
> Me again ...
> One way to workaround Namrata's problem is to enforce the type either 
> by specifying the schema or at runtime:
> let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
> for $x in $l // for each list in the outer list
> let $k := (for $y in $x
> return abs($y)
> )
> return sql-avg($k)
>
> This will work only if your list doesn't contain negative numbers. I 
> think we need to unify the behavior in all functions on how to deal 
> with type ANY.
>
>
>
> On Tue, Dec 8, 2015 at 11:34 AM, Wail Alkowaileet <wael.y.k@gmail.com 
> <mailto:wael.y.k@gmail.com>> wrote:
>
>     That's one thing I observed in the built-in functions. Some work
>     perfectly fine with the open type and some are not.
>     As for instance, if I want to do string-length on a string that's
>     not declared in my schema. I have to trick the compiler as such
>     string-length(string-concat(["",$mystring]) to infer the type of
>     $mystring as UNION(NULL, STRING) instead of ANY to satisfies the
>     check conditions.
>
>     I really don't know what would be the best solution. However, I
>     think it would be better for open type queries to fail at runtime
>     instead of compile time. But ... from a user experience
>     point-of-view, runtime fail can be problematic in a situation
>     where I can apply the function to the first n-1 of the records and
>     fails at the last record.
>
>     On Tue, Dec 8, 2015 at 1:04 AM, Ildar Absalyamov
>     <ildar.absalyamov@gmail.com <mailto:ildar.absalyamov@gmail.com>>
>     wrote:
>
>         That’s true, the trick will work only for homogeneous lists.
>
>>         On Dec 7, 2015, at 13:00, Ian Maxon <imaxon@uci.edu
>>         <mailto:imaxon@uci.edu>> wrote:
>>
>>         We still can't declare a list of mixed type though, I don't
>>         think. I
>>         was trying that earlier and ran into some cryptic errors
>>         about Java
>>         typecasting. Hopefully that isn't necessary though as the
>>         NetCDF (or
>>         the json representation thereof) isn't dynamically structured
>>         (e.g.
>>         open types aren't necessary)?
>>
>>         On Mon, Dec 7, 2015 at 12:48 PM, Ildar Absalyamov
>>         <ildar.absalyamov@gmail.com
>>         <mailto:ildar.absalyamov@gmail.com>> wrote:
>>>         Namrata,
>>>
>>>         I assume the aforementioned query with record defined in let
>>>         clause was only the example.
>>>         That query indeed has a bug, but is happen only because the
>>>         type of the list is not statically enforced.
>>>
>>>         Do you load your data into dataset? I so what is the type of
>>>         that dataset?
>>>         If you enforce the type of your nested ordered lists upon
>>>         data ingestion you can calculate the average:
>>>
>>>         drop dataverse test if exists
>>>         create dataverse test
>>>         use dataverse test
>>>
>>>         create type testType as {
>>>         id: int32,
>>>         list: [[double]]
>>>         }
>>>
>>>         create dataset testDS(testType) primary key id;
>>>         insert into dataset testDS({"id": 1, "list": [[1.2, 2.3,
>>>         3.4],[6,3,7,2]]});
>>>
>>>         for $x in dataset  testDS
>>>         for $y in $x.list
>>>         return {"avg": avg($y)}
>>>
>>>>         On Dec 7, 2015, at 09:57, Malarout, Namrata
>>>>         (398M-Affiliate) <Namrata.Malarout@jpl.nasa.gov
>>>>         <mailto:Namrata.Malarout@jpl.nasa.gov>> wrote:
>>>>
>>>>         Hi,
>>>>
>>>>         Wail, thanks for looking into it and explaining the use of
>>>>         for. I will be following the issue. However, working with
>>>>         my sample data  may be a little more tricky. I have a
>>>>         couple hundred of records which contain such nested ordered
>>>>         lists. I would like to perform an aggregation over all the
>>>>         values across all the records. Any suggestions on how to do it?
>>>>
>>>>         Mike, thanks for understanding :) Appreciate all the help.
>>>>         -Namrata
>>>>
>>>>         From: Michael Carey <mjcarey@ics.uci.edu
>>>>         <mailto:mjcarey@ics.uci.edu><mailto:mjcarey@ics.uci.edu>>
>>>>         Reply-To: "users@asterixdb.incubator.apache.org
>>>>         <mailto:users@asterixdb.incubator.apache.org><mailto:users@asterixdb.incubator.apache.org>"
>>>>         <users@asterixdb.incubator.apache.org
>>>>         <mailto:users@asterixdb.incubator.apache.org><mailto:users@asterixdb.incubator.apache.org>>
>>>>         Date: Monday, December 7, 2015 at 7:28 AM
>>>>         To: "users@asterixdb.incubator.apache.org
>>>>         <mailto:users@asterixdb.incubator.apache.org><mailto:users@asterixdb.incubator.apache.org>"
>>>>         <users@asterixdb.incubator.apache.org
>>>>         <mailto:users@asterixdb.incubator.apache.org><mailto:users@asterixdb.incubator.apache.org>>,
>>>>         "dev@asterixdb.incubator.apache.org
>>>>         <mailto:dev@asterixdb.incubator.apache.org><mailto:dev@asterixdb.incubator.apache.org>"
>>>>         <dev@asterixdb.incubator.apache.org
>>>>         <mailto:dev@asterixdb.incubator.apache.org><mailto:dev@asterixdb.incubator.apache.org>>
>>>>         Subject: Re: Aggregate function on collection of ordered list
>>>>
>>>>         + Looping in the dev list to try and get fast attention to
>>>>         the fix, if it's easy!
>>>>         (I know that Namarata's under time pressure in a NASA
>>>>         bakeoff exercise. :-))
>>>>
>>>>         On 12/7/15 4:59 AM, Wail Alkowaileet wrote:
>>>>>         It's an easy fix...
>>>>>         Thanks for reporting that.
>>>>>
>>>>>         I reported it
>>>>>         inhttps://issues.apache.org/jira/browse/ASTERIXDB-1216<https://issues.apache.org/jira/browse/ASTERIXDB-1216>
>>>>>
>>>>>         On Mon, Dec 7, 2015 at 3:33 PM, Wail Alkowaileet
>>>>>         <wael.y.k@gmail.com
>>>>>         <mailto:wael.y.k@gmail.com><mailto:wael.y.k@gmail.com>>
wrote:
>>>>>         Hi Namrata,
>>>>>
>>>>>         The best way to think of for in lists is to think it works
>>>>>         as foreach in java.
>>>>>         So ..
>>>>>         in your first query, it should be like:
>>>>>
>>>>>         let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
>>>>>         for $x in $l // for each list in the outer list
>>>>>         return {"avg”: avg($y)}
>>>>>
>>>>>         However, I tried it and it seems that there is a bug for
>>>>>         applying aggregation on nested open field.
>>>>>
>>>>>         I'll look into it to see if it's an easy fix
>>>>>
>>>>>
>>>>>
>>>>>         On Mon, Dec 7, 2015 at 2:52 PM, Malarout, Namrata
>>>>>         (398M-Affiliate) <Namrata.Malarout@jpl.nasa.gov
>>>>>         <mailto:Namrata.Malarout@jpl.nasa.gov><mailto:Namrata.Malarout@jpl.nasa.gov>>
>>>>>         wrote:
>>>>>         Hi,
>>>>>
>>>>>         I am trying to perform avg, sum, min and max functions on
>>>>>         a collection of ordered lists. An example is:
>>>>>         let $l := [[1.2, 2.3, 3.4],[6,3,7,2]]
>>>>>         return {"avg”: avg($l)}
>>>>>
>>>>>         I have tried both avg and sql-avg. But I get the following
>>>>>         error:
>>>>>         Cannot compute AVG for values of type ORDEREDLIST
>>>>>         [NotImplementedException].
>>>>>
>>>>>         I’ve attached the sample data that I’m working with
>>>>>         (sample.adm). My AQL query to find the average of
>>>>>         analysis_error looks like:
>>>>>
>>>>>         use dataverse Test;
>>>>>         for $f in dataset sample
>>>>>         where not(is-null($f.analysis_error))
>>>>>         return avg($f.analysis_error);
>>>>>
>>>>>         The error seen is as follows:
>>>>>         Type of argument in function-call: asterix:avg,
>>>>>         Args:[function-call: asterix:field-access-by-name,
>>>>>         Args:[%0->$$0, AString: {analysis_error}]] should be a
>>>>>         collection type instead of ANY [AlgebricksException]
>>>>>
>>>>>         I would like to know what is the correct syntax to find
>>>>>         the average. Appreciate the help.
>>>>>         Thanks,
>>>>>         Namrata
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>         --
>>>>>
>>>>>         Regards,
>>>>>         Wail Alkowaileet
>>>>>
>>>>>
>>>>>
>>>>>         --
>>>>>
>>>>>         Regards,
>>>>>         Wail Alkowaileet
>>>>
>>>
>>>         Best regards,
>>>         Ildar
>
>         Best regards,
>         Ildar
>
>
>
>
>     -- 
>
>     *Regards,*
>     Wail Alkowaileet
>
>
>
>
> -- 
>
> *Regards,*
> Wail Alkowaileet


Mime
View raw message