drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: question about correlated arrays and flatten
Date Mon, 01 Jun 2015 22:28:53 GMT
How could we make functional primitives work without lambda?



On Mon, Jun 1, 2015 at 9:55 PM, Hanifi Gunes <hgunes@maprtech.com> wrote:

> Idea of having functional primitives with Drill sounds really handy. It
> would be great if we could support left-right folding as well. I can see
> many great use cases of project/map, fold/reduce, zip, flatten when
> combined.
>
> On Sat, May 30, 2015 at 12:57 AM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
>
> > OK.  I will file a JIRA for a zip function.  No idea if I will be able to
> > get one written in the available cracks of time.
> >
> >
> >
> > On Fri, May 29, 2015 at 7:17 PM, Steven Phillips <sphillips@maprtech.com
> >
> > wrote:
> >
> > > I think your use case could be solved by adding a UDF that can combine
> > > multiple arrays into a single array. The result of this function could
> > then
> > > be handled by our current implementation of flatten.
> > >
> > > I think this is preferable to enhancing flatten itself to handle it,
> > since
> > > flatten is not an ordinary UDF, and thus more difficult to modify and
> > > maintain.
> > >
> > > On Fri, May 29, 2015 at 3:20 PM, Ted Dunning <ted.dunning@gmail.com>
> > > wrote:
> > >
> > > > My particular use case can throw an error if the lists are different
> > > > length.
> > > >
> > > > I think our real goal should be to have a logically complete set of
> > > simple
> > > > primitives that lets any sort of back and forward conversions of this
> > > kind.
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, May 29, 2015 at 9:58 AM, Jason Altekruse <
> > > altekrusejason@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > I understand what you want to do, unfortunately we don't have
> support
> > > for
> > > > > this right now. A UDF is the best I can suggest at this point.
> > > > >
> > > > > Just to explore the idea a little further for the sake of creating
> a
> > > > > complete feature request, I assume you would just want nulls filled
> > in
> > > > for
> > > > > the cases where the lists were different lengths?
> > > > >
> > > > > On Fri, May 29, 2015 at 8:58 AM, Ted Dunning <
> ted.dunning@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Input is here:
> > https://gist.github.com/tdunning/07ce66e7e4d4af41afd7
> > > > > >
> > > > > > Output is here:
> > > https://gist.github.com/tdunning/3aa841c56bfcdc0ab90e
> > > > > >
> > > > > > log-synth schema for generating input data is here:
> > > > > > https://gist.github.com/tdunning/638dd52c00569ffa9582
> > > > > >
> > > > > >
> > > > > > Preferred syntax would be like
> > > > > >
> > > > > > select flatten(t, v1, v2) from ...
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, May 29, 2015 at 7:04 AM, Neeraja Rentachintala <
> > > > > > nrentachintala@maprtech.com> wrote:
> > > > > >
> > > > > > > Ted
> > > > > > > can you pls give an example with few data elements in a,
b and
> > the
> > > > > > expected
> > > > > > > output you are looking from the query.
> > > > > > >
> > > > > > > -Neeraja
> > > > > > >
> > > > > > > On Fri, May 29, 2015 at 6:43 AM, Ted Dunning <
> > > ted.dunning@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I have two arrays.  Their elements are correlated
times and
> > > values.
> > > > > I
> > > > > > > > would like to flatten them into rows, each with two
elements.
> > > > > > > >
> > > > > > > > The query
> > > > > > > >
> > > > > > > >    select flatten(a), flatten(b) from ...
> > > > > > > >
> > > > > > > > doesn't work because I get the cartesian product (of
course).
> > > The
> > > > > > query
> > > > > > > >
> > > > > > > >    select flatten(a, b) from ...
> > > > > > > >
> > > > > > > > also doesn't work because flatten doesn't have a
> multi-argument
> > > > form.
> > > > > > > >
> > > > > > > > Going crazy, this query kind of sort of almost works,
but not
> > > > really:
> > > > > > > >
> > > > > > > >      select r.x.`key`, flatten(r.x.`value`)  from
(
> > > > > > > >
> > > > > > > >          select flatten(kvgen(x)) as x from ...) r;
> > > > > > > >
> > > > > > > > What I really want to see is something like this:
> > > > > > > >    select zip(flatten(a), flatten(b)) from ...
> > > > > > > >
> > > > > > > > Any pointers?  Is my next step to write a UDF?
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >  Steven Phillips
> > >  Software Engineer
> > >
> > >  mapr.com
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message