incubator-drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: What do you want out of Apache Drill?
Date Wed, 13 Mar 2013 15:47:58 GMT
On Wed, Mar 13, 2013 at 7:07 AM, David Alves <davidralves@gmail.com> wrote:

>         I was going through the list looking for the current stance on
> joins and I found Ted's answer.
>         What is the main point behind not doing large joins on Drill?
>

Not doing large joins *yet*.


>          Is it just simplicity (as in optimizer, etc.) or is there
> something else?
>

Simplicity in early implementation.


>          I mention this because I'm particularly interested in large self
> joins (I'd can volunteer to work on them myself, of course).
>

I would love to see large self joins.  In pig notation, I would be
interested in co-group of multiple fields on a single key field followed by
counting of all pairs in each of the groups.  Counting the cross-group
pairs is also interesting.  Saying this concisely in SQL is hard for me,
especially since I would like to down-sample each of the groups.  I can say
it with many queries and multiple temp tables, but I expect that this would
be difficult for the optimizer to understand.  I can also say it concisely
in Drill's intermediate language.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message