incubator-drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Nested collections (e.g. JSON arrays) and drill queries
Date Fri, 26 Oct 2012 04:43:21 GMT
Does the WITHIN clause help?  In BigQuery, this is described here:

https://developers.google.com/bigquery/docs/query-reference#within

On Thu, Oct 25, 2012 at 2:51 PM, Evan Pollan <evan.pollan@gmail.com> wrote:

> Hi,
>
> I attended Tomer's Strata/HadoopWorld presentation on Drill yesterday, and
> was very impressed.  Lots of features that map directly to my needs.
>
> He specifically cited support for, on the HDFS side, JSON/BSON, avro, and
> sequence files and emphasized the ability to access nested data.  We use
> JSON heavily, so it sounds like Drill would support base-case queries over
> nested properties within my dataset.  One question I didn't get the chance
> to ask, though:  what about querying over records with nested collections?
>  For example, I have some JSON datasets that look like:
>
> {
>     "propertyA": "valueA",
>     "propertyB": [
>         {
>             "propertyX": "value1",
>             "propertyY": "value2"
>         },
>         {
>             "propertyX": "value3",
>             "propertyY": "value4"
>         }
>     ]
> }
>
> In this case, I have users that would like to be able to access
> propertyB.propertyX and leverage it in joins and aggregations.  Since each
> record has N propertyB.propertyX values, though, I'm wondering how Drill's
> query planner and execution engine would handle this?
>
> thanks,
> Evan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message