drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Bates <jba...@maprtech.com>
Subject Re: 6 to 7 min delay in closing query when pulling over multiple json files using drill-0.6.0.28642.r2-1.noarch
Date Wed, 26 Nov 2014 03:02:08 GMT
Didn't get a hit on this so I'm sending it for round 2...

When executing a query to a specific file and limiting to 1 row returned
the query returns in under a second. When keeping the same limit but
increasing the scope to several directories of JSON files it returns the
single row quickly but can take up to 7 to 10 min to "finish". That delay
forces one to configure a timeout of 600 to 1200 sec in the ODBC connector
or the query will fail.

Any workarounds for this?

Query to a single file:
select * FROM (select `dir0` as `city`, to_timestamp(
`executionTime`,'YYYY-MM-dd hh:mm:ss a') as `executionTime`,
flatten(`stationBeanList`) as `stations` FROM
 `data`.`all_bikes`.`../bikes/chicago/bikestations/1416875401.json` limit
1) a limit 1;
+------------+---------------+------------+
|    city    | executionTime |  stations  |
+------------+---------------+------------+
| null       | 2014-11-24 18:29:01.0 | {"id":5,"stationName":"State St &
Harrison St","availableDocks":12,"totalDocks":19,"latitude":41.8739580629,"longitude":-87.6277394859,"statusValue":"In
Service","statusKey":1,"availableBikes":7,"stAddress1":"State St & Harrison
St","stAddress2":"","city":"","postalCode":"","location":"620 S. State
St.","altitude":"","testStation":false,"landMark":"030"} |
+------------+---------------+------------+
1 row selected (0.542 seconds)

When executing over a larger scope it returns the first row in 3 sec but
does not close the query for another 6 or 7 minuets:
select * FROM (select `dir0` as `city`, to_timestamp(
`executionTime`,'YYYY-MM-dd hh:mm:ss a') as `executionTime`,
flatten(`stationBeanList`) as `stations` FROM
 `data`.`all_bikes`.`../bikes` limit 1) a limit 1;
+------------+---------------+------------+
|    city    | executionTime |  stations  |
+------------+---------------+------------+
| chicago    | 2014-11-17 23:29:01.0 | {"id":5,"stationName":"State St &
Harrison St","availableDocks":8,"totalDocks":19,"latitude":41.8739580629,"longitude":-87.6277394859,"statusValue":"In
Service","statusKey":1,"availableBikes":11,"stAddress1":"State St &
Harrison St","stAddress2":"","city":"","postalCode":"","location":"620 S.
State St.","altitude":"","testStation":false,"landMark":"030"} | * <--- At
this point in 3 sec*
+------------+---------------+------------+
1 row selected (683.15 seconds)


On Mon, Nov 24, 2014 at 10:00 PM, Jim Bates <jbates@maprtech.com> wrote:

> When executing a query to a specific file and limiting to 1 the query
> returns in under a second:
> select * FROM (select `dir0` as `city`, to_timestamp(
> `executionTime`,'YYYY-MM-dd hh:mm:ss a') as `executionTime`,
> flatten(`stationBeanList`) as `stations` FROM
>  `data`.`all_bikes`.`../bikes/chicago/bikestations/1416875401.json` limit
> 1) a limit 1;
> +------------+---------------+------------+
> |    city    | executionTime |  stations  |
> +------------+---------------+------------+
> | null       | 2014-11-24 18:29:01.0 | {"id":5,"stationName":"State St &
> Harrison St","availableDocks":12,"totalDocks":19,"latitude":41.8739580629,"longitude":-87.6277394859,"statusValue":"In
> Service","statusKey":1,"availableBikes":7,"stAddress1":"State St & Harrison
> St","stAddress2":"","city":"","postalCode":"","location":"620 S. State
> St.","altitude":"","testStation":false,"landMark":"030"} |
> +------------+---------------+------------+
> 1 row selected (0.567 seconds)
>
> When executing over a larger scope it returns the first row in 3 sec but
> does not close the query for another 6 or 7 minuets:
> select * FROM (select `dir0` as `city`, to_timestamp(
> `executionTime`,'YYYY-MM-dd hh:mm:ss a') as `executionTime`,
> flatten(`stationBeanList`) as `stations` FROM
>  `data`.`all_bikes`.`../bikes` limit 1) a limit 1;
> +------------+---------------+------------+
> |    city    | executionTime |  stations  |
> +------------+---------------+------------+
> | chicago    | 2014-11-17 23:29:01.0 | {"id":5,"stationName":"State St &
> Harrison St","availableDocks":8,"totalDocks":19,"latitude":41.8739580629,"longitude":-87.6277394859,"statusValue":"In
> Service","statusKey":1,"availableBikes":11,"stAddress1":"State St &
> Harrison St","stAddress2":"","city":"","postalCode":"","location":"620 S.
> State St.","altitude":"","testStation":false,"landMark":"030"} | * <---
> At this point in 3 sec*
> +------------+---------------+------------+
> 1 row selected (496.05 seconds)
>
> Any reason that might be?
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message