drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zelaine Fong <zf...@maprtech.com>
Subject Re: Performance question
Date Thu, 11 Aug 2016 16:35:35 GMT
What does the query plan look like when you're using SqlServer with Drill?
I'm guessing that the join isn't being pushed down to SqlServer.  If so,
you've hit DRILL-4818.  There are known limitations with the JDBC storage
plugin that prevent it from generating the optimal query plan in cases like
this.

-- Zelaine

On Thu, Aug 11, 2016 at 9:22 AM, imbar marinescu <imbarma@gmail.com> wrote:

> Hi,
>
> I'm looking into drill, to use it as an in memory db.
> I wanted to handle data that I have in a Sql Server db.
> I connected with an Sql Server jdbc plug in, and my test query ran for
> about 2 sec.
> When running directly from Sql Server it took 0.15 sec.
>
> I ran a "create table" as a parquet file and then tried to query with dfs
> plug in.
> The query ran for 0.5 sec (after caching. first run is about 3 sec).
> Also tried to do "REFRESH TABLE METADATA", but it didn't change anything.
>
> My Test query is:
> select sum(f.Sales), p.`Product Category`
> from dfs.tmp.`/Demo/Facts/` f
> join dfs.tmp.`/Demo/Product/` p on p.productKey = f.productKey
> group by p.`Product Category`;
>
> Facts table has 422,833 rows, product has 606.
> The result set is 4 rows.
>
> This was done running drill locally (embedded) on a windows machine.
> I tried a linux machine, but the results where even slower.
>
> I didn't configure anything, just used the install as-is.
>
> Am I doing something wrong? Is a RDBMS going to be faster anyway?
> I read about the performance and I feel I'm not getting there.
>
> SqlServer: 0.15 sec.
> SqlServer in drill: 2 sec.
> Parquet in drill: 0.5 sec.
>
> Thank you,
> Imbar
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message