drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laurent Goujon <laur...@dremio.com>
Subject Re: [HANGOUT] Topics for 10/04/16
Date Thu, 06 Oct 2016 20:54:52 GMT
For the PR I didn't squash the changes into one single commit to make it
easier to review, but let me know if you would prefer the changes or pull
requests organized differently to speed up the review...

On Thu, Oct 6, 2016 at 11:50 AM, Parth Chandra <pchandra@maprtech.com>
wrote:

> I see, so it is not the prepare query where we see the improvement. I
> didn't realise that the BI tools can make so many information schema
> queries.
>
> I haven't been able to go thru the PR in detail. There is extensive
> refactoring that always makes it hard to tell the exact changes.  (Make me
> nervous as we don't have _any_ tests to test for memory leaks, concurrency
> and error conditions). I'll try to test some of these conditions out.
>
> Re DRILL-4280 - I was basing my comment on the PR comment.  I'm assuming
> Sudheesh will follow up and address this.
>
>
> On Wed, Oct 5, 2016 at 1:37 PM, Jacques Nadeau <jacques@dremio.com> wrote:
>
> > -user (since this is really a dev discussion)
> >
> > Looking at DRILL-4280, I see the comment about C++ changes but don't see
> > them there or in the linked squash branch. Maybe I'm missing something
> > obvious?
> >
> > The key performance benefits from the provided patch are about all the
> > other interrogations that Tableau (and other BI tools) makes against the
> > ODBC metadata interfaces (not prepare). With the new driver, those items
> > can be answered directly (such as list of schemas) rather than through
> ODBC
> > driver generated queries. I've commonly seen situations where even a
> simple
> > tableau workflow can take 7-10s even in the situation that the actual
> > "work" query (and associated limit 0 query) is quite short (<1s). This
> has
> > been especially problematic in situations where you're interacting with a
> > large number of sources since those information schema queries are not
> > always optimal to what is actually needed.
> >
> > With regards to benchmarks, we haven't done anything formal but you
> should
> > be able to easily do a comparison and see the improvements, especially
> when
> > dealing with larger catalogs.
> >
> >
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Wed, Oct 5, 2016 at 9:49 AM, Parth Chandra <pchandra@maprtech.com>
> > wrote:
> >
> > > Yup. I agree that we need to make sure that both clients are in sync. I
> > > believe DRILL-4280's PR refers to making changes in both APIs as well.
> > >
> > > Do you have a sense of how these changes give us a performance boost?
> As
> > > far as I can see, the APIs result in nearly the same code path being
> > > executed, with the difference being that the limit 0 query is now
> > submitted
> > > by the server instead of the client.
> > >
> > > I don't know much about the tweaking of performance for various BI
> tools;
> > > is there something that Tableau et al do different? I don't see how,
> > since
> > > the the ODBC/JDBC interface remains the same. Just trying to understand
> > > this.
> > >
> > > Anyway, any performance gain is wonderful. Do you have any numbers to
> > > share?
> > >
> > >
> > > On Tue, Oct 4, 2016 at 10:29 AM, Jacques Nadeau <jacques@dremio.com>
> > > wrote:
> > >
> > > > Both the C++ and the JDBC changes are updates that leverage a number
> of
> > > > pre-existing APIs already on the server. Our initial evaluations, we
> > have
> > > > already seen substantially improved BI tool performance with the
> > proposed
> > > > changes (with no additional server side changes). Are you seeing
> > > something
> > > > different? If you haven't yet looked at the changes in that light, I
> > > > suggest you do.
> > > >
> > > > If anything, I'm more concerned about client feature proposals that
> > don't
> > > > cover both the C++ and Java client. For example, I think we should be
> > > > cautious about merging something like DRILL-4280. We should be
> cautious
> > > > about introducing new server APIs unless there is a concrete plan
> > around
> > > > support in all clients.
> > > >
> > > > So I agree with the spirit of your ask: change proposals should be
> > > > "complete". However, I don't think it reasonably applies to the
> changes
> > > > proposed by Laurent. His changes "complete" the already introduced
> > > metadata
> > > > and prepare apis the server exposes. It provides an improved BI user
> > > > experience. It also introduces unit tests in the C++ client,
> something
> > > that
> > > > was previously sorely missing.
> > > >
> > > >
> > > >
> > > > --
> > > > Jacques Nadeau
> > > > CTO and Co-Founder, Dremio
> > > >
> > > > On Tue, Oct 4, 2016 at 9:47 AM, Parth Chandra <pchandra@maprtech.com
> >
> > > > wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > >   I won't be able to join the hangout but it would be good to
> discuss
> > > the
> > > > > plan for the related backend changes.
> > > > >
> > > > >   As I mentioned before I would like to see a concrete proposal for
> > the
> > > > > backend that will accompany these changes. Without that, I feel
> there
> > > is
> > > > no
> > > > > point to adding so much new code.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Parth
> > > > >
> > > > >
> > > > > On Mon, Oct 3, 2016 at 7:52 PM, Laurent Goujon <laurent@dremio.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm currently working on improving metadata support for both
the
> > JDBC
> > > > > > driver and the C++ connector, more specifically the following
> > JIRAs:
> > > > > >
> > > > > > DRILL-4853: Update C++ protobuf source files
> > > > > > DRILL-4420: Server-side metadata and prepared-statement support
> for
> > > C++
> > > > > > connector
> > > > > > DRILL-4880: Support JDBC driver registration using ServiceLoader
> > > > > > DRILL-4925: Add tableType filter to GetTables metadata query
> > > > > > DRILL-4730: Update JDBC DatabaseMetaData implementation to use
> new
> > > > > Metadata
> > > > > > APIs
> > > > > >
> > > > > > I  already opened multiple pull requests for those (the list
is
> > > > available
> > > > > > at https://github.com/apache/drill/pulls/laurentgo)
> > > > > >
> > > > > > I'm planning to join tomorrow hangout in case people have
> questions
> > > > about
> > > > > > those.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Laurent
> > > > > >
> > > > > > On Mon, Oct 3, 2016 at 10:28 AM, Subbu Srinivasan <
> > > > > ssrinivasan@zscaler.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Can we close on https://github.com/apache/drill/pull/518
?
> > > > > > >
> > > > > > > On Mon, Oct 3, 2016 at 10:27 AM, Sudheesh Katkam <
> > > > sudheesh@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi drillers,
> > > > > > > >
> > > > > > > > Our bi-weekly hangout is tomorrow (10/04/16, 10 AM
PT). If
> you
> > > have
> > > > > any
> > > > > > > > suggestions for hangout topics, you can add them to
this
> > thread.
> > > We
> > > > > > will
> > > > > > > > also ask around at the beginning of the hangout for
topics.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Sudheesh
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message