hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Roberts <jrobe...@pivotal.io>
Subject Re: PXF JDBC plugin
Date Thu, 27 Apr 2017 14:01:19 GMT
https://issues.apache.org/jira/browse/HAWQ-1445

Jon Roberts

On Tue, Apr 25, 2017 at 6:31 AM, Jon Roberts <jroberts@pivotal.io> wrote:

> No I did not.  My post was intended to start a discussion about the plugin
> and I thought the original contributor would chime in.  But I can make the
> tickets now if you think that is the best way to move forward.
>
> Jon Roberts
> Principal Engineer | jroberts@pivotal.io | 615-426-8661 <(615)%20426-8661>
>
> On Tue, Apr 25, 2017 at 2:58 AM, Michael Pearce <Michael.Pearce@ig.com>
> wrote:
>
>> Hi Jon,
>>
>> Did you make JIRA issues for the two improvements you suggested to be
>> made?
>>
>> Cheers
>> Mike
>>
>> On 04/04/2017, 04:15, "Michael Pearce" <Michael.Pearce@ig.com> wrote:
>>
>>     Hi Jon
>>
>>     I think on issue 1 and 2 are valid improvements that can be made, add
>> some Jira's for these. Looking at issue 1 it seems quite trivial for
>> someone to contribute the solution for.
>>
>>     On the question front:
>>
>>     1) from my understanding and using this plugin so far this is what
>> the partion by interval allows you to control the number of partitions that
>> would be generated, e.g. This is why the sample in document has 1 year so
>> it only creates 2 fragments.  You can also partition by an enum field here
>> the number of partitions is the number of enums.
>>
>>     2) I haven't checked specifically, saying that I haven't run into an
>> issue when trialling this myself. Obviously this is contributed by Devin he
>> will be the definitive source here. Though I guess can always specially
>> test these scenarios.
>>
>>     Sent using OWA for iPhone
>>     ________________________________________
>>     From: Jon Roberts <jroberts@pivotal.io>
>>     Sent: Monday, April 3, 2017 11:57:59 PM
>>     To: dev@hawq.incubator.apache.org
>>     Subject: Re: PXF JDBC plugin
>>
>>     https://github.com/apache/incubator-hawq/tree/master/pxf/pxf-jdbc
>>
>>     Issue 1: Security
>>     The example has the username and password in the connection string.
>>
>>     LOCATION ('pxf://localhost:51200/demodb.myclass'
>>                   '?PROFILE=JDBC'
>>                   '&JDBC_DRIVER=com.mysql.jdbc.Driver'
>>                   '&DB_URL=jdbc:mysql://
>>     192.168.200.6:3306/demodb&USER=root&PASS=root'
>>                   )
>>
>>     Any chance we can get this changed to a connection profile that
>> points to a
>>     file outside of the database?  For Greenplum database and S3, the
>> LOCATION
>>     syntax includes "config=/path/to/config_file".  The config_file
>> contains
>>     the S3 credentials.  This seems like a good pattern to follow.
>>
>>     As it is right now, anyone that can connect to the database will be
>> able to
>>     see the username and password of the JDBC connection.
>>
>>     Issue 2: Extra Properties
>>
>>     Some JDBC drivers will need many additional properties beyond the URL
>> and
>>     this requires setting it with a put to a Properties variable.  An
>> example
>>     of this is Oracle's defaultRowPrefetch property that needs to be
>> updated
>>     from the default of 10 which is designed for OLTP to something larger
>> like
>>     2000 which is more ideal for data extracts.
>>
>>     Additionally, you will need the ability to set the isolation level
>> which is
>>     done with setTransactionIsolation on the Connection.  I don't believe
>> you
>>     can set this on the connection URL either.  Many SQL Server and DB2
>>     database still don't use snapshot isolation and use dirty reads
>> instead.
>>     Without a dirty read here, your query will block modifications to the
>> table
>>     with a "blocking lock".
>>
>>     So for the configuration file will need both an extra properties
>> variable
>>     that is delimited so you can multiple and a isolation level indicator.
>>
>>     Questions.
>>     1.  How do you manage the maximum number of pxf instances?  For
>> example, if
>>     you partition like this:
>>     "PARTITION_BY=cdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:day",
>> will
>>     you create 730 pxf instances and thus 730 concurrent queries to the
>> source
>>     database?
>>     2.  Have you tested special characters like null, escape, carriage
>> return,
>>     newline, and your delimiter?  Maybe that is handled automatically by
>>     pxfwritable_import so this isn't an issue.
>>
>>
>>
>>     Jon Roberts
>>     Principal Engineer | jroberts@pivotal.io | 615-426-8661
>>
>>     On Mon, Apr 3, 2017 at 3:18 PM, Michael Pearce <Michael.Pearce@ig.com
>> >
>>     wrote:
>>
>>     > I mean the readme file :) at that location.
>>     > ________________________________________
>>     > From: Michael Pearce
>>     > Sent: Monday, April 3, 2017 9:15:48 PM
>>     > To: dev@hawq.incubator.apache.org
>>     > Subject: Re: PXF JDBC plugin
>>     >
>>     > On master there is the document that was added.
>>     >
>>     > https://github.com/apache/incubator-hawq/tree/master/pxf/pxf-jdbc
>>     >
>>     > Reading this I think answers some of the questions.
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > Sent using OWA for iPhone
>>     > ________________________________________
>>     > From: Vineet Goel <vvineet@apache.org>
>>     > Sent: Monday, April 3, 2017 6:33:07 PM
>>     > To: dev@hawq.incubator.apache.org
>>     > Subject: Re: PXF JDBC plugin
>>     >
>>     > Devin and others,
>>     >
>>     > It would be great if you can expand a bit more on the
>> implementation or
>>     > point to a document?
>>     >
>>     > Thanks!
>>     >
>>     >
>>     >
>>     > On Mon, Apr 3, 2017 at 7:53 AM Jon Roberts <jroberts@pivotal.io>
>> wrote:
>>     >
>>     > > JDBC PXF is pretty exciting.  Are there details on how this works?
>>     > >
>>     > > Does PXF randomly pick a segment to connect to the JDBC source or
>> do you
>>     > > specify a particular node to execute the query?  I'm assuming it
>> isn't a
>>     > > parallel query so you don't flood the JDBC source with x number of
>>     > "select
>>     > > *" queries but I could be wrong.
>>     > >
>>     > > How do you register the JDBC connection string with PXF?
>>     > >
>>     > > How do you manage credentials?
>>     > >
>>     > > How do you register the JDBC jar files?
>>     > >
>>     > >
>>     > > Jon Roberts
>>     > >
>>     > > On Sat, Apr 1, 2017 at 5:05 PM, Ed Espino <espino@apache.org>
>> wrote:
>>     > >
>>     > > > Thank you Michael, this is a well-deserved report highlight. I
>> have
>>     > added
>>     > > > the following to the report:
>>     > > >
>>     > > >   3. Community contribution highlight(s):
>>     > > >
>>     > > >      * Leveraging the extensible PXF design, a JDBC PXF plugin
>>     > > >        was contributed by Devin Jia (github id: jiadexin). This
>>     > > >        contribution came from the community and not from the
>>     > > >        company which originally donated HAWQ to the ASF.
>>     > > >
>>     > > > Warm regrds,
>>     > > > -=e
>>     > > >
>>     > > > On Sat, Apr 1, 2017 at 1:14 PM, Michael André Pearce <
>>     > > > michael.andre.pearce@me.com> wrote:
>>     > > >
>>     > > > >
>>     > > > > I would add note about the new JDBC PXF plugin being
>> contributed by
>>     > > Devin
>>     > > > > Jia. (Kudos to him)
>>     > > > >
>>     > > > > It's a good show of the community contributing and growing
>> outside of
>>     > > > just
>>     > > > > Pivotal staff (I think for incubator review this is
>> important) and
>>     > also
>>     > > > the
>>     > > > > benefit of the extensible PXF design the project has done.
>>     > >
>>     > The information contained in this email is strictly confidential
>> and for
>>     > the use of the addressee only, unless otherwise indicated. If you
>> are not
>>     > the intended recipient, please do not read, copy, use or disclose
>> to others
>>     > this message or any attachment. Please also notify the sender by
>> replying
>>     > to this email or by telephone (+44(020 7896 0011) and then delete
>> the email
>>     > and any copies of it. Opinions, conclusion (etc) that do not relate
>> to the
>>     > official business of this company shall be understood as neither
>> given nor
>>     > endorsed by it. IG is a trading name of IG Markets Limited (a
>> company
>>     > registered in England and Wales, company number 04008957) and IG
>> Index
>>     > Limited (a company registered in England and Wales, company number
>>     > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
>> Hill,
>>     > London EC4R 2YA. Both IG Markets Limited (register number 195355)
>> and IG
>>     > Index Limited (register number 114059) are authorised and regulated
>> by the
>>     > Financial Conduct Authority.
>>     >
>>
>>
>> The information contained in this email is strictly confidential and for
>> the use of the addressee only, unless otherwise indicated. If you are not
>> the intended recipient, please do not read, copy, use or disclose to others
>> this message or any attachment. Please also notify the sender by replying
>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>> official business of this company shall be understood as neither given nor
>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>> registered in England and Wales, company number 04008957) and IG Index
>> Limited (a company registered in England and Wales, company number
>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>> Index Limited (register number 114059) are authorised and regulated by the
>> Financial Conduct Authority.
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message