apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandni Singh <chan...@datatorrent.com>
Subject Re: Changes to JdbcOutputOperator
Date Mon, 28 Dec 2015 17:56:16 GMT
Yeah that sounds good.

Chandni

On Mon, Dec 28, 2015 at 2:46 AM, Bhupesh Chawda <bhupesh@datatorrent.com>
wrote:

> Thanks Chandni for the comments.
>
> Additionally I think, we should do away with constructing the SQL query in
> the operator. This is because it is only possible in case of simple insert
> statements. In case a where clause is needed in the insert statement, we
> cannot construct the SQL easily. If we are accepting a parametrized query
> from the user for update/merge, why not do the same for insert statements
> as well? Then the hierarchy would look like:
> - AbstractJdbcPOJOOutputOperator (As suggested)
> -- JdbcPOJOOutputOperator (Takes care of all statements insert, update,
> merge, delete)
>
> We don't need a separate operator for inserts and update/merge.
>
> Thanks.
> -Bhupesh
>
>
> On Mon, Dec 28, 2015 at 3:32 PM, Chandni Singh <chandni@datatorrent.com>
> wrote:
>
> >   1. FieldInfo, is ultimately a custom object and any user who uses this
> >    operator has to construct an object, populate it and then use it. We
> are
> >    trying to avoid using any custom object and allow any user to use the
> >    operator without writing any extra code; just configuration.
> > FieldInfo is a way to provide configuration in a UI friendly way.
> Providing
> > configuration as JSON is not UI friendly.
> >
> >    2. In case of update / merge, we need what SQL data types that the
> >    expression provided by the user would evaluate to. In case of insert
> we
> > can
> >    go and fetch the data types of the columns directly from DB. However,
> > the
> >    same is not possible for custom expressions; "avg(salary)" for
> instance.
> > Ok so here is where you can make a change.
> > - JDBCPojoOutput can be renamed to AbstractJDBCPojoOutpuOperator.
> > - Abstraction is to fetch the type of column.
> > - Add a concrete JDBCPojoInsertOutput  that derives the types of columns
> > directly from DB. Please note that FieldInfo can also provide type of
> > derived column.
> >
> >
> > You mentioned "We are trying to avoid using any custom object and allow
> any
> > user to use the
> >    operator without writing any extra code".
> > This I think is specific to your use case. You can create an extension of
> > the above which takes JSON blob and creates FieldInfos from it.
> >
> > Chandni
> >
> >
> >
> >
> >
> > On Mon, Dec 28, 2015 at 1:48 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> >
> > wrote:
> >
> > > Hi Chandni,
> > >
> > > Following are the issues:
> > >
> > >    1. FieldInfo, is ultimately a custom object and any user who uses
> this
> > >    operator has to construct an object, populate it and then use it. We
> > are
> > >    trying to avoid using any custom object and allow any user to use
> the
> > >    operator without writing any extra code; just configuration.
> > >    2. In case of update / merge, we need what SQL data types that the
> > >    expression provided by the user would evaluate to. In case of insert
> > we
> > > can
> > >    go and fetch the data types of the columns directly from DB.
> However,
> > > the
> > >    same is not possible for custom expressions; "avg(salary)" for
> > instance.
> > >
> > > Thanks.
> > >
> > > -Bhupesh
> > >
> > >
> > > On Mon, Dec 28, 2015 at 2:57 PM, Chandni Singh <
> chandni@datatorrent.com>
> > > wrote:
> > >
> > > > Hi Bhupesh,
> > > >
> > > > JDBCPojoOutputOperator was written for a demo and therefore it was
> > marked
> > > > Evolving which is why I had mentioned that you should feel free to
> > modify
> > > > it.
> > > >
> > > > I think an insert query can be as complex as any other query. It uses
> > > > FieldInfo because in the app builder it is easy for the user to
> provide
> > > > that instead of JSON String.
> > > >
> > > > Can you please provide specifics about what it is that you find
> > difficult
> > > > to change/implement for providing update/merge/delete support in
> > > > JDBCPojoOutputOperator?
> > > >
> > > > Chandni
> > > >
> > > > On Mon, Dec 28, 2015 at 1:09 AM, Bhupesh Chawda <
> > bhupesh@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > It has been pointed out that adding another class for handling
> > update /
> > > > > merge queries would not be a good option.
> > > > >
> > > > > Here are the current implementation details:
> > > > >
> > > > >    - We have an existing class: JdbcPOJOOutputOperator which
> accepts
> > a
> > > > list
> > > > >    of FieldInfo objects. Each element of this list indicates the
> > > details
> > > > >    (column name, pojo field expression and datatype) of fields that
> > > need
> > > > > to be
> > > > >    inserted. Using this, the operator formulates the insert query
> in
> > > the
> > > > > setup
> > > > >    method and identifies the sql datatypes of these columns from
> the
> > > > > database
> > > > >    using the table name.
> > > > >
> > > > >
> > > > >    - Now, coming to the update / merge feature, it is difficult to
> > > > >    formulate the update / merge query in the operator logic due to
> > the
> > > > > complex
> > > > >    structure of these statements. For this reason, we plan to take
> a
> > > > >    parametrized SQL query from the user. This may look like:
> *"update
> > > > table
> > > > >    set x = ?, y = ? where z + w > ? and a == 1;"*. Such statements
> > can
> > > be
> > > > >    accepted from the user in addition to a json string which
> > indicates
> > > > the
> > > > >    details for the parameters: *column name, the pojo expression
> and
> > > the
> > > > >    sql data type* of the expression. Note that this information is
> > > > similar
> > > > >    to the FieldInfo object, but is a string which can be configured
> > > > easily
> > > > > by
> > > > >    the user. Also note that the data type which is accepted is the
> > SQL
> > > > data
> > > > >    type. Using this info we can populate the parametrized query and
> > run
> > > > it
> > > > >    based on the incoming POJO.
> > > > >
> > > > > The second approach is able to handle all kinds of queries (insert
> /
> > > > update
> > > > > / merge / delete). However, since we already have the
> > > > > JdbcPOJOOutputOperator, we would like to merge the new
> functionality
> > > into
> > > > > the same class.
> > > > >
> > > > > Here we have the following options:
> > > > >
> > > > >    1. Change the existing class (JdbcPOJOOutputOperator) to the
> > second
> > > > >    approach which is more generic and also handles inserts.
> > > > >    2. Add the update/ merge functionality to the existing class
> > without
> > > > >    changing the existing functionality. This will have two
> different
> > > ways
> > > > > that
> > > > >    insert queries may be handled in the operator.
> > > > >    3. Add another class which extends from JdbcPOJOOutputOperator
> and
> > > > have
> > > > >    the update/merge functionality there. (This is not recommended.)
> > > > >    4. Any other approach.
> > > > >
> > > > > Please suggest.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > -Bhupesh.
> > > > >
> > > > >
> > > > > On Mon, Dec 14, 2015 at 11:04 AM, Chandni Singh <
> > > chandni@datatorrent.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > No I don't think we are restricting Malhar to just abstract
> > classes.
> > > > > > Whenever they are couple of use cases that we see quite often,
we
> > add
> > > > > > concrete implementations.
> > > > > >
> > > > > > For eg. FileLineInputOperator which is a concrete implementation
> of
> > > > > > AbstractFileInputOperator. FSSliceReader is an example as well.
> > > > > >
> > > > > > In AbstractJdbcOutputOperator case there hasn't been such common
> > > > > > insert/update query.
> > > > > >
> > > > > > Also if you look at the example I provided, it is very simple
to
> > > > provide
> > > > > a
> > > > > > concrete implementation.
> > > > > >
> > > > > > If you would like to change JdbcPOJOOutputOperator to work for
> > > > > > "UPDATE/MERGE" then please go ahead.
> > > > > >
> > > > > > Chandni
> > > > > >
> > > > > > On Sun, Dec 13, 2015 at 8:47 PM, Bhupesh Chawda <
> > > > bhupesh@datatorrent.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I see. So, just to understand more, do we plan to keep
Malhar
> > > > > restricted
> > > > > > to
> > > > > > > the base functionality (as in abstract classes)? And put
the
> > > > > > configuration
> > > > > > > aspect / concrete implementations in apps that use these
> > operators?
> > > > > > >
> > > > > > > Thanks.
> > > > > > > Bhupesh
> > > > > > >
> > > > > > > On Sat, Dec 12, 2015 at 5:43 AM, Chandni Singh <
> > > > > chandni@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Here is an example of doing Upsert with JDBC:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/chandnisingh/Malhar/blob/examples/apps/jdbc/src/main/java/com/datatorrent/jdbc/JdbcWriter.java
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Chandni
> > > > > > > >
> > > > > > > > On Fri, Dec 11, 2015 at 11:19 AM, Chandni Singh <
> > > > > > chandni@datatorrent.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > The operators are under Malhar/lib/db/jdbc.
> > > > > > > > >
> > > > > > > > > Here is one of them:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-apex-malhar/blob/devel-3/library/src/main/java/com/datatorrent/lib/db/jdbc/AbstractJdbcTransactionableOutputOperator.java
> > > > > > > > >
> > > > > > > > > They work with any kind PreparedStatement - insert
or
> update
> > > > > > > > >
> > > > > > > > > Chandni
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Dec 11, 2015 at 10:59 AM, Bhupesh Chawda
<
> > > > > > > > bhupesh@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Hi Chandni,
> > > > > > > > >>
> > > > > > > > >> I don't see an update query being handled
in the operator.
> > > Could
> > > > > you
> > > > > > > > >> please
> > > > > > > > >> point me to the appropriate class?
> > > > > > > > >> Or did you mean that handling a update query
is just a
> > matter
> > > of
> > > > > > > > extending
> > > > > > > > >> the class and providing a concrete implementation?
> > > > > > > > >>
> > > > > > > > >> Thanks.
> > > > > > > > >> -Bhupesh
> > > > > > > > >>
> > > > > > > > >> On Fri, Dec 11, 2015 at 10:42 PM, Chandni
Singh <
> > > > > > > > chandni@datatorrent.com>
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >> > Hi Bhupesh,
> > > > > > > > >> >
> > > > > > > > >> > The current abstract JDBC Output Operators
 in library
> are
> > > > > generic
> > > > > > > and
> > > > > > > > >> have
> > > > > > > > >> > already been used in multiple POCs and
applications. In
> > fact
> > > > > this
> > > > > > > > >> operator
> > > > > > > > >> > has matured through customer use cases.
It is not just
> an
> > > > insert
> > > > > > > > >> operator.
> > > > > > > > >> > We have used it to perform update and
inserts.
> > > > > > > > >> >
> > > > > > > > >> > That said, I don't think it is a good
idea to introduce
> > > input
> > > > > > format
> > > > > > > > in
> > > > > > > > >> >  these abstract implementations. It
is written to handle
> > any
> > > > > type
> > > > > > of
> > > > > > > > >> query,
> > > > > > > > >> > be it a procedure call (that was an
actual customer use
> > > case).
> > > > > > > > >> >
> > > > > > > > >> > Chandni
> > > > > > > > >> >
> > > > > > > > >> > On Fri, Dec 11, 2015 at 2:50 AM, Bhupesh
Chawda <
> > > > > > > > >> bhupesh@datatorrent.com>
> > > > > > > > >> > wrote:
> > > > > > > > >> >
> > > > > > > > >> > > Hi All,
> > > > > > > > >> > >
> > > > > > > > >> > > We are planning to proceed with
the following approach
> > for
> > > > > JDBC
> > > > > > > > >> *update*
> > > > > > > > >> > > operator:
> > > > > > > > >> > >
> > > > > > > > >> > >    - *Update Query Configuration*
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >    - Example Update Query: *update
tableName set a =
> ?**
> > > > > where b
> > > > > > > = ?
> > > > > > > > >> and
> > > > > > > > >> > c
> > > > > > > > >> > >    > ?;*
> > > > > > > > >> > >    - Example JSON input array for
parameter
> > > instantiations:
> > > > > > *[{a,
> > > > > > > > >> > >    expression, INTEGER}, {b, expression,
VARCHAR}, {c,
> > > > > > expression,
> > > > > > > > >> > DATE}]*
> > > > > > > > >> > >
> > > > > > > > >> > > We are also planning to change
the JDBC Output
> Operator
> > in
> > > > > > Malhar
> > > > > > > > >> Library
> > > > > > > > >> > > which currently does just insert.
We plan to make the
> > > input
> > > > > > format
> > > > > > > > >> > > consistent for both insert and
update and hence the
> > change
> > > > to
> > > > > > the
> > > > > > > > >> current
> > > > > > > > >> > > way of configuration using JSON.
Following would be
> the
> > > > config
> > > > > > for
> > > > > > > > >> > inserts:
> > > > > > > > >> > >
> > > > > > > > >> > >    - *Insert Query Configuration*
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >    - Example Insert Query: *insert
into tableName
> values
> > > (?,
> > > > > ?,
> > > > > > > .. ,
> > > > > > > > >> ?);*
> > > > > > > > >> > >    - Example JSON input array for
parameter
> > > instantiations:
> > > > > > *[{a,
> > > > > > > > >> > >    expression, INTEGER}, {b, expression,
VARCHAR}, ..
> ,
> > > {c,
> > > > > > > > >> expression,
> > > > > > > > >> > > DATE}]*
> > > > > > > > >> > >
> > > > > > > > >> > > Please let us know your thoughts.
> > > > > > > > >> > >
> > > > > > > > >> > > Thanks.
> > > > > > > > >> > >
> > > > > > > > >> > > -Bhupesh
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > On Wed, Dec 9, 2015 at 6:38 PM,
Bhupesh Chawda <
> > > > > > > > >> bhupesh@datatorrent.com>
> > > > > > > > >> > > wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > Hi All,
> > > > > > > > >> > > >
> > > > > > > > >> > > > Would it be a good idea to
introduce the update
> > > > > functionality
> > > > > > to
> > > > > > > > the
> > > > > > > > >> > JDBC
> > > > > > > > >> > > > output operator in Apache
Apex Malhar library.
> > > > > > > > >> > > >
> > > > > > > > >> > > > The following are possible
approaches:
> > > > > > > > >> > > >
> > > > > > > > >> > > >    1. Accept a update query
from the user with place
> > > > holders
> > > > > > for
> > > > > > > > >> > values.
> > > > > > > > >> > > >    Example: *update tableName
set a = ?, b = ?
> where c
> > > = ?
> > > > > and
> > > > > > > d >
> > > > > > > > >> ?*.
> > > > > > > > >> > > >    Here "?" will be provided
by the user as java
> > > > expressions
> > > > > > > which
> > > > > > > > >> will
> > > > > > > > >> > > be
> > > > > > > > >> > > >    evaluated from the incoming
tuple.
> > > > > > > > >> > > >    2. Another option is to
accept in some
> > configuration
> > > > > format
> > > > > > > > >> (json /
> > > > > > > > >> > > >    xml) the following and
formulate the query in the
> > > > > operator.
> > > > > > > > This
> > > > > > > > >> can
> > > > > > > > >> > > become
> > > > > > > > >> > > >    arbitrarily complex.
> > > > > > > > >> > > >    1. update clause columns
> > > > > > > > >> > > >       2. update clause expressions
> > > > > > > > >> > > >       3. where clause columns
> > > > > > > > >> > > >       4. where clause expressions
> > > > > > > > >> > > >
> > > > > > > > >> > > > I am thinking about going
ahead with 1. Please let
> me
> > > know
> > > > > if
> > > > > > > any
> > > > > > > > >> other
> > > > > > > > >> > > > option is possible and whether
such a functionality
> > > > already
> > > > > > > exists
> > > > > > > > >> in
> > > > > > > > >> > > some
> > > > > > > > >> > > > other class.
> > > > > > > > >> > > >
> > > > > > > > >> > > > Thanks.
> > > > > > > > >> > > >
> > > > > > > > >> > > > -Bhupesh
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message