apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandni Singh <chan...@datatorrent.com>
Subject Re: Changes to JdbcOutputOperator
Date Mon, 28 Dec 2015 10:02:12 GMT
  1. FieldInfo, is ultimately a custom object and any user who uses this
   operator has to construct an object, populate it and then use it. We are
   trying to avoid using any custom object and allow any user to use the
   operator without writing any extra code; just configuration.
FieldInfo is a way to provide configuration in a UI friendly way. Providing
configuration as JSON is not UI friendly.

   2. In case of update / merge, we need what SQL data types that the
   expression provided by the user would evaluate to. In case of insert we
can
   go and fetch the data types of the columns directly from DB. However, the
   same is not possible for custom expressions; "avg(salary)" for instance.
Ok so here is where you can make a change.
- JDBCPojoOutput can be renamed to AbstractJDBCPojoOutpuOperator.
- Abstraction is to fetch the type of column.
- Add a concrete JDBCPojoInsertOutput  that derives the types of columns
directly from DB. Please note that FieldInfo can also provide type of
derived column.


You mentioned "We are trying to avoid using any custom object and allow any
user to use the
   operator without writing any extra code".
This I think is specific to your use case. You can create an extension of
the above which takes JSON blob and creates FieldInfos from it.

Chandni





On Mon, Dec 28, 2015 at 1:48 AM, Bhupesh Chawda <bhupesh@datatorrent.com>
wrote:

> Hi Chandni,
>
> Following are the issues:
>
>    1. FieldInfo, is ultimately a custom object and any user who uses this
>    operator has to construct an object, populate it and then use it. We are
>    trying to avoid using any custom object and allow any user to use the
>    operator without writing any extra code; just configuration.
>    2. In case of update / merge, we need what SQL data types that the
>    expression provided by the user would evaluate to. In case of insert we
> can
>    go and fetch the data types of the columns directly from DB. However,
> the
>    same is not possible for custom expressions; "avg(salary)" for instance.
>
> Thanks.
>
> -Bhupesh
>
>
> On Mon, Dec 28, 2015 at 2:57 PM, Chandni Singh <chandni@datatorrent.com>
> wrote:
>
> > Hi Bhupesh,
> >
> > JDBCPojoOutputOperator was written for a demo and therefore it was marked
> > Evolving which is why I had mentioned that you should feel free to modify
> > it.
> >
> > I think an insert query can be as complex as any other query. It uses
> > FieldInfo because in the app builder it is easy for the user to provide
> > that instead of JSON String.
> >
> > Can you please provide specifics about what it is that you find difficult
> > to change/implement for providing update/merge/delete support in
> > JDBCPojoOutputOperator?
> >
> > Chandni
> >
> > On Mon, Dec 28, 2015 at 1:09 AM, Bhupesh Chawda <bhupesh@datatorrent.com
> >
> > wrote:
> >
> > > Hi All,
> > >
> > > It has been pointed out that adding another class for handling update /
> > > merge queries would not be a good option.
> > >
> > > Here are the current implementation details:
> > >
> > >    - We have an existing class: JdbcPOJOOutputOperator which accepts a
> > list
> > >    of FieldInfo objects. Each element of this list indicates the
> details
> > >    (column name, pojo field expression and datatype) of fields that
> need
> > > to be
> > >    inserted. Using this, the operator formulates the insert query in
> the
> > > setup
> > >    method and identifies the sql datatypes of these columns from the
> > > database
> > >    using the table name.
> > >
> > >
> > >    - Now, coming to the update / merge feature, it is difficult to
> > >    formulate the update / merge query in the operator logic due to the
> > > complex
> > >    structure of these statements. For this reason, we plan to take a
> > >    parametrized SQL query from the user. This may look like: *"update
> > table
> > >    set x = ?, y = ? where z + w > ? and a == 1;"*. Such statements can
> be
> > >    accepted from the user in addition to a json string which indicates
> > the
> > >    details for the parameters: *column name, the pojo expression and
> the
> > >    sql data type* of the expression. Note that this information is
> > similar
> > >    to the FieldInfo object, but is a string which can be configured
> > easily
> > > by
> > >    the user. Also note that the data type which is accepted is the SQL
> > data
> > >    type. Using this info we can populate the parametrized query and run
> > it
> > >    based on the incoming POJO.
> > >
> > > The second approach is able to handle all kinds of queries (insert /
> > update
> > > / merge / delete). However, since we already have the
> > > JdbcPOJOOutputOperator, we would like to merge the new functionality
> into
> > > the same class.
> > >
> > > Here we have the following options:
> > >
> > >    1. Change the existing class (JdbcPOJOOutputOperator) to the second
> > >    approach which is more generic and also handles inserts.
> > >    2. Add the update/ merge functionality to the existing class without
> > >    changing the existing functionality. This will have two different
> ways
> > > that
> > >    insert queries may be handled in the operator.
> > >    3. Add another class which extends from JdbcPOJOOutputOperator and
> > have
> > >    the update/merge functionality there. (This is not recommended.)
> > >    4. Any other approach.
> > >
> > > Please suggest.
> > >
> > > Thanks.
> > >
> > > -Bhupesh.
> > >
> > >
> > > On Mon, Dec 14, 2015 at 11:04 AM, Chandni Singh <
> chandni@datatorrent.com
> > >
> > > wrote:
> > >
> > > > No I don't think we are restricting Malhar to just abstract classes.
> > > > Whenever they are couple of use cases that we see quite often, we add
> > > > concrete implementations.
> > > >
> > > > For eg. FileLineInputOperator which is a concrete implementation of
> > > > AbstractFileInputOperator. FSSliceReader is an example as well.
> > > >
> > > > In AbstractJdbcOutputOperator case there hasn't been such common
> > > > insert/update query.
> > > >
> > > > Also if you look at the example I provided, it is very simple to
> > provide
> > > a
> > > > concrete implementation.
> > > >
> > > > If you would like to change JdbcPOJOOutputOperator to work for
> > > > "UPDATE/MERGE" then please go ahead.
> > > >
> > > > Chandni
> > > >
> > > > On Sun, Dec 13, 2015 at 8:47 PM, Bhupesh Chawda <
> > bhupesh@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > I see. So, just to understand more, do we plan to keep Malhar
> > > restricted
> > > > to
> > > > > the base functionality (as in abstract classes)? And put the
> > > > configuration
> > > > > aspect / concrete implementations in apps that use these operators?
> > > > >
> > > > > Thanks.
> > > > > Bhupesh
> > > > >
> > > > > On Sat, Dec 12, 2015 at 5:43 AM, Chandni Singh <
> > > chandni@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Here is an example of doing Upsert with JDBC:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/chandnisingh/Malhar/blob/examples/apps/jdbc/src/main/java/com/datatorrent/jdbc/JdbcWriter.java
> > > > > >
> > > > > > Thanks,
> > > > > > Chandni
> > > > > >
> > > > > > On Fri, Dec 11, 2015 at 11:19 AM, Chandni Singh <
> > > > chandni@datatorrent.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > The operators are under Malhar/lib/db/jdbc.
> > > > > > >
> > > > > > > Here is one of them:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-apex-malhar/blob/devel-3/library/src/main/java/com/datatorrent/lib/db/jdbc/AbstractJdbcTransactionableOutputOperator.java
> > > > > > >
> > > > > > > They work with any kind PreparedStatement - insert or update
> > > > > > >
> > > > > > > Chandni
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Dec 11, 2015 at 10:59 AM, Bhupesh Chawda <
> > > > > > bhupesh@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi Chandni,
> > > > > > >>
> > > > > > >> I don't see an update query being handled in the operator.
> Could
> > > you
> > > > > > >> please
> > > > > > >> point me to the appropriate class?
> > > > > > >> Or did you mean that handling a update query is just
a matter
> of
> > > > > > extending
> > > > > > >> the class and providing a concrete implementation?
> > > > > > >>
> > > > > > >> Thanks.
> > > > > > >> -Bhupesh
> > > > > > >>
> > > > > > >> On Fri, Dec 11, 2015 at 10:42 PM, Chandni Singh <
> > > > > > chandni@datatorrent.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Hi Bhupesh,
> > > > > > >> >
> > > > > > >> > The current abstract JDBC Output Operators  in
library are
> > > generic
> > > > > and
> > > > > > >> have
> > > > > > >> > already been used in multiple POCs and applications.
In fact
> > > this
> > > > > > >> operator
> > > > > > >> > has matured through customer use cases. It is
not just an
> > insert
> > > > > > >> operator.
> > > > > > >> > We have used it to perform update and inserts.
> > > > > > >> >
> > > > > > >> > That said, I don't think it is a good idea to
introduce
> input
> > > > format
> > > > > > in
> > > > > > >> >  these abstract implementations. It is written
to handle any
> > > type
> > > > of
> > > > > > >> query,
> > > > > > >> > be it a procedure call (that was an actual customer
use
> case).
> > > > > > >> >
> > > > > > >> > Chandni
> > > > > > >> >
> > > > > > >> > On Fri, Dec 11, 2015 at 2:50 AM, Bhupesh Chawda
<
> > > > > > >> bhupesh@datatorrent.com>
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Hi All,
> > > > > > >> > >
> > > > > > >> > > We are planning to proceed with the following
approach for
> > > JDBC
> > > > > > >> *update*
> > > > > > >> > > operator:
> > > > > > >> > >
> > > > > > >> > >    - *Update Query Configuration*
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >    - Example Update Query: *update tableName
set a = ?**
> > > where b
> > > > > = ?
> > > > > > >> and
> > > > > > >> > c
> > > > > > >> > >    > ?;*
> > > > > > >> > >    - Example JSON input array for parameter
> instantiations:
> > > > *[{a,
> > > > > > >> > >    expression, INTEGER}, {b, expression,
VARCHAR}, {c,
> > > > expression,
> > > > > > >> > DATE}]*
> > > > > > >> > >
> > > > > > >> > > We are also planning to change the JDBC Output
Operator in
> > > > Malhar
> > > > > > >> Library
> > > > > > >> > > which currently does just insert. We plan
to make the
> input
> > > > format
> > > > > > >> > > consistent for both insert and update and
hence the change
> > to
> > > > the
> > > > > > >> current
> > > > > > >> > > way of configuration using JSON. Following
would be the
> > config
> > > > for
> > > > > > >> > inserts:
> > > > > > >> > >
> > > > > > >> > >    - *Insert Query Configuration*
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >    - Example Insert Query: *insert into tableName
values
> (?,
> > > ?,
> > > > > .. ,
> > > > > > >> ?);*
> > > > > > >> > >    - Example JSON input array for parameter
> instantiations:
> > > > *[{a,
> > > > > > >> > >    expression, INTEGER}, {b, expression,
VARCHAR}, .. ,
> {c,
> > > > > > >> expression,
> > > > > > >> > > DATE}]*
> > > > > > >> > >
> > > > > > >> > > Please let us know your thoughts.
> > > > > > >> > >
> > > > > > >> > > Thanks.
> > > > > > >> > >
> > > > > > >> > > -Bhupesh
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Wed, Dec 9, 2015 at 6:38 PM, Bhupesh Chawda
<
> > > > > > >> bhupesh@datatorrent.com>
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi All,
> > > > > > >> > > >
> > > > > > >> > > > Would it be a good idea to introduce
the update
> > > functionality
> > > > to
> > > > > > the
> > > > > > >> > JDBC
> > > > > > >> > > > output operator in Apache Apex Malhar
library.
> > > > > > >> > > >
> > > > > > >> > > > The following are possible approaches:
> > > > > > >> > > >
> > > > > > >> > > >    1. Accept a update query from the
user with place
> > holders
> > > > for
> > > > > > >> > values.
> > > > > > >> > > >    Example: *update tableName set a
= ?, b = ? where c
> = ?
> > > and
> > > > > d >
> > > > > > >> ?*.
> > > > > > >> > > >    Here "?" will be provided by the
user as java
> > expressions
> > > > > which
> > > > > > >> will
> > > > > > >> > > be
> > > > > > >> > > >    evaluated from the incoming tuple.
> > > > > > >> > > >    2. Another option is to accept in
some configuration
> > > format
> > > > > > >> (json /
> > > > > > >> > > >    xml) the following and formulate
the query in the
> > > operator.
> > > > > > This
> > > > > > >> can
> > > > > > >> > > become
> > > > > > >> > > >    arbitrarily complex.
> > > > > > >> > > >    1. update clause columns
> > > > > > >> > > >       2. update clause expressions
> > > > > > >> > > >       3. where clause columns
> > > > > > >> > > >       4. where clause expressions
> > > > > > >> > > >
> > > > > > >> > > > I am thinking about going ahead with
1. Please let me
> know
> > > if
> > > > > any
> > > > > > >> other
> > > > > > >> > > > option is possible and whether such
a functionality
> > already
> > > > > exists
> > > > > > >> in
> > > > > > >> > > some
> > > > > > >> > > > other class.
> > > > > > >> > > >
> > > > > > >> > > > Thanks.
> > > > > > >> > > >
> > > > > > >> > > > -Bhupesh
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message