apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hitesh Kapoor <hit...@datatorrent.com>
Subject Re: APEXMALHAR-2382 User needs to create dt_meta table while using JdbcPOJOInsertOutputOperator
Date Tue, 17 Jan 2017 11:27:27 GMT
Hi All,

Thank you for the opinions and suggestions. As per the consensus we will
continue to rely on the user to create dt_meta table.
Jira for documentation already exists (APEXMALHAR-2383) and is in progress.

Regards,
Hitesh


On Mon, Jan 16, 2017 at 9:16 PM, Thomas Weise <thw@apache.org> wrote:

> For JDBC exactly-once results the window meta data need to be committed
> along with the user data, hence the need for additional schema in the
> target database. There have been webinar and blogs about processing
> guarantees, please have a look at those.
>
> Table creation cannot be automatic by default. As I already said on the
> JIRA it could be an option for testing or for development, it just needs to
> be off by default. I would like to see the table name changed.
>
> If the library can generate the DDL for the table for the target database,
> then possibly when the table is missing it can output that along with
> instructions into the log to make things easier for users.
>
> It might also be a good idea to document the setup steps for multiple
> operators to write to the same schema.
>
> Thanks
>
> On Sun, Jan 15, 2017 at 11:55 PM, Devendra Tagare <
> devendrat@datatorrent.com
> > wrote:
>
> > Hi,
> >
> > -1 on auto table creation.Reasons have been eloquently elaborated in the
> > earlier posts.
> >
> > +1 on revisiting the approach taken for exactly once in the
> > JDBCPollInputOperator.
> >
> > One way could be to move the JDBC read and write operators into a
> separate
> > module (like apex-malhar/kafka) and add statistics, metrics,meta-data
> > features along similar lines.This module can have required concrete
> > implementations for mysql, psql etc..
> >
> > Thanks,
> > Dev
> >
> > On Sun, Jan 15, 2017 at 10:34 PM, Chinmay Kolhatkar <chinmay@apache.org>
> > wrote:
> >
> > > -1 for automatic schema creation...
> > >
> > > Moreover, I am wondering whether asking user to create a dt_meta table
> is
> > > right way. From an admins perspective, an ask for creation of meta
> table
> > > looks wrong to me. dt_meta table is created for the purpose of exactly
> > once
> > > but it does not hold any user data.. On this logic admin might deny
> > > developers for creation of table.
> > >
> > > I suggest to start a separate thread to do exactly once for JDBC insert
> > in
> > > a cleaner way. We take take a look at Kafka or File outputs to see how
> > > they've done to achieve exactly once without creating a meta location
> at
> > > destination.
> > >
> > > -Chinmay.
> > >
> > >
> > > On Mon, Jan 16, 2017 at 11:16 AM, Pradeep Kumbhar <
> > pradeep@datatorrent.com
> > > >
> > > wrote:
> > >
> > > > +1 on having operator documentation explicitly mentioning that,
> > "dt_meta"
> > > > table is mandatory
> > > > for the operator to work correctly. Also provide a sample table
> > creation
> > > > query for reference.
> > > >
> > > > On Sat, Jan 14, 2017 at 1:05 PM, AJAY GUPTA <ajaygit158@gmail.com>
> > > wrote:
> > > >
> > > > > Since the query can be different for different databases, the user
> > will
> > > > > have to provide query to the operator. Rather than this, I believe
> > it's
> > > > > easier for user to directly execute create table query on DB.
> > > > >
> > > > > Also, the create table script won't be that heavy that we create
> > script
> > > > for
> > > > > it. Probably adding a generic type of query in the docs itself
> should
> > > > > suffice.
> > > > >
> > > > >
> > > > > Ajay
> > > > >
> > > > > On Sat, 14 Jan 2017 at 10:27 AM, Yogi Devendra <
> > > yogidevendra@apache.org>
> > > > > wrote:
> > > > >
> > > > > > As Aniruddha pointed out, table creation should be done by
> dbadmin.
> > > > > >
> > > > > > In that case, utility script will be helpful.
> > > > > >
> > > > > >
> > > > > >
> > > > > > If we embed this code inside operator or application; then it
> will
> > be
> > > > > >
> > > > > > difficult for dbadmin to use it.
> > > > > >
> > > > > >
> > > > > >
> > > > > > ~ Yogi
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 14 January 2017 at 03:43, Thomas Weise <thw@apache.org>
> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -1 for automatic schema modification, unless the user asked
for
> > it.
> > > > See
> > > > > >
> > > > > > > comment on JIRA.
> > > > > >
> > > > > > >
> > > > > >
> > > > > > >
> > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 5:11 AM, Aniruddha Thombare <
> > > > > >
> > > > > > > aniruddha@datatorrent.com> wrote:
> > > > > >
> > > > > > >
> > > > > >
> > > > > > > > The tables should be created / altered by dbadmin.
> > > > > >
> > > > > > > > We shouldn't worry about table creations as its one-time
> > > activity.
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > Thanks,
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > A
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > _____________________________________
> > > > > >
> > > > > > > > Sent with difficulty, I mean handheld ;)
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > On 13 Jan 2017 6:37 pm, "Yogi Devendra" <
> > yogidevendra@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > I am not very keen on having utility script.
> > > > > >
> > > > > > > > But, "no side-effects without explicit ask by the
end-user"
> is
> > > > > > important.
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > ~ Yogi
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > On 13 January 2017 at 16:44, Priyanka Gugale <
> > priyag@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > > > > IMO it's okay to create table in java code. We
should
> > document
> > > it
> > > > > in
> > > > > >
> > > > > > > > > operator guide as well as put a log message when
we create
> > > table.
> > > > > >
> > > > > > > > > And in case you don't have privileges, the operator
should
> > > throw
> > > > > >
> > > > > > > > meaningful
> > > > > >
> > > > > > > > > message.
> > > > > >
> > > > > > > > >
> > > > > >
> > > > > > > > > -Priyanka
> > > > > >
> > > > > > > > >
> > > > > >
> > > > > > > > > On Fri, Jan 13, 2017 at 4:07 PM, Yogi Devendra
<
> > > > > >
> > > > > > > yogidevendra@apache.org>
> > > > > >
> > > > > > > > > wrote:
> > > > > >
> > > > > > > > >
> > > > > >
> > > > > > > > > > My suggestions:
> > > > > >
> > > > > > > > > >
> > > > > >
> > > > > > > > > >    1. Have a separate utility script for
creating this
> > table.
> > > > > >
> > > > > > > > > >    2. Have README for the utility script
> > > > > >
> > > > > > > > > >    3. Mention about the utility script in
the operator
> > > > javadocs.
> > > > > >
> > > > > > > > > >    4. Mention  about the utility script
in the
> application
> > > > > README.
> > > > > >
> > > > > > > > > >    5. If at all, you wish to ease out the
process; you
> can
> > > > > > introduce
> > > > > >
> > > > > > > > flag
> > > > > >
> > > > > > > > > >    like autoPopulateMetaTable. But. default
value of this
> > > flag
> > > > > > should
> > > > > >
> > > > > > > > to
> > > > > >
> > > > > > > > > be
> > > > > >
> > > > > > > > > >    off.
> > > > > >
> > > > > > > > > >    6. I would prefer to avoid side-effects
unless
> > explicitly
> > > > > asked
> > > > > > by
> > > > > >
> > > > > > > > the
> > > > > >
> > > > > > > > > >    end user.
> > > > > >
> > > > > > > > > >    7. Relevant exceptions should be caught
and should
> have
> > a
> > > > > > message
> > > > > >
> > > > > > > > > which
> > > > > >
> > > > > > > > > >    can be understood by the end user.
> > > > > >
> > > > > > > > > >
> > > > > >
> > > > > > > > > > ~ Yogi
> > > > > >
> > > > > > > > > >
> > > > > >
> > > > > > > > > > On 13 January 2017 at 15:57, Hitesh Kapoor
<
> > > > > hitesh@datatorrent.com
> > > > > > >
> > > > > >
> > > > > > > > > wrote:
> > > > > >
> > > > > > > > > >
> > > > > >
> > > > > > > > > > > Hi All,
> > > > > >
> > > > > > > > > > >
> > > > > >
> > > > > > > > > > > Currently to use JdbcPOJOInsertOutputOperator,
user
> needs
> > > to
> > > > > > create
> > > > > >
> > > > > > > > > > > "dt_meta" table to enforce
> > > > > >
> > > > > > > > > > > exactly-once processing semantic. If
the user fails to
> > > create
> > > > > > this
> > > > > >
> > > > > > > > > table
> > > > > >
> > > > > > > > > > > before launching the application an
exception is
> thrown.
> > > > > >
> > > > > > > > > > > To handle this scenario we can automate
the process of
> > > > creating
> > > > > >
> > > > > > > this
> > > > > >
> > > > > > > > > > table,
> > > > > >
> > > > > > > > > > > assuming the user has the appropriate
privileges. The
> > > problem
> > > > > > with
> > > > > >
> > > > > > > > this
> > > > > >
> > > > > > > > > > > approach is that it may not be a very
good idea to
> modify
> > > > > user's
> > > > > >
> > > > > > > > > database
> > > > > >
> > > > > > > > > > > automatically , also if the user doesn't
has the
> > > appropriate
> > > > > >
> > > > > > > > privileges
> > > > > >
> > > > > > > > > > it
> > > > > >
> > > > > > > > > > > will eventually throw an exception
(however a different
> > > > > > exception).
> > > > > >
> > > > > > > > > > > So I need your opinion if we should
automate the
> creation
> > > of
> > > > > this
> > > > > >
> > > > > > > > > > internal
> > > > > >
> > > > > > > > > > > table (if it doesn't exists) or continue
with the
> > existing
> > > > > >
> > > > > > > behaviour
> > > > > >
> > > > > > > > or
> > > > > >
> > > > > > > > > > > anything else.
> > > > > >
> > > > > > > > > > >
> > > > > >
> > > > > > > > > > > Regards,
> > > > > >
> > > > > > > > > > > Hitesh
> > > > > >
> > > > > > > > > > >
> > > > > >
> > > > > > > > > >
> > > > > >
> > > > > > > > >
> > > > > >
> > > > > > > >
> > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *regards,*
> > > > *~pradeep*
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message