apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Jotwani <mo...@datatorrent.com>
Subject Re: JdbcPOJOInputOperator Behaviour
Date Mon, 09 May 2016 15:56:39 GMT
+1 for incremental data.

Regards,
Mohit
On 9 May 2016 19:59, "Yogi Devendra" <devendra.vyavahare@gmail.com> wrote:

> +1 for incremental data fetching.
> for fetchDirection variable; it is better to get inputs from original
> author (if possible).
>
> ~ Yogi
>
> On 9 May 2016 at 19:04, Akshay Gore <akshay@datatorrent.com> wrote:
>
> > +1 for incremental data fetching. This is a must-have feature.
> >
> > -Akshay
> > On 09-May-2016 3:39 pm, "Sandeep Deshmukh" <sandeep@datatorrent.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I am using JdbcPOJOInputOperator to ingest data from mysql to HDFS. I
> > > observed that  once the existing data is ingested, newly added data in
> > > mysql is not ingested. At the same time, if I add some data to mysql
> when
> > > the ingestion is still going on, the newly added data is also ingested
> on
> > > HDFS.
> > >
> > > In the code, fetching data in batches in achieved using fetchSize
> > parameter
> > > that limits the number of tuples to fetch per result set and pageNumber
> > is
> > > used internally to manage the offset calculation as ( fetchSize *
> > > pageNumber). The pageNumber is incremented per window.
> > >
> > > When the existing tuples are ingested, there is no further data ingest
> > but
> > > the pageNumber variable is still incremented. This results is trying to
> > > fetch data that is beyond the number of tuples in the
> table/queryresult.
> > >
> > > Changing offset calculations to tuples read so far will fix this issue
> > and
> > > the operator can then be used to poll for newer data in the table.
> > >
> > > If you need to have a quick look at the code:
> https://github.com/apache/
> > > incubator-apex-malhar/blob/master/library/src/main/java/
> > > com/datatorrent/lib/db/jdbc/JdbcPOJOInputOperator.java
> > >
> > > Side observation: fetchDirection variable is unused in the code. Will
> > > remove it from the class.
> > >
> > > Would like get your thoughts on my observations. I will create a JIRA
> and
> > > open a PR based on inputs received on this thread.
> > >
> > > Regards,
> > > Sandeep
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message