flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <ruben.casado.teje...@accenture.com>
Subject Re: <Dev>Need guidance for write a client connector for 'Flink'
Date Thu, 19 Jan 2017 09:28:03 GMT

Just in case it could useful, we are working in Flink-Kudu integration [1]. This is a still
Work in Progess but we had to implemente an InputFormat to read from Kudu tables so maybe
the code is useful for you [2]


[1] https://github.com/rubencasado/Flink-Kudu
[2] https://github.com/rubencasado/Flink-Kudu/blob/master/src/main/java/es/accenture/flink/Sources/KuduInputFormat.java

El 19/1/17 6:03, "Pawan Manishka Gunarathna" <pawan.manishka@gmail.com> escribió:

    When we are implementing that InputFormat Interface, if we have that Input
    split part in our data analytics server APIs can we directly go to the
    second phase that you have described earlier....?

    Since Our data source has database tables architecture I have a thought of
    follow that 'JDBCInputFormat' in Flink. Can you provide some information
    regarding how that JDBCInputFormat execution happens?


    On Mon, Jan 16, 2017 at 3:37 PM, Pawan Manishka Gunarathna <
    pawan.manishka@gmail.com> wrote:

    > Hi Fabian,
    > Thanks for providing those information.
    > On Mon, Jan 16, 2017 at 2:36 PM, Fabian Hueske <fhueske@gmail.com> wrote:
    >> Hi Pawan,
    >> this sounds like you need to implement a custom InputFormat [1].
    >> An InputFormat is basically executed in two phases. In the first phase it
    >> generates InputSplits. An InputSplit references a a chunk of data that
    >> needs to be read. Hence, InputSplits define how the input data is split to
    >> be read in parallel. In the second phase, multiple InputFormats are
    >> started
    >> and request InputSplits from an InputSplitProvider. Each instance of the
    >> InputFormat processes one InputSplit at a time.
    >> It is hard to give general advice on implementing InputFormats because
    >> this
    >> very much depends on the data source and data format to read from.
    >> I'd suggest to have a look at other InputFormats.
    >> Best, Fabian
    >> [1]
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_flink_blob_master_flink-2Dcore_src_&d=DgIBaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=brkRAgrW3LbdVDOiRLzI7SFUIWBL5aa2MIfENljA8xoe0lFg2u3-S6GnFTH7Pbmc&m=RVDymwyU0kgdfLg3Rv7z3F9J81xIKmyt-6MlPBY5hSw&s=BDRgnhShzvotGlc7rLXFHyh5iiP4pHXF9lP8uysQW8M&e=
    >> main/java/org/apache/flink/api/common/io/InputFormat.java
    >> 2017-01-16 6:18 GMT+01:00 Pawan Manishka Gunarathna <
    >> pawan.manishka@gmail.com>:
    >> > Hi,
    >> >
    >> > we have a data analytics server that has analytics data tables. So I
    >> need
    >> > to write a custom *Java* implementation for read data from that data
    >> source
    >> > and do processing (*batch* processing) using Apache Flink. Basically
    >> it's
    >> > like a new client connector for Flink.
    >> >
    >> > So It would be great if you can provide a guidance for my requirement.
    >> >
    >> > Thanks,
    >> > Pawan
    >> >
    > --
    > *Pawan Gunaratne*
    > *Mob: +94 770373556 <+94%2077%20037%203556>*


    *Pawan Gunaratne*
    *Mob: +94 770373556*


This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Where allowed by local law, electronic communications with Accenture and its affiliates, including
e-mail and instant messaging (including content), may be scanned by our systems for the purposes
of information security and assessment of internal compliance with Accenture policy.

View raw message