flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yun Gao" <yungao...@aliyun.com.INVALID>
Subject Re: [DISCUSS] FLIP-115: Filesystem connector in Table
Date Fri, 13 Mar 2020 10:43:13 GMT
       Hi,
       Very thanks for Jinsong to bring up this discussion! It should largely improve the
usability after enhancing the FileSystem connector in Table. 

       I have the same question with Piotr. From my side, I think it should be better to be
able to reuse existing StreamingFileSink. I think We have began 
       enhancing the supported FileFormat (e.g., ORC, Avro...), and reusing StreamFileSink
should be able to avoid repeat work in the Table library. Besides, 
       the bucket concept seems also matches the semantics of partition. 

       For the notification of adding partitions, I'm a little wondering that the Watermark
mechanism might not be enough since Bucket/Partition might spans
       multiple subtasks. It depends on the level of notification: if we want to notify for
the bucket on each subtask, using watermark to notifying each subtask
       should be ok, but if we want to notifying for the whole Bucket/Partition, we might
need to also do some coordination between subtasks. 


     Best, 
      Yun




------------------------------------------------------------------
From:Piotr Nowojski <piotr@ververica.com>
Send Time:2020 Mar. 13 (Fri.) 18:03
To:dev <dev@flink.apache.org>
Cc:user <user@flink.apache.org>; user-zh <user-zh@flink.apache.org>
Subject:Re: [DISCUSS] FLIP-115: Filesystem connector in Table

Hi,

Which actual sinks/sources are you planning to use in this feature? Is it about exposing StreamingFileSink
in the Table API? Or do you want to implement new Sinks/Sources?

Piotrek

> On 13 Mar 2020, at 10:04, jinhai wang <jinhai.me@gmail.com> wrote:
> 
> Thanks for FLIP-115. It is really useful feature for platform developers who manage hundreds
of Flink to Hive jobs in production.
> I think we need add 'connector.sink.username' for UserGroupInformation when data is written
to HDFS
> 
> 
>  在 2020/3/13 下午3:33,“Jingsong Li”<jingsonglee0@gmail.com> 写入:
> 
>    Hi everyone,
> 
>    I'd like to start a discussion about FLIP-115 Filesystem connector in Table
>    [1].
>    This FLIP will bring:
>    - Introduce Filesystem table factory in table, support
>    csv/parquet/orc/json/avro formats.
>    - Introduce streaming filesystem/hive sink in table
> 
>    CC to user mail list, if you have any unmet needs, please feel free to
>    reply~
> 
>    Look forward to hearing from you.
> 
>    [1]
>    https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table
> 
>    Best,
>    Jingsong Lee
> 
> 
> 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message