flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Wysakowicz <dwysakow...@apache.org>
Subject Re: Joining streamed data to reference data
Date Fri, 20 Jul 2018 11:22:16 GMT
Hi James,

1) Unfortunately, Flink does not support DataSet with DataStream joins as
of now. If the "batch" table is small enough you might try the solution
suggested by Vino to load it in the UDTF. You can also try implementing the
Stream version of this table yourself. You can use the
org.apache.flink.table.sources.CsvTableSource and
org.apache.flink.orc.OrcRowInputFormat as examples.

2) Providing better out-of-the box support for multiple source and formats
in high on the roadmap for upcoming releases. So I would guess you can
expect support for orc in stream in the nearest future.

Best,
Dawid

On Fri, 20 Jul 2018 at 11:59, vino yang <yanghua1127@gmail.com> wrote:

> Hi Porritt,
>
> Flink does not support streaming and batch join, currently, streaming and
> batch job are both independent.
>
> I guess your use case is streaming and dimension table join?
> Unfortunately, it's not possible for the Flink SQL API to join a stream
> with a common dataset now.
>
> 1)
> As a workaround, if the table is just a tiny one, you can achieve a
> inner/left outer join with the user defined table functions :
>
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#joins
>
> 2)
> I did not see any plan about this.
>
> Thanks, vino.
>
>
> 2018-07-20 17:29 GMT+08:00 Porritt, James <James.Porritt@uk.mlp.com>:
>
>> I was hoping to join a StreamTableSource to a BatchTableSource, but I
>> find it’s not simple. A couple of questions:
>>
>>
>>
>> 1)      Other than just pushing the DataSet to a Kafka topic (either
>> internally or externally to the application) and reading it into a
>> DataStream are there any means of doing the conversion?
>>
>> 2)      Are there any plans to get OrcTableSource to be both
>> StreamTableSource and BatchTableSource instead of just a BatchTableSource?
>>
>>
>>
>> Thanks,
>>
>> James.
>> ######################################################################
>> The information contained in this communication is confidential and
>> intended only for the individual(s) named above. If you are not a named
>> addressee, please notify the sender immediately and delete this email
>> from your system and do not disclose the email or any part of it to any
>> person. The views expressed in this email are the views of the author
>> and do not necessarily represent the views of Millennium Capital Partners
>> LLP (MCP LLP) or any of its affiliates. Outgoing and incoming electronic
>> communications of MCP LLP and its affiliates, including telephone
>> communications, may be electronically archived and subject to review
>> and/or disclosure to someone other than the recipient. MCP LLP is
>> authorized and regulated by the Financial Conduct Authority. Millennium
>> Capital Partners LLP is a limited liability partnership registered in
>> England & Wales with number OC312897 and with its registered office at
>> 50 Berkeley Street, London, W1J 8HD
>> <https://maps.google.com/?q=50+Berkeley+Street,+London,+W1J+8HD&entry=gmail&source=g>
>> .
>> ######################################################################
>>
>>
>

Mime
View raw message