flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai" <tzuli...@gmail.com>
Subject Re: API request to submit job takes over 1hr
Date Tue, 14 Jun 2016 03:57:30 GMT
Hi Shannon,

Thanks for your investigation on the issue and the JIRA. There's actually a
previous JIRA on this problem already:
https://issues.apache.org/jira/browse/FLINK-4023. Would you be ok with
tracking this issue on FLINK-4023, and close FLINK-4069 as a duplicate
issue? As you can see, I've also referenced a link to FLINK-4069 on
FLINK-4023 for your additional info on the problem.

A little help with answering your last questions:
1. We're doing the partition distribution across consumers ourselves: the
Kafka consumer connector creates a Kafka client on subtasks, and each
subtask independently determines which partitions it should be in charge of.
There's also information on this blog here for more info:
http://data-artisans.com/kafka-flink-a-practical-how-to/, on the last FAQ
section. As Robert has mentioned, the consumer is currently depending on the
fixed ordered list of partitions sent to all subtasks so that each of them
always determine the same set of partitions to fetch from across restarts.
2. Following the above description, currently the consumer is only
subscribing to the fixed partition list queried in the constructor. So at
the moment the Flink Kafka consumer doesn't handle repartitioning of topics,
but it's definitely on the todo list for the Kafka connector and won't be
too hard to implement once querying in the consumer is resolved (perhaps
Robert can clarify this a bit more).

Best,
Gordon



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/API-request-to-submit-job-takes-over-1hr-tp7319p7558.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Mime
View raw message