flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tzulitai <...@git.apache.org>
Subject [GitHub] flink pull request #5108: [FLINK-8181] [kafka] Make FlinkFixedPartitioner in...
Date Fri, 01 Dec 2017 15:13:30 GMT
Github user tzulitai commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5108#discussion_r154368266
  
    --- Diff: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/partitioner/FlinkFixedPartitioner.java
---
    @@ -68,6 +78,13 @@ public int partition(T record, byte[] key, byte[] value, String targetTopic,
int
     			partitions != null && partitions.length > 0,
     			"Partitions of the target topic is empty.");
     
    -		return partitions[parallelInstanceId % partitions.length];
    +		if (topicToFixedPartition.containsKey(targetTopic)) {
    --- End diff --
    
    @aljoscha yes, the semantics is a bit odd / needs some clarification before we move on.
I've been having a go at implementing state checkpointing for the `FlinkFixedPartitioner`
today, and for example one unclear case I bumped into was the following:
    Subtask 1 writes to partition X for "some-topic"
    Subtask 2 writes to partition Y for "some-topic"
    On restore and say the sink is rescaled to DOP of 1, should the single subtask continue
writing to partition X or Y for "some-topic"?
    
    Regarding the default Kafka behaviour:
    It's hash partitioning on the attached key for the records. I've also thought about using
that as the default instead of the fixed partitioner; see the relevant discussion here: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkKafkaProducerXX-td16951.html


---

Mime
View raw message