kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Kreps (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-691) Fault tolerance broken with replication factor 1
Date Wed, 09 Jan 2013 16:18:13 GMT
Jay Kreps created KAFKA-691:
-------------------------------

             Summary: Fault tolerance broken with replication factor 1
                 Key: KAFKA-691
                 URL: https://issues.apache.org/jira/browse/KAFKA-691
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.8
            Reporter: Jay Kreps


In 0.7 if a partition was down we would just send the message elsewhere. This meant that the
partitioning was really more of a "stickiness" then a hard guarantee. This made it impossible
to depend on it for partitioned, stateful processing.

In 0.8 when running with replication this should not be a problem generally as the partitions
are now highly available and fail over to other replicas. However in the case of replication
factor = 1 no longer really works for most cases as now a dead broker will give errors for
that broker.

I am not sure of the best fix. Intuitively I think this is something that should be handled
by the Partitioner interface. However currently the partitioner has no knowledge of which
nodes are available. So you could use a random partitioner, but that would keep going back
to the down node.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message