kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-6144) Allow state stores to serve stale reads during rebalance
Date Sat, 04 Nov 2017 14:52:00 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matthias J. Sax updated KAFKA-6144:
-----------------------------------
    Description: 
Currently when expanding the KS cluster, the new node's partitions will be unavailable during
the rebalance, which for large states can take a very long time, or for small state stores
even more than a few ms can be a deal breaker for micro service use cases.

One workaround is to allow stale data to be read from the state stores when use case allows.

Relates to KAFKA-6145 - Warm up new KS instances before migrating tasks - potentially a two
phase rebalance


This is the description from KAFKA-6031 (keeping this JIRA as the title is more descriptive):

{quote}
Currently reads for a key are served by single replica, which has 2 drawbacks:
 - if replica is down there is a down time in serving reads for keys it was responsible for
until a standby replica takes over
 - in case of semantic partitioning some replicas might become hot and there is no easy way
to scale the read load

If standby replicas would have endpoints that are exposed in StreamsMetadata it would enable
serving reads from several replicas, which would mitigate the above drawbacks. 
Due to the lag between replicas reading from multiple replicas simultaneously would have weaker
(eventual) consistency comparing to reads from single replica. This however should be acceptable
tradeoff in many cases.
{quote}

  was:
Currently when expanding the KS cluster, the new node's partitions will be unavailable during
the rebalance, which for large states can take a very long time, or for small state stores
even more than a few ms can be a deal breaker for micro service use cases.

One workaround is to allow stale data to be read from the state stores when use case allows.

Relates to KAFKA-6145 - Warm up new KS instances before migrating tasks - potentially a two
phase rebalance


This is the description from KAFKA-6031 (keeping this JIRA as the title is more descriptive):

{quote}
Currently reads for a key are served by single replica, which has 2 drawbacks:
 - if replica is down there is a down time in serving reads for keys it was responsible for
until a standby replica takes over
in case of semantic partitioning some replicas might become hot and there is no easy way to
scale the read load
If standby replicas would have endpoints that are exposed in StreamsMetadata it would enable
serving reads from several replicas, which would mitigate the above drawbacks. 
Due to the lag between replicas reading from multiple replicas simultaneously would have weaker
(eventual) consistency comparing to reads from single replica. This however should be acceptable
tradeoff in many cases.
{quote}


> Allow state stores to serve stale reads during rebalance
> --------------------------------------------------------
>
>                 Key: KAFKA-6144
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6144
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>            Reporter: Antony Stubbs
>            Priority: Major
>
> Currently when expanding the KS cluster, the new node's partitions will be unavailable
during the rebalance, which for large states can take a very long time, or for small state
stores even more than a few ms can be a deal breaker for micro service use cases.
> One workaround is to allow stale data to be read from the state stores when use case
allows.
> Relates to KAFKA-6145 - Warm up new KS instances before migrating tasks - potentially
a two phase rebalance
> This is the description from KAFKA-6031 (keeping this JIRA as the title is more descriptive):
> {quote}
> Currently reads for a key are served by single replica, which has 2 drawbacks:
>  - if replica is down there is a down time in serving reads for keys it was responsible
for until a standby replica takes over
>  - in case of semantic partitioning some replicas might become hot and there is no easy
way to scale the read load
> If standby replicas would have endpoints that are exposed in StreamsMetadata it would
enable serving reads from several replicas, which would mitigate the above drawbacks. 
> Due to the lag between replicas reading from multiple replicas simultaneously would have
weaker (eventual) consistency comparing to reads from single replica. This however should
be acceptable tradeoff in many cases.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message