samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Kreps <jay.kr...@gmail.com>
Subject Re: Question about stream partitions
Date Fri, 23 Aug 2013 21:24:47 GMT
Ah, one thing that is worth mentioning is that the way partitions are
handled changed between 0.7 and 0.8. What you describe sounds like the 0.7
partitioning behavior. Samza only supports 0.8. This change was actual made
to better support partitioned stream processing which requires hard
partition guarantees and highly available partitions (otherwise your
partitioning will change when the partition count changes as you add
machines or they die). As of 0.8 the number of partitions is set at topic
creation time and is independent of the number of servers (as Chris
describes).

Hopefully that helps explain.

-Jay


On Fri, Aug 23, 2013 at 12:42 PM, Mathias Herberts <
mathias.herberts@gmail.com> wrote:

> Hi,
>
> first of all kudos for putting Samza into the Apache Incubator, it's good
> to have yet another approach to stream processing.
>
> IIRC in a multi-node Kafka cluster (let's assume P nodes), a topic T with N
> partitions will have N partitions on each node, so the total number of
> partitions will be P*N.
>
> My question relates to the notion of partition in the Samza stream linked
> to T, will the Samza partition number be N or P*N ?
>
> Mathias.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message