incubator-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [15/29] incubator-rocketmq-site git commit: Post a blog-How to Support More Queues in RocketMQ.
Date Sat, 24 Dec 2016 06:56:29 GMT
Post a blog-How to Support More Queues in RocketMQ.


Branch: refs/heads/asf-site
Commit: 665dc7d5e839dab55465d3c1474746e4e6311004
Parents: 2041fbe
Author: yukon <>
Authored: Fri Dec 23 14:33:03 2016 +0800
Committer: yukon <>
Committed: Fri Dec 23 14:33:03 2016 +0800

---------------------------------------------------------------------- |  53 +++++++++++++++++++ |   2 -
 assets/images/blog/rocketmq-queues.png          | Bin 0 -> 46754 bytes
 3 files changed, 53 insertions(+), 2 deletions(-)
diff --git a/_posts/ b/_posts/
new file mode 100644
index 0000000..c413d30
--- /dev/null
+++ b/_posts/
@@ -0,0 +1,53 @@
+title: "How to Support More Queues in RocketMQ?"
+  - RocketMQ
+  - RocketMQ
+  - Queue
+  - Partition
+  - Message Oriented Middleware
+# Summary
+Kafka is a distributed streaming platform, which was born from [logging aggregation cases](
It does not need too high concurrency. In some large scale cases in alibaba, we found that
the original model has been unable to meet our actual needs. So, we developed a messaging
middleware, named RocketMQ, which can handle a broad set of use cases, ranging from traditional
publish/subscribe scenario to demandingly high volume realtime transaction system that tolerates
no message loss. Now, in alibaba, RocketMQ clusters process more than 500 billion events every
day, provide services for more than 3000 core applications.
+{% include toc %}
+# Partition design in kafka
+1. Producer parallelism of writing is bounded by the number of partitions.
+2. The degree of consumer consumption parallelism, also is bounded by the number of partitions
being consumed. Assuming that the number of partitions is 20, then the maximum concurrent
consumption of Consumer is 20.
+3. Each Topic consists of a number of fixed number of partitions. Partition number determines
the number of Topic that single Broker can support.
+More details please refer to [here](
+## Why Kafka can't support more partitions
+1. Each partition stores the whole message data, although each partition is written to the
disk is in order, but a number of sequential partition writing  at the same time from the
aspect of operating system become a random writing.
+2. Due to the scattered data files, it is difficult to use the linux IO Group Commit mechanism.
+# How to support more partition in RocketMQ?
+1. All message data are stored in CommmitLog files. Complete sequential writing and random
+2. ConsumeQueue stores the actual user consumption location information, they are flushed
to disk in sequential mode.
+> pros:
+1. A very small amount of data on a single consume queue. Lightweight.
+2. Sequential access in disk, avoid disk lock contention, and not incur high disk iowait
when more and more queues created.
+> cons:
+1. Message consumption will read ConsumeQueue firstly, then read CommitLog if not found.This
process brings a certain cost in the worst cases.
+2. CommitLog and ConsumeQueue must be keep completely consistent, increasing the complexity
of programming model.
+> Design Motivation:
+1. Random read. Read as much as possible to increase the PAGECACHE hit rate, and reduce read
io operations. So bigger memory is still better. If too much message accumulation happened,
whether the read performance will fall very badly? the answer is negative, reasons are as
+	- [PAGECACHE prefetch](, even if the 1K
message, the system will read more data in advance. You may hit the memory in the next read
+	- Random access CommitLog from disk. If set the I/O scheduler to noop in SSD, the read qps
will greatly accelerate than others scheduler algorithm.
+2. Because ConsumeQueue stores little information, mainly associated with consumption locations.also,
supports random read. Under PAGECACHE prefetch control, read performance almost keep consistent
with the main memory, even if in the message accumulation cases. In this particular case,ConsumeQueue
will not hinder the read performance.
+3. CommitLog stores all the meta information, including the message data. similar db's redolog.
So as long as CommitLog exists, even if the ConsumeQueue data is lost, data can be recovered.
diff --git a/_posts/ b/_posts/
index c39662a..02c0f5e 100644
--- a/_posts/
+++ b/_posts/
@@ -9,8 +9,6 @@ tags:
   - Maven
-# Preface
 This article mainly includes three first,I will introduce compatibility principle(more
details see [here]( briefly.followed
by a detailed elaborating about Java component compatible dependency,including the interface-oriented
programming,single component signature protection,single component compatibility protection
and multi-component compatibility compile time checking.Finally is the review and prospect,especially
about **Dependency Mediator** project.
 {% include toc %}
diff --git a/assets/images/blog/rocketmq-queues.png b/assets/images/blog/rocketmq-queues.png
new file mode 100644
index 0000000..228f313
Binary files /dev/null and b/assets/images/blog/rocketmq-queues.png differ

View raw message