hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akashnil (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-6361) Change the compaction queue to a round robin scheduler
Date Tue, 10 Jul 2012 10:18:35 GMT

     [ https://issues.apache.org/jira/browse/HBASE-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Akashnil updated HBASE-6361:
----------------------------

    Description: 
Currently the compaction requests are submitted to the minor/major compaction queue of a region-server
from every column-family/region belonging to it. The requests are processed from the queue
in FIFO order (First in First out). We want to make a lazy scheduler in place of the current
queue-based one. The idea of lazy scheduling is that, it is always better to make a decision
(compaction selection) later if the decision is relevant later only. Presently, when the queue
gets bottle-necked, there is a delay between compaction selection of a request and its execution.
Rather than that, we can postpone the compaction selection until the queue is empty when we
will have more information (new flush files will have affected the state) to make a better
decision.

Removing the queue, we propose to implement a round-robin scheduler. All the column families
in their regions will be visited in sequence periodically. In each visit, if the column family
generates a valid compaction request, the request is executed before moving to the next one.
We do not plan to change the current compaction algorithm for now. We expect that it will
automatically make a better decision when doing just-in-time selection due to the new change.
How do we know that? Let us consider an example.

Note that the presently existing compaction queue is only relevant as a buffer, when the flushes
out-pace the compactions for a period of time, or a relatively large compaction consumes time
to complete, the queue accumulates requests. Suppose such a short-term bottleneck has occurred.
Suppose min-files for compaction = 4. For an active column-family, when new flushes are written,
new compaction requests, each of size 4, will be added to the queue continuously until the
queue starts processing them.

Now consider a round-robin scheduler. The effect of a bottle-neck due to the IO rate of compaction
results in a longer latency to visit the same column family again. When the same active column
family is visited following a long delay, suppose 16 new flush files have been written there.
The compaction selection algorithm will select one compaction request of size 16, as opposed
to 4 compaction requests of size 4 that would have been generated in the previous case.

A compaction request with 16 flush files is more IOPs-efficient than the same set of files
being compacted 4 at a time. This is because both consume the same total amount of reads and
writes while producing a file of size 16 compared to 4 files of size 4. So we obtained a free
compaction 4*4->16 without paying for it. In case of the queue, those smaller 4 sized files
would have consumed more IOPs to become bigger later.

I did some experiments on how a bottle-neck of the queue affects the compaction selections
in the simulator. It appears that, a filled up queue actually makes all future compaction
selections less and less efficient in terms of IOPs, resulting in a runway positive feedback
loop which can sometimes explode the compaction queue. The main effect of this change should
be to deal with bursty loads. When a bottleneck occurs, the compaction selection will become
more IOPs-efficient rather than less efficient, resulting in negative feedback and restoration
to stability more easily. As for monitoring, the compaction queue size will not be present
as a metric. However, the number of files in each compaction will indicate if a bottleneck
has occurred.

  was:
Currently the compaction requests are submitted to the minor/major compaction queue of a region-server
from every column-family/region belonging to it. The queue processes those requests in FIFO
order (First in First out). We want to make a lazy scheduler in place of the current one.
The idea of lazy scheduling is that, it is always better to make a decision (compaction selection)
later if the decision is relevant later only. Currently, if the queue grows large, currently
generated requests are not processed until all the preceding requests are executed. Rather
than that, we can postpone the compaction selection until the queue is empty when we will
have more information (new flush files will have affected the state) to make a better decision.

Removing the queue, we propose to implement a round-robin scheduler. All the column families
in their regions will be visited in sequence periodically. In each visit, if the column family
generates a valid compaction request, the request is executed before moving to the next one.
We do not plan to change the current compaction algorithm for now. We expect that it will
automatically make a better decision when doing just-in-time selection due to the new change.
How do we know that? Let us consider an example.

Note that the presently existing compaction queue is only relevant as a buffer, when the flushes
out-pace the compactions for a period of time, or a relatively large compaction consumes time
to complete, the queue accumulates requests. Suppose such a scenario has occurred. Suppose
min-files for compaction = 4. For an active column-family, new compaction requests, each of
size 4 will be added to the queue continuously until the queue starts processing them.

Now consider a round-robin scheduler. The effect of a bottle-neck due to the IO rate of compaction
results in a longer latency to visit the same column family again. By this time suppose there
are 16 new flush files in this column family. The compaction selection algorithm will select
a compaction request of size 16, as opposed to 4 compaction requests of size 4 that would
have been generated in the previous case.

A compaction request with 16 flush files is more IOPs-efficient than the same set of files
being compacted 4 at a time. This is because both consume the same total amount of reads,
total writes, and IOPs/sec while producing a file of size 16 compared to 4 files of size 4.
So we obtained a free compaction from those 4*4->16 without paying for it. In case of the
queue, those smaller files would have consumed more IOPs to become bigger later.

In case of uniform steady-state load this change should not make a difference, because the
compaction queue would have been empty anyway. However in case of bursty load, it automatically
adapts itself to consume less IOPs in times of high flush rate. This negative feedback should
mainly improve faliure-resistence of the system. In case something goes wrong, monitoring
should still give feedback, not in the form of queue size, but the number of files in each
compaction, which will go up when the bottle-neck occurs. If there is no important down-sides,
this should be a very good change since this should apply to all use-cases.

    
> Change the compaction queue to a round robin scheduler
> ------------------------------------------------------
>
>                 Key: HBASE-6361
>                 URL: https://issues.apache.org/jira/browse/HBASE-6361
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Akashnil
>
> Currently the compaction requests are submitted to the minor/major compaction queue of
a region-server from every column-family/region belonging to it. The requests are processed
from the queue in FIFO order (First in First out). We want to make a lazy scheduler in place
of the current queue-based one. The idea of lazy scheduling is that, it is always better to
make a decision (compaction selection) later if the decision is relevant later only. Presently,
when the queue gets bottle-necked, there is a delay between compaction selection of a request
and its execution. Rather than that, we can postpone the compaction selection until the queue
is empty when we will have more information (new flush files will have affected the state)
to make a better decision.
> Removing the queue, we propose to implement a round-robin scheduler. All the column families
in their regions will be visited in sequence periodically. In each visit, if the column family
generates a valid compaction request, the request is executed before moving to the next one.
We do not plan to change the current compaction algorithm for now. We expect that it will
automatically make a better decision when doing just-in-time selection due to the new change.
How do we know that? Let us consider an example.
> Note that the presently existing compaction queue is only relevant as a buffer, when
the flushes out-pace the compactions for a period of time, or a relatively large compaction
consumes time to complete, the queue accumulates requests. Suppose such a short-term bottleneck
has occurred. Suppose min-files for compaction = 4. For an active column-family, when new
flushes are written, new compaction requests, each of size 4, will be added to the queue continuously
until the queue starts processing them.
> Now consider a round-robin scheduler. The effect of a bottle-neck due to the IO rate
of compaction results in a longer latency to visit the same column family again. When the
same active column family is visited following a long delay, suppose 16 new flush files have
been written there. The compaction selection algorithm will select one compaction request
of size 16, as opposed to 4 compaction requests of size 4 that would have been generated in
the previous case.
> A compaction request with 16 flush files is more IOPs-efficient than the same set of
files being compacted 4 at a time. This is because both consume the same total amount of reads
and writes while producing a file of size 16 compared to 4 files of size 4. So we obtained
a free compaction 4*4->16 without paying for it. In case of the queue, those smaller 4
sized files would have consumed more IOPs to become bigger later.
> I did some experiments on how a bottle-neck of the queue affects the compaction selections
in the simulator. It appears that, a filled up queue actually makes all future compaction
selections less and less efficient in terms of IOPs, resulting in a runway positive feedback
loop which can sometimes explode the compaction queue. The main effect of this change should
be to deal with bursty loads. When a bottleneck occurs, the compaction selection will become
more IOPs-efficient rather than less efficient, resulting in negative feedback and restoration
to stability more easily. As for monitoring, the compaction queue size will not be present
as a metric. However, the number of files in each compaction will indicate if a bottleneck
has occurred.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message