hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eshcar Hillel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
Date Thu, 28 Jan 2016 13:04:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121385#comment-15121385
] 

Eshcar Hillel commented on HBASE-14918:
---------------------------------------

I reviewed the mslab-move patch. Software-engineering-wise I am not at all convinced that
the right place for mslab is in HStore level.
The compacting memstore is an example in which cells are allocated at the memstore level and
not the store level.

But more important is what you say about off-heap memory. I have no experience with off-heaping.
Can you please elaborate why the suggested design cannot be off-heap, and what is needed to
allow it be off-heap?
In addition, you refer to the write-path, but actually the write-path goes through mutable-segment
that stores the data in a CSLM format.
Only reads and scans access the cell block.

It is good we have this discussion at this point since it relates to the design of task #4,
and can also affect task #3.
However, [~stack], is there anything that prevents committing the patch of task #1. Is it
not committed due to the MSLAB issue?
IMO, the mslab is orthogonal to task #1. If it is decided that it needs to move, then it is
possible to do so even after the patch.

> In-Memory MemStore Flush and Compaction
> ---------------------------------------
>
>                 Key: HBASE-14918
>                 URL: https://issues.apache.org/jira/browse/HBASE-14918
>             Project: HBase
>          Issue Type: Umbrella
>    Affects Versions: 2.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>             Fix For: 0.98.18
>
>         Attachments: CellBlocksSegmentDesign.pdf, MSLABMove.patch
>
>
> A memstore serves as the in-memory component of a store unit, absorbing all updates to
the store. From time to time these updates are flushed to a file on disk, where they are compacted
(by eliminating redundancies) and compressed (i.e., written in a compressed format to reduce
their storage size).
> We aim to speed up data access, and therefore suggest to apply in-memory memstore flush.
That is to flush the active in-memory segment into an intermediate buffer where it can be
accessed by the application. Data in the buffer is subject to compaction and can be stored
in any format that allows it to take up smaller space in RAM. The less space the buffer consumes
the longer it can reside in memory before data is flushed to disk, resulting in better performance.
> Specifically, the optimization is beneficial for workloads with medium-to-high key churn
which incur many redundant cells, like persistent messaging. 
> We suggest to structure the solution as 4 subtasks (respectively, patches). 
> (1) Infrastructure - refactoring of the MemStore hierarchy, introducing segment (StoreSegment)
as first-class citizen, and decoupling memstore scanner from the memstore implementation;
> (2) Adding StoreServices facility at the region level to allow memstores update region
counters and access region level synchronization mechanism;
> (3) Implementation of a new memstore (CompactingMemstore) with non-optimized immutable
segment representation, and 
> (4) Memory optimization including compressed format representation and off heap allocations.
> This Jira continues the discussion in HBASE-13408.
> Design documents, evaluation results and previous patches can be found in HBASE-13408.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message