hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingcheng Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11339) HBase MOB
Date Fri, 20 Jun 2014 07:34:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038559#comment-14038559

Jingcheng Du commented on HBASE-11339:

Thanks for the comments! [~jmhsieh] and [~ndimiduk].

Does it mean the mob files are not feasible?

bq. Why not just improve/use existing column family functionality and have use a cf for lob/mob
fields? Couldn't we just do a combination of per-cf compaction and per-cf flushes (not sure
if all or some of those features are in already) and get to good performance while avoiding
write amplification penalties?
You mean directly saving the mob into HBase and using different compaction policy for the
mob cf? The compaction on the mob cf in HBase is costly, will probably delay the flushing
and block the updates. And a large mob store leads to frequent region split. All of these
impact the HBase potentially.

In the current design (introduced in the pdf), if users are concerned for the write performance
rather than the consistency and replication, how about to disable the WAL directly? If users
want to enable the WAL and don't want the twice writing, they could write the mob in the client
side ( the way like Lars's suggestion). The scanner and sweep tool could work as well with
this if the locator(reference) column follows the specific format.

> HBase MOB
> ---------
>                 Key: HBASE-11339
>                 URL: https://issues.apache.org/jira/browse/HBASE-11339
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver, Scanners
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBase LOB Design.pdf
>   It's quite useful to save the medium binary data like images, documents into Apache
HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse
performance since the frequent split and compaction.
>   In this design, the MOB data are stored in an more efficient way, which keeps a high
write/read performance and guarantees the data consistency in Apache HBase.

This message was sent by Atlassian JIRA

View raw message