hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingcheng Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11339) HBase LOB
Date Wed, 18 Jun 2014 09:51:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035526#comment-14035526

Jingcheng Du commented on HBASE-11339:

In the current design, the Lob files are saved by date(for example tableName/columnfamily/date/lobFileName),
it's easy to delete the lob files which are expired (by the TTL).
The date of commit is used as this date in the path.

1. If using the date of commit in the suggested way, we need to update the reference KVs after
the Lob files are committed(rename the file from the temp directory to the date directory).
If the MemStore flush fails while the Lob file commits successfully, the date of commit is
lost when the WALEdits are replayed. The Lob data and reference KV in HBase could not be connected.
2. If we don't save lob files by date, all the lob files for a column family are saved together.
Then it's difficult to delete the expired lob files( could delete them by sweep tool instead).

> HBase LOB
> ---------
>                 Key: HBASE-11339
>                 URL: https://issues.apache.org/jira/browse/HBASE-11339
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver, Scanners
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBase LOB Design.pdf
>   It's quite useful to save the massive binary data like images, documents into Apache
HBase. Unfortunately directly saving the binary LOB(large object) to HBase leads to a worse
performance since the frequent split and compaction.
>   In this design, the LOB data are stored in an more efficient way, which keeps a high
write/read performance and guarantees the data consistency in Apache HBase.

This message was sent by Atlassian JIRA

View raw message