hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6597) Block Encoding Size Estimation
Date Mon, 08 Oct 2012 19:16:03 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471784#comment-13471784
] 

Phabricator commented on HBASE-6597:
------------------------------------

mbautin has commented on the revision "[jira] [HBASE-6597] [89-fb] Incremental data block
encoding".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:76 includesMemstoreTS
means that we are memstore timestamp is part of both input and output. We don't change that
aspect of the data format on data block encoding/decoding.
  src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:119 Added
an assertion to BufferedEncodedWriter. The code below won't make currentState null if it is
not null initially.
  src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:447 As you
can see, the DataBlockEncoder class does not have a lot of state (unlike the EncodedWriter)
so I don't know what else I could include here.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java:96 Oops. That was for
debugging. Good catch!

REVISION DETAIL
  https://reviews.facebook.net/D5895

To: Kannan, Karthik, Liyin, aaiyer, avf, JIRA, mbautin
Cc: tedyu

                
> Block Encoding Size Estimation
> ------------------------------
>
>                 Key: HBASE-6597
>                 URL: https://issues.apache.org/jira/browse/HBASE-6597
>             Project: HBase
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.89-fb
>            Reporter: Brian Nixon
>            Assignee: Mikhail Bautin
>            Priority: Minor
>         Attachments: D5895.1.patch, D5895.2.patch, D5895.3.patch
>
>
> Blocks boundaries as created by current writers are determined by the size of the unencoded
data. However, blocks in memory are kept encoded. By using an estimate for the encoded size
of the block, we can get greater consistency in size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message