hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6597) Block Encoding Size Estimation
Date Mon, 29 Oct 2012 00:03:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485756#comment-13485756
] 

Ted Yu commented on HBASE-6597:
-------------------------------

Update on my recent fidings.
I came up with patch for 0.94 branch.
Most data block encoding related tests pass.
TestHFileBlockCompatibility poses a little challenge. There is no embedded checksum feature
in 0.89-fb branch. So this test is unique to 0.94 / trunk.
In the test, there is a copy of Writer class which I assume shouldn't be modified, at least
not for a point release.
The test reuses some code from TestHFileBlock.java where there is some change related to usage
of Writer:
{code}
- static int writeTestKeyValues(OutputStream dos, int seed, boolean includesMemstoreTS)
+ static void writeTestKeyValues(OutputStream dos, Writer hbw, int seed, boolean includesMemstoreTS)
{code}
This is the test failure I am observing now:
{code}
testDataBlockEncoding[0](org.apache.hadoop.hbase.io.hfile.TestHFileBlockCompatibility)  Time
elapsed: 0.129 sec  <<< FAILURE!
org.junit.ComparisonFailure: Content mismath for compression NONE, encoding PREFIX, pread
false, commonPrefix 2, expected length 1859, actual length 1859 expected:<\x00\x00\x0[B\xB8]*\x0A\x00\x00\x0A\x0...>
but was:<\x00\x00\x0[0\x00]*\x0A\x00\x00\x0A\x0...>
  at org.junit.Assert.assertEquals(Assert.java:125)
  at org.apache.hadoop.hbase.io.hfile.TestHFileBlock.assertBuffersEqual(TestHFileBlock.java:463)
  at org.apache.hadoop.hbase.io.hfile.TestHFileBlockCompatibility.testDataBlockEncoding(TestHFileBlockCompatibility.java:337)
{code}
                
> Block Encoding Size Estimation
> ------------------------------
>
>                 Key: HBASE-6597
>                 URL: https://issues.apache.org/jira/browse/HBASE-6597
>             Project: HBase
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.89-fb
>            Reporter: Brian Nixon
>            Assignee: Mikhail Bautin
>            Priority: Minor
>         Attachments: 6597-trunk.txt, D5895.1.patch, D5895.2.patch, D5895.3.patch, D5895.4.patch,
D5895.5.patch
>
>
> Blocks boundaries as created by current writers are determined by the size of the unencoded
data. However, blocks in memory are kept encoded. By using an estimate for the encoded size
of the block, we can get greater consistency in size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message