hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2055) Serialize WAL as Avro records
Date Thu, 24 Dec 2009 09:27:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794366#action_12794366

Andrew Purtell commented on HBASE-2055:

Sorry, above I meant SYNC_INTERVAL, not SYNC_SIZE. Also it looks like the DataFileWriter as
implemented for AVRO-160 will hold up to SYNC_INTERVAL bytes in a buffer before writing out
the block. We want to hsync after a group of related commits in the WAL whether SYNC_INTERVAL
is reached or not, but also have the stream marked with a sync marker at each SYNC_INTERVAL.
This is basically what my v3 or v4 patch does. It also writes a copy of the schema just after
the sync marker so we have an opportunity to resynchronize a reader on each block regardless
of how many previous blocks are corrupt (perhaps all). 

> Serialize WAL as Avro records
> -----------------------------
>                 Key: HBASE-2055
>                 URL: https://issues.apache.org/jira/browse/HBASE-2055
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Priority: Minor
>         Attachments: HBASE-2055-v2.patch, HBASE-2055-v3.patch, HBASE-2055-v4.patch, HBASE-2055.patch,
jackson-core-asl-1.0.1.jar, jackson-mapper-asl-1.0.1.jar, paranamer-1.5.jar, TEST-org.apache.hadoop.hbase.regionserver.wal.TestHLog.txt.gz,
TEST-org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.txt.gz, TEST-org.apache.hadoop.hbase.TestFullLogReconstruction.txt.gz,
> There was some advocacy of using Avro for serialization of HBase WAL records up on hbase-dev@.
Idea is Hadoop core is getting away from Writables and Avro is the blessed replacement. 
> I think we have this criteria for its use:
> 1) Performance of writing Avro records is no worse than that for writing Writables into
a SequenceFile.
> 2) Space consumed by Avro serialization is no worse than that of Writables
> 3) File format is amenable to appends (cannot require valid trailers, etc.)
> I'll put up a patch so we can try it out. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message