accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1216) reconsider using SequenceFiles for the WAL
Date Thu, 28 Mar 2013 14:38:13 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616302#comment-13616302
] 

Keith Turner commented on ACCUMULO-1216:
----------------------------------------

bq. Say I have a 1GB WAL sequence file, could I treat them as two 500MB files and replay them
in parallel? I think the answer is no as the order of mutations applied is important, correct?

Right the order is correct.  Having a splittable file would allow you sort those two 500MB
chunks in parallel.   Then you could do a merged read of the two sorted chunks.
                
> reconsider using SequenceFiles for the WAL
> ------------------------------------------
>
>                 Key: ACCUMULO-1216
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1216
>             Project: Accumulo
>          Issue Type: Task
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Eric Newton
>            Priority: Minor
>
> Observing the code in HBase we learned that WAL files written/flushed to HDFS would *not*
present the correct file size, which made using SequenceFile for WALs problematic.  So we
just write Writables.  In a sequence.
> However, it would be nice to go back and use SequenceFile.
> It might be possible since learning how to properly close a file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message