hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Hlog Group Commit Question: SequenceFileLogReader
Date Tue, 26 Jan 2010 22:02:49 GMT
HBase 0.20 had a hack that would recognize the presence of Dhruba's
HDFS-200.  If it had been applied, then we'd do the open-for-append,
close, and reopen to recover edits written to an unclosed WAL/HLog
file (Grep 'syncfs' in HLog on the 0.20 branch).

In HBase TRUNK, the above hackery was stripped out.  In TRUNK we are
leaning on the new hflush/HDFS-265 rather than HDFS-200.  For hflush,
when we do FSDataInputStream::available(), its returning the 'right'
answer (WALReaderFsDataInputStream::getPos() was added before an API
was available.  HBASE-2069 is about using the new API instead of this
getPos fancy-dancing).

It sounds like you need to do a bit of merging of TRUNK group commit
and the old hbase code that exploited HDFS-200?


On Tue, Jan 26, 2010 at 12:35 PM, Nicolas Spiegelberg
<nspiegelberg@facebook.com> wrote:
> Hi,
> I am trying to backport the HLog group commit functionality to Hbase 0.20.  For proper
reliability, I am working with Dhruba to get the 0.21 syncFs() changes from HDFS ported back
to HDFS 0.20 as well.  When going through a peer review of the modified code, my group had
a question about the SequenceFileLogReader.java (WALReader).  I am hoping that you guys could
be of assistance.
> I know that there is an open issue [HBASE-2069] where Hlog::splitLog() does not call
DFSDataInputStream::getVisibleLength(), which would properly sync hflushed, but unclosed,
file lengths.  I believe the current workaround is to open an HDFS file in append mode &
then close, which would cause the namenode to get updates from the datanodes.  However, I
don’t see that shim present in Hlog::splitLog() on the 0.21 trunk.  Is this a pending issue
to fix or is calling FSDataInputStream::available() within WALReaderFsDataInputStream::getPos()
sufficient to force the namenode to sync up with the datanodes?
> Nicolas Spiegelberg

View raw message