hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-895) Allow hflush/sync to occur in parallel with new writes to the file
Date Mon, 08 Nov 2010 21:00:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929734#action_12929734

Todd Lipcon commented on HDFS-895:

Thanks for the review.

bq. Does this work with the heartbeat packet?

Line 657-658 check for the heartbeat sequence number before we set lastAckedSeqno in ResponseProcessor.run(),
so it should work as before.

bq. dataQueue.wait(1000); is it possible to use dataQueue.wait();

Yes, I think that would also work. In theory, the timeout isn't necessary, but I've seen bugs
before where a slight race causes us to miss a notify. Perhaps we can switch the synchronized
(dataQueue) and the while (!closed) in this function and avoid the race? It seems to me that
waking up once a second just to be extra safe doesn't really harm us, but maybe it's a bit
of a band-aid.

bq. lines 1294-1299: when would this happen?

This is a bit subtle - it was one of the bugs in the original version of the patch. Here's
the sequence of events that it covers:
- write 10 bytes to a new file (ie no packet yet)
- call hflush()
-- it calls flushBuffer, so that it enqueues a new "packet"
-- it sends that packet - now we have no currentPacket, and the 10 bytes are still in the
checksum buffer, lastFlushOffset=10
- we call hflush() again without writing any more data
-- it calls flushBuffer, so it creates a new "packet" with the same 10 bytes
-- we notice that the bytesCurBlock is the same as lastFlushOffset, hence we don't want to
actually re-send this packet (there's no new data, so it's a no-op flush)
-- hence, we need to get rid of the packet and also decrement the sequence number so we don't
"skip" one

Without this fix, we were triggering 'assert seqno == lastAckedSeqno + 1;' when calling hflush()
twice without writing any data in between

> Allow hflush/sync to occur in parallel with new writes to the file
> ------------------------------------------------------------------
>                 Key: HDFS-895
>                 URL: https://issues.apache.org/jira/browse/HDFS-895
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>         Attachments: hdfs-895-0.20-append.txt, hdfs-895-20.txt, hdfs-895-review.txt,
hdfs-895-trunk.txt, hdfs-895.txt, hdfs-895.txt
> In the current trunk, the HDFS client methods writeChunk() and hflush./sync are syncronized.
This means that if a hflush/sync is in progress, an applicationn cannot write data to the
HDFS client buffer. This reduces the write throughput of the transaction log in HBase. 
> The hflush/sync should allow new writes to happen to the HDFS client even when a hflush/sync
is in progress. It can record the seqno of the message for which it should receice the ack,
indicate to the DataStream thread to star flushing those messages, exit the synchronized section
 and just wai for that ack to arrive.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message