hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
Date Tue, 11 Aug 2015 09:55:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681550#comment-14681550

Rakesh R commented on HDFS-8287:

Thanks [~kaisasak], nice work. I've few comments, please take a look at it:
# Please close the {{parityGenerator}} daemon thread when closing the DFSStripedOutputStream.
Here, please make sure the writeParityCellsForLastStripe() gets a fair chance to finish.
# Few queries about {{synchronized (bufferQueue)}} statement/block. Is that your intention
is to block the {{cellBufferses(currentIndex)}} till writeParity() function is completed,
considering that encoder will take more time than next {{ParityGenerator#enqueue}} arrives.
If yes, then I think we should make {{ParityGenerator#enqueue()}} function also synchronized
on {{bufferQueue}}. If not, then will wrap only {{bufferQueue.wait();}}, right?
# Please pass the exception object as argument. This will help to get the complete trace.
LOG.warn("Caught exception " + e);
modify to 
LOG.warn("Caught exception ", e);
# Presently {{DFSStripedOutputStream#writeParity}} method have package/default visibility.
With this change {{ParityGenerator}} is consuming the buffer, its good to limit this function
accessibility to private.
  void writeParity(int index, ByteBuffer buffer, byte[] checksumBuf
      ) throws IOException {
# Can we make {{class ParityGeneratorEntity}} and {{class ParityGenerator}} private?
# IMHO, keep the var name {{cellBuffers[]}} instead of {{cellBufferses[]}}. Probably we can
add comment saying that 'Uses double buffering. In the beginning, client writes data to the
first buffer. When the first cell stripe is full, client continues writing to the second buffer.
Now, the ParityGenerator will pick the first buffer and writes parity, then releases the first
buffer so that when the second buffer is full, the client can continue write to the first
buffer again.'
# For {{ParityGenerator#run}} thread, its good to catch try/throwable or use {{java.lang.Thread.UncaughtExceptionHandler}}.
Improves debuggability to know the Thread abruptly terminates due to an uncaught exception.

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> ---------------------------------------------------------------------
>                 Key: HDFS-8287
>                 URL: https://issues.apache.org/jira/browse/HDFS-8287
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Kai Sasaki
>         Attachments: HDFS-8287-HDFS-7285.00.patch, HDFS-8287-HDFS-7285.01.patch
> When a stripping cell is full, writeChunk computes and generates parity packets.  It
sequentially calls waitAndQueuePacket so that user client cannot continue to write data until
it finishes.
> We should allow user client to continue writing instead but not blocking it when writing

This message was sent by Atlassian JIRA

View raw message