hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11882) Client fails if acknowledged size is greater than bytes sent
Date Fri, 01 Sep 2017 00:54:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149869#comment-16149869
] 

Kai Zheng commented on HDFS-11882:
----------------------------------

Thanks [~andrew.wang] for adding so many comments in the codes which is very helpful for understanding
the complex logic. Some minor comments, please check if they make sense or not.

1. How about {{waitCreatingNewStreams}} => {{waitCreatingStreamers}}, like we have checkStreamerUpdates.
2. "Get the acked file length" => "Get the length of the acked bytes in the block group";
"A full stripe is acked when at least numDataBlocks streamers have that cell" => "... streamers
have corresponding cells of the stripe"; About "Parity cells are the length of the longest
data cells", didn't quite follow and could you clarify some bit? 
{code}
   /**
-   * Get the number of acked stripes. An acked stripe means at least data block
-   * number size cells of the stripe were acked.
+   * Get the acked file length.
+   *
+   * <p>
+   *   A full stripe is acked when at least numDataBlocks streamers have
+   *   that cell, and all previous full stripes are also acked. This enforces
+   *   the constraint that there is at most one partial stripe.
+   * </p>
+   * <p>
+   *   Partial stripes write all parity cells. Empty data cells are not written.
+   *   Parity cells are the length of the longest data cells.
+   *   To be considered acked, a partial stripe needs at least numDataBlocks
+   *   empty or written cells.
+   * </p>
+   * <p>
+   *   Currently, partial stripes can only happen when closing the file at a
+   *   non-stripe boundary, but this could also happen during (currently
+   *   unimplemented) hflush/hsync support.
+   * </p>
    */
-  private long getNumAckedStripes() {
-    int minStripeNum = Integer.MAX_VALUE;
+  private long getAckedLength() {
{code}

May post more later today.

> Client fails if acknowledged size is greater than bytes sent
> ------------------------------------------------------------
>
>                 Key: HDFS-11882
>                 URL: https://issues.apache.org/jira/browse/HDFS-11882
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding, test
>            Reporter: Akira Ajisaka
>            Assignee: Andrew Wang
>            Priority: Critical
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11882.01.patch, HDFS-11882.02.patch, HDFS-11882.03.patch, HDFS-11882.04.patch,
HDFS-11882.05.patch, HDFS-11882.regressiontest.patch
>
>
> Some tests of erasure coding fails by the following exception. The following test was
removed by HDFS-11823, however, this type of error can happen in real cluster.
> {noformat}
> Running org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> Tests run: 14, Failures: 0, Errors: 1, Skipped: 10, Time elapsed: 89.086 sec <<<
FAILURE! - in org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure
> testMultipleDatanodeFailure56(org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure)
 Time elapsed: 38.831 sec  <<< ERROR!
> java.lang.IllegalStateException: null
> 	at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> 	at org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780)
> 	at org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664)
> 	at org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> 	at org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472)
> 	at org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTestWithMultipleFailure(TestDFSStripedOutputStreamWithFailure.java:381)
> 	at org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testMultipleDatanodeFailure56(TestDFSStripedOutputStreamWithFailure.java:245)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message