hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files
Date Fri, 10 Apr 2015 01:49:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488742#comment-14488742
] 

Kai Zheng commented on HDFS-7889:
---------------------------------

Thanks Bo for the hard-working and Zhe for the great review!
Looking at the latest patch, some comments so far:
1. Regarding below, is it safe? Is it possible some cluster for some file would rather use
replica 1 not in stripping layout? Should we explicitly declare that? Maybe we could have
{{isStripped}} in stat?
{code}
+      if(stat.getReplication() == 0) {
+        out = new DFSStripedOutputStream(dfsClient, src, stat,
+            flag, progress, checksum, favoredNodes);
+      }
{code}
2. Ref. below: 1) should {{cellSize}} be HdfsConstants.BLOCK_STRIPED_CELL_SIZE? 2) {{blockGroupSize}}
=> {{blockGroupBlocks}}? 3) the comment "bytes written in current block group" comments
about which variable?
{code}
+  private int cellSize = 64 * 1024;
+  private ByteBuffer[] cellBuffers;
+  private final short blockGroupSize = HdfsConstants.NUM_DATA_BLOCKS
+      + HdfsConstants.NUM_PARITY_BLOCKS;
+  private final short blockGroupDataBlocks = HdfsConstants.NUM_DATA_BLOCKS;
+  private int curIdx = 0;
+  /* bytes written in current block group */
+  private int lastStripeLen = 0;
{code}
3. Ref. below codes: why we init {{cellBuffers[i]}} two times?
{code}
+    for (int i = 0; i < blockGroupSize; i++) {
+      stripeBlocks.add(new LinkedBlockingQueue<LocatedBlock>(blockGroupSize));
+      try {
+        cellBuffers[i] = ByteBuffer.wrap(byteArrayManager.newByteArray(cellSize));
+      } catch (InterruptedException ie) {
...
+      }
+      cellBuffers[i] = ByteBuffer.allocate(cellSize);
+    }
{code}

> Subclass DFSOutputStream to support writing striping layout files
> -----------------------------------------------------------------
>
>                 Key: HDFS-7889
>                 URL: https://issues.apache.org/jira/browse/HDFS-7889
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>         Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, HDFS-7889-003.patch, HDFS-7889-004.patch,
HDFS-7889-005.patch, HDFS-7889-006.patch, HDFS-7889-007.patch, HDFS-7889-008.patch, HDFS-7889-009.patch,
HDFS-7889-010.patch
>
>
> After HDFS-7888, we can subclass  {{DFSOutputStream}} to support writing striping layout
files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message