hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8136) Client gets and uses EC schema when reads and writes a stripping file
Date Wed, 22 Apr 2015 06:18:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506504#comment-14506504

Vinayakumar B commented on HDFS-8136:

Hi [~kaisasak], 
Thanks for working on this,
Here are some comments about the patch.

1. cellSize also can be final
{code}+  private int cellSize;{code}
2. cellSize is {{int}}, not {{short}}
{code}+    cellSize = (short)ecInfo.getSchema().getChunkSize();{code}

3. I think this check is unnecessary, as this is directly returned from ECSchema, which will
not change once defined.
+    if (numAllBlocks <= 1) {
+      throw new IOException("The block group must contain more than one block.");
+    }{code}

4. IMO cluster need not recreated for every test. It can be initialized once for class and
shutdown at the end of all tests.

5. In {{testOneFileUsingDFSStripedInputStream()}} need to close {{dis}} in try-finally
6. When tried to run the test, all, but one, failed with {{ArithMaticException : / by zero}}

> Client gets and uses EC schema when reads and writes a stripping file
> ---------------------------------------------------------------------
>                 Key: HDFS-8136
>                 URL: https://issues.apache.org/jira/browse/HDFS-8136
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Kai Zheng
>            Assignee: Kai Sasaki
>         Attachments: HDFS-8136-005.patch, HDFS-8136.1.patch, HDFS-8136.2.patch, HDFS-8136.3.patch,
> Discussed with [~umamaheswararao] and [~vinayrpet], in client when reading and writing
a stripping file, it can invoke a separate call to NameNode to request the EC schema associated
with the EC zone where the file is in. Then the schema can be used to guide the reading and
writing. Currently it uses hard-coded values.
> Optionally, as an optimization consideration, client may cache schema info per file or
per zone or per schema name. We could add schema name in {{HdfsFileStatus}} for that.

This message was sent by Atlassian JIRA

View raw message