hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3177) Allow DFSClient to find out and use the CRC type being used for a file.
Date Thu, 23 Aug 2012 23:06:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440750#comment-13440750
] 

Kihwal Lee commented on HDFS-3177:
----------------------------------

I think consistent checksum in concat() can be supported by inserting checks. It hits datanodes,
but won't be too bad since it's reading only checksum file and sending MD5 of it.

I tested this and it passes TestHDFSConcat.

{code}
  public void concat(String trg, String [] srcs) throws IOException {
    checkOpen();
    try {
+      // check the checksum consistency
+      MD5MD5CRC32FileChecksum csum = null;
+      String src = "";
+      for (String s : srcs) {
+        MD5MD5CRC32FileChecksum csumToCompare = getFileChecksum(s);
+        if (csumToCompare.getChecksumOpt().getChecksumType() ==
+            DataChecksum.Type.MIXED) {
+          throw new IOException("Mixed checksum type detected in " +
+              s + ". This is not supported in concat()");
+        }
+        if (csum == null) {
+          csum = csumToCompare;
+          src = s;
+          continue;
+        }
+        if (csum.getChecksumOpt().getChecksumType() !=
+            csumToCompare.getChecksumOpt().getChecksumType()) {
+          throw new IOException("Checksum types are different between " + s
+              + " and  " + src);
+        }
+      }
      namenode.concat(trg, srcs);
    } catch(RemoteException re) {
      throw re.unwrapRemoteException(AccessControlException.class,
                                     UnresolvedPathException.class);
    }
  }
{code}
                
> Allow DFSClient to find out and use the CRC type being used for a file.
> -----------------------------------------------------------------------
>
>                 Key: HDFS-3177
>                 URL: https://issues.apache.org/jira/browse/HDFS-3177
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, hdfs client
>    Affects Versions: 0.23.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>             Fix For: 2.1.0-alpha, 3.0.0
>
>         Attachments: hdfs-3177-after-hadoop-8239-8240.patch.txt, hdfs-3177-after-hadoop-8239.patch.txt,
hdfs-3177-branch2-trunk.patch.txt, hdfs-3177-branch2-trunk.patch.txt, hdfs-3177-branch2-trunk.patch.txt,
hdfs-3177-branch2-trunk.patch.txt, hdfs-3177-branch2-trunk.patch.txt, hdfs-3177.patch, hdfs-3177-with-hadoop-8239-8240.patch.txt,
hdfs-3177-with-hadoop-8239-8240.patch.txt, hdfs-3177-with-hadoop-8239-8240.patch.txt, hdfs-3177-with-hadoop-8239.patch.txt
>
>
> To support HADOOP-8060, DFSClient should be able to find out the checksum type being
used for files in hdfs.
> In my prototype, DataTransferProtocol was extended to include the checksum type in the
blockChecksum() response. DFSClient uses it in getFileChecksum() to determin the checksum
type. Also append() can be configured to use the existing checksum type instead of the configured
one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message