hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7018) Implement C interface for libhdfs3
Date Wed, 28 Jan 2015 04:20:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294691#comment-14294691

Colin Patrick McCabe commented on HDFS-7018:

Hi Zhanwei,

The full text of the Google Coding style says "On their face, the benefits of using exceptions
outweigh the costs, especially in new projects."  Then it goes on to explain why exceptions
in C\+\+ are not really a good idea.  I realize this is a tricky verbal construction.  "On
their face" means "at first glance" or "if we did not look deeper, we would think that..."

The reality is that exceptions in C\+\+ are just a bad idea.  Most people who have worked
with really big C++ codebases share this belief.  The LLVM / Clang compiler doesn't use exceptions,
nor does Firefox's C++ code base.  Nor does the V8 Javascript engine, or the Chrome browser,
LevelDB, the protobuf compiler, or Cloudera Impala.

bq. blocksize is 64bit integer in JAVA code and JAVA API. I cannot find any reason to restrict
it to 32bit in C API.

I agree with you that the API is poorly designed.  It should have been an {{int64_t}} to match
the Java code, not a 32-bit number.

But unfortunately, changing the type is a compatibility-breaking change.  All existing applications
will break if we change this.  And the way that they will break is undefined.  The library
will try to interpret a 32-bit number on the stack or in a register as 64-bit.  This could
result in garbage being used for the block size.  It is like casting a structure of one type
to another.

We cannot assume that all applications will be recompiled every time libhdfs changes.  It
must maintain backwards compatibility at a binary level.  Adding new functions is fine, but
changing the types in existing ones is not OK.

Once this branch is merged, we can make a change or two in Hadoop 3.x.  We should get rid
of time_t in seconds as well, and just use a 64-bit type that is in milliseconds.  But please,
let's not open that can of worms right now, or else we will not be able to merge this to Hadoop

It's very rare for people to want block sizes above 2 GB anyway.  They were broken in HDFS
for a long time, and nobody noticed (although it was fixed recently).  Please let's just leave
this alone for now.  I would say, open a JIRA against trunk (not this branch) to fix it in
the Hadoop 3.0 release.

> Implement C interface for libhdfs3
> ----------------------------------
>                 Key: HDFS-7018
>                 URL: https://issues.apache.org/jira/browse/HDFS-7018
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Zhanwei Wang
>            Assignee: Zhanwei Wang
>         Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch,
> Implement C interface for libhdfs3

This message was sent by Atlassian JIRA

View raw message