hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10672) libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ version of cat tool
Date Tue, 02 Aug 2016 16:39:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404346#comment-15404346

James Clampffer commented on HDFS-10672:

Hey Anatoli, it looks like this no longer applies cleanly to the current head of HDFS-8707,
could you take a look?  I let my turnaround time for finishing reviews get longer than it
should be which could have caused this; sorry about that.

  virtual Status PositionRead(void *buf, size_t buf_size, off_t offset, size_t *bytes_read)
= 0;
  virtual Status Read(void *buf, size_t buf_size, size_t *bytes_read) = 0;
Thanks for changing this and the bits of code the change touched.  The old API that used nbyte
as an in/out parameter was a bad design choice for every real-world situation I tried using
it on.

Overall the code looks good.  I think there's a few things worth doing make it a bit better
before committing.

1) In cat.c you have some nice uri parsing code.  It looks like you are working on several
other file system tools that are going to be using something similar if they have C versions.
 I think it'd be worth pulling out that parsing code into it's own file so it can be shared
between your tools and used by others who need a nice C API for that.  You could also write
some C wrappers around the C++ URI parsing code because it's more functionally complete if
you want to go that route.  You can defer this work until some of your other utility tools
start landing; I just want to make sure we don't have several tools with the same copy/pasted
block of code.

2) In cat.cpp you use the same C-style URI code.  Since part of the value of this work is
to provide good examples of libhdfs API usage I think you should reuse our URI parsing class
from libhdfspp/lib/common/uri.h here.

3) Like Bob mentioned earlier it's worth updating this to use the ConfigurationLoader to give
you an HdfsConfiguration object that can give you an Options object that reflects what people
have in /etc/hadoop/conf or $HADOOP_CONF_DIR.  I can give you a hand here if you need some
references on how to do this (at least the way I'd do it).

4) In some of the unit tests:
file_info->file_length_ = 1; //To avoid running into EOF
Best to add at least one test that makes sure we do get the EOF status when expected.  I've
tried to get away with similar "small" changes to other tests without new tests and it always
ended up being more pain later on than writing the test would have.

5) In the changes to status:
bool invalid_offset() const { return code_ == kInvalidOffset; }
Not really a blocker, but I try to follow the pattern of is_<some predicate on object's
current state> to make it really clear that what you're doing is strictly a test even when
the object isn't const qualified.

General comment not related to your changes but could be worth discussing somewhere:
Factory functions that return a Status and assign to a user supplied pointer leads to weird
looking code (not at all your fault, this is what the API forces)
FileHandle *file_raw = nullptr;
stat = fs->Open(uri.path, &file_raw);
if (!stat.ok()) {
  cerr << "Could not open file " << uri.path << endl;
  return 1;
//wrapping file_raw into a unique pointer to guarantee deletion
unique_ptr<FileHandle> file(file_raw);

It'd sure be nice if FileSystem::Open could just return a unique_ptr or raw pointer IMO. 
Google's coding standards pretend/assert that exceptions don't exist.  For their code that
might be true, but I think this is going to be a boilerplate pattern that shows up in any
code that uses libhdfs++ but is written to use RAII in idiomatic C++.  Do we want to make
life easier for those users with a more idiomatic constructor?  One thing you might want to
do in the error catching block is assert that the pointer is still null (or just call free
on it).

> libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ version of
cat tool
> ----------------------------------------------------------------------------------------------
>                 Key: HDFS-10672
>                 URL: https://issues.apache.org/jira/browse/HDFS-10672
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>         Attachments: HDFS-10672.HDFS-8707.000.patch, HDFS-10672.HDFS-8707.001.patch,
HDFS-10672.HDFS-8707.002.patch, HDFS-10672.HDFS-8707.003.patch
> src/main/libhdfspp/examples should be structured like examples/language/utility instead
of examples/utility/language for easier access by different developers.
> Additionally implementing C++ version of cat tool.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message