hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10740) libhdfs++: Implement recursive directory generator
Date Tue, 09 Aug 2016 22:09:21 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414326#comment-15414326
] 

James Clampffer commented on HDFS-10740:
----------------------------------------

Cool stuff, relatively minor comments.  A little more picky than usual since this is an API
usage example.

You might want to rename the "width" argument in the usage message to something like "fanout".
 To me width could mean number of directories in each subdir or number of all leaf nodes at
the top of the tree.  This is just my preference, I'm fine with the current comment.

Comment no longer applicable, could you please replace this with "man" style instructions
on usage and any optional flags?  At some point we need to start commenting in the same markup
as the rest of the HDFS native code as well.
{code}
/*
  A a stripped down version of unix's "cat".
  Doesn't deal with any flags for now, will just attempt to read the whole file.
*/
{code}

Since this is at least partly intended to be in example about how to take advantage of the
async operations some more comments would go a long way.  Just basic things about how the
recursion doesn't block and how everything waits on the promises later on.

Bringing the whole namespaces into scope can lead to weirdness.  You could also just add the
definitions you want "using std::string" etc.  Not a blocker since nothing it's a standalone
util but lets try to avoid this outside of the examples and test directories.
{code}
using namespace std;
using namespace hdfs;
{code}

Generally taking references to any kind of smart_ptr as arguments is a bad idea. Taking references
to unique_ptr is a really bad idea because it's sidestepping the contract that it's supposed
to be unique and will lead to confusion.  Taking a raw pointer using .get() and passing it
around is slightly better because the loss of uniqueness is explicit.
{code}
unique_ptr<FileSystem> & fs
{code}

get_port will return an optional, if that hasn't been set it's undefined behavior to use the
deference operator or otherwise retrieve the value.  That might end up making the error message
useless depending on what libstdc++ feels like doing.
{code}
cerr << "Could not connect to " << uri->get_host() << ":" << *(uri->get_port())
<< endl;
{code}



> libhdfs++: Implement recursive directory generator
> --------------------------------------------------
>
>                 Key: HDFS-10740
>                 URL: https://issues.apache.org/jira/browse/HDFS-10740
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>         Attachments: HDFS-10740.HDFS-8707.000.patch
>
>
> This tool will allow us do benchmarking/testing our find functionality, and will be a
good example showing how to call a large number or namenode operations reqursively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message