hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anatoli Shein (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, and hdfs_chmod
Date Mon, 22 Aug 2016 20:04:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431511#comment-15431511

Anatoli Shein commented on HDFS-10754:

Thanks for the review, [~James C].
I have addressed your comments in the new patch as follows:
1) There's a good bit of code duplication in the config parsing code for each tool. Factoring
that out into a set of shared files makes each tool cleaner and also helps improve consistency
in how they deal with configuration.
(/) I changed the way configs are parsed, and separated them into a separate function. For
examples we just now look at the configs and do not care about the given uri, and for tools
we look at both.

2) Very minor, disregard if you want, but you rather than including google/protobuf/io/coded_stream.h
you could include google/protobuf/stubs/common.h if you just need something to declare ShutdownProtobufLibrary.
It should be a bit smaller so it'd help with compile times and brings in fewer symbols.
(/) Fixed.

3) In places you call "optional<T>::value" you can also call "optional<T>::value_or"
to get rid of some if blocks without making the code any harder to read (IMO). Nested value_or
calls do tend to make things hard to read. The block below and the equivalent block for host
are repeated in multiple places, perhaps make a function that takes the uri, options, and
default and stick the logic there and share it between executables in the same was as the
config stuff.
(/) I got rid of this whole part altogether, since ConfigLoader is now doing this part.

4) One of the things that's getting more and more important is adding tests; admittedly I
haven't been getting good CI coverage on my recent patches either. Could you include some
simple tests, even a bash script would do, where you run these tools? One method would be
to assume there's a cluster running and you're passed in something like $HADOOP_CLI (if you
need to write anything), $NN_HOST, and $NN_PORT just so it's easy enough for people to run
the tests. One of the major blockers we have in the way of merging this branch into trunk
is getting really solid tests so if you think of a way to test these in the CI test environment
that's ideal.
(/) I will add tests for this when the find tool patch (HDFS-10679) lands, because I need
it for the result verification.

5) Use std::make_shared rather than passing a pointer into the constructor of shared_ptr.
make_shared will allocate the reference count variable contiguously with the rest of the class
so you get less heap fragmentation and better locality. It's also exception safe which is
always a good thing.
(/) Done. I updated all occurrences of this.

6) Not an issue in your current code, but in case anyone comes in later to modify it I'd get
rid of pointer aliases like the one below. You can still do the if(!whatever) on shared_ptrs,
this is a case where constructing the shared_ptr with a pointer is a reasonable thing to do.
(/) Fixed this for the FileSystem object.

Additionally, I replaced SetOwnerState and SetPermissionState with just RecursionState to
reduce boilerplate code, since they mostly match. A few function specific parameters are now
passed though outside of this struct.

> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, and
> ------------------------------------------------------------------------------------------------
>                 Key: HDFS-10754
>                 URL: https://issues.apache.org/jira/browse/HDFS-10754
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>         Attachments: HDFS-10754.HDFS-8707.000.patch, HDFS-10754.HDFS-8707.001.patch,
HDFS-10754.HDFS-8707.002.patch, HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch,
HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, HDFS-10754.HDFS-8707.007.patch

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message