hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9663) Optimize some RPC call using lighter weight construct than DatanodeInfo
Date Tue, 26 Jan 2016 03:57:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116634#comment-15116634
] 

Colin Patrick McCabe commented on HDFS-9663:
--------------------------------------------

Is that stuff actually sent over the wire in every case?  These fields are optional in the
protobuf structures.

{code}
/**
 * The status of a Datanode
 */
message DatanodeInfoProto {
  required DatanodeIDProto id = 1;
  optional uint64 capacity = 2 [default = 0];
  optional uint64 dfsUsed = 3 [default = 0];
  optional uint64 remaining = 4 [default = 0];
  optional uint64 blockPoolUsed = 5 [default = 0];
  optional uint64 lastUpdate = 6 [default = 0];
  optional uint32 xceiverCount = 7 [default = 0];
  optional string location = 8;
  enum AdminState {
    NORMAL = 0;
    DECOMMISSION_INPROGRESS = 1;
    DECOMMISSIONED = 2;
  }
  
  optional AdminState adminState = 10 [default = NORMAL];
  optional uint64 cacheCapacity = 11 [default = 0];
  optional uint64 cacheUsed = 12 [default = 0];
  optional uint64 lastUpdateMonotonic = 13 [default = 0];
  optional string upgradeDomain = 14;
}
{code}

I agree that it's messy that these fields are optional, but it's hard to see how to change
it compatibly at this point.

> Optimize some RPC call using lighter weight construct than DatanodeInfo
> -----------------------------------------------------------------------
>
>                 Key: HDFS-9663
>                 URL: https://issues.apache.org/jira/browse/HDFS-9663
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>
> While working on HDFS-8430 when add a RPC in DataTransferProtocol, it was noticed the
very heavy construct either {{DatanodeInfo}} or {{DatanodeInfoWithStorage}} is used to represent
a datanode just for connection in most time. However, it's very fat and contains much more
information than that needed. See how it's defined:
> {code}
> public class DatanodeInfo extends DatanodeID implements Node {
>   private long capacity;
>   private long dfsUsed;
>   private long remaining;
>   private long blockPoolUsed;
>   private long cacheCapacity;
>   private long cacheUsed;
>   private long lastUpdate;
>   private long lastUpdateMonotonic;
>   private int xceiverCount;
>   private String location = NetworkTopology.DEFAULT_RACK;
>   private String softwareVersion;
>   private List<String> dependentHostNames = new LinkedList<>();
>   private String upgradeDomain;
> ...
> {code}
> In client and datanode sides, for RPC calls like {{DataTransferProtocol#writeBlock}},
looks like the information contained in {{DatanodeID}} is almost enough.
> I did a quick hack that using a light weight construct like {{SimpleDatanodeInfo}} that
simply extends DatanodeID (no other field added, but if whatever field needed, then just add
it) and changed the DataTransferProtocol#writeBlock call. Manually checked many relevant tests
it did work fine. How much network traffic saved, did a simple test with codes in {{Sender}}:
> {code}
>   private static void send(final DataOutputStream out, final Op opcode,
>       final Message proto) throws IOException {
>     LOG.trace("Sending DataTransferOp {}: {}",
>         proto.getClass().getSimpleName(), proto);
>     int before = out.size();
>     op(out, opcode);
>     proto.writeDelimitedTo(out);
>     int after = out.size();
>     System.out.println("XXXXXXXXXXXXXXXXX sent=" + (after - before));
>     out.flush();
>   }
> {code}
> Ran the test {{TestWriteRead#testWriteAndRead}}, the change can  save about 100 bytes
in most time for the call. The saving may be not so big because only 3 datanodes are to send,
but in situations like in {{BlockECRecoveryCommand}}, there can be 6+ 3 datanodes as targets
and sources to send, the saving will be significant.
> Hence, suggest use more light weight construct to represent a datanode in RPC calls when
possible. Or other ideas to avoid unnecessary wire data size. This may make sense, as noted,
there were some discussions in HDFS-8999 to save some datanodes bandwidth.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message