hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Binglin Chang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8990) Some minor issus in protobuf based ipc
Date Tue, 22 Jan 2013 01:11:11 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559262#comment-13559262
] 

Binglin Chang commented on HADOOP-8990:
---------------------------------------

bq. Note that since rpc has pluggable rpc engine, the rpc request itself is serialized based
on the pluggable rpc engine. For protobuf-rpc-engine the length is serialized as varint.
I don't understand why protobuf-rpc-engine has to be varint prefixd? I mean originally it
is 4 byte int prefixed, and HADOOP-8084 changed it to varint prefixed, and then at some time
later RpcResponseWritable changed to using varint, and RpcRequestWritable keep using 4 byte
int.
I just think 4 byte int is a better choice, because it easy to implement in non-blocking code.

Just for example, see Todd's ipc implementation:
https://github.com/toddlipcon/libhrpc/blob/master/rpc_client.cpp
{code}
  uint32_t length;
  CodedInputStream cis(&rbuf_[rbuf_consumed_], rbuf_available());
  bool success = cis.ReadVarint32(&length);
  if (!success) {
    if (rbuf_available() >= kMaxVarint32Size) {
      // TODO: error handling
      LOG(FATAL) << "bad varint";
    }
    // Not enough bytes in buffer to read varint, ask for at least another byte
    EnsureAvailableLength(rbuf_available() + 1,
        boost::bind(&RpcClient::ReadResponseHeaderLengthCallback, this,
          asio::placeholders::error));
    return;
  }
{code}
There are lot of retry logic to read varint in non-blocking code, it is much easier if it's
fixed 4 byte, and hence unified. 



                
> Some minor issus in protobuf based ipc
> --------------------------------------
>
>                 Key: HADOOP-8990
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8990
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Binglin Chang
>            Priority: Minor
>
> 1. proto file naming
> RpcPayloadHeader.proto include not only RpcPayLoadHeaderProto, but also RpcResponseHeaderProto,
which is irrelevant to the file name.
> hadoop_rpc.proto only include HadoopRpcRequestProto, and the filename "hadoop_rpc" is
strange comparing to other .proto file names.
> How about merge those two file into HadoopRpc.proto?
> 2. proto class naming
> In rpc request RpcPayloadHeaderProto includes callId, but in rpc response callId is included
in RpcResponseHeaderProto, and there is also HadoopRpcRequestProto, this is just too confusing.
> 3. The rpc system is not fully protobuf based, there are still some Writables:
> RpcRequestWritable and RpcResponseWritable.
> rpc response exception name and stack trace string.
> And RpcRequestWritable uses protobuf style varint32 prefix, but RpcResponseWritable uses
int32 prefix, why this inconsistency?
> Currently rpc request is splitted into length, PayLoadHeader and PayLoad, and response
into RpcResponseHeader, response and error message. 
> I think wrap request and response into single RequstProto and ResponseProto is better,
cause this gives a formal complete wire format definition, 
> or developer need to look into the source code and hard coding the communication format.
> These issues above make it very confusing and hard for developers to use these rpc interfaces.
> Some of these issues can be solved without breaking compatibility, but some can not,
but at least we need to know what will be changed and what will stay stable?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message