hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-524) Further DataTransferProtocol code refactoring.
Date Tue, 04 Aug 2009 20:30:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739139#action_12739139

Tsz Wo (Nicholas), SZE commented on HDFS-524:

bq. in DataXceiver: is datanode.getXceiverCount() is so costly that it really make sense performance
getXceiverCount() alone is not costly.  However, constructing the string, which creates intermediate
objects, inside LOG.debug(..) is unnecessary.  Since we often got gc problem, I think it is
good to minimize object creations.

it seems that the switch statement won't grow up anytime soon, but wouldn't it be better looking
and perhaps more efficient to replace it with polymorphism? It will increase a number of classes
though, but the control flow would be better enclosed within new classes. I.e. the operations
like opReadBlock(), opBlockChecksum(), etc. could be moved to a smaller classes like OpBlockRead,
OpBlockCheckSum which will expose only one public method, say, doOperation().
This is a good idea.  Let's discuss class structure in more details.

In the patch, we have
  public enum Op {

  public static abstract class Receiver {
    protected final void processOp(Op op, DataInputStream in
        ) throws IOException {
      switch(op) {
      case WRITE_BLOCK:

    private final void opWriteBlock(DataInputStream in) throws IOException {
      final long blockId = in.readLong();
      opWriteBlock(in, blockId, blockGs, pipelineSize, isRecovery,
          client, src, targets, accesstoken);

    protected abstract void opWriteBlock(DataInputStream in,
        long blockId, long blockGs, int pipelineSize, boolean isRecovery,
        String client, DatanodeInfo src, DatanodeInfo[] targets,
        AccessToken accesstoken) throws IOException;
In the implementation, DataXceiver extends DataTransferProtocol.Receiver and implements all
the abstract methods.
class DataXceiver extends DataTransferProtocol.Receiver
    implements Runnable, FSConstants {
  public void opWriteBlock(DataInputStream in, long blockId, long blockGs,
      int pipelineSize, boolean isRecovery,
      String client, DatanodeInfo srcDataNode, DatanodeInfo[] targets,
      AccessToken accessToken) throws IOException {
If we have the proposed classes OpBlockRead, OpBlockCheckSum, etc., then how should DataXceiver
implement the abstract methods?

Also, how to relate the new classes OpBlockRead and OpBlockCheckSum with the enum Op?

> Further DataTransferProtocol code refactoring.
> ----------------------------------------------
>                 Key: HDFS-524
>                 URL: https://issues.apache.org/jira/browse/HDFS-524
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>    Affects Versions: 0.21.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.21.0
>         Attachments: h524_20090803.patch
> This is a further refactoring over HDFS-377 to move generic codes from DataXceiver to

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message