hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
Date Mon, 24 Aug 2015 18:11:45 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Anu Engineer updated HADOOP-12325:
    Attachment: HADOOP-12325.006.patch

[~ajisakaa] Thanks for your review and changes to the test file. Please see my comments below

bq. 1. Would you add a whitespace before "took " in the log message?

bq. 2. After running the regression test locally, I can't see any logs about sleep RPC.

On my machine if I open the file  org.apache.hadoop.ipc.TestProtoBufRpc-output.txt in the
sure-fire reports directory, I am able to see the following line.

2015-08-24 10:52:16,713 WARN  ipc.Server (Server.java:logSlowRpcCalls(438)) - Slow RPC : sleep
took 3004 milliseconds to process from client

bq. Attaching a patch to verify that the slow call is logged. Now the test fails.

With the new call {code} long after = getLongCounter("RpcSlowCalls", rpcMetrics); {code} somehow
the mocking layer is still returning the old snap-shotted value.  I have modified the tests
to call server layer directly and tests are now behaving as expected.

> RPC Metrics : Add the ability track and log slow RPCs
> -----------------------------------------------------
>                 Key: HADOOP-12325
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12325
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc, metrics
>    Affects Versions: 2.7.1
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>         Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch,
HADOOP-12325.003.patch, HADOOP-12325.004.patch, HADOOP-12325.005.patch, HADOOP-12325.005.test.patch,
> This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting
that allows users to log really slow RPCs.  Slow RPCs are RPCs that fall at 99th percentile.
This is useful to troubleshoot why certain services like name node freezes under heavy load.

This message was sent by Atlassian JIRA

View raw message