hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yi Liu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-7964) Add support for async edit logging
Date Fri, 16 Oct 2015 06:41:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960114#comment-14960114
] 

Yi Liu edited comment on HDFS-7964 at 10/16/15 6:40 AM:
--------------------------------------------------------

Thanks [~daryn] for the work.

Further comments:

*1.* In FSEditLogAsync#run
{code}
@Override
  public void run() {
    try {
      while (true) {
        ....
        if (doSync) {
          ...
            logSync(getLastWrittenTxId());
  ...
{code}
I think it's better to pass the txid of current edit to {{logSync}}, not need to wait for
all txid written. Then it's more efficient and client can get more faster response? 

*2.*
{code}
-log4j.rootLogger=OFF, CONSOLE
+log4j.rootLogger=DEBUG, CONSOLE
{code}
Any reason to change it?

*3.*
{code}
call.abortResponse(syncEx);
{code}
Seems this code is not available?


was (Author: hitliuyi):
Thanks [~daryn] for the work.

Further comments:

*1.* In FSEditLogAsync#run
{code}
@Override
  public void run() {
    try {
      while (true) {
        ....
        if (doSync) {
          ...
            logSync(getLastWrittenTxId());
  ...
{code}
I think it's better to pass the txid of current edit to {{logSync}}, not need to wait for
all txid written. Then it's more efficient and client can get more faster response? 

*2.*
{code}
+          editsBatchedInSync = txid - synctxid - 1;
{code}
Isn't it "txid - synctxid"?   The txid is the max txid written, and synctxid is the max txid
already synced, suppose txid = 20, synctxid = 10, then the editsBatchedInSync should be (txid
- synctxid) = (20 - 10) = 10.   Also you can get it from the existing log message:
{code}
final String msg =
                "Could not sync enough journals to persistent storage " +
                "due to " + e.getMessage() + ". " +
                "Unsynced transactions: " + (txid - synctxid);
{code}

*3.*
{code}
-log4j.rootLogger=OFF, CONSOLE
+log4j.rootLogger=DEBUG, CONSOLE
{code}
Any reason to change it?

*4.*
{code}
call.abortResponse(syncEx);
{code}
Seems this code is not available?

> Add support for async edit logging
> ----------------------------------
>
>                 Key: HDFS-7964
>                 URL: https://issues.apache.org/jira/browse/HDFS-7964
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.0.2-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-7964.patch, HDFS-7964.patch
>
>
> Edit logging is a major source of contention within the NN.  LogEdit is called within
the namespace write log, while logSync is called outside of the lock to allow greater concurrency.
 The handler thread remains busy until logSync returns to provide the client with a durability
guarantee for the response.
> Write heavy RPC load and/or slow IO causes handlers to stall in logSync.  Although the
write lock is not held, readers are limited/starved and the call queue fills.  Combining an
edit log thread with postponed RPC responses from HADOOP-10300 will provide the same durability
guarantee but immediately free up the handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message