hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1008) [performance] The replay of logs on server crash takes way too long
Date Thu, 14 May 2009 21:49:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709604#action_12709604
] 

stack commented on HBASE-1008:
------------------------------

I tested it and it works.

Please fix following when you apply:

There are two lines emitted when HLog is done:

{code}
2009-05-14 21:40:08,467 [HMaster] INFO org.apache.hadoop.hbase.regionserver.HLog: Took 41393ms
2009-05-14 21:40:09,984 [HMaster] INFO org.apache.hadoop.hbase.regionserver.HLog: log file
splitting completed for hdfs://aa0-000-12.u.powerset.com:9000/hbasetrunk2/.logs/aa0-000-15.u.powerset.com_1242336420277_60021
{code}

Can the time taken be added to the "file splitting completed" line?

I think you can name executor threads..... would help with log lines like this:

2009-05-14 21:40:02,309 [pool-1-thread-2] DEBUG org.apache.hadoop.hbase.regionserver.HLog:
Thread got 62947 to process

Who are the edits for?  Add in region name I'd say. 

Otherwise, looks good.

We still need to rewrite it -- if crash during this processing we're hosed.. but this is a
nice speedup.  I'd say up the default number of threads J-D from 3 to 5 or 10 even?

Good stuff.

+1 after making above logging hcanges.



> [performance] The replay of logs on server crash takes way too long
> -------------------------------------------------------------------
>
>                 Key: HBASE-1008
>                 URL: https://issues.apache.org/jira/browse/HBASE-1008
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.20.0, 0.19.3
>
>         Attachments: 1008-v2.patch, hbase-1008-3.patch, hbase-1008-v4-0.19.patch, hbase-1008-v4.patch
>
>
> Watching recovery from a crash on streamy.com where there were 1048 logs and repay is
running at rate of about 20 seconds each.  Meantime these regions are not online.  This is
way too long to wait on recovery for a live site.  Marking critical.  Performance related
so priority and in 0.20.0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message