hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5078) DistributedLogSplitter failing to split file because it has edits for lots of regions
Date Wed, 21 Dec 2011 04:01:31 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173836#comment-13173836
] 

Hadoop QA commented on HBASE-5078:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508195/5078-v4.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/564//console

This message is automatically generated.
                
> DistributedLogSplitter failing to split file because it has edits for lots of regions
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-5078
>                 URL: https://issues.apache.org/jira/browse/HBASE-5078
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 5078-v2.txt, 5078-v3.txt, 5078-v4.txt, 5078.txt
>
>
> Testing 0.92.0RC, ran into interesting issue where a log file had edits for many regions
and just opening the file per region was taking so long, we were never updating our progress
and so the split of the log just kept failing; in this case, the first 40 edits in a file
required our opening 35 files -- opening 35 files took longer than the hard-coded 25 seconds
its supposed to take "acquiring" the task.
> First, here is master's view:
> {code}
> 2011-12-20 17:54:09,184 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not
yet acquired /hbase/splitlog/hdfs%3A%2F%2Fsv4r11s38%3A7000%2Fhbase%2F.logs%2Fsv4r31s44%2C7003%2C1324365396770-splitting%2Fsv4r31s44%252C7003%252C1324365396770.1324403487679
ver = 0
> ...
> 2011-12-20 17:54:09,233 INFO org.apache.hadoop.hbase.master.SplitLogManager: task /hbase/splitlog/hdfs%3A%2F%2Fsv4r11s38%3A7000%2Fhbase%2F.logs%2Fsv4r31s44%2C7003%2C1324365396770-splitting%2Fsv4r31s44%252C7003%252C1324365396770.1324403487679
acquired by sv4r27s44,7003,1324365396664
> ...
> 2011-12-20 17:54:35,475 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not
yet acquired /hbase/splitlog/hdfs%3A%2F%2Fsv4r11s38%3A7000%2Fhbase%2F.logs%2Fsv4r31s44%2C7003%2C1324365396770-splitting%2Fsv4r31s44%252C7003%252C1324365396770.1324403573033
ver = 3
> {code}
> Master then gives it elsewhere.
> Over on the regionserver we see:
> {code}
> 2011-12-20 17:54:09,233 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker
sv4r27s44,7003,1324365396664 acquired task /hbase/splitlog/hdfs%3A%2F%2Fsv4r11s38%3A7000%2Fhbase%2F.logs%2Fsv4r31s44%2C7003%2C1324365396770-splitting%2Fsv4r31s44%252C7003%252C1324365396770.1324403487679
> ....
> 2011-12-20 17:54:10,714 DEBUG org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter:
Path=hdfs://sv4r11s38:7000/hbase/splitlog/sv4r27s44,7003,1324365396664_hdfs%3A%2F%2Fsv4r11s38%3A7000%2Fhbase%2F.logs%2Fsv4r31s44%2C7003%2C1324365396770-splitting%2Fsv4r31s44%252C7003%252C1324365396770.1324403487679/TestTable/6b6bfc2716dff952435ab26f018648b2/recovered.ed
> its/0000000000000278862.temp, syncFs=true, hflush=false
> ....
> {code}
> .... and so on till:
> {code}
> 2011-12-20 17:54:36,876 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: task
/hbase/splitlog/hdfs%3A%2F%2Fsv4r11s38%3A7000%2Fhbase%2F.logs%2Fsv4r31s44%2C7003%2C1324365396770-splitting%2Fsv4r31s44%252C7003%252C1324365396770.1324403487679
preempted from sv4r27s44,7003,1324365396664, current task state and owner=owned sv4r28s44,7003,1324365396678
> ....
> 2011-12-20 17:54:37,112 WARN org.apache.hadoop.hbase.regionserver.SplitLogWorker: Failed
to heartbeat the task/hbase/splitlog/hdfs%3A%2F%2Fsv4r11s38%3A7000%2Fhbase%2F.logs%2Fsv4r31s44%2C7003%2C1324365396770-splitting%2Fsv4r31s44%252C7003%252C1324365396770.1324403487679
> ....
> {code}
> When above happened, we'd only processed 40 edits.  As written, we only heatbeat every
1024 edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message