hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16931) Setting cell's seqId to zero in compaction flow might cause RS down.
Date Mon, 24 Oct 2016 15:30:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602322#comment-15602322
] 

Yu Li commented on HBASE-16931:
-------------------------------

All timed out cases failed because of OOME (why so frequent OOME?...)
{noformat}
Running org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd
Exception in thread "Thread-2475" java.lang.OutOfMemoryError: Java heap space
Running org.apache.hadoop.hbase.TestHBaseOnOtherDfsCluster
Running org.apache.hadoop.hbase.tool.TestCanaryTool
Exception in thread "process reaper" java.lang.OutOfMemoryError: Java heap space
Exception in thread "Thread-2505" java.lang.OutOfMemoryError: Java heap space
Exception in thread "Thread-2507" java.lang.OutOfMemoryError: Java heap space
{noformat}

And the failed case seems encountered some environment issue (_Unable to create region directory_):
{noformat}
Running org.apache.hadoop.hbase.mapreduce.TestMultiTableSnapshotInputFormat
Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 23.592 sec <<< FAILURE!
- in org.apache.hadoop.hbase.mapreduce.TestMultiTableSnapshotInputFormat
testScanYZYToEmpty(org.apache.hadoop.hbase.mapreduce.TestMultiTableSnapshotInputFormat)  Time
elapsed: 0.044 sec  <<< ERROR!
java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: Unable
to create region directory: /tmp/scantest1_snapshot__8235bb48-4e7b-4e00-ad80-b2ce716c8522/data/default/scantest1/519e450e89d832d702a416a9bca04b5d
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:188)
	at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:180)
	at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.cloneHdfsRegions(RestoreSnapshotHelper.java:527)
	at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:234)
	at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.restoreHdfsRegions(RestoreSnapshotHelper.java:170)
	at org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper.copySnapshotForScanner(RestoreSnapshotHelper.java:736)
	at org.apache.hadoop.hbase.mapreduce.MultiTableSnapshotInputFormatImpl.restoreSnapshot(MultiTableSnapshotInputFormatImpl.java:249)
	at org.apache.hadoop.hbase.mapreduce.MultiTableSnapshotInputFormatImpl.restoreSnapshots(MultiTableSnapshotInputFormatImpl.java:243)
	at org.apache.hadoop.hbase.mapreduce.MultiTableSnapshotInputFormatImpl.setInput(MultiTableSnapshotInputFormatImpl.java:80)
	at org.apache.hadoop.hbase.mapreduce.MultiTableSnapshotInputFormat.setInput(MultiTableSnapshotInputFormat.java:106)
	at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initMultiTableSnapshotMapperJob(TableMapReduceUtil.java:319)
	at org.apache.hadoop.hbase.mapreduce.TestMultiTableSnapshotInputFormat.initJob(TestMultiTableSnapshotInputFormat.java:72)
{noformat}

Ran above 4 cases locally and confirmed all could pass.

> Setting cell's seqId to zero in compaction flow might cause RS down.
> --------------------------------------------------------------------
>
>                 Key: HBASE-16931
>                 URL: https://issues.apache.org/jira/browse/HBASE-16931
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 2.0.0
>            Reporter: binlijin
>            Assignee: binlijin
>            Priority: Critical
>         Attachments: HBASE-16931-master.patch, HBASE-16931.branch-1.patch, HBASE-16931.branch-1.v2.patch,
HBASE-16931_master_v2.patch, HBASE-16931_master_v3.patch, HBASE-16931_master_v4.patch, HBASE-16931_master_v5.patch
>
>
> Compactor#performCompaction
>       do {
>         hasMore = scanner.next(cells, scannerContext);
>         // output to writer:
>         for (Cell c : cells) {
>           if (cleanSeqId && c.getSequenceId() <= smallestReadPoint) {
>             CellUtil.setSequenceId(c, 0);
>           }
>           writer.append(c);
>         }
>         cells.clear();
>       } while (hasMore);
> scanner.next will choose at most "hbase.hstore.compaction.kv.max" kvs, the last cell
still reference by StoreScanner.prevCell, so if cleanSeqId is called when the scanner.next
call StoreScanner.checkScanOrder may throw exception and cause regionserver down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message