hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18168) NoSuchElementException when rolling the log
Date Tue, 06 Jun 2017 11:01:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16038617#comment-16038617
] 

Hadoop QA commented on HBASE-18168:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 51s {color}
| {color:green} branch-1.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} |
{color:green} branch-1.1 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} |
{color:green} branch-1.1 passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color}
| {color:green} branch-1.1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color}
| {color:green} branch-1.1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 12s {color} | {color:red}
hbase-server in branch-1.1 has 80 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 33s {color} | {color:red}
hbase-server in branch-1.1 failed with JDK v1.8.0_131. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} |
{color:green} branch-1.1 passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} |
{color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} |
{color:green} the patch passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 13m 57s {color}
| {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2
2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 12s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s {color} |
{color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 24s {color} | {color:red}
hbase-server in the patch failed with JDK v1.8.0_131. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} |
{color:green} the patch passed with JDK v1.7.0_131 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 90m 24s {color} | {color:red}
hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 2s {color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 122m 16s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.TestAssignmentManager |
|   | hadoop.hbase.mapreduce.TestCellCounter |
| Timed out junit tests | org.apache.hadoop.hbase.wal.TestWALSplitCompressed |
|   | org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap |
|   | org.apache.hadoop.hbase.util.TestMiniClusterLoadEncoded |
|   | org.apache.hadoop.hbase.snapshot.TestSecureExportSnapshot |
|   | org.apache.hadoop.hbase.snapshot.TestExportSnapshot |
|   | org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase |
|   | org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat |
|   | org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat |
|   | org.apache.hadoop.hbase.wal.TestWALSplit |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.10.1 Server=1.10.1 Image:yetus/hbase:70789c7 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871527/HBASE-18168-branch-1.1.patch
|
| JIRA Issue | HBASE-18168 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  hbaseanti  checkstyle
 compile  |
| uname | Linux 72aacde2bed7 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20
UTC 2015 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/hbase.sh |
| git revision | branch-1.1 / 70789c7 |
| Default Java | 1.7.0_131 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131
|
| findbugs | v3.0.0 |
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7098/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
|
| javadoc | https://builds.apache.org/job/PreCommit-HBASE-Build/7098/artifact/patchprocess/branch-javadoc-hbase-server-jdk1.8.0_131.txt
|
| javadoc | https://builds.apache.org/job/PreCommit-HBASE-Build/7098/artifact/patchprocess/patch-javadoc-hbase-server-jdk1.8.0_131.txt
|
| unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7098/artifact/patchprocess/patch-unit-hbase-server.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HBASE-Build/7098/artifact/patchprocess/patch-unit-hbase-server.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7098/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7098/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> NoSuchElementException when rolling the log
> -------------------------------------------
>
>                 Key: HBASE-18168
>                 URL: https://issues.apache.org/jira/browse/HBASE-18168
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.11
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>         Attachments: HBASE-18168-branch-1.1.patch, HBASE-18168-branch-1.1.v2.patch
>
>
> Today, one of our server aborted due to the following log.
> {code}
> 2017-06-06 05:38:47,142 ERROR [regionserver/xxxx.logRoller] regionserver.LogRoller: Log
rolling failed
> java.util.NoSuchElementException
>         at java.util.concurrent.ConcurrentSkipListMap$Iter.advance(ConcurrentSkipListMap.java:2224)
>         at java.util.concurrent.ConcurrentSkipListMap$ValueIterator.next(ConcurrentSkipListMap.java:2253)
>         at java.util.Collections.min(Collections.java:628)
>         at org.apache.hadoop.hbase.regionserver.wal.FSHLog.findEligibleMemstoresToFlush(FSHLog.java:861)
>         at org.apache.hadoop.hbase.regionserver.wal.FSHLog.findRegionsToForceFlush(FSHLog.java:886)
>         at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:728)
>         at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:137)
>         at java.lang.Thread.run(Thread.java:756)
> 2017-06-06 05:38:47,142 FATAL [regionserver/xxxx.logRoller] regionserver.HRegionServer:
ABORTING region server xxxx: Log rolling failed
> java.util.NoSuchElementException
> ......
> {code}
> The code is here: 
> {code}
> private byte[][] findEligibleMemstoresToFlush(Map<byte[], Long> regionsSequenceNums)
{
>     List<byte[]> regionsToFlush = null;
>     // Keeping the old behavior of iterating unflushedSeqNums under oldestSeqNumsLock.
>     synchronized (regionSequenceIdLock) {
>       for (Map.Entry<byte[], Long> e: regionsSequenceNums.entrySet()) {
>         ConcurrentMap<byte[], Long> m =
>             this.oldestUnflushedStoreSequenceIds.get(e.getKey());
>         if (m == null) {
>           continue;
>         }
>         long unFlushedVal = Collections.min(m.values()); //The exception is thrown here
>         ......
> {code}
> The map 'm' is empty is the only reason I can think of why NoSuchElementException is
thrown. I then looked up all code related to the update of 'oldestUnflushedStoreSequenceIds'.
All update to 'oldestUnflushedStoreSequenceIds' is guarded by the synchronization of 'regionSequenceIdLock'
except here:
> {code}
> private ConcurrentMap<byte[], Long> getOrCreateOldestUnflushedStoreSequenceIdsOfRegion(
>       byte[] encodedRegionName) {
>     ......
>     oldestUnflushedStoreSequenceIdsOfRegion =
>         new ConcurrentSkipListMap<byte[], Long>(Bytes.BYTES_COMPARATOR);
>     ConcurrentMap<byte[], Long> alreadyPut =
>         oldestUnflushedStoreSequenceIds.putIfAbsent(encodedRegionName,
>           oldestUnflushedStoreSequenceIdsOfRegion); // Here, a empty map may put to 'oldestUnflushedStoreSequenceIds'
with no synchronization
>     return alreadyPut == null ? oldestUnflushedStoreSequenceIdsOfRegion : alreadyPut;
>   }
> {code}
> It should be a very rare bug. But it can lead to server abort. It only exists in branch-1.1.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message