Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D9E1B200BC7 for ; Fri, 11 Nov 2016 02:07:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id D7B91160B01; Fri, 11 Nov 2016 01:07:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id ABD77160B11 for ; Fri, 11 Nov 2016 02:07:00 +0100 (CET) Received: (qmail 91911 invoked by uid 500); 11 Nov 2016 01:06:59 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 91826 invoked by uid 99); 11 Nov 2016 01:06:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Nov 2016 01:06:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 9AE012C4C73 for ; Fri, 11 Nov 2016 01:06:59 +0000 (UTC) Date: Fri, 11 Nov 2016 01:06:59 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17069) RegionServer writes invalid META entries for split daughters in some circumstances MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 11 Nov 2016 01:07:02 -0000 [ https://issues.apache.org/jira/browse/HBASE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15655722#comment-15655722 ] Andrew Purtell commented on HBASE-17069: ---------------------------------------- I can't say [~mantonov]. I haven't test 1.3 and up. Pretty busy here. I'm assuming you have not seen it? I will aim to get a test of the latest 1.3.0 tomorrow, but can't promise it > RegionServer writes invalid META entries for split daughters in some circumstances > ---------------------------------------------------------------------------------- > > Key: HBASE-17069 > URL: https://issues.apache.org/jira/browse/HBASE-17069 > Project: HBase > Issue Type: Bug > Affects Versions: 1.2.4 > Reporter: Andrew Purtell > Priority: Critical > Attachments: daughter_1_d55ef81c2f8299abbddfce0445067830.log, daughter_2_08629d59564726da2497f70451aafcdb.log, logs.tar.gz, parent-393d2bfd8b1c52ce08540306659624f2.log > > > I have been seeing frequent ITBLL failures testing various versions of 1.2.x. > Over the lifetime of 1.2.x the following issues have been fixed: > - HBASE-15315 (Remove always set super user call as high priority) > - HBASE-16093 (Fix splits failed before creating daughter regions leave meta inconsistent) > And this one is pending: > - HBASE-17044 (Fix merge failed before creating merged region leaves meta inconsistent) > I can apply all of the above to branch-1.2 and still see this failure: > *The life of stillborn region d55ef81c2f8299abbddfce0445067830* > *Master sees SPLITTING_NEW* > {noformat} > 2016-11-08 04:23:21,186 INFO [AM.ZK.Worker-pool2-t82] master.RegionStates: Transition null to {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW, ts=1478579001186, server=node-3.cluster,16020,1478578389506} > {noformat} > *The RegionServer creates it* > {noformat} > 2016-11-08 04:23:26,035 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created cacheConfig for GomnU: blockCache=LruBlockCache{blockCount=34, currentSize=14996112, freeSize=12823716208, maxSize=12838712320, heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false > 2016-11-08 04:23:26,038 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created cacheConfig for big: blockCache=LruBlockCache{blockCount=34, currentSize=14996112, freeSize=12823716208, maxSize=12838712320, heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false > 2016-11-08 04:23:26,442 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created cacheConfig for meta: blockCache=LruBlockCache{blockCount=63, currentSize=17187656, freeSize=12821524664, maxSize=12838712320, heapSize=17187656, minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false > 2016-11-08 04:23:26,713 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created cacheConfig for nwmrW: blockCache=LruBlockCache{blockCount=96, currentSize=19178440, freeSize=12819533880, maxSize=12838712320, heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false > 2016-11-08 04:23:26,715 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created cacheConfig for piwbr: blockCache=LruBlockCache{blockCount=96, currentSize=19178440, freeSize=12819533880, maxSize=12838712320, heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false > 2016-11-08 04:23:26,717 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created cacheConfig for tiny: blockCache=LruBlockCache{blockCount=96, currentSize=19178440, freeSize=12819533880, maxSize=12838712320, heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5, singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false > {noformat} > *The RegionServer onlines it* > {noformat} > 2016-11-08 04:23:27,015 INFO [node-3.cluster,16020,1478578389506-daughterOpener=d55ef81c2f8299abbddfce0445067830] regionserver.HRegion: Onlined d55ef81c2f8299abbddfce0445067830; next sequenceid=19184 > 2016-11-08 04:23:27,029 INFO [regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099] regionserver.HRegionServer: Post open deploy tasks for IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830. > 2016-11-08 04:23:27,047 INFO [regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099] hbase.MetaTableAccessor: Updated row IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830. with server=node-3.cluster,16020,1478578389506 > {noformat} > *The Master transitions state from SPLITTING_NEW to OPEN* > {noformat} > 2016-11-08 04:23:27,058 INFO [AM.ZK.Worker-pool2-t84] master.RegionStates: Transition {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW, ts=1478579007057, server=node-3.cluster,16020,1478578389506} to {d55ef81c2f8299abbddfce0445067830 state=OPEN, ts=1478579007058, server=node-3.cluster,16020,1478578389506} > 2016-11-08 04:23:27,059 INFO [AM.ZK.Worker-pool2-t84] master.AssignmentManager: Handled SPLIT event; parent=IntegrationTestBigLinkedList,,1478577020916.393d2bfd8b1c52ce08540306659624f2., daughter a=IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830., daughter b=IntegrationTestBigLinkedList,/\xFB\x14,1478579001155.08629d59564726da2497f70451aafcdb., on node-3.cluster,16020,1478578389506 > {noformat} > *RegionServer updates META - BUT APPARENTLY NOT CORRECTLY* > {noformat} > 2016-11-08 04:23:27,165 INFO [regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099] regionserver.SplitRequest: Region split, hbase:meta updated, and report to master. Parent=IntegrationTestBigLinkedList,,1478577020916.393d2bfd8b1c52ce08540306659624f2., new regions: IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830., IntegrationTestBigLinkedList,/\xFB\x14,1478579001155.08629d59564726da2497f70451aafcdb.. Split took 6sec > {noformat} > *RegionServer delays flush* > (Is this important?) > {noformat} > 2016-11-08 04:24:14,639 WARN [MemStoreFlusher.0] regionserver.MemStoreFlusher: Region IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830. has too many store files; delaying flush up to 90000ms > {noformat} > *Immediate warnings about No serialized HRegionInfo* > {noformat} > 2016-11-08 04:24:44,691 WARN [B.defaultRpcServer.handler=26,queue=2,port=16000] hbase.MetaTableAccessor: No serialized HRegionInfo in keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478579007029/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478579007029/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478579007029/Put/vlen=8/seqid=0} > {noformat} > *Master is not happy either* > {noformat} > 2016-11-08 04:24:51,148 WARN [MASTER_TABLE_OPERATIONS-node-1:16000-0] hbase.MetaTableAccessor: No serialized HRegionInfo in keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478579007029/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478579007029/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478579007029/Put/vlen=8/seqid=0} > {noformat} > *TestRunner MetaScanner complains about invalid entries in META missing HRegionInfo* > {noformat} > (standard input):9086:2016-11-08 05:04:17,230 WARN [B.defaultRpcServer.handler=4,queue=1,port=16000] hbase.MetaTableAccessor: No serialized HRegionInfo in keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0} > {noformat} > *ITBLL MapReduce tasks fail because part of the keyspace cannot be located:* > {noformat} > java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList, row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0} > at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1293) > at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1185) > at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:410) > at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:359) > at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238) > at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:154) > at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:121) > at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.persist(IntegrationTestBigLinkedList.java:486) > at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:431) > at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:375) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) > {noformat} > {noformat} > ./application_1478574724776_0002/container_1478574724776_0002_01_000008/syslog:920:java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList, row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0} > {noformat} > {noformat} > ./application_1478574724776_0002/container_1478574724776_0002_01_000010/syslog:920:java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList, row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0} > {noformat} > {noformat} > ./application_1478574724776_0002/container_1478574724776_0002_01_000011/syslog:909:java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList, row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0} > {noformat} > {noformat} > ./application_1478574724776_0002/container_1478574724776_0002_01_000030/syslog:48:java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList, row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0} > {noformat} > {noformat} > ./application_1478574724776_0002/container_1478574724776_0002_01_000048/syslog:48:java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList, row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0, IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0} > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)