hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guanghao Zhang <zghao...@gmail.com>
Subject Re: Major compaction cannot remove deleted rows until the region is split. Strange!
Date Mon, 13 Jun 2016 10:45:27 GMT
We add some log in our production cluster.
2016-06-06,21:37:28,443 INFO org.apache.hadoop.hbase.regionserver.HRegion:
scannerReadPoints size is 0 and smallest read point is 4037995
2016-06-06,21:37:51,429 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: Failed
openScannerjava.io.IOException: Could not seek StoreFileScanner
2016-06-06,21:38:02,103 INFO org.apache.hadoop.hbase.regionserver.HRegion:
scannerReadPoints size is 1 and smallest read point is 4038037
2016-06-13,17:17:10,516 INFO org.apache.hadoop.hbase.regionserver.HRegion:
scannerReadPoints size is 1 and smallest read point is 4038037

When new RegionScanner, it will add a scanner read point in
scannerReadPoints. But if we got a exception after add read point, the read
point will keep in regions server and the delete after this mvcc number
will never be compacted.
Create a issue about this. https://issues.apache.org/jira/browse/HBASE-16012

2016-06-10 3:08 GMT+08:00 Stack <stack@duboce.net>:

> Tian-ying:
>
> Can you try the suggestion below to see if it helps?
>
> Do you see the lease expired logs?
>
> St.Ack
>
>
> On Thu, Jun 2, 2016 at 7:03 PM, hao zhang <zghaobac@gmail.com> wrote:
>
> > Hi, Stack
> >
> >   We found this in our production cluster, too.  I take a look about the
> > code and found one case which will make the major compact not work.
> >
> >   1. put some rows
> >   2. scan
> >   3. delete
> >   4. scanner didn't close rightly, it will keep the  the read point in
> > region's scannerReadPoints.
> >   5. major compact. It can't work because we have a scanner which have
> > small read point than the delete.
> >
> >   But if move the region to new rs, the scannerReadPoints will update to
> > the biggest memstoreTs which form sotrefile. So major compact will work.
> >   I thought the try...catch module in Scanner.leaseExpired() method
> needs a
> > finally module to close the region scanner rightly.
> >
> >     public void leaseExpired() {
> >       RegionScannerHolder rsh = scanners.remove(this.scannerName);
> >       if (rsh != null) {
> >         RegionScanner s = rsh.s;
> >         LOG.info("Scanner " + this.scannerName + " lease expired on
> region
> > "
> >           + s.getRegionInfo().getRegionNameAsString());
> >         try {
> >           Region region =
> > regionServer.getRegion(s.getRegionInfo().getRegionName());
> >           if (region != null && region.getCoprocessorHost() != null) {
> >             region.getCoprocessorHost().preScannerClose(s);
> >           }
> >           s.close();
> >           if (region != null && region.getCoprocessorHost() != null) {
> >             region.getCoprocessorHost().postScannerClose(s);
> >           }
> >         } catch (IOException e) {
> >           LOG.error("Closing scanner for "
> >             + s.getRegionInfo().getRegionNameAsString(), e);
> >         }
> >       } else {
> >         LOG.warn("Scanner " + this.scannerName + " lease expired, but no
> > related" +
> >           " scanner found, hence no chance to close that related
> > scanner!");
> >       }
> >     }
> >
> >
> > 2016-06-02 2:50 GMT+08:00 Stack <stack@duboce.net>:
> >
> > > On Wed, Jun 1, 2016 at 10:56 AM, Tianying Chang <tychang@gmail.com>
> > wrote:
> > >
> > > > Hi, Stack
> > > >
> > > > After moving the region and issue a major compact on that region, its
> > > size
> > > > shrink from 99G down to 24G. So it looks like the region is in a bad
> > > state
> > > > that cannot recover, close/open it fixed the issue. And from the
> region
> > > > size metric graph, we can see major compaction stop working  since
> > March
> > > > 31, so some bug that caused region enter into bad state...
> > Unfortunately,
> > > > we don't have DEBUG enabled and that is the last region that has the
> > > issue,
> > > > it is hard to figure out what is the bug that caused the bad state...
> > > >
> > > >
> > > Interesting. So moving it to another RS make it major-compactable? That
> > > would seem to indicate some state kept in the RS memory is preventing
> the
> > > major compaction running. Is moving the region a workaround for you
> until
> > > we figure what it is Tian-Ying?
> > >
> > > St.
> > >
> > >
> > >
> > > > Thanks
> > > > Tian-Ying
> > > >
> > > > On Tue, May 31, 2016 at 3:43 PM, Tianying Chang <tychang@gmail.com>
> > > wrote:
> > > >
> > > > > Hi, Stack
> > > > >
> > > > > Based on the log, the major compaction was run, and it took 5+
> hours.
> > > > And
> > > > > I also manually run major_compact from hbase shell explicitly to
> > > verify.
> > > > >
> > > > > I just moved the region to a different RS and issued a
> major_compact
> > on
> > > > > that region again, let me see if the major compaction can succeed
> and
> > > > will
> > > > > report back.
> > > > >
> > > > > Thanks
> > > > > Tian-Ying
> > > > >
> > > > > On Sun, May 29, 2016 at 4:35 PM, Stack <stack@duboce.net> wrote:
> > > > >
> > > > >> On Fri, May 27, 2016 at 3:17 PM, Tianying Chang <
> tychang@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Yes, it is 94.26.  By a quick glance, I didn't  see any
put that
> > is
> > > > >> older
> > > > >> > than the delete marker's TS, which could go as far as about
> couple
> > > > weeks
> > > > >> > ago since major compaction on it for long time seems.
> > > > >> >
> > > > >> Also it is really strange that if the region is split, then seems
> > > > >> > everything is working as expected. Also we noticed, the
same
> > region
> > > > >> > replicated at the slave side is totally normal, i.e. at
20+G....
> > > > >> >
> > > > >> >
> > > > >> If you move the region to another server, does that work?
> > > > >>
> > > > >> Looking in 0.94 codebase, I see this in Compactor#compact
> > > > >>
> > > > >>
> > > > >>       // For major compactions calculate the earliest put
> timestamp
> > > > >>
> > > > >>       // of all involved storefiles. This is used to remove
> > > > >>
> > > > >>       // family delete marker during the compaction.
> > > > >>
> > > > >>       if (majorCompaction) {
> > > > >>
> > > > >>         tmp = fileInfo.get(StoreFile.EARLIEST_PUT_TS);
> > > > >>
> > > > >>         if (tmp == null) {
> > > > >>
> > > > >>           // There's a file with no information, must be an old
> one
> > > > >>
> > > > >>           // assume we have very old puts
> > > > >>
> > > > >>           earliestPutTs = HConstants.OLDEST_TIMESTAMP;
> > > > >>
> > > > >>         } else {
> > > > >>
> > > > >>           earliestPutTs = Math.min(earliestPutTs,
> > Bytes.toLong(tmp));
> > > > >>
> > > > >>         }
> > > > >>
> > > > >>       }
> > > > >>
> > > > >>
> > > > >> The above is followed by this log line:
> > > > >>
> > > > >>
> > > > >>       if (LOG.isDebugEnabled()) {
> > > > >>
> > > > >>         LOG.debug("Compacting " + file +
> > > > >>
> > > > >>           ", keycount=" + keyCount +
> > > > >>
> > > > >>           ", bloomtype=" + r.getBloomFilterType().toString()
+
> > > > >>
> > > > >>           ", size=" + StringUtils.humanReadableInt(r.length())
+
> > > > >>
> > > > >>           ", encoding=" + r.getHFileReader().getEncodingOnDisk()
+
> > > > >>
> > > > >>           (majorCompaction? ", earliestPutTs=" + earliestPutTs:
> > ""));
> > > > >>
> > > > >>       }
> > > > >>
> > > > >> This prints out earliestPutTs. You see that in the logs?  You
> > running
> > > > with
> > > > >> DEBUG? Does the earliest put ts preclude our dropping delete
> family?
> > > > >>
> > > > >>
> > > > >> Looking more in code, we retain deletes in following
> circumstances:
> > > > >>
> > > > >>
> > > > >>     this.retainDeletesInOutput = scanType ==
> ScanType.MINOR_COMPACT
> > ||
> > > > >> scan
> > > > >> .isRaw();
> > > > >>
> > > > >>
> > > > >> So, for sure we are running major compaction?
> > > > >>
> > > > >> Otherwise, have to dig in a bit more here.. This stuff is a little
> > > > >> involved.
> > > > >> St.Ack
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> > On Fri, May 27, 2016 at 3:13 PM, Stack <stack@duboce.net>
> wrote:
> > > > >> >
> > > > >> > > On Fri, May 27, 2016 at 2:32 PM, Tianying Chang <
> > > tychang@gmail.com>
> > > > >> > wrote:
> > > > >> > >
> > > > >> > > > Hi,
> > > > >> > > >
> > > > >> > > > We saw a very strange case in one of our production
> cluster. A
> > > > >> couple
> > > > >> > > > regions cannot get their deleted rows or delete
marker
> removed
> > > > even
> > > > >> > after
> > > > >> > > > major compaction. However when the region triggered
split
> (we
> > > set
> > > > >> 100G
> > > > >> > > for
> > > > >> > > > auto split), the deletion worked. The 100G region
becomes
> two
> > > 10G
> > > > >> > > daughter
> > > > >> > > > regions, and all the delete marker are gone.
> > > > >> > > >
> > > > >> > > > Also, the same region in the slave cluster (through
> > replication)
> > > > >> have
> > > > >> > > > normal size at about 20+G.
> > > > >> > > >
> > > > >> > > > BTW, the delete marker in the regions are mostly
> deleteFamily
> > if
> > > > it
> > > > >> > > > matters.
> > > > >> > > >
> > > > >> > > > This is really weird. Anyone has any clue for
this strange
> > > > behavior?
> > > > >> > > >
> > > > >> > > > Thanks
> > > > >> > > > Tian-Ying
> > > > >> > > >
> > > > >> > > > These 0.94 Tian-Ying?
> > > > >> > >
> > > > >> > > It looks like the DeleteFamily is retained only; do
you see
> > > > incidence
> > > > >> > where
> > > > >> > > there may have been versions older than the DeleteFamily
that
> > are
> > > > also
> > > > >> > > retained post-major-compaction?
> > > > >> > >
> > > > >> > > St.Ack
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > > > A snippet of the HFile generated by the major
compaction:
> > > > >> > > >
> > > > >> > > > : \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459808114380/DeleteFamily/vlen=0/ts=
> > 2292870047
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459808114011/DeleteFamily/vlen=0/ts=
> > 2292869794
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > >
> \x10\x00?PF/d:/1459805381104/DeleteFamily/vlen=0/ts=2291072240
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > >
> \x10\x00?PF/d:/1459805380673/DeleteFamily/vlen=0/ts=2291071997
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > >
> \x10\x00?PF/d:/1459802643449/DeleteFamily/vlen=0/ts=2290248886
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > >
> \x10\x00?PF/d:/1459802643246/DeleteFamily/vlen=0/ts=2290248786
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459799913003/DeleteFamily/vlen=0/ts=
> > 2289446916
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459797181831/DeleteFamily/vlen=0/ts=
> > 2288670451
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459794447388/DeleteFamily/vlen=0/ts=
> > 2287911443
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459791708803/DeleteFamily/vlen=0/ts=
> > 2287213792
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459788978387/DeleteFamily/vlen=0/ts=
> > 2286488738
> > > > >> > > > V:
> > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
> > > > >> > > > \x10\x00?PF/d:/1459786243642/DeleteFamily/vlen=0/ts=
> > 2285778927
> > > > >> > > > V:
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message