hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Shelukhin <ser...@hortonworks.com>
Subject Re: Why is this region compacting?
Date Wed, 25 Sep 2013 02:21:56 GMT
Meanwhile you can mitigate as specified above, by temporarily disabling
expired file deletion.
Please report if it doesn't work...


On Tue, Sep 24, 2013 at 4:08 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Tom,
>
> Thanks for reporting this and for providing all this information.
>
> I have attached a patch on the JIRA that Sergey's opened.
>
> This will need to be reviewed and we will need a commiter to push it if
> it's accepted.
>
> JM
>
>
> 2013/9/24 Tom Brown <tombrown52@gmail.com>
>
> > I tried the workaround, and it is working very well. The number of store
> > files for all regions is now sane (went from about 8000 total store files
> > to 1000), and scans are now much more efficient.
> >
> > Thanks for all your help, Jean-Marc and Sergey!
> >
> > --Tom
> >
> >
> > On Tue, Sep 24, 2013 at 2:11 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > Hi Tom,
> > >
> > > Thanks for this information and the offer. I think we have enought to
> > start
> > > to look at this issue. I'm still trying to reproduce that locally. In
> the
> > > meantime, I sent a patch to fix the NullPointException your faced
> before.
> > >
> > > I will post back here if I'm able to reproduce. Have you tried Sergey's
> > > workarround?
> > >
> > > JM
> > >
> > >
> > > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > >
> > > > Yes, it is empty.
> > > >
> > > > 13/09/24 13:03:03 INFO hfile.CacheConfig: Allocating LruBlockCache
> with
> > > > maximum size 2.9g
> > > > 13/09/24 13:03:03 ERROR metrics.SchemaMetrics: Inconsistent
> > > configuration.
> > > > Previous configuration for using table name in metrics: true, new
> > > > configuration: false
> > > > 13/09/24 13:03:03 WARN metrics.SchemaConfigured: Could not determine
> > > table
> > > > and column family of the HFile path
> /fca0882dc7624342a8f4fce4b89420ff.
> > > > Expecting at least 5 path components.
> > > > 13/09/24 13:03:03 WARN snappy.LoadSnappy: Snappy native library is
> > > > available
> > > > 13/09/24 13:03:03 INFO util.NativeCodeLoader: Loaded the
> native-hadoop
> > > > library
> > > > 13/09/24 13:03:03 INFO snappy.LoadSnappy: Snappy native library
> loaded
> > > > 13/09/24 13:03:03 INFO compress.CodecPool: Got brand-new decompressor
> > > > Stats:
> > > > no data available for statistics
> > > > Scanned kv count -> 0
> > > >
> > > > If you want to examine the actual file, I would be happy to email it
> to
> > > you
> > > > directly.
> > > >
> > > > --Tom
> > > >
> > > >
> > > > On Tue, Sep 24, 2013 at 12:42 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org> wrote:
> > > >
> > > > > Can you try with less parameters and see if you are able to get
> > > something
> > > > > from it? This exception is caused by the "printMeta", so if you
> > remove
> > > -m
> > > > > it should be ok. However, printMeta was what I was looking for ;)
> > > > >
> > > > > getFirstKey for this file seems to return null. So it might simply
> be
> > > an
> > > > > empty file, not necessary a corrupted one.
> > > > >
> > > > >
> > > > > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > >
> > > > > > /usr/lib/hbase/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile
> -m
> > -s
> > > > -v
> > > > > -f
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff
> > > > > > 13/09/24 12:33:40 INFO util.ChecksumType: Checksum using
> > > > > > org.apache.hadoop.util.PureJavaCrc32
> > > > > > Scanning ->
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff
> > > > > > 13/09/24 12:33:41 INFO hfile.CacheConfig: Allocating
> LruBlockCache
> > > with
> > > > > > maximum size 2.9g
> > > > > > 13/09/24 12:33:41 ERROR metrics.SchemaMetrics: Inconsistent
> > > > > configuration.
> > > > > > Previous configuration for using table name in metrics: true, new
> > > > > > configuration: false
> > > > > > 13/09/24 12:33:41 WARN snappy.LoadSnappy: Snappy native library
> is
> > > > > > available
> > > > > > 13/09/24 12:33:41 INFO util.NativeCodeLoader: Loaded the
> > > native-hadoop
> > > > > > library
> > > > > > 13/09/24 12:33:41 INFO snappy.LoadSnappy: Snappy native library
> > > loaded
> > > > > > 13/09/24 12:33:41 INFO compress.CodecPool: Got brand-new
> > decompressor
> > > > > > Block index size as per heapsize: 336
> > > > > > Exception in thread "main" java.lang.NullPointerException
> > > > > >         at
> > > > > org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:716)
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toStringFirstKey(AbstractHFileReader.java:138)
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toString(AbstractHFileReader.java:149)
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.printMeta(HFilePrettyPrinter.java:318)
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:234)
> > > > > >         at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:189)
> > > > > >         at
> > > org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:756)
> > > > > >
> > > > > >
> > > > > > Does this mean the problem might have been caused by a corrupted
> > > > file(s)?
> > > > > >
> > > > > > --Tom
> > > > > >
> > > > > >
> > > > > > On Tue, Sep 24, 2013 at 12:21 PM, Jean-Marc Spaggiari <
> > > > > > jean-marc@spaggiari.org> wrote:
> > > > > >
> > > > > > > One more Tom,
> > > > > > >
> > > > > > > When you will have been able capture de HFile locally, please
> run
> > > run
> > > > > the
> > > > > > > HFile class on it to see the number of keys (is it empty?) and
> > the
> > > > > other
> > > > > > > specific information.
> > > > > > >
> > > > > > > bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -m -s -v -f
> > > > HFILENAME
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > JM
> > > > > > >
> > > > > > >
> > > > > > > 2013/9/24 Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> > > > > > >
> > > > > > > > We get -1 because of this:
> > > > > > > >
> > > > > > > >       byte [] timerangeBytes =
> metadataMap.get(TIMERANGE_KEY);
> > > > > > > >       if (timerangeBytes != null) {
> > > > > > > >         this.reader.timeRangeTracker = new
> TimeRangeTracker();
> > > > > > > >         Writables.copyWritable(timerangeBytes,
> > > > > > > > this.reader.timeRangeTracker);
> > > > > > > >       }
> > > > > > > > this.reader.timeRangeTracker will return -1 for the
> > > > maximumTimestamp
> > > > > > > > value. So now, we need to figure if it's normal or not to
> have
> > > > > > > > TIMERANGE_KEY not null here.
> > > > > > > >
> > > > > > > > I have created the same table locally on 0,94.10 with the
> same
> > > > > > attributes
> > > > > > > > and I'm not facing this issue.
> > > > > > > >
> > > > > > > > We need to look at the related HFile, but files are rolled
> VERY
> > > > > > quickly,
> > > > > > > > so might be difficult to get one.
> > > > > > > >
> > > > > > > > Maybe something like
> > > > > > > > hadoop fs -get hdfs://
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/*
> > > > > > > > .
> > > > > > > >
> > > > > > > > might help to get the file? Then we can start to look at it
> and
> > > see
> > > > > > what
> > > > > > > > exactly trigger this behaviour?
> > > > > > > >
> > > > > > > > JM
> > > > > > > >
> > > > > > > >
> > > > > > > > 2013/9/24 Sergey Shelukhin <sergey@hortonworks.com>
> > > > > > > >
> > > > > > > >> Yeah, I think c3580bdb62d64e42a9eeac50f1c582d2 store file
> is a
> > > > good
> > > > > > > >> example.
> > > > > > > >> Can you grep for c3580bdb62d64e42a9eeac50f1c582d2 and post
> the
> > > log
> > > > > > just
> > > > > > > to
> > > > > > > >> be sure? Thanks.
> > > > > > > >> It looks like an interaction of deleting expired files and
> > > > > > > >>           // Create the writer even if no kv(Empty store
> file
> > is
> > > > > also
> > > > > > > ok),
> > > > > > > >>           // because we need record the max seq id for the
> > store
> > > > > file,
> > > > > > > see
> > > > > > > >>           // HBASE-6059
> > > > > > > >> in compactor.
> > > > > > > >> The newly created file is immediately collected the same way
> > and
> > > > > > > replaced
> > > > > > > >> by another file, which seems like not an intended behavior,
> > even
> > > > > > though
> > > > > > > >> both pieces of code are technically correct (the empty file
> is
> > > > > > expired,
> > > > > > > >> and
> > > > > > > >> the new file is generally needed).
> > > > > > > >>
> > > > > > > >> I filed HBASE-9648
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Tue, Sep 24, 2013 at 10:55 AM, Sergey Shelukhin
> > > > > > > >> <sergey@hortonworks.com>wrote:
> > > > > > > >>
> > > > > > > >> > To mitigate, you can change
> > > hbase.store.delete.expired.storefile
> > > > > to
> > > > > > > >> false
> > > > > > > >> > on one region server, or for entire table, and restart
> this
> > > RS.
> > > > > > > >> > This will trigger a different compaction, hopefully.
> > > > > > > >> > We'd need to find what the bug is. My gut feeling (which
> is
> > > > known
> > > > > to
> > > > > > > be
> > > > > > > >> > wrong often) is that it has to do with it selecting one
> > file,
> > > > > > probably
> > > > > > > >> > invalid check somewhere, or interaction with the code that
> > > > ensures
> > > > > > > that
> > > > > > > >> at
> > > > > > > >> > least one file needs to be written to preserve metadata,
> it
> > > > might
> > > > > be
> > > > > > > >> just
> > > > > > > >> > cycling thru such files.
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > On Tue, Sep 24, 2013 at 10:20 AM, Jean-Marc Spaggiari <
> > > > > > > >> > jean-marc@spaggiari.org> wrote:
> > > > > > > >> >
> > > > > > > >> >> So. Looking at the code, this, for me, sound like a bug.
> > > > > > > >> >>
> > > > > > > >> >> I will try to reproduce it locally. Seems to be related
> to
> > > the
> > > > > > > >> combination
> > > > > > > >> >> of TTL + BLOOM.
> > > > > > > >> >>
> > > > > > > >> >> Creating a table for that right now, will keep you posted
> > > very
> > > > > > > shortly.
> > > > > > > >> >>
> > > > > > > >> >> JM
> > > > > > > >> >>
> > > > > > > >> >>
> > > > > > > >> >> 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > > > > >> >>
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       2194 2013-09-21
> > > 14:32
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/014ead47a9484d67b55205be16802ff1
> > > > > > > >> >> > -rw-------   1 hadoop supergroup      31321 2013-09-24
> > > 05:49
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1305d625bd4a4be39a98ae4d91a66140
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1350 2013-09-24
> > > 10:31
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1352e0828f974f08b1f3d7a9dff04abd
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       4194 2013-09-21
> > > 10:38
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/17a546064bd840619816809ae0fc4c49
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1061 2013-09-20
> > > 22:55
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1cb3df115da244288bd076968ab4ccf6
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1375 2013-08-24
> > > 10:17
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1e41a96c49fc4e5ab59392d26935978d
> > > > > > > >> >> > -rw-------   1 hadoop supergroup      96296 2013-08-26
> > > 15:48
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/22d72fd897e34424b5420a96483a571e
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1356 2013-08-26
> > > 15:23
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/25fee1ffadbe42549bd0b7b13d782b72
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       6229 2013-09-21
> > > 11:14
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/26289c777ec14dc5b7021b4d6b1050c5
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1223 2013-09-21
> > > 02:42
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2757d7ba9c8448d6a3d5d46bd4d59758
> > > > > > > >> >> > -rw-------   1 hadoop supergroup    5302248 2013-08-24
> > > 02:22
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2ec40943787246ea983608dd6591db24
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1596 2013-08-24
> > > 03:37
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3157fd1cabe4483aaa4d9a21f75e4d88
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1338 2013-09-22
> > > 04:25
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/36b0f80a4a7b492f97358b64d879a2df
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       3264 2013-09-21
> > > 12:05
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/39e249fcb532400daed73aed6689ceeb
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       4549 2013-09-21
> > > 08:56
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3bc9e2a566ad460a9b0ed336b2fb5ed9
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1630 2013-09-22
> > > 03:22
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/48026d08aae748f08aad59e4eea903be
> > > > > > > >> >> > -rw-------   1 hadoop supergroup     105395 2013-09-20
> > > 21:12
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/53198825f085401cbbd4322faa0e3aae
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       3859 2013-09-21
> > > 09:09
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/71c2f9b2a8ff4c049fcc5a9a22af5cfe
> > > > > > > >> >> > -rw-------   1 hadoop supergroup     311688 2013-09-20
> > > 21:12
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/97ff16d6da974c30835c6e0acc7c737a
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       1897 2013-08-24
> > > 08:43
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/a172d7577641434d82abcce88a433213
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       3380 2013-09-21
> > > 13:04
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/be678e5c60534c65a012a798fbc7e284
> > > > > > > >> >> > -rw-------   1 hadoop supergroup      43710 2013-09-22
> > > 02:15
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/e2508a23acf1491f9d38b9a8594e41e8
> > > > > > > >> >> > -rw-------   1 hadoop supergroup       5409 2013-09-21
> > > 10:10
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f432846182714b93a1c3df0f5835c09b
> > > > > > > >> >> > -rw-------   1 hadoop supergroup        491 2013-09-24
> > > 11:18
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f7d8669cf7a047b98c1d3b13c16cfaec
> > > > > > > >> >> > -rw-------   1 hadoop supergroup        491 2013-09-24
> > > 11:18
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fa1b8f6cc9584eb28365dcd8f10d3f0a
> > > > > > > >> >> > -rw-------   1 hadoop supergroup        491 2013-09-13
> > > 11:28
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >> >
> > > > > > > >> >> > On Tue, Sep 24, 2013 at 11:14 AM, Jean-Marc Spaggiari <
> > > > > > > >> >> > jean-marc@spaggiari.org> wrote:
> > > > > > > >> >> >
> > > > > > > >> >> > > TTL seems to be fine.
> > > > > > > >> >> > >
> > > > > > > >> >> > > -1 is the default value for
> > > > > TimeRangeTracker.maximumTimestamp.
> > > > > > > >> >> > >
> > > > > > > >> >> > > Can you run:
> > > > > > > >> >> > > hadoop fs -lsr hdfs://
> > > > > > > >> >> > >
> > > > > > > >> >> > >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/
> > > > > > > >> >> > >
> > > > > > > >> >> > > Thanks,
> > > > > > > >> >> > >
> > > > > > > >> >> > > JM
> > > > > > > >> >> > >
> > > > > > > >> >> > >
> > > > > > > >> >> > > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > > > > >> >> > >
> > > > > > > >> >> > > > 1. Hadoop version is 1.1.2.
> > > > > > > >> >> > > > 2. All servers are synched with NTP.
> > > > > > > >> >> > > > 3. Table definition is: 'compound0', {
> > > > > > > >> >> > > > NAME => 'd',
> > > > > > > >> >> > > > DATA_BLOCK_ENCODING => 'NONE',
> > > > > > > >> >> > > > BLOOMFILTER => 'ROW',
> > > > > > > >> >> > > > REPLICATION_SCOPE => '0',
> > > > > > > >> >> > > > VERSIONS => '1',
> > > > > > > >> >> > > > COMPRESSION => 'SNAPPY',
> > > > > > > >> >> > > > MIN_VERSIONS => '0',
> > > > > > > >> >> > > > TTL => '8640000',
> > > > > > > >> >> > > > KEEP_DELETED_CELLS => 'false',
> > > > > > > >> >> > > > BLOCKSIZE => '65536',
> > > > > > > >> >> > > > IN_MEMORY => 'false',
> > > > > > > >> >> > > > ENCODE_ON_DISK => 'true',
> > > > > > > >> >> > > > BLOCKCACHE => 'true'
> > > > > > > >> >> > > > }
> > > > > > > >> >> > > >
> > > > > > > >> >> > > > The TTL is supposed to be 100 days.
> > > > > > > >> >> > > >
> > > > > > > >> >> > > > --Tom
> > > > > > > >> >> > > >
> > > > > > > >> >> > > >
> > > > > > > >> >> > > > On Tue, Sep 24, 2013 at 10:53 AM, Jean-Marc
> > Spaggiari <
> > > > > > > >> >> > > > jean-marc@spaggiari.org> wrote:
> > > > > > > >> >> > > >
> > > > > > > >> >> > > > > Another important information why might be the
> root
> > > > cause
> > > > > > of
> > > > > > > >> this
> > > > > > > >> >> > > > issue...
> > > > > > > >> >> > > > >
> > > > > > > >> >> > > > > Do you have any TTL defines for this table?
> > > > > > > >> >> > > > >
> > > > > > > >> >> > > > > JM
> > > > > > > >> >> > > > >
> > > > > > > >> >> > > > >
> > > > > > > >> >> > > > > 2013/9/24 Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org
> > > > >
> > > > > > > >> >> > > > >
> > > > > > > >> >> > > > > > Strange.
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > > Few questions then.
> > > > > > > >> >> > > > > > 1) What is your hadoop version?
> > > > > > > >> >> > > > > > 2) Is the clock on all your severs synched with
> > > NTP?
> > > > > > > >> >> > > > > > 3) What is you table definition? Bloom filters,
> > > etc.?
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > > This is the reason why it keep compacting:
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > > 2013-09-24 10:04:00,548 INFO
> > > > > > > >> >> > > > >
> > > > > > > >>
> > > org.apache.hadoop.hbase.regionserver.compactions.CompactSelection:
> > > > > > > >> >> > > > Deleting
> > > > > > > >> >> > > > > the expired store file by compaction: hdfs://
> > > > > > > >> >> > > > >
> > > > > > > >> >> > > >
> > > > > > > >> >> > >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/7426f128469242ec8ee09f3965fd5a1awhosemaxTimeStampis-1whilethemaxexpiredtimestampis1371398640548
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > > maxTimeStamp = -1
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > > Each time there is a comparison between
> > > maxTimeStamp
> > > > > for
> > > > > > > this
> > > > > > > >> >> store
> > > > > > > >> >> > > > file
> > > > > > > >> >> > > > > > and the configured maxExpiredTimeStamp and
> since
> > > > > > > maxTimeStamp
> > > > > > > >> >> > returns
> > > > > > > >> >> > > > -1,
> > > > > > > >> >> > > > > > it's always elected for a compaction. Now, we
> > need
> > > to
> > > > > > find
> > > > > > > >> >> why...
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > > JM
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > >> My cluster is fully distributed (2
> regionserver
> > > > > nodes).
> > > > > > > >> >> > > > > >>
> > > > > > > >> >> > > > > >> Here is a snippet of log entries that may
> > explain
> > > > why
> > > > > it
> > > > > > > >> >> started:
> > > > > > > >> >> > > > > >> http://pastebin.com/wQECif8k. I had to go
> back
> > 2
> > > > days
> > > > > > to
> > > > > > > >> find
> > > > > > > >> >> > when
> > > > > > > >> >> > > it
> > > > > > > >> >> > > > > >> started for this region.
> > > > > > > >> >> > > > > >>
> > > > > > > >> >> > > > > >> This is not the only region experiencing this
> > > issue
> > > > > (but
> > > > > > > >> this
> > > > > > > >> >> is
> > > > > > > >> >> > the
> > > > > > > >> >> > > > > >> smallest one it's happened to).
> > > > > > > >> >> > > > > >>
> > > > > > > >> >> > > > > >> --Tom
> > > > > > > >> >> > > > > >>
> > > > > > > >> >> > > > > >>
> > > > > > > >> >> > > > > >> On Tue, Sep 24, 2013 at 10:13 AM, Jean-Marc
> > > > Spaggiari
> > > > > <
> > > > > > > >> >> > > > > >> jean-marc@spaggiari.org> wrote:
> > > > > > > >> >> > > > > >>
> > > > > > > >> >> > > > > >> > Can you past logs a bit before that? To see
> if
> > > > > > anything
> > > > > > > >> >> > triggered
> > > > > > > >> >> > > > the
> > > > > > > >> >> > > > > >> > compaction?
> > > > > > > >> >> > > > > >> > Before the 1M compactions entries.
> > > > > > > >> >> > > > > >> >
> > > > > > > >> >> > > > > >> > Also, what is your setup? Are you running in
> > > > > > Standalone?
> > > > > > > >> >> > > > Pseudo-Dist?
> > > > > > > >> >> > > > > >> > Fully-Dist?
> > > > > > > >> >> > > > > >> >
> > > > > > > >> >> > > > > >> > Thanks,
> > > > > > > >> >> > > > > >> >
> > > > > > > >> >> > > > > >> > JM
> > > > > > > >> >> > > > > >> >
> > > > > > > >> >> > > > > >> >
> > > > > > > >> >> > > > > >> > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > > > > >> >> > > > > >> >
> > > > > > > >> >> > > > > >> > > There is one column family, d. Each row
> has
> > > > about
> > > > > 10
> > > > > > > >> >> columns,
> > > > > > > >> >> > > and
> > > > > > > >> >> > > > > each
> > > > > > > >> >> > > > > >> > > row's total data size is less than 2K.
> > > > > > > >> >> > > > > >> > >
> > > > > > > >> >> > > > > >> > > Here is a small snippet of logs from the
> > > region
> > > > > > > server:
> > > > > > > >> >> > > > > >> > > http://pastebin.com/S2jE4ZAx
> > > > > > > >> >> > > > > >> > >
> > > > > > > >> >> > > > > >> > > --Tom
> > > > > > > >> >> > > > > >> > >
> > > > > > > >> >> > > > > >> > >
> > > > > > > >> >> > > > > >> > > On Tue, Sep 24, 2013 at 9:59 AM, Bharath
> > > > > > Vissapragada
> > > > > > > <
> > > > > > > >> >> > > > > >> > > bharathv@cloudera.com
> > > > > > > >> >> > > > > >> > > > wrote:
> > > > > > > >> >> > > > > >> > >
> > > > > > > >> >> > > > > >> > > > It would help if you can show your RS
> log
> > > (via
> > > > > > > >> >> pastebin?) .
> > > > > > > >> >> > > Are
> > > > > > > >> >> > > > > >> there
> > > > > > > >> >> > > > > >> > > > frequent flushes for this region too?
> > > > > > > >> >> > > > > >> > > >
> > > > > > > >> >> > > > > >> > > >
> > > > > > > >> >> > > > > >> > > > On Tue, Sep 24, 2013 at 9:20 PM, Tom
> > Brown <
> > > > > > > >> >> > > > tombrown52@gmail.com>
> > > > > > > >> >> > > > > >> > wrote:
> > > > > > > >> >> > > > > >> > > >
> > > > > > > >> >> > > > > >> > > > > I have a region that is very small,
> only
> > > > 5MB.
> > > > > > > >> Despite
> > > > > > > >> >> it's
> > > > > > > >> >> > > > size,
> > > > > > > >> >> > > > > >> it
> > > > > > > >> >> > > > > >> > has
> > > > > > > >> >> > > > > >> > > > 24
> > > > > > > >> >> > > > > >> > > > > store files. The logs show that it's
> > > > > compacting
> > > > > > > >> (over
> > > > > > > >> >> and
> > > > > > > >> >> > > over
> > > > > > > >> >> > > > > >> > again).
> > > > > > > >> >> > > > > >> > > > >
> > > > > > > >> >> > > > > >> > > > > The odd thing is that even though
> there
> > > are
> > > > 24
> > > > > > > store
> > > > > > > >> >> > files,
> > > > > > > >> >> > > it
> > > > > > > >> >> > > > > >> only
> > > > > > > >> >> > > > > >> > > does
> > > > > > > >> >> > > > > >> > > > > one at a time. Even more strange is
> that
> > > my
> > > > > logs
> > > > > > > are
> > > > > > > >> >> > filling
> > > > > > > >> >> > > > up
> > > > > > > >> >> > > > > >> with
> > > > > > > >> >> > > > > >> > > > > compacting this one region. In the
> last
> > 10
> > > > > > hours,
> > > > > > > >> there
> > > > > > > >> >> > have
> > > > > > > >> >> > > > > been
> > > > > > > >> >> > > > > >> > > > 1,876,200
> > > > > > > >> >> > > > > >> > > > > log entries corresponding to
> compacting
> > > this
> > > > > > > region
> > > > > > > >> >> alone.
> > > > > > > >> >> > > > > >> > > > >
> > > > > > > >> >> > > > > >> > > > > My cluster is 0.94.10, and using
> almost
> > > all
> > > > > > > default
> > > > > > > >> >> > > settings.
> > > > > > > >> >> > > > > >> Only a
> > > > > > > >> >> > > > > >> > > few
> > > > > > > >> >> > > > > >> > > > > are not default:
> > > > > > > >> >> > > > > >> > > > > hbase.hregion.max.filesize =
> 4294967296
> > > > > > > >> >> > > > > >> > > > > hbase.hstore.compaction.min = 6
> > > > > > > >> >> > > > > >> > > > >
> > > > > > > >> >> > > > > >> > > > > I am at a total loss as to why this
> > > behavior
> > > > > is
> > > > > > > >> >> occurring.
> > > > > > > >> >> > > Any
> > > > > > > >> >> > > > > >> help
> > > > > > > >> >> > > > > >> > is
> > > > > > > >> >> > > > > >> > > > > appreciated.
> > > > > > > >> >> > > > > >> > > > >
> > > > > > > >> >> > > > > >> > > > > --Tom
> > > > > > > >> >> > > > > >> > > > >
> > > > > > > >> >> > > > > >> > > >
> > > > > > > >> >> > > > > >> > > >
> > > > > > > >> >> > > > > >> > > >
> > > > > > > >> >> > > > > >> > > > --
> > > > > > > >> >> > > > > >> > > > Bharath Vissapragada
> > > > > > > >> >> > > > > >> > > > <http://www.cloudera.com>
> > > > > > > >> >> > > > > >> > > >
> > > > > > > >> >> > > > > >> > >
> > > > > > > >> >> > > > > >> >
> > > > > > > >> >> > > > > >>
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > > >
> > > > > > > >> >> > > > >
> > > > > > > >> >> > > >
> > > > > > > >> >> > >
> > > > > > > >> >> >
> > > > > > > >> >>
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >> CONFIDENTIALITY NOTICE
> > > > > > > >> NOTICE: This message is intended for the use of the
> individual
> > > or
> > > > > > entity
> > > > > > > >> to
> > > > > > > >> which it is addressed and may contain information that is
> > > > > > confidential,
> > > > > > > >> privileged and exempt from disclosure under applicable law.
> If
> > > the
> > > > > > > reader
> > > > > > > >> of this message is not the intended recipient, you are
> hereby
> > > > > notified
> > > > > > > >> that
> > > > > > > >> any printing, copying, dissemination, distribution,
> disclosure
> > > or
> > > > > > > >> forwarding of this communication is strictly prohibited. If
> > you
> > > > have
> > > > > > > >> received this communication in error, please contact the
> > sender
> > > > > > > >> immediately
> > > > > > > >> and delete it from your system. Thank You.
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message