hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Why is this region compacting?
Date Tue, 24 Sep 2013 20:11:24 GMT
Hi Tom,

Thanks for this information and the offer. I think we have enought to start
to look at this issue. I'm still trying to reproduce that locally. In the
meantime, I sent a patch to fix the NullPointException your faced before.

I will post back here if I'm able to reproduce. Have you tried Sergey's
workarround?

JM


2013/9/24 Tom Brown <tombrown52@gmail.com>

> Yes, it is empty.
>
> 13/09/24 13:03:03 INFO hfile.CacheConfig: Allocating LruBlockCache with
> maximum size 2.9g
> 13/09/24 13:03:03 ERROR metrics.SchemaMetrics: Inconsistent configuration.
> Previous configuration for using table name in metrics: true, new
> configuration: false
> 13/09/24 13:03:03 WARN metrics.SchemaConfigured: Could not determine table
> and column family of the HFile path /fca0882dc7624342a8f4fce4b89420ff.
> Expecting at least 5 path components.
> 13/09/24 13:03:03 WARN snappy.LoadSnappy: Snappy native library is
> available
> 13/09/24 13:03:03 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 13/09/24 13:03:03 INFO snappy.LoadSnappy: Snappy native library loaded
> 13/09/24 13:03:03 INFO compress.CodecPool: Got brand-new decompressor
> Stats:
> no data available for statistics
> Scanned kv count -> 0
>
> If you want to examine the actual file, I would be happy to email it to you
> directly.
>
> --Tom
>
>
> On Tue, Sep 24, 2013 at 12:42 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > Can you try with less parameters and see if you are able to get something
> > from it? This exception is caused by the "printMeta", so if you remove -m
> > it should be ok. However, printMeta was what I was looking for ;)
> >
> > getFirstKey for this file seems to return null. So it might simply be an
> > empty file, not necessary a corrupted one.
> >
> >
> > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> >
> > > /usr/lib/hbase/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -m -s
> -v
> > -f
> > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff
> > > 13/09/24 12:33:40 INFO util.ChecksumType: Checksum using
> > > org.apache.hadoop.util.PureJavaCrc32
> > > Scanning ->
> > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff
> > > 13/09/24 12:33:41 INFO hfile.CacheConfig: Allocating LruBlockCache with
> > > maximum size 2.9g
> > > 13/09/24 12:33:41 ERROR metrics.SchemaMetrics: Inconsistent
> > configuration.
> > > Previous configuration for using table name in metrics: true, new
> > > configuration: false
> > > 13/09/24 12:33:41 WARN snappy.LoadSnappy: Snappy native library is
> > > available
> > > 13/09/24 12:33:41 INFO util.NativeCodeLoader: Loaded the native-hadoop
> > > library
> > > 13/09/24 12:33:41 INFO snappy.LoadSnappy: Snappy native library loaded
> > > 13/09/24 12:33:41 INFO compress.CodecPool: Got brand-new decompressor
> > > Block index size as per heapsize: 336
> > > Exception in thread "main" java.lang.NullPointerException
> > >         at
> > org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:716)
> > >         at
> > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toStringFirstKey(AbstractHFileReader.java:138)
> > >         at
> > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toString(AbstractHFileReader.java:149)
> > >         at
> > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.printMeta(HFilePrettyPrinter.java:318)
> > >         at
> > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:234)
> > >         at
> > >
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:189)
> > >         at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:756)
> > >
> > >
> > > Does this mean the problem might have been caused by a corrupted
> file(s)?
> > >
> > > --Tom
> > >
> > >
> > > On Tue, Sep 24, 2013 at 12:21 PM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org> wrote:
> > >
> > > > One more Tom,
> > > >
> > > > When you will have been able capture de HFile locally, please run run
> > the
> > > > HFile class on it to see the number of keys (is it empty?) and the
> > other
> > > > specific information.
> > > >
> > > > bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -m -s -v -f
> HFILENAME
> > > >
> > > > Thanks,
> > > >
> > > > JM
> > > >
> > > >
> > > > 2013/9/24 Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> > > >
> > > > > We get -1 because of this:
> > > > >
> > > > >       byte [] timerangeBytes = metadataMap.get(TIMERANGE_KEY);
> > > > >       if (timerangeBytes != null) {
> > > > >         this.reader.timeRangeTracker = new TimeRangeTracker();
> > > > >         Writables.copyWritable(timerangeBytes,
> > > > > this.reader.timeRangeTracker);
> > > > >       }
> > > > > this.reader.timeRangeTracker will return -1 for the
> maximumTimestamp
> > > > > value. So now, we need to figure if it's normal or not to have
> > > > > TIMERANGE_KEY not null here.
> > > > >
> > > > > I have created the same table locally on 0,94.10 with the same
> > > attributes
> > > > > and I'm not facing this issue.
> > > > >
> > > > > We need to look at the related HFile, but files are rolled VERY
> > > quickly,
> > > > > so might be difficult to get one.
> > > > >
> > > > > Maybe something like
> > > > > hadoop fs -get hdfs://
> > > > >
> > > >
> > >
> >
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/*
> > > > > .
> > > > >
> > > > > might help to get the file? Then we can start to look at it and see
> > > what
> > > > > exactly trigger this behaviour?
> > > > >
> > > > > JM
> > > > >
> > > > >
> > > > > 2013/9/24 Sergey Shelukhin <sergey@hortonworks.com>
> > > > >
> > > > >> Yeah, I think c3580bdb62d64e42a9eeac50f1c582d2 store file is
a
> good
> > > > >> example.
> > > > >> Can you grep for c3580bdb62d64e42a9eeac50f1c582d2 and post the
log
> > > just
> > > > to
> > > > >> be sure? Thanks.
> > > > >> It looks like an interaction of deleting expired files and
> > > > >>           // Create the writer even if no kv(Empty store file
is
> > also
> > > > ok),
> > > > >>           // because we need record the max seq id for the store
> > file,
> > > > see
> > > > >>           // HBASE-6059
> > > > >> in compactor.
> > > > >> The newly created file is immediately collected the same way
and
> > > > replaced
> > > > >> by another file, which seems like not an intended behavior, even
> > > though
> > > > >> both pieces of code are technically correct (the empty file is
> > > expired,
> > > > >> and
> > > > >> the new file is generally needed).
> > > > >>
> > > > >> I filed HBASE-9648
> > > > >>
> > > > >>
> > > > >> On Tue, Sep 24, 2013 at 10:55 AM, Sergey Shelukhin
> > > > >> <sergey@hortonworks.com>wrote:
> > > > >>
> > > > >> > To mitigate, you can change hbase.store.delete.expired.storefile
> > to
> > > > >> false
> > > > >> > on one region server, or for entire table, and restart this
RS.
> > > > >> > This will trigger a different compaction, hopefully.
> > > > >> > We'd need to find what the bug is. My gut feeling (which
is
> known
> > to
> > > > be
> > > > >> > wrong often) is that it has to do with it selecting one
file,
> > > probably
> > > > >> > invalid check somewhere, or interaction with the code that
> ensures
> > > > that
> > > > >> at
> > > > >> > least one file needs to be written to preserve metadata,
it
> might
> > be
> > > > >> just
> > > > >> > cycling thru such files.
> > > > >> >
> > > > >> >
> > > > >> > On Tue, Sep 24, 2013 at 10:20 AM, Jean-Marc Spaggiari <
> > > > >> > jean-marc@spaggiari.org> wrote:
> > > > >> >
> > > > >> >> So. Looking at the code, this, for me, sound like a
bug.
> > > > >> >>
> > > > >> >> I will try to reproduce it locally. Seems to be related
to the
> > > > >> combination
> > > > >> >> of TTL + BLOOM.
> > > > >> >>
> > > > >> >> Creating a table for that right now, will keep you posted
very
> > > > shortly.
> > > > >> >>
> > > > >> >> JM
> > > > >> >>
> > > > >> >>
> > > > >> >> 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > >> >>
> > > > >> >> > -rw-------   1 hadoop supergroup       2194 2013-09-21
14:32
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/014ead47a9484d67b55205be16802ff1
> > > > >> >> > -rw-------   1 hadoop supergroup      31321 2013-09-24
05:49
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1305d625bd4a4be39a98ae4d91a66140
> > > > >> >> > -rw-------   1 hadoop supergroup       1350 2013-09-24
10:31
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1352e0828f974f08b1f3d7a9dff04abd
> > > > >> >> > -rw-------   1 hadoop supergroup       4194 2013-09-21
10:38
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/17a546064bd840619816809ae0fc4c49
> > > > >> >> > -rw-------   1 hadoop supergroup       1061 2013-09-20
22:55
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1cb3df115da244288bd076968ab4ccf6
> > > > >> >> > -rw-------   1 hadoop supergroup       1375 2013-08-24
10:17
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1e41a96c49fc4e5ab59392d26935978d
> > > > >> >> > -rw-------   1 hadoop supergroup      96296 2013-08-26
15:48
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/22d72fd897e34424b5420a96483a571e
> > > > >> >> > -rw-------   1 hadoop supergroup       1356 2013-08-26
15:23
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/25fee1ffadbe42549bd0b7b13d782b72
> > > > >> >> > -rw-------   1 hadoop supergroup       6229 2013-09-21
11:14
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/26289c777ec14dc5b7021b4d6b1050c5
> > > > >> >> > -rw-------   1 hadoop supergroup       1223 2013-09-21
02:42
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2757d7ba9c8448d6a3d5d46bd4d59758
> > > > >> >> > -rw-------   1 hadoop supergroup    5302248 2013-08-24
02:22
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2ec40943787246ea983608dd6591db24
> > > > >> >> > -rw-------   1 hadoop supergroup       1596 2013-08-24
03:37
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3157fd1cabe4483aaa4d9a21f75e4d88
> > > > >> >> > -rw-------   1 hadoop supergroup       1338 2013-09-22
04:25
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/36b0f80a4a7b492f97358b64d879a2df
> > > > >> >> > -rw-------   1 hadoop supergroup       3264 2013-09-21
12:05
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/39e249fcb532400daed73aed6689ceeb
> > > > >> >> > -rw-------   1 hadoop supergroup       4549 2013-09-21
08:56
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3bc9e2a566ad460a9b0ed336b2fb5ed9
> > > > >> >> > -rw-------   1 hadoop supergroup       1630 2013-09-22
03:22
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/48026d08aae748f08aad59e4eea903be
> > > > >> >> > -rw-------   1 hadoop supergroup     105395 2013-09-20
21:12
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/53198825f085401cbbd4322faa0e3aae
> > > > >> >> > -rw-------   1 hadoop supergroup       3859 2013-09-21
09:09
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/71c2f9b2a8ff4c049fcc5a9a22af5cfe
> > > > >> >> > -rw-------   1 hadoop supergroup     311688 2013-09-20
21:12
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/97ff16d6da974c30835c6e0acc7c737a
> > > > >> >> > -rw-------   1 hadoop supergroup       1897 2013-08-24
08:43
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/a172d7577641434d82abcce88a433213
> > > > >> >> > -rw-------   1 hadoop supergroup       3380 2013-09-21
13:04
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/be678e5c60534c65a012a798fbc7e284
> > > > >> >> > -rw-------   1 hadoop supergroup      43710 2013-09-22
02:15
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/e2508a23acf1491f9d38b9a8594e41e8
> > > > >> >> > -rw-------   1 hadoop supergroup       5409 2013-09-21
10:10
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f432846182714b93a1c3df0f5835c09b
> > > > >> >> > -rw-------   1 hadoop supergroup        491 2013-09-24
11:18
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f7d8669cf7a047b98c1d3b13c16cfaec
> > > > >> >> > -rw-------   1 hadoop supergroup        491 2013-09-24
11:18
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fa1b8f6cc9584eb28365dcd8f10d3f0a
> > > > >> >> > -rw-------   1 hadoop supergroup        491 2013-09-13
11:28
> > > > >> >> >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff
> > > > >> >> >
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > On Tue, Sep 24, 2013 at 11:14 AM, Jean-Marc Spaggiari
<
> > > > >> >> > jean-marc@spaggiari.org> wrote:
> > > > >> >> >
> > > > >> >> > > TTL seems to be fine.
> > > > >> >> > >
> > > > >> >> > > -1 is the default value for
> > TimeRangeTracker.maximumTimestamp.
> > > > >> >> > >
> > > > >> >> > > Can you run:
> > > > >> >> > > hadoop fs -lsr hdfs://
> > > > >> >> > >
> > > > >> >> > >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/
> > > > >> >> > >
> > > > >> >> > > Thanks,
> > > > >> >> > >
> > > > >> >> > > JM
> > > > >> >> > >
> > > > >> >> > >
> > > > >> >> > > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > >> >> > >
> > > > >> >> > > > 1. Hadoop version is 1.1.2.
> > > > >> >> > > > 2. All servers are synched with NTP.
> > > > >> >> > > > 3. Table definition is: 'compound0',
{
> > > > >> >> > > > NAME => 'd',
> > > > >> >> > > > DATA_BLOCK_ENCODING => 'NONE',
> > > > >> >> > > > BLOOMFILTER => 'ROW',
> > > > >> >> > > > REPLICATION_SCOPE => '0',
> > > > >> >> > > > VERSIONS => '1',
> > > > >> >> > > > COMPRESSION => 'SNAPPY',
> > > > >> >> > > > MIN_VERSIONS => '0',
> > > > >> >> > > > TTL => '8640000',
> > > > >> >> > > > KEEP_DELETED_CELLS => 'false',
> > > > >> >> > > > BLOCKSIZE => '65536',
> > > > >> >> > > > IN_MEMORY => 'false',
> > > > >> >> > > > ENCODE_ON_DISK => 'true',
> > > > >> >> > > > BLOCKCACHE => 'true'
> > > > >> >> > > > }
> > > > >> >> > > >
> > > > >> >> > > > The TTL is supposed to be 100 days.
> > > > >> >> > > >
> > > > >> >> > > > --Tom
> > > > >> >> > > >
> > > > >> >> > > >
> > > > >> >> > > > On Tue, Sep 24, 2013 at 10:53 AM, Jean-Marc
Spaggiari <
> > > > >> >> > > > jean-marc@spaggiari.org> wrote:
> > > > >> >> > > >
> > > > >> >> > > > > Another important information why
might be the root
> cause
> > > of
> > > > >> this
> > > > >> >> > > > issue...
> > > > >> >> > > > >
> > > > >> >> > > > > Do you have any TTL defines for
this table?
> > > > >> >> > > > >
> > > > >> >> > > > > JM
> > > > >> >> > > > >
> > > > >> >> > > > >
> > > > >> >> > > > > 2013/9/24 Jean-Marc Spaggiari <jean-marc@spaggiari.org
> >
> > > > >> >> > > > >
> > > > >> >> > > > > > Strange.
> > > > >> >> > > > > >
> > > > >> >> > > > > > Few questions then.
> > > > >> >> > > > > > 1) What is your hadoop version?
> > > > >> >> > > > > > 2) Is the clock on all your
severs synched with NTP?
> > > > >> >> > > > > > 3) What is you table definition?
Bloom filters, etc.?
> > > > >> >> > > > > >
> > > > >> >> > > > > > This is the reason why it keep
compacting:
> > > > >> >> > > > > >
> > > > >> >> > > > > > 2013-09-24 10:04:00,548 INFO
> > > > >> >> > > > >
> > > > >> org.apache.hadoop.hbase.regionserver.compactions.CompactSelection:
> > > > >> >> > > > Deleting
> > > > >> >> > > > > the expired store file by compaction:
hdfs://
> > > > >> >> > > > >
> > > > >> >> > > >
> > > > >> >> > >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/7426f128469242ec8ee09f3965fd5a1awhosemaxTimeStampis-1whilethemaxexpiredtimestampis
1371398640548
> > > > >> >> > > > > >
> > > > >> >> > > > > > maxTimeStamp = -1
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > Each time there is a comparison
between maxTimeStamp
> > for
> > > > this
> > > > >> >> store
> > > > >> >> > > > file
> > > > >> >> > > > > > and the configured maxExpiredTimeStamp
and since
> > > > maxTimeStamp
> > > > >> >> > returns
> > > > >> >> > > > -1,
> > > > >> >> > > > > > it's always elected for a compaction.
Now, we need to
> > > find
> > > > >> >> why...
> > > > >> >> > > > > >
> > > > >> >> > > > > > JM
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > > > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > > >> >> > > > > >
> > > > >> >> > > > > >> My cluster is fully distributed
(2 regionserver
> > nodes).
> > > > >> >> > > > > >>
> > > > >> >> > > > > >> Here is a snippet of log
entries that may explain
> why
> > it
> > > > >> >> started:
> > > > >> >> > > > > >> http://pastebin.com/wQECif8k.
I had to go back 2
> days
> > > to
> > > > >> find
> > > > >> >> > when
> > > > >> >> > > it
> > > > >> >> > > > > >> started for this region.
> > > > >> >> > > > > >>
> > > > >> >> > > > > >> This is not the only region
experiencing this issue
> > (but
> > > > >> this
> > > > >> >> is
> > > > >> >> > the
> > > > >> >> > > > > >> smallest one it's happened
to).
> > > > >> >> > > > > >>
> > > > >> >> > > > > >> --Tom
> > > > >> >> > > > > >>
> > > > >> >> > > > > >>
> > > > >> >> > > > > >> On Tue, Sep 24, 2013 at
10:13 AM, Jean-Marc
> Spaggiari
> > <
> > > > >> >> > > > > >> jean-marc@spaggiari.org>
wrote:
> > > > >> >> > > > > >>
> > > > >> >> > > > > >> > Can you past logs
a bit before that? To see if
> > > anything
> > > > >> >> > triggered
> > > > >> >> > > > the
> > > > >> >> > > > > >> > compaction?
> > > > >> >> > > > > >> > Before the 1M compactions
entries.
> > > > >> >> > > > > >> >
> > > > >> >> > > > > >> > Also, what is your
setup? Are you running in
> > > Standalone?
> > > > >> >> > > > Pseudo-Dist?
> > > > >> >> > > > > >> > Fully-Dist?
> > > > >> >> > > > > >> >
> > > > >> >> > > > > >> > Thanks,
> > > > >> >> > > > > >> >
> > > > >> >> > > > > >> > JM
> > > > >> >> > > > > >> >
> > > > >> >> > > > > >> >
> > > > >> >> > > > > >> > 2013/9/24 Tom Brown
<tombrown52@gmail.com>
> > > > >> >> > > > > >> >
> > > > >> >> > > > > >> > > There is one
column family, d. Each row has
> about
> > 10
> > > > >> >> columns,
> > > > >> >> > > and
> > > > >> >> > > > > each
> > > > >> >> > > > > >> > > row's total data
size is less than 2K.
> > > > >> >> > > > > >> > >
> > > > >> >> > > > > >> > > Here is a small
snippet of logs from the region
> > > > server:
> > > > >> >> > > > > >> > > http://pastebin.com/S2jE4ZAx
> > > > >> >> > > > > >> > >
> > > > >> >> > > > > >> > > --Tom
> > > > >> >> > > > > >> > >
> > > > >> >> > > > > >> > >
> > > > >> >> > > > > >> > > On Tue, Sep 24,
2013 at 9:59 AM, Bharath
> > > Vissapragada
> > > > <
> > > > >> >> > > > > >> > > bharathv@cloudera.com
> > > > >> >> > > > > >> > > > wrote:
> > > > >> >> > > > > >> > >
> > > > >> >> > > > > >> > > > It would
help if you can show your RS log (via
> > > > >> >> pastebin?) .
> > > > >> >> > > Are
> > > > >> >> > > > > >> there
> > > > >> >> > > > > >> > > > frequent
flushes for this region too?
> > > > >> >> > > > > >> > > >
> > > > >> >> > > > > >> > > >
> > > > >> >> > > > > >> > > > On Tue,
Sep 24, 2013 at 9:20 PM, Tom Brown <
> > > > >> >> > > > tombrown52@gmail.com>
> > > > >> >> > > > > >> > wrote:
> > > > >> >> > > > > >> > > >
> > > > >> >> > > > > >> > > > > I have
a region that is very small, only
> 5MB.
> > > > >> Despite
> > > > >> >> it's
> > > > >> >> > > > size,
> > > > >> >> > > > > >> it
> > > > >> >> > > > > >> > has
> > > > >> >> > > > > >> > > > 24
> > > > >> >> > > > > >> > > > > store
files. The logs show that it's
> > compacting
> > > > >> (over
> > > > >> >> and
> > > > >> >> > > over
> > > > >> >> > > > > >> > again).
> > > > >> >> > > > > >> > > > >
> > > > >> >> > > > > >> > > > > The
odd thing is that even though there are
> 24
> > > > store
> > > > >> >> > files,
> > > > >> >> > > it
> > > > >> >> > > > > >> only
> > > > >> >> > > > > >> > > does
> > > > >> >> > > > > >> > > > > one
at a time. Even more strange is that my
> > logs
> > > > are
> > > > >> >> > filling
> > > > >> >> > > > up
> > > > >> >> > > > > >> with
> > > > >> >> > > > > >> > > > > compacting
this one region. In the last 10
> > > hours,
> > > > >> there
> > > > >> >> > have
> > > > >> >> > > > > been
> > > > >> >> > > > > >> > > > 1,876,200
> > > > >> >> > > > > >> > > > > log
entries corresponding to compacting this
> > > > region
> > > > >> >> alone.
> > > > >> >> > > > > >> > > > >
> > > > >> >> > > > > >> > > > > My
cluster is 0.94.10, and using almost all
> > > > default
> > > > >> >> > > settings.
> > > > >> >> > > > > >> Only a
> > > > >> >> > > > > >> > > few
> > > > >> >> > > > > >> > > > > are
not default:
> > > > >> >> > > > > >> > > > > hbase.hregion.max.filesize
= 4294967296
> > > > >> >> > > > > >> > > > > hbase.hstore.compaction.min
= 6
> > > > >> >> > > > > >> > > > >
> > > > >> >> > > > > >> > > > > I am
at a total loss as to why this behavior
> > is
> > > > >> >> occurring.
> > > > >> >> > > Any
> > > > >> >> > > > > >> help
> > > > >> >> > > > > >> > is
> > > > >> >> > > > > >> > > > > appreciated.
> > > > >> >> > > > > >> > > > >
> > > > >> >> > > > > >> > > > > --Tom
> > > > >> >> > > > > >> > > > >
> > > > >> >> > > > > >> > > >
> > > > >> >> > > > > >> > > >
> > > > >> >> > > > > >> > > >
> > > > >> >> > > > > >> > > > --
> > > > >> >> > > > > >> > > > Bharath
Vissapragada
> > > > >> >> > > > > >> > > > <http://www.cloudera.com>
> > > > >> >> > > > > >> > > >
> > > > >> >> > > > > >> > >
> > > > >> >> > > > > >> >
> > > > >> >> > > > > >>
> > > > >> >> > > > > >
> > > > >> >> > > > > >
> > > > >> >> > > > >
> > > > >> >> > > >
> > > > >> >> > >
> > > > >> >> >
> > > > >> >>
> > > > >> >
> > > > >> >
> > > > >>
> > > > >> --
> > > > >> CONFIDENTIALITY NOTICE
> > > > >> NOTICE: This message is intended for the use of the individual
or
> > > entity
> > > > >> to
> > > > >> which it is addressed and may contain information that is
> > > confidential,
> > > > >> privileged and exempt from disclosure under applicable law. If
the
> > > > reader
> > > > >> of this message is not the intended recipient, you are hereby
> > notified
> > > > >> that
> > > > >> any printing, copying, dissemination, distribution, disclosure
or
> > > > >> forwarding of this communication is strictly prohibited. If you
> have
> > > > >> received this communication in error, please contact the sender
> > > > >> immediately
> > > > >> and delete it from your system. Thank You.
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message