hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Brown <tombrow...@gmail.com>
Subject Re: Why is this region compacting?
Date Tue, 24 Sep 2013 17:18:49 GMT
-rw-------   1 hadoop supergroup       2194 2013-09-21 14:32
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/014ead47a9484d67b55205be16802ff1
-rw-------   1 hadoop supergroup      31321 2013-09-24 05:49
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1305d625bd4a4be39a98ae4d91a66140
-rw-------   1 hadoop supergroup       1350 2013-09-24 10:31
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1352e0828f974f08b1f3d7a9dff04abd
-rw-------   1 hadoop supergroup       4194 2013-09-21 10:38
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/17a546064bd840619816809ae0fc4c49
-rw-------   1 hadoop supergroup       1061 2013-09-20 22:55
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1cb3df115da244288bd076968ab4ccf6
-rw-------   1 hadoop supergroup       1375 2013-08-24 10:17
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1e41a96c49fc4e5ab59392d26935978d
-rw-------   1 hadoop supergroup      96296 2013-08-26 15:48
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/22d72fd897e34424b5420a96483a571e
-rw-------   1 hadoop supergroup       1356 2013-08-26 15:23
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/25fee1ffadbe42549bd0b7b13d782b72
-rw-------   1 hadoop supergroup       6229 2013-09-21 11:14
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/26289c777ec14dc5b7021b4d6b1050c5
-rw-------   1 hadoop supergroup       1223 2013-09-21 02:42
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2757d7ba9c8448d6a3d5d46bd4d59758
-rw-------   1 hadoop supergroup    5302248 2013-08-24 02:22
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2ec40943787246ea983608dd6591db24
-rw-------   1 hadoop supergroup       1596 2013-08-24 03:37
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3157fd1cabe4483aaa4d9a21f75e4d88
-rw-------   1 hadoop supergroup       1338 2013-09-22 04:25
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/36b0f80a4a7b492f97358b64d879a2df
-rw-------   1 hadoop supergroup       3264 2013-09-21 12:05
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/39e249fcb532400daed73aed6689ceeb
-rw-------   1 hadoop supergroup       4549 2013-09-21 08:56
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3bc9e2a566ad460a9b0ed336b2fb5ed9
-rw-------   1 hadoop supergroup       1630 2013-09-22 03:22
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/48026d08aae748f08aad59e4eea903be
-rw-------   1 hadoop supergroup     105395 2013-09-20 21:12
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/53198825f085401cbbd4322faa0e3aae
-rw-------   1 hadoop supergroup       3859 2013-09-21 09:09
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/71c2f9b2a8ff4c049fcc5a9a22af5cfe
-rw-------   1 hadoop supergroup     311688 2013-09-20 21:12
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/97ff16d6da974c30835c6e0acc7c737a
-rw-------   1 hadoop supergroup       1897 2013-08-24 08:43
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/a172d7577641434d82abcce88a433213
-rw-------   1 hadoop supergroup       3380 2013-09-21 13:04
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/be678e5c60534c65a012a798fbc7e284
-rw-------   1 hadoop supergroup      43710 2013-09-22 02:15
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/e2508a23acf1491f9d38b9a8594e41e8
-rw-------   1 hadoop supergroup       5409 2013-09-21 10:10
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f432846182714b93a1c3df0f5835c09b
-rw-------   1 hadoop supergroup        491 2013-09-24 11:18
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f7d8669cf7a047b98c1d3b13c16cfaec
-rw-------   1 hadoop supergroup        491 2013-09-24 11:18
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fa1b8f6cc9584eb28365dcd8f10d3f0a
-rw-------   1 hadoop supergroup        491 2013-09-13 11:28
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff



On Tue, Sep 24, 2013 at 11:14 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> TTL seems to be fine.
>
> -1 is the default value for TimeRangeTracker.maximumTimestamp.
>
> Can you run:
> hadoop fs -lsr hdfs://
>
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/
>
> Thanks,
>
> JM
>
>
> 2013/9/24 Tom Brown <tombrown52@gmail.com>
>
> > 1. Hadoop version is 1.1.2.
> > 2. All servers are synched with NTP.
> > 3. Table definition is: 'compound0', {
> > NAME => 'd',
> > DATA_BLOCK_ENCODING => 'NONE',
> > BLOOMFILTER => 'ROW',
> > REPLICATION_SCOPE => '0',
> > VERSIONS => '1',
> > COMPRESSION => 'SNAPPY',
> > MIN_VERSIONS => '0',
> > TTL => '8640000',
> > KEEP_DELETED_CELLS => 'false',
> > BLOCKSIZE => '65536',
> > IN_MEMORY => 'false',
> > ENCODE_ON_DISK => 'true',
> > BLOCKCACHE => 'true'
> > }
> >
> > The TTL is supposed to be 100 days.
> >
> > --Tom
> >
> >
> > On Tue, Sep 24, 2013 at 10:53 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > Another important information why might be the root cause of this
> > issue...
> > >
> > > Do you have any TTL defines for this table?
> > >
> > > JM
> > >
> > >
> > > 2013/9/24 Jean-Marc Spaggiari <jean-marc@spaggiari.org>
> > >
> > > > Strange.
> > > >
> > > > Few questions then.
> > > > 1) What is your hadoop version?
> > > > 2) Is the clock on all your severs synched with NTP?
> > > > 3) What is you table definition? Bloom filters, etc.?
> > > >
> > > > This is the reason why it keep compacting:
> > > >
> > > > 2013-09-24 10:04:00,548 INFO
> > > org.apache.hadoop.hbase.regionserver.compactions.CompactSelection:
> > Deleting
> > > the expired store file by compaction: hdfs://
> > >
> >
> hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/7426f128469242ec8ee09f3965fd5a1awhosemaxTimeStampis
-1 while the max expired timestamp is 1371398640548
> > > >
> > > > maxTimeStamp = -1
> > > >
> > > >
> > > > Each time there is a comparison between maxTimeStamp for this store
> > file
> > > > and the configured maxExpiredTimeStamp and since maxTimeStamp returns
> > -1,
> > > > it's always elected for a compaction. Now, we need to find why...
> > > >
> > > > JM
> > > >
> > > >
> > > > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > >
> > > >> My cluster is fully distributed (2 regionserver nodes).
> > > >>
> > > >> Here is a snippet of log entries that may explain why it started:
> > > >> http://pastebin.com/wQECif8k. I had to go back 2 days to find when
> it
> > > >> started for this region.
> > > >>
> > > >> This is not the only region experiencing this issue (but this is the
> > > >> smallest one it's happened to).
> > > >>
> > > >> --Tom
> > > >>
> > > >>
> > > >> On Tue, Sep 24, 2013 at 10:13 AM, Jean-Marc Spaggiari <
> > > >> jean-marc@spaggiari.org> wrote:
> > > >>
> > > >> > Can you past logs a bit before that? To see if anything triggered
> > the
> > > >> > compaction?
> > > >> > Before the 1M compactions entries.
> > > >> >
> > > >> > Also, what is your setup? Are you running in Standalone?
> > Pseudo-Dist?
> > > >> > Fully-Dist?
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > JM
> > > >> >
> > > >> >
> > > >> > 2013/9/24 Tom Brown <tombrown52@gmail.com>
> > > >> >
> > > >> > > There is one column family, d. Each row has about 10 columns,
> and
> > > each
> > > >> > > row's total data size is less than 2K.
> > > >> > >
> > > >> > > Here is a small snippet of logs from the region server:
> > > >> > > http://pastebin.com/S2jE4ZAx
> > > >> > >
> > > >> > > --Tom
> > > >> > >
> > > >> > >
> > > >> > > On Tue, Sep 24, 2013 at 9:59 AM, Bharath Vissapragada <
> > > >> > > bharathv@cloudera.com
> > > >> > > > wrote:
> > > >> > >
> > > >> > > > It would help if you can show your RS log (via pastebin?)
.
> Are
> > > >> there
> > > >> > > > frequent flushes for this region too?
> > > >> > > >
> > > >> > > >
> > > >> > > > On Tue, Sep 24, 2013 at 9:20 PM, Tom Brown <
> > tombrown52@gmail.com>
> > > >> > wrote:
> > > >> > > >
> > > >> > > > > I have a region that is very small, only 5MB.
Despite it's
> > size,
> > > >> it
> > > >> > has
> > > >> > > > 24
> > > >> > > > > store files. The logs show that it's compacting
(over and
> over
> > > >> > again).
> > > >> > > > >
> > > >> > > > > The odd thing is that even though there are 24
store files,
> it
> > > >> only
> > > >> > > does
> > > >> > > > > one at a time. Even more strange is that my logs
are filling
> > up
> > > >> with
> > > >> > > > > compacting this one region. In the last 10 hours,
there have
> > > been
> > > >> > > > 1,876,200
> > > >> > > > > log entries corresponding to compacting this region
alone.
> > > >> > > > >
> > > >> > > > > My cluster is 0.94.10, and using almost all default
> settings.
> > > >> Only a
> > > >> > > few
> > > >> > > > > are not default:
> > > >> > > > > hbase.hregion.max.filesize = 4294967296
> > > >> > > > > hbase.hstore.compaction.min = 6
> > > >> > > > >
> > > >> > > > > I am at a total loss as to why this behavior is
occurring.
> Any
> > > >> help
> > > >> > is
> > > >> > > > > appreciated.
> > > >> > > > >
> > > >> > > > > --Tom
> > > >> > > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > --
> > > >> > > > Bharath Vissapragada
> > > >> > > > <http://www.cloudera.com>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message