hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Major Compaction Concerns
Date Thu, 12 Jan 2012 22:09:23 GMT
Thanks for the tips, Nicolas.

About lazy seek, if you were referring to HBASE-4465, that was only
integrated into TRUNK and 0.89-fb.
I was thinking about backporting it to 0.92


On Thu, Jan 12, 2012 at 1:44 PM, Nicolas Spiegelberg <nspiegelberg@fb.com>wrote:

> Mikael,
> >The system is an OLTP system, with strict latency and throughput
> >requirements, regions are pre-splitted and throughput is controlled.
> >
> >The system has heavy load period for few hours, during heavy load i mean
> >high proportion insert/update and small proportion of read.
> I'm not sure about the production status of your system, but you sound
> like you have critical need for dozens of optimization features coming out
> in 0.92 and even some trunk patches.  In particular, update speed has been
> drastically improved due to lazy seek.  Although you can get incremental
> wins with a different compaction features, you will get exponential wins
> from looking at other features right now.
> >we fall in the memstore flush throttling (
> >will wait 90000 ms before flushing the memstore) retaining more logs,
> >triggering more flush that can't be flushed.... adding pressure on the
> >system memory (memstore is not flushed on time)
> Filling up the logs faster than you can flush normally indicates that you
> have disk or network saturation.  If you have an increment workload, I
> know there are a number of patches in 0.92 that will drastically reduce
> your flush size (1: read memstore before going to disk, 2: don't flush all
> versions).  You don't have a compaction problem, you have a write/read
> problem.
> In 0.92, you can try setting your compaction.ratio down (0.25 is a good
> start) to increase the StoreFile count to slow reads but save Network IO
> on write.  This setting is very similar to the defaults suggested in the
> BigTable paper.  However, this is only going to cut your Network IO in
> half.  The LevelDB or BigTable algorithm can reduce your outlier StoreFile
> count, but they wouldn't be able to cut this IO volume down much either.
> >Please remember i'm on 0.90.1 so when major compaction is running minor is
> >blocked, when a memstore for a column family is flushed all other memstore
> >(for other) column family are also (no matter if they are smaller or not).
> >As you already wrote, the best way is to manage compaction, and it is what
> >i tried to do.
> Per-storefile compactions & multi-threaded compactions were added 0.92 to
> address this problem.  However, a high StoreFile count is not necessarily
> a bad thing.  For an update workload, you only have to read the newest
> StoreFile and lazy seek optimizes your situation a lot (again 0.92).
> >Regarding the compaction plug-ability needs.
> >Let suppose that the data you are inserting in different column family has
> >a different pattern, for example on CF1 (column family #1) you update
> >fields in the same row key while in CF2 you add each time new fields or
> >CF2 has new row and older rows are never updated won't you use different
> >algorithms for compacting these CF?
> There are mostly 3 different workloads that require different
> optimizations (not necessarily compaction-related):
> 1. Read old data.  Should properly use bloom filters to filter out
> StoreFiles
> 2. R+W.  Will really benefit from lazy seeks & cache on write (0.92).  Far
> more than a compaction algorithm
> 3. Write mostly.  Don't really care about compactions here.  Just don't
> want them to be sucking too much IO
> >Finally the schema design is guided by the ACID property of a row, we have
> >2 CF only both CF holds a different volume of data even if they are
> >Updated approximately with the same amount of data (cell updated vs cell
> >created).
> Note that 0.90 only had row-based write atomicity.  HBASE-2856 is
> necessary for row-based read atomicity across column families.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message