hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop John <anoop.hb...@gmail.com>
Subject Re: Questions about HBase
Date Wed, 05 Jun 2013 08:24:40 GMT
Why there are so many miss for the index blocks? WHat is the block cache
mem you use?

On Wed, Jun 5, 2013 at 12:37 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> I get your point Pankaj.
> Going thro the code to confirm it
>     // Data index. We also read statistics about the block index written
> after
>     // the root level.
>     dataBlockIndexReader.readMultiLevelIndexRoot(
>         blockIter.nextBlockWithBlockType(BlockType.ROOT_INDEX),
>         trailer.getDataIndexCount());
>
>     // Meta index.
>     metaBlockIndexReader.readRootIndex(
>         blockIter.nextBlockWithBlockType(BlockType.ROOT_INDEX),
>         trailer.getMetaIndexCount());
>
> We read the root level of the multilevel index and the actual root index.
> So as and when when we need new index blocks we will be hitting the disk
> and your observation is correct.  Sorry if i had confused you in this.
> The new version of HFile was mainly to address the concern in the previous
> versoin where the entire indices was in memory.  The version V2 addressed
> that concern like having the root level (something like metadata of the
> indices) and from there you should be able to get new index blocks.
> But there are chances that if you region size is small you may have only
> one level and the entire thing may be in memory.
>
> Regards
> Ram
>
>
> On Wed, Jun 5, 2013 at 11:56 AM, Pankaj Gupta <pankaj@brightroll.com>
> wrote:
>
> > Sorry, forgot to mention that I added the log statements to the method
> > readBlock in HFileReaderV2.java. I'm on hbase 0.94.2.
> >
> >
> > On Tue, Jun 4, 2013 at 11:16 PM, Pankaj Gupta <pankaj@brightroll.com>
> > wrote:
> >
> > > Some context on how I observed bloom filters being loaded constantly. I
> > > added the following logging statements to HFileReaderV2.java:
> > > }
> > >         if (!useLock) {
> > >           // check cache again with lock
> > >           useLock = true;
> > >           continue;
> > >         }
> > >
> > >         // Load block from filesystem.
> > >         long startTimeNs = System.nanoTime();
> > >         HFileBlock hfileBlock =
> > > fsBlockReader.readBlockData(dataBlockOffset,
> > >             onDiskBlockSize, -1, pread);
> > >         hfileBlock = dataBlockEncoder.diskToCacheFormat(hfileBlock,
> > >             isCompaction);
> > >         validateBlockType(hfileBlock, expectedBlockType);
> > >         passSchemaMetricsTo(hfileBlock);
> > >         BlockCategory blockCategory =
> > > hfileBlock.getBlockType().getCategory();
> > >
> > > // My logging statements ---->
> > >         if(blockCategory == BlockCategory.INDEX) {
> > >           LOG.info("index block miss, reading from disk " + cacheKey);
> > >         } else if (blockCategory == BlockCategory.BLOOM) {
> > >           LOG.info("bloom block miss, reading from disk " + cacheKey);
> > >         } else {
> > >           LOG.info("block miss other than index or bloom, reading from
> > > disk " + cacheKey);
> > >         }
> > > //-------------->
> > >         final long delta = System.nanoTime() - startTimeNs;
> > >         HFile.offerReadLatency(delta, pread);
> > >         getSchemaMetrics().updateOnCacheMiss(blockCategory,
> isCompaction,
> > > delta);
> > >
> > >         // Cache the block if necessary
> > >         if (cacheBlock && cacheConf.shouldCacheBlockOnRead(
> > >             hfileBlock.getBlockType().getCategory())) {
> > >           cacheConf.getBlockCache().cacheBlock(cacheKey, hfileBlock,
> > >               cacheConf.isInMemory());
> > >         }
> > >
> > >         if (hfileBlock.getBlockType() == BlockType.DATA) {
> > >           HFile.dataBlockReadCnt.incrementAndGet();
> > >         }
> > >
> > > With these in place I saw the following statements in log:
> > > 2013-06-05 01:04:55,281 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_30361506
> > > 2013-06-05 01:05:00,579 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_28779560
> > > 2013-06-05 01:07:41,335 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_4199735
> > > 2013-06-05 01:08:58,460 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_8519720
> > > 2013-06-05 01:11:01,545 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_12838948
> > > 2013-06-05 01:11:03,035 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_3973250
> > > 2013-06-05 01:11:36,339 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_17159812
> > > 2013-06-05 01:12:35,398 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_21478349
> > > 2013-06-05 01:13:02,572 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_25798003
> > > 2013-06-05 01:13:03,260 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_8068381
> > > 2013-06-05 01:13:20,265 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_30118048
> > > 2013-06-05 01:13:20,522 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_60833137
> > > 2013-06-05 01:13:32,261 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_34545951
> > > 2013-06-05 01:13:48,504 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_38865311
> > > 2013-06-05 01:13:49,951 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_12161793
> > > 2013-06-05 01:14:02,073 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_43185677
> > > 2013-06-05 01:14:12,956 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_47506066
> > > 2013-06-05 01:14:25,132 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_51825831
> > > 2013-06-05 01:14:25,946 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_16257519
> > > 2013-06-05 01:14:34,478 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_56145793
> > > 2013-06-05 01:14:45,319 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_60466405
> > > 2013-06-05 01:14:45,998 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_91304775
> > > 2013-06-05 01:14:58,203 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_64893493
> > > 2013-06-05 01:14:58,463 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_20352561
> > > 2013-06-05 01:15:09,299 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_69214092
> > > 2013-06-05 01:15:32,944 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_73533616
> > > 2013-06-05 01:15:46,903 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_77865906
> > > 2013-06-05 01:15:47,273 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_24448138
> > > 2013-06-05 01:15:55,312 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_82185687
> > > 2013-06-05 01:16:07,591 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_86506129
> > > 2013-06-05 01:16:20,728 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_90825624
> > > 2013-06-05 01:16:22,551 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_28542144
> > > 2013-06-05 01:16:22,810 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_121777484
> > > 2013-06-05 01:16:23,035 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_57670002
> > > 2013-06-05 01:16:33,196 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_95253904
> > > 2013-06-05 01:16:48,187 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_99574899
> > > 2013-06-05 01:17:06,648 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_103895087
> > > 2013-06-05 01:17:10,526 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_32744846
> > > 2013-06-05 01:17:22,939 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_108214936
> > > 2013-06-05 01:17:36,010 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_112535209
> > > 2013-06-05 01:17:46,028 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_116855742
> > > 2013-06-05 01:17:47,029 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_36838416
> > > 2013-06-05 01:17:54,472 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_121174753
> > > 2013-06-05 01:17:55,491 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_152248177
> > > 2013-06-05 01:18:05,912 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_125601238
> > > 2013-06-05 01:18:15,417 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_129921797
> > > 2013-06-05 01:18:16,713 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_40933856
> > > 2013-06-05 01:18:29,521 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_134242324
> > > 2013-06-05 01:18:38,653 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_138561860
> > > 2013-06-05 01:18:49,280 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_142881436
> > > 2013-06-05 01:18:50,052 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 52cded0c399b48fdbccd8b3d4e25502f_45029905
> > > 2013-06-05 01:18:58,339 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_147201737
> > > 2013-06-05 01:19:06,371 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: bloom block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_151533253
> > > 2013-06-05 01:19:07,782 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_182719269
> > >
> > > I kept seeing these statements appearing constantly over a long period,
> > > this seemed to confirm to me that bloom filter blocks are being loaded
> > over
> > > a period time, which also matched what I read about HFileV2. May be I
> am
> > > wrong about both. Would love to understand what's really going on.
> > >
> > > Thanks in Advance,
> > > Pankaj
> > >
> > >
> > >
> > > On Tue, Jun 4, 2013 at 11:05 PM, ramkrishna vasudevan <
> > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > >
> > >> Whenever the region is opened all the bloom filter meta data are
> loaded
> > >> into memory.  I think his concern is every time all the store files
> are
> > >> read and then we load it into memory and wants some faster ways of
> doing
> > >> it.
> > >> Asaf you are right.
> > >>
> > >> Regards
> > >> Ram
> > >>
> > >>
> > >> On Wed, Jun 5, 2013 at 11:22 AM, Asaf Mesika <asaf.mesika@gmail.com>
> > >> wrote:
> > >>
> > >> > When you do the first read of this region, wouldn't this load all
> > bloom
> > >> > filters?
> > >> >
> > >> >
> > >> >
> > >> > On Wed, Jun 5, 2013 at 8:43 AM, ramkrishna vasudevan <
> > >> > ramkrishna.s.vasudevan@gmail.com> wrote:
> > >> >
> > >> > > for the question whether you will be able to do a warm up for
the
> > >> bloom
> > >> > and
> > >> > > block cache i don't think it is possible now.
> > >> > >
> > >> > > Regards
> > >> > > Ram
> > >> > >
> > >> > >
> > >> > > On Wed, Jun 5, 2013 at 10:57 AM, Asaf Mesika <
> asaf.mesika@gmail.com
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > If you will read HFile v2 document on HBase site you will
> > understand
> > >> > > > completely how the search for a record works and why there
is
> > linear
> > >> > > search
> > >> > > > in the block but binary search to get to the right block.
> > >> > > > Also bear in mind the amount of keys in a blocks is not
big
> since
> > a
> > >> > block
> > >> > > > in HFile by default is 65k, thus from a 10GB HFile you are
only
> > >> fully
> > >> > > > scanning 65k out of it.
> > >> > > >
> > >> > > > On Wednesday, June 5, 2013, Pankaj Gupta wrote:
> > >> > > >
> > >> > > > > Thanks for the replies. I'll take a look at
> > >> src/main/java/org/apache/
> > >> > > > > hadoop/hbase/coprocessor/BaseRegionObserver.java.
> > >> > > > >
> > >> > > > > @ramkrishna: I do want to have bloom filter and block
index
> all
> > >> the
> > >> > > time.
> > >> > > > > For good read performance they're critical in my workflow.
The
> > >> worry
> > >> > is
> > >> > > > > that when HBase is restarted it will take a long time
for them
> > to
> > >> get
> > >> > > > > populated again and performance will suffer. If there
was a
> way
> > of
> > >> > > > loading
> > >> > > > > them quickly and warm up the table then we'll be able
to
> restart
> > >> > HBase
> > >> > > > > without causing slow down in processing.
> > >> > > > >
> > >> > > > >
> > >> > > > > On Tue, Jun 4, 2013 at 9:29 PM, Ted Yu <yuzhihong@gmail.com>
> > >> wrote:
> > >> > > > >
> > >> > > > > > bq. But i am not very sure if we can control the
files
> getting
> > >> > > selected
> > >> > > > > for
> > >> > > > > > compaction in the older verisons.
> > >> > > > > >
> > >> > > > > > Same mechanism is available in 0.94
> > >> > > > > >
> > >> > > > > > Take a look
> > >> > > > > > at
> > >> > > > > >
> > >> > > >
> > >> >
> > >>
> > src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
> > >> > > > > > where you would find the following methods (and
more):
> > >> > > > > >
> > >> > > > > >   public void preCompactSelection(final
> > >> > > > > > ObserverContext<RegionCoprocessorEnvironment>
c,
> > >> > > > > >       final Store store, final List<StoreFile>
candidates,
> > final
> > >> > > > > > CompactionRequest request)
> > >> > > > > >   public InternalScanner
> > >> > > > > > preCompact(ObserverContext<RegionCoprocessorEnvironment>
e,
> > >> > > > > >       final Store store, final InternalScanner
scanner)
> throws
> > >> > > > > IOException
> > >> > > > > > {
> > >> > > > > >
> > >> > > > > > Cheers
> > >> > > > > >
> > >> > > > > > On Tue, Jun 4, 2013 at 8:14 PM, ramkrishna vasudevan
<
> > >> > > > > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > >> > > > > >
> > >> > > > > > > >>Does Minor compaction remove HFiles
in which all entries
> > are
> > >> > out
> > >> > > of
> > >> > > > > > >    TTL or does only Major compaction do that
> > >> > > > > > > Yes it applies for Minor compactions.
> > >> > > > > > > >>Is there a way of configuring major
compaction to
> compact
> > >> only
> > >> > > > files
> > >> > > > > > >    older than a certain time or to compress
all the files
> > >> except
> > >> > > the
> > >> > > > > > latest
> > >> > > > > > >    few?
> > >> > > > > > > In the latest trunk version the compaction
algo itself can
> > be
> > >> > > > plugged.
> > >> > > > > > >  There are some coprocessor hooks that gives
control on
> the
> > >> > scanner
> > >> > > > > that
> > >> > > > > > > gets created for compaction with which we
can control the
> > KVs
> > >> > being
> > >> > > > > > > selected. But i am not very sure if we can
control the
> files
> > >> > > getting
> > >> > > > > > > selected for compaction in the older verisons.
> > >> > > > > > > >> The above excerpt seems to imply
to me that the search
> > for
> > >> key
> > >> > > > > inside
> > >> > > > > > a
> > >> > > > > > > block
> > >> > > > > > > is linear and I feel I must be reading it
wrong. I would
> > >> expect
> > >> > the
> > >> > > > > scan
> > >> > > > > > to
> > >> > > > > > > be a binary search.
> > >> > > > > > > Once the data block is identified for a key,
we seek to
> the
> > >> > > beginning
> > >> > > > > of
> > >> > > > > > > the block and then do a linear search until
we reach the
> > exact
> > >> > key
> > >> > > > that
> > >> > > > > > we
> > >> > > > > > > are looking out for.  Because internally
the data (KVs)
> are
> > >> > stored
> > >> > > as
> > >> > > > > > byte
> > >> > > > > > > buffers per block and it follows this pattern
> > >> > > > > > > <keylength><valuelength><keybytearray><valuebytearray>
> > >> > > > > > > >>Is there a way to warm up the bloom
filter and block
> index
> > >> > cache
> > >> > > > for
> > >> > > > > > >    a table?
> > >> > > > > > > You always want the bloom and block index
to be in cache?
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > On Wed, Jun 5, 2013 at 7:45 AM, Pankaj Gupta
<
> > >> > > pankaj@brightroll.com>
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi,
> > >> > > > > > > >
> > >> > > > > > > > I have a few small questions regarding
HBase. I've
> > searched
> > >> the
> > >> > > > forum
> > >> > > > > > but
> > >> > > > > > > > couldn't find clear answers hence asking
them here:
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > >    1. Does Minor compaction remove HFiles
in which all
> > >> entries
> > >> > > are
> > >> > > > > out
> > >> > > > > > of
> > >> > > > > > > >    TTL or does only Major compaction
do that? I found
> this
> > >> > jira:
> > >> > > > > > > >    https://issues.apache.org/jira/browse/HBASE-5199but
I
> > >> > dont'
> > >> > > > know
> > >> > > > > > if
> > >> > > > > > > > the
> > >> > > > > > > >    compaction being talked about there
is minor or
> major.
> > >> > > > > > > >    2. Is there a way of configuring
major compaction to
> > >> compact
> > >> > > > only
> > >> > > > > > > files
> > >> > > > > > > >    older than a certain time or to compress
all the
> files
> > >> > except
> > >> > > > the
> > >> > > > > > > latest
> > >> > > > > > > >    few? We basically want to use the
time based
> filtering
> > >> > > > > optimization
> > >> > > > > > in
> > >> > > > > > > >    HBase to get the latest additions
to the table and
> > since
> > >> > major
> > >> > > > > > > > compaction
> > >> > > > > > > >    bunches everything into one file,
it would defeat the
> > >> > > > > optimization.
> > >> > > > > > > >    3. Is there a way to warm up the
bloom filter and
> block
> > >> > index
> > >> > > > > cache
> > >> > > > > > > for
> > >> > > > > > > >    a table? This is for a case where
I always want the
> > bloom
> > >> > > > filters
> > >> > > > > > and
> > >> > > > > > > > index
> > >> > > > > > > >    to be all in memory, but not the
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > >
> > >
> > > *P* | (415) 677-9222 ext. 205 *F *| (415) 677-0895 |
> > pankaj@brightroll.com
> > >
> > > Pankaj Gupta | Software Engineer
> > >
> > > *BrightRoll, Inc. *| Smart Video Advertising | www.brightroll.com
> > >
> > >
> > > United States | Canada | United Kingdom | Germany
> > >
> > >
> > > We're hiring<
> >
> http://newton.newtonsoftware.com/career/CareerHome.action?clientId=8a42a12b3580e2060135837631485aa7
> > >
> > > !
> > >
> >
> >
> >
> > --
> >
> >
> > *P* | (415) 677-9222 ext. 205 *F *| (415) 677-0895 |
> pankaj@brightroll.com
> >
> > Pankaj Gupta | Software Engineer
> >
> > *BrightRoll, Inc. *| Smart Video Advertising | www.brightroll.com
> >
> >
> > United States | Canada | United Kingdom | Germany
> >
> >
> > We're hiring<
> >
> http://newton.newtonsoftware.com/career/CareerHome.action?clientId=8a42a12b3580e2060135837631485aa7
> > >
> > !
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message