hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bing Jiang <jiangbinglo...@gmail.com>
Subject Re: Hbase scan using TIMERANGE
Date Fri, 06 Feb 2015 03:24:03 GMT
Really thankful for Ted's points.

Yes, the tight time range will cause scan to be very slow to fill the cache.

I will investigate the hbase-5032 further, will report to you if there are
some progresses and improvements.

Thank you!

-Bing

2015-02-05 11:34 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:

> bq. set a sparse TimeRange
>
> You mean a TimeRange whose span is short ?
>
> bq. and large scan cache
>
> Can you try smaller number of rows for caching ?
>
> A preliminary search led me to HBASE-5032 'Add other DELETE type
> information into the delete bloom filter to optimize the time range query'
>
> Cheers
>
> On Wed, Feb 4, 2015 at 7:26 PM, Bing Jiang <jiangbinglover@gmail.com>
> wrote:
>
> > hi, Ted.
> >
> > Do you know whether there is optimization on scan with TimeRange?
> >
> > Actually, if set a sparse TimeRange and large scan cache, it will cause
> rpc
> > time out sometimes.
> >
> >
> > Actually, want to know whether it requires scanning each KV for checking
> > its timestamp?
> >
> > Thanks,
> > -Bing
> >
> > 2014-06-28 21:25 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
> >
> > > Have you looked at the following method in AggregationClient ?
> > >
> > >   long rowCount(final HTable table,
> > >
> > >       final ColumnInterpreter<R, S, P, Q, T> ci, final Scan scan)
> throws
> > > Throwable {
> > >
> > > You can specify timerange through scan parameter.
> > >
> > > See this method of Scan:
> > >
> > >   public Scan setTimeRange(long minStamp, long maxStamp)
> > >
> > > Cheers
> > >
> > >
> > > On Sat, Jun 28, 2014 at 3:42 AM, yogi <yogidave15@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I have a requirement where I have to make a shell script using which
> i
> > > need
> > > > to scan some 6 huge hbase tables and get the count of records present
> > in
> > > > them. Also i need the counts per day wise where i pass the date
> > parameter
> > > > to
> > > > the shell script which calls these scan commands. I did find a way to
> > > > convert the date to epoch time and pass it to scan command but the
> scan
> > > > keeps running forever. Can some one help me in making this faster.
> > > >
> > > > Note: I am scanning the tables based on TIMERANGE as all the tables
> > have
> > > > this field.
> > > >
> > > > Thanks,
> > > > Yogi
> > > >
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/Hbase-scan-using-TIMERANGE-tp4060851.html
> > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > >
> > >
> >
>



-- 
Bing Jiang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message