hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristoffer Sjögren <sto...@gmail.com>
Subject Re: Timerange scan
Date Mon, 02 Mar 2015 08:42:42 GMT
Thanks, great explanation!

Forgive my laziness, but do you happen to know what part(s) of the code
base to look into even more details?

On Sun, Mar 1, 2015 at 9:38 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> I was going to say something similar. But as soon as you have a major
> compaction you endup with a single file and everything into it. So
> depending on your key distribution you might still read everything. If you
> read just the last few minutes over a huge table, then yes, skip will help.
> Else, I'm not sure it will hep that much :(
>
> 2015-02-28 18:25 GMT-05:00 Nick Dimiduk <ndimiduk@gmail.com>:
>
> > A Scan without start and end rows will be issued to all regions in the
> > table -- a full table scan. Within each region, store files will be
> > selected to participate in the scan based on on the min/max timestamps
> > from their
> > headers.
> >
> > On Saturday, February 28, 2015, Kristoffer Sjögren <stoffe@gmail.com>
> > wrote:
> >
> > > If Scan.setTimeRange is a full table scan then it runs surprisingly
> fast
> > on
> > > tables that host a few hundred million rows :-)
> > >
> > >
> > >
> > > On Sat, Feb 28, 2015 at 8:05 PM, Kristoffer Sjögren <stoffe@gmail.com
> > > <javascript:;>>
> > > wrote:
> > >
> > > > Hi Jean-Marc
> > > >
> > > > I was thinking of Scan.setTimeRange to only get the x latest rows,
> but
> > I
> > > > would like to avoid a full table scan.
> > > >
> > > > The alternative would be to use set the timestamp in the key and use
> > > start
> > > > and stop key. But since HBase already is aware of timestamps I tought
> > it
> > > > might optimize Scan.setTimeRange scans?
> > > >
> > > > Cheers,
> > > > -Kristoffer
> > > >
> > > > On Sat, Feb 28, 2015 at 7:45 PM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org <javascript:;>> wrote:
> > > >
> > > >> Hi Kristoffer,
> > > >>
> > > >> What do you mean by "timerange scans"? If you want to scan
> everything
> > > from
> > > >> your table, you will always end up with a full table scan, no?
> > > >>
> > > >> JM
> > > >>
> > > >> 2015-02-28 13:41 GMT-05:00 Kristoffer Sjögren <stoffe@gmail.com
> > > <javascript:;>>:
> > > >>
> > > >> > Hi
> > > >> >
> > > >> > I want to understand the effectiveness of timerange scans without
> > > >> setting
> > > >> > start and stop keys? Will HBase do a full table scan or will
the
> > scan
> > > be
> > > >> > optimized in any way?
> > > >> >
> > > >> > Cheers,
> > > >> > -Kristoffer
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message