hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: How does hbase find regionservers for scans
Date Tue, 18 Jul 2017 19:11:10 GMT
For #1, the timestamp currently is stored using epoch in milliseconds on
server side.

For #2, PrefixFilter should also work.
I peformed the following on a small table:

scan 't', {FILTER => "PrefixFilter('111')"}

which returned the two.

Cheers

On Tue, Jul 18, 2017 at 11:36 AM, S L <slouie.at.work@gmail.com> wrote:

> Thanks for the tip with RowPrefixFilter.  THAT works compared to
> PrefixFilter.  However, regarding the timerange, when i type in the
> epoch time, that returns 0 rows.  However, if I use epoch time in
> milliseconds, that returns tons of rows.
>
> I have more questions now:
> 1) Why does my hbase work wtih epoch in milliseconds but your example
> says to use epoch seconds.
> 2) Also, how do you use PrefixFilter because I thought PrefixFilter
> was what I needed via common sense but apparently my common sense
> didn't work.
>
> Thanks for answering all my questions the last couple weeks.
>
>  scan 'dbi_based_data', {ROWPREFIXFILTER=> '26', COLUMNS =>
> 'raw_data:processlist', TIMERANGE => [1499205600, 1499206200]}
>
> ROW                                COLUMN+CELL
>
> 0 row(s) in 0.2350 seconds
>
>
> scan 'dbi_based_data', {ROWPREFIXFILTER=> '26', COLUMNS =>
> 'raw_data:processlist', TIMERANGE => [1499205600000, 1499206200000]}
>
> <snip snip>
> <snpi snip>
>
> 26_p3419.db160151.ycg1.dbi_149920 column=raw_data:processlist,
> timestamp=1499206083343, value= <snip snip>
>
> 351 row(s) in 184.6360 seconds
>
>
>
>
> On Fri, Jul 14, 2017 at 8:14 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > I wonder what time unit you were using.
> >
> > From the example in hbase-shell/src/main/ruby/shell/commands/scan.rb :
> >
> >   hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804,
> 1303668904]}
> >
> > You can see the time range having much smaller values.
> >
> > Please look at ROWPREFIXFILTER example in the same scan.rb
> >
> > If you check the table UI for dbi_based_data, you would see the start key
> > of each region.
> > From there it is easy to pinpoint which server hosts the relevant region.
> >
> > Cheers
> >
> > On Fri, Jul 14, 2017 at 7:51 PM, S L <slouie.at.work@gmail.com> wrote:
> >
> > > Sorry if this is a basic question.  How does hbase determine which
> > > regionserver the rows are supposed to be stored on?  My rowkey looks
> like
> > > hash_servername_timestamp, e.g.
> > >
> > > 33_myserver.mydomain.com_1234567890
> > >
> > > If I run the following command:
> > >
> > > scan 'dbi_based_data', {FILTER => "PrefixFilter('0')", COLUMNS =>
> > > 'raw_data:processlist', TIMERANGE => [1499205600000, 1499206200000]}
> > >
> > > I get all the rows that start with "0".  Since hbase stores things in
> > > lexical order, it seems like all rows that were stored lexically first
> gets
> > > returned.
> > >
> > > However, if I run the following command, hbase times out.  Even if I
> extend
> > > the timeout period to 3 minutes, it still times out.
> > >
> > > scan 'dbi_based_data', {FILTER => "PrefixFilter('28')", COLUMNS =>
> > > 'raw_data:processlist', TIMERANGE => [1499205600000, 1499206200000]}
> > >
> > > It seems like if it was any other prefix other than "0", it times out
> (like
> > > above prefix = 28).  I don't understand why it would timeout since it
> > > should be able to calculate which region/regionserver it should go to
> since
> > > I gave it the prefix to use.
> > >
> > >
> > > I performed "hbase hbck" and it says that
> > >
> > > 9 region servers are alive, 2 are dead
> > >
> > > # of total regions is 15850 for the db but there's only 350 for the
> table
> > > I'm querying.  There are 0 inconsistencies so the status is "OK".
> > >
> > > Thanks in advance for any help you can give me.
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message