hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liam Slusser <lslus...@gmail.com>
Subject Re: first scan returns nothing and how big is big?
Date Mon, 30 Jun 2014 22:44:47 GMT
I'll try to put together a unit test and report back.

thanks,
liam



On Mon, Jun 30, 2014 at 3:25 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> FuzzyRowFilter is an interesting filter around which there has been user
> feedback on various scenarios.
>
> If you can write a unit test which exhibits the problem in your first
> point, that would help us track down the root cause.
>
> I checked FuzzyRowFilter in 0.94 branch - last fix for FuzzyRowFilter
> was HBASE-7628
> which you already have in 0.94.15
>
> Cheers
>
>
> On Mon, Jun 30, 2014 at 2:59 PM, Liam Slusser <lslusser@gmail.com> wrote:
>
> > Hey Hbase list,
> >
> > First question - It seems that the first time I do a scan with a few
> > filters the system returns nothing - it also takes a long time (20-30
> > seconds) - but I can run the exact same request over again and it goes
> much
> > quicker (2-3 seconds for a total scan, I figured things are cached the
> > second time which is fine) but the 2nd time around I get results.  It is
> > the exact same scan request.  I don't get any errors and nothing in the
> log
> > files...
> >
> > Has anybody else noticed anything like this?  I'm running HBase
> > 0.94.15-cdh4.6.0 and using FuzzyRowFilter with SingleColumnValueFilter on
> > top of my scan.
> >
> > Second question - how big is too big?  I am using my hbase database to
> > store parsed logs, currently I am breaking the logs into monthly tables.
>  I
> > am inputting around 350 million logs a day so near the end of the month
> > there is an estimated 8-10 billion rows per table.  All seems to be
> fine, I
> > am able to use FuzzyRowFilter+SingleColumnValueFilter and scan over an
> hour
> > of logs in about 10 seconds so the performance is still very decent.  Is
> > there any advantage to breaking the table up into separate days?  Is
> there
> > a best practices guide for tables this big?
> >
> > thanks!
> > liam
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message