hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: querying questions
Date Fri, 28 Oct 2011 16:23:15 GMT
On Fri, Oct 28, 2011 at 4:30 AM, Rita <rmorgan466@gmail.com> wrote:
> Couple of questions:
> What is the best delimiter for a key? Does it even matter? I read somewhere
> that using a \t is optimal for a reason.

Do without a delimiter if you can.  Just make the row key elements of
fixed size.

It looks like though that your your key schema would require you have
a delimiter (I'm guessing 'server' can be anything -- or can it be
contained so all servers have same size'd name?)

If you have to have a delimiter, choose one that is illegal in a
server name or user name so you can be sure it doesn't show up in
either ever and throw off your parse.

> For these types of queries I have been using filters particularly,
> RegexStringComparator
> (w/start&stop) and things seem to work to an extent. I was wondering is this
> the correct way to query or is there a more optimal way?

Regex'ing over keys will be expensive.  HBase is all bytes.  To regex,
you need to change the bytes into a String.  Java Strings are i18n and
multi-byte natively so it costs making them.  Can you make your key as
raw bytes and do byte compares in your filtering?

> I also couldnt find any examples using filters for timeseries data, is there
> a place I should be looking at?

I thought tsdb used filters?


View raw message